Table of Contents

Advertisement

Quick Links

AWS Snowball
User Guide
AWS Snowball: User Guide
Copyright © 2018 Amazon Web Services, Inc. and/or its affiliates. All rights reserved.

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the Snowball and is the answer not in the manual?

Questions and answers

Summary of Contents for AWS Snowball

  • Page 1 AWS Snowball User Guide AWS Snowball: User Guide Copyright © 2018 Amazon Web Services, Inc. and/or its affiliates. All rights reserved.
  • Page 2 AWS Snowball User Guide Amazon's trademarks and trade dress may not be used in connection with any product or service that is not Amazon's, in any manner that is likely to cause confusion among customers, or in any manner that disparages or discredits Amazon. All other trademarks not owned by Amazon are the property of their respective owners, who may or may not be affiliated with, connected to, or sponsored by...
  • Page 3: Table Of Contents

    Snowball Features ........................1 Prerequisites for Using AWS Snowball ..................2 Tools and Interfaces ........................2 Related Services ......................... 2 Are You a First-Time User of AWS Snowball? ................. 2 Pricing ............................3 Device Differences ........................3 Use Case Differences ......................3 Hardware Differences ......................
  • Page 4 Checksum Validation of Transferred Data ..................87 Common Validation Errors ......................87 Manual Data Validation for Snowball During Transfer ..............88 Manual Data Validation for Snowball After Import into Amazon S3 ..........89 Notifications ............................ 90 Specifications ........................... 91 Supported Network Hardware ....................91 Workstation Specifications ......................
  • Page 5 AWS Snowball User Guide Document History .......................... 106 AWS Glossary ..........................108...
  • Page 6 AWS Snowball User Guide This guide is for the Snowball (50 TB or 80 TB of storage space). If you are looking for documentation for the Snowball Edge, see the AWS Snowball Edge Developer Guide.
  • Page 7: What Is A Snowball

    AWS Key Management Service (AWS KMS) and made physically rugged to secure and protect your data while the Snowball is in transit. In the US regions, Snowballs come in two sizes: 50 TB and 80 TB. All other regions have 80 TB Snowballs only.
  • Page 8: Prerequisites For Using Aws Snowball

    Snowball uses the AWS Snowball Management Console and the job management API for creating and managing jobs. To perform data transfers on the Snowball appliance locally, use the Snowball client or the Amazon S3 Adapter for Snowball. To learn more about using these in detail, see the following topics: •...
  • Page 9: Pricing

    AWS Snowball Pricing. AWS Snowball Device Differences The Snowball and the Snowball Edge are two different devices. This guide is for the Snowball. If you are looking for documentation for the Snowball Edge, see the AWS Snowball Edge Developer Guide. Both...
  • Page 10 AWS Snowball User Guide Hardware Differences Snowball Snowball Edge Each device has different storage capacities, as follows: Storage capacity (usable Snowball Snowball Edge capacity) ✓ 50 TB (42 TB) - US regions only   ✓ 80 TB (72 TB)  ...
  • Page 11: Tool Differences

    Amazon S3 Adapter for Snowball with Snowball Edge • Is already installed on the Snowball Edge by default. It does not need to be downloaded or installed. • Can transfer data to or from the Snowball Edge. For more information, see Using the Amazon S3 Adapter.
  • Page 12: Other Differences

    • https://aws.amazon.com/snowball • https://aws.amazon.com/snowball-edge How AWS Snowball Works with the Standard Snowball Appliance Following, you can find information on how AWS Snowball works, including concepts and its end-to-end implementation. Topics • How It Works: Concepts (p. 7) • How It Works: Implementation (p. 9)
  • Page 13: How It Works: Concepts

    Each import job uses a single Snowball appliance. After you create a job in the AWS Snowball Management Console or the job management API, we ship you a Snowball. When it arrives in a few days, you’ll connect the Snowball to your network and transfer the data that you want imported into Amazon S3 onto that Snowball using the Snowball client or the Amazon S3 Adapter for Snowball.
  • Page 14 However, this process can take longer. Once the export is done, AWS gets the Snowball ready for pickup by your region's carrier. When the Snowball arrives at your data center or office in a few days, you’ll connect the Snowball to your network and transfer the data that you want exported to your servers by using the Snowball client or the Amazon S3 Adapter for Snowball.
  • Page 15: How It Works: Implementation

    2. A Snowball is prepared for your job – We prepare a Snowball for your job, and the status of your job is now Preparing Snowball. For security purposes, data transfers must be completed within 90 days of the Snowball being prepared.
  • Page 16 In transit to AWS. 13. A WS gets the Snowball – The Snowball arrives at AWS, and the status for your job becomes At AWS. On average, it takes about a day for AWS to begin importing your data into Amazon S3.
  • Page 17: Jobs

    12. Y our region's carrier returns the Snowball to AWS – When the carrier has the Snowball, the status for the job becomes In transit to AWS. At this point, if your export job has more job parts, the next job part enters the Preparing Snowball status.
  • Page 18: Job Details

    Your data source for an export job is one or more Amazon S3 buckets. Once the data for a job part is moved from Amazon S3 to a Snowball, you can download a job report. This report will alert you to any objects that failed the transfer to the Snowball.
  • Page 19: Job Statuses

    AWS Key Management Service in Snowball (p. 84). • Snowball capacity – In the US regions, Snowballs come in two sizes: 50 TB and 80 TB. All other regions have the 80 TB Snowballs only. • Storage service – The AWS storage service associated with this job, in this case Amazon S3.
  • Page 20: Setting Up

    Root Account Credentials vs. IAM User Credentials in the AWS General Reference and IAM Best Practices in IAM User Guide. If you signed up for AWS but have not created an IAM user for yourself, you can create one using the IAM console.
  • Page 21: Next Step

    Access Management Example Policies. To sign in as this new IAM user, sign out of the AWS Management Console, then use the following URL, where is your AWS account number without the hyphens (for example, if your your_aws_account_id AWS account number is 1234-5678-9012, your AWS account ID is 123456789012): https://your_aws_account_id.signin.aws.amazon.com/console/...
  • Page 22: Getting Started

    Both sets of instructions assume that you'll use the AWS Snowball Management Console to create your job and the Snowball client to locally transfer your data. If you'd rather work programmatically, to create jobs you can use the job management API for Snowball. For more information, see AWS Snowball API Reference.
  • Page 23: Create An Import Job

    Amazon S3 bucket, the specific Amazon S3 bucket to receive your imported data, and the storage size of the Snowball. If you don't already have an Amazon S3 bucket, you can create one on this page. If you create a new Amazon S3 bucket for your destination, note that the Amazon S3 namespace for buckets is shared universally by all AWS users as a feature of the service.
  • Page 24: Receive The Snowball

    Snowball specifically for your import job into Amazon S3. During the processing stage, if there's an issue with your job, we contact you by email. Otherwise, we ship a Snowball to the address you provided when you created the job. Shipping can take a few days, but you can track the shipping status of the Snowball we prepared for your job.
  • Page 25 • Workstation – This computer hosts your mounted data source. You'll use this workstation to transfer data to the Snowball. We highly recommend that your workstation be a powerful computer, able to meet high demands in terms of processing, memory, and networking. For more information, see Workstation Specifications (p.
  • Page 26: Connect The Snowball To Your Local Network

    flipped up to rest on the top of the Snowball. Open the front panel first, flip it on top of the Snowball, and then open the back panel, flipping it up to rest on the first.
  • Page 27: Transfer Data

    Changing Your IP Address (p. 49). Make a note of the IP address shown, because you'll need it to configure the Snowball client. Important To prevent corrupting your data, do not disconnect the Snowball or change its network settings while transferring data.
  • Page 28 Now that you have your credentials, you're ready to transfer data. Install the AWS Snowball Client The Snowball client is one of the tools that you can use transfer from your on-premises data source to the Snowball. You can download the Snowball client for your operating system from...
  • Page 29 During data transfer, you'll notice that there is at least one folder at the root level of the Snowball. This folder and any others at this level have the same names as the destination buckets that were chosen when this job was created.
  • Page 30: Return The Appliance

    Once your package arrives at AWS and the Snowball is delivered to processing, your job status changes from In transit to AWS to At AWS. On average, it takes a day for your data import into Amazon S3 to begin. When it does, the status of your job changes to Importing. From this point on, it takes an average of two business days for your import to reach Completed status.
  • Page 31: Create An Export Job

    Snowballs that will be used with this job. We recommend that you let AWS decide on the Snowball sizes for each job part, as we will optimize for cost efficiency and speed for each job part. When you create an export job in the...
  • Page 32: Receive The Snowball

    In transit to you status. Shipping can take a few days, and you can track the shipping status of the Snowball we prepared for your job. In your job's details, you'll see a link to the tracking webpage with your tracking number provided.
  • Page 33 • Data destination – This on-premises device will hold the data that you want to transfer from the Snowball. It can be a single device, such as a hard drive or USB stick, or it can be separate destinations of data within your data center. The data destination must be mounted onto your workstation in order to transfer data to it.
  • Page 34: Connect The Snowball To Your Local Network

    flipped up to rest on the top of the Snowball. Open the front panel first, flip it on top of the Snowball, and then open the back panel, flipping it up to rest on the first.
  • Page 35: Transfer Data

    Changing Your IP Address (p. 49). Make a note of the IP address shown, because you'll need it to configure the Snowball client. Important To prevent corrupting your data, do not disconnect the Snowball or change its network settings while transferring data.
  • Page 36 Disconnect the AWS Snowball Appliance (p. 31) Get Your Credentials Each AWS Snowball job has a set of credentials that you must get to authenticate your access to the Snowball. These credentials are an encrypted manifest file and an unlock code. The manifest file contains important information about the job and permissions associated with it.
  • Page 37 Install the AWS Snowball Client The Snowball client is one of the tools that you can use to manage the flow of data from your on- premises data source to the Snowball. You can download the Snowball client for your operating system...
  • Page 38: Return The Appliance

    Snowball. Always use the shipping label that is displayed on the Snowball's E Ink display. When your region's carrier gets the Snowball, the status for the job becomes In transit to AWS. At this point, if your export job has more job parts, the next job part enters the Preparing Snowball status.
  • Page 39: Best Practices

    Snowball. For example, you might take this approach: First, save a copy of the manifest to the workstation. Then, email the unlock code to the AWS Identity and Access Management (IAM) user to perform the data transfer from the workstation. This approach limits access to the Snowball to individuals who have access to files saved on the workstation and also...
  • Page 40: Performance

    1. Use the latest Mac or Linux Snowball client – The latest Snowball clients for Mac and Linux both support the Advanced Encryption Standard New Instructions (AES-NI) extension to the x86 instruction set architecture.
  • Page 41 30 MB/second. You open a second terminal window, and run a second snowball cp command on another set of files that you want to transfer. You see that both commands are performing at 30 MB/second. In this case, your total transfer performance is 60 MB/second.
  • Page 42: How To Transfer Petabytes Of Data Efficiently

    Performance Considerations for HDFS Data Transfers When getting ready to transfer data from a Hadoop Distributed File System (HDFS) cluster (version 2.x) into a Snowball, we recommend that you follow the guidance in the previous section, and also the following tips: •...
  • Page 43 file size, and the speed at which data can be read from your local servers. The Snowball client copies data to the Snowball as fast as conditions allow. It can take as little as a day to fill a 50 TB Snowball depending on your local environment. You can copy twice as much data in the same amount of time by using two 50 TB Snowballs in parallel.
  • Page 44: Calibrating A Large Transfer

    When planning your segments, make sure that all the sizes of the data for each segment combined fit on the Snowball for this job. When segmenting your data transfer, take care not to copy the same files or directories multiple times.
  • Page 45: Transferring Data In Parallel

    AWS Snowball User Guide Transferring Data in Parallel this point, you can end the last active instance of Snowball client and make a note of your new target transfer rate. Important Your workstation should be the local host for your data. For performance reasons, we don't recommend reading files across a network when using Snowball to transfer data.
  • Page 46: Using The Snowball Console

    Management Console All jobs for AWS Snowball are created and managed from either the AWS Snowball Management Console or the job management API for AWS Snowball. The following provides an overview of how to use the AWS Snowball Management Console.
  • Page 47: Export Range Examples

    AWS Snowball User Guide Export Range Examples • The numbers 0-9 come before both uppercase and lowercase English characters. • Uppercase English characters come before all lowercase English characters. • Lowercase English characters come last when sorted against uppercase English characters and numbers.
  • Page 48: Getting Your Job Completion Report And Logs

    Whenever data is imported into or exported out of Amazon S3, you'll get a downloadable PDF job report. For import jobs, this report becomes available at the very end of the import process. For export jobs, your job report typically becomes available for you while the Snowball for your job part is being delivered to you.
  • Page 49: Canceling Jobs

    Canceling Jobs in the Console If you need to cancel a job for any reason, you can do so before it enters the Preparing Snowball status. You can only cancel jobs can when they have Job created status. Once a job begins processing, you can...
  • Page 50 AWS Snowball User Guide Canceling Jobs To cancel a job Sign in to the AWS Management Console and open the AWS Snowball Management Console. Search for and choose your job from the table. From Actions, choose Cancel job. You have now canceled your job.
  • Page 51: Using A Snowball

    Following, you can find an overview of the Snowball appliance, the physically rugged appliance protected by AWS Key Management Service (AWS KMS) that you use to transfer data between your on- premises data centers and Amazon Simple Storage Service (Amazon S3). This overview includes images of the Snowball, instructions for preparing the appliance for data transfer, and networking best practices to help optimize your data transfer.
  • Page 52 AWS Snowball User Guide It has two panels, a front and a back, which are opened by latches and flipped up to rest on the top of the Snowball.
  • Page 53 AWS Snowball User Guide Open the front panel first, flip it on top of the Snowball, and then open the back panel second, flipping it up to rest on the first.
  • Page 54 E Ink display. You’ll hear the Snowball internal fans start up, and the display changes from your shipping label to say Preparing. Wait a few minutes, and the Ready screen appears. When that happens, the Snowball is ready...
  • Page 55: Changing Your Ip Address

    Changing Your IP Address The E Ink display is used to provide you with shipping labels and with the ability to manage the IP address that the Snowball uses to communicate across your local network. Changing Your IP Address You can change your IP address to a different static address, which you provide by following this procedure.
  • Page 56 At the top of the page, either RJ45, SFP+ Copper, or SFP+ Optical has been highlighted. This value represents the type of connection that Snowball is currently using to communicate on your local network. Here, on the active DHCP tap, you see your current network settings. To change to a static...
  • Page 57: Transferring Data

    – A standalone terminal application that you run on your local workstation to do your data transfer. You don't need any coding knowledge to use the Snowball client. It provides all the functionality you need to transfer data, including handling errors and writing logs to your local workstation for troubleshooting and auditing.
  • Page 58: Transferring Data With The Snowball Client

    Using the Snowball Client (p. 52) Using the Snowball Client Following, you can find an overview of the Snowball client, one of the tools that you can use to transfer data between your on-premises data center and the Snowball. The Snowball client supports transferring the following types of data to and from a Snowball.
  • Page 59 The first 10 days that the Snowball is on-site at your facility are free, and you'll want to test your data transfer ahead of time to prevent fees starting on the eleventh day.
  • Page 60 Snowball while it’s at your facility. Locate the IP address for the Snowball on the Snowball's E Ink display. When the Snowball is connected to your network for the first time, it automatically creates a DHCP IP address. If you want to use a different IP address, you can change it from the E Ink display.
  • Page 61 HDFS data transfer. Although you can write HDFS data to a Snowball, you can't write Hadoop data from a Snowball to your local HDFS. As a result, export jobs are not supported for HDFS.
  • Page 62 Before using the Snowball client to copy HDFS (version 2.x) data, take the following steps: 1. To transfer data from an HDFS cluster, get the latest version of the Snowball client. You can download and install the Snowball client from the AWS Snowball Tools Download page.
  • Page 63 Amazon S3 buckets that you chose when this job was created. You can't write data to the root level of the Snowball. All data must be written into one of the bucket folders or into their subfolders.
  • Page 64 The snowball retry command retries the snowball cp command for all the files that didn't copy the last time snowball cp was executed. The list of files that weren't copied is saved in a plaintext log in your workstation's temporary directory. The exact path to that log is printed to the terminal if the snowball cp command fails to copy a file.
  • Page 65 Validate Command for the Snowball Client Unless you specify a path, the snowball validate command validates all the metadata and transfer statuses for the objects on the Snowball. If you specify a path, then this command validates the content...
  • Page 66 Options for the snowball cp Command Following, you can find information about snowball cp command options and also syntax guidelines for using this command. You use this command to transfer data from your workstation to a Snowball. Command Option Description String.
  • Page 67 The preceding use cases are not mutually exclusive. We recommend that you use -f with care to prevent delays in data transfer. On and set to false by default. -h, --help Displays the usage information for the snowball cp command in the terminal. String. --noBatch Disables automatic batching of small files.
  • Page 68 However, there are some notable differences. In the following topics, you can find a reference for the syntax used by the snowball cp command. Failure to follow this syntax can lead to unexpected results when copying data to or from a Snowball.
  • Page 69 -f option. • Copying a file to a bucket on Snowball with or without a trailing slash – Copies the file into the root level directory on the Snowball that shares the name of an Amazon S3 bucket.
  • Page 70 However, because the -f is used, the file dir6 is forcibly overwritten as a directory with the contents from the source dir5. • Copying a directory to a bucket on a Snowball – Specify the bucket name in the destination path. snowball cp -r /tmp/dir7 s3://bucket-name/...
  • Page 71: Transferring Data With The Amazon S3 Adapter For Snowball

    Transferring Data with the Amazon S3 Adapter for Snowball The Amazon S3 Adapter for Snowball is a programmatic tool that you can use to transfer data between your on-premises data center and a Snowball. It replaces the functionality of the Snowball client. As with the Snowball client, you need to download the adapter's executable file from the...
  • Page 72: Using The Amazon S3 Adapter For Snowball

    Installing the adapter also adds the snowball subdirectory to your .aws directory. Within this snowball directory, you can find the logs and config subdirectories. Their contents are as follows: • The logs directory is where you find the log files for your data transfers to the Snowball through the Amazon S3 Adapter for Snowball.
  • Page 73 You can't use the AWS CLI or any of the AWS SDKs to retrieve status in this. However, you can easily test a HEAD request over the wire by running a curl command on the adapter, as in the following example.
  • Page 74 Using the Amazon S3 Adapter for Snowball, you can programmatically transfer data to and from a Snowball with Amazon S3 API actions. However, not all Amazon S3 transfer features and API actions are supported for use with a Snowball device. For more information on the supported features, see the following: •...
  • Page 75 Usage and Example [default] profile. To specify a different profile, use this option followed by the profile name. The AWS secret key that you want to use to sign snowball-adapter -s requests to the Snowball. By default, the adapter --aws-secret- uses the key present in the default profile specified...
  • Page 76 Guide. Specifying the Adapter as the AWS CLI Endpoint When you use the AWS CLI to issue a command to the Snowball, specify that the endpoint is the Amazon S3 Adapter for Snowball, as shown following. aws s3 ls --endpoint http://<IP address for the S3...
  • Page 77 AWS Support for troubleshooting purposes. You can't use this operation with the AWS SDKs or the AWS CLI. We recommend that you use curl or an HTTP client. The request doesn't need to be signed for this operation.
  • Page 78 • The Snowball adapter is not optimized for large list operations. For example, you might have a case with over a million objects per folder where you want to list the objects after you transfer them to the device. In this type of case, we recommend that you order a Snowball Edge for your job instead. •...
  • Page 79: Shipping Considerations

    Disconnect and stow the cables the Snowball was sent with. The back panel of the Snowball has a cable caddie that holds the cables safely during the return trip. Close the two panels on the back and front of the Snowball, pressing in until you hear and feel them click.
  • Page 80: Region-Based Shipping Restrictions

    Snowball: • You arrange for UPS to pick up the Snowball by scheduling a pickup with UPS directly, or take the Snowball to a UPS package drop-off facility to be shipped to AWS. To schedule a pickup with UPS, you...
  • Page 81 In Australia, if you're shipping a Snowball back to AWS, send an email to snowball-pickup@amazon.com with Snowball Pickup Request in the subject line so we can schedule the pickup for you. In the body of the email, include the following information: •...
  • Page 82 E Ink display. • The Snowball is delivered to an AWS sorting facility and forwarded to the AWS data center. Schenker- Seino automatically reports back a tracking number for your job.
  • Page 83: Security

    GCM 256-bit keys, and the keys are cycled for every 60 GB of data transferred. 2. SSL encryption is a second layer of encryption for all data going onto or off of a standard Snowball. AWS Snowball uses server side-encryption (SSE) to protect data at rest.
  • Page 84 Snowball Use the following procedure in the Amazon S3 Management Console to enable SSE-S3 for data being imported into Amazon S3. No configuration is necessary in the AWS Snowball Management Console or on the Snowball device itself. To enable SSE-S3 encryption for the data that you're importing into Amazon S3, simply update the bucket policies for all the buckets that you're importing data into.
  • Page 85: Authorization And Access Control

    Creating an IAM User for Snowball If the account doing the work in the Snowball console is not the root account or administrator, you must use the IAM Management Console to grant the user the permissions necessary to create and manage jobs.
  • Page 86 Creating an IAM user generates an access key consisting of an access key ID and a secret access key, which are used to sign programmatic requests that you make to AWS. If you want to download these security credentials, choose Download. Otherwise, choose close to return to your list of users.
  • Page 87 An IAM role must be created with read and write permissions for your Amazon S3 buckets. The role must also have a trust relationship with Snowball, so AWS can write the data in the Snowball and in your Amazon S3 buckets, depending on whether you're importing or exporting data. Creating this role is done as a step in the job creation wizard for each job.
  • Page 88: Access Control

    "Resource": "arn:aws:s3:::*" Access Control As IAM resource owner, you have responsibility for access control and security for the Snowball console, Snowball, and other assets associated with using Snowball. These assets include such things as Amazon S3 buckets, the workstation that the data transfer goes through, and your on-premises data itself.
  • Page 89 AWS Snowball User Guide Access Control In some cases, we can help you grant and manage access control to the resources used in transferring your data with Snowball. In other cases, we suggest that you follow industry-wide best practices for access control. Resource...
  • Page 90: Aws Key Management Service In Snowball

    Specifically, the Amazon Resource Name (ARN) for the AWS KMS key that you choose for a job in AWS Snowball is associated with a KMS key. That KMS key is used to encrypt the unlock code for your job.
  • Page 91: Using The Aws-Managed Customer Master Key For Snowball

    Set your security options. Under Encryption, for KMS key either choose the AWS-managed CMK or a custom CMK that was previously created in AWS KMS, or choose Enter a key ARN if you need to enter a key that is owned by a separate account.
  • Page 92: Other Security Considerations For Snowball

    Snowball appliance sent for that job. • Don't leave the Snowball sitting on a loading dock. Left on a loading dock, it can be exposed to the elements. Although the Snowball is rugged, weather can damage the sturdiest of hardware. Report stolen, missing, or broken Snowballs as soon as possible.
  • Page 93: Data Validation

    Checksum Validation of Transferred Data When you copy a file from a local data source using the Snowball client or the Amazon S3 Adapter for Snowball, to the Snowball, a number of checksums are created. These checksums are used to automatically validate data as it's transferred.
  • Page 94: Manual Data Validation For Snowball During Transfer

    Use the --verbose option for the Snowball client copy command When you run the Snowball client copy command, you can use the --verbose option to list all the files that are transferred to the Snowball. You can use this list to validate the content that was transferred to the Snowball.
  • Page 95: Manual Data Validation For Snowball After Import Into Amazon S3

    If your workstation can connect to the internet, you can do a final validation of all your transferred files by running the AWS CLI command aws s3 sync. This command syncs directories and S3 prefixes. This command recursively copies new and updated files from the source directory to the destination. For more information, see http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html.
  • Page 96: Notifications

    AWS Snowball User Guide Snowball Notifications Snowball is designed to take advantage of the robust notifications delivered by Amazon Simple Notification Service (Amazon SNS). While creating a job, you can provide a list of comma-separated email addresses to receive email notifications for your job.
  • Page 97: Specifications

    Temperature range 0 – 40°C (operational) Non-operational Not specified Altitude Operational Altitude 0 to 3,000m (0 to 10,000’) Supported Network Hardware After you open the back panel of the Snowball, you see the network ports shown in the following photograph.
  • Page 98 1G operation is indicated by a blinking amber light. 1G operation is not recommended for large-scale data transfers to the Snowball device, as it will dramatically increase the time it takes to transfer data. 10G operation is indicated by a blinking green light. It requires a Cat6A UTP cable with a maximum operating distance of 180 feet(55 meters).
  • Page 99: Workstation Specifications

    We recommend that your workstation be a computer dedicated to the task of running the Snowball client or the Amazon S3 Adapter for Snowball while you're transferring data. Each instance of the client...
  • Page 100 Snowball client or the Amazon S3 Adapter for Snowball. Once the file is encrypted, chunks of the encrypted file are sent to the Snowball. At no point is any data stored to disk. All data is kept in memory, and only encrypted data is sent to the Snowball. These steps of loading into memory, encrypting, chunking, and sending to the Snowball are both CPU- and memory- intensive.
  • Page 101: Limits

    Note The guide you're reading now is for the Snowball, which has 50 TB or 80 TB of storage space. If you are looking for documentation for the Snowball Edge, see the...
  • Page 102: Limitations On Transferring On-Premises Data With A Snowball

    • When using the Amazon S3 Adapter for Snowball with the AWS CLI to transfer data, note that the -- recursive option for the cp command is only supported for uploading data to a Snowball, not for transferring data from a Snowball.
  • Page 103 • The appliance must not be physically damaged. You can prevent damage by closing the two panels on the Snowball until the latches make an audible clicking sound. • The Snowball's E Ink display must be visible, and must show the return label that was automatically generated when you finished transferring your data onto the Snowball.
  • Page 104: Troubleshooting

    The following can help you troubleshoot issues you might have with connecting to your Snowball. • Routers and switches that work at a rate of 100 megabytes per second won't work with a Snowball. We recommend that you use a switch that works at a rate of 1 GB per second (or faster).
  • Page 105: Troubleshooting Client Problems

    If you receive this error message, you can resolve the issue by reducing the object's key length. • If you're using Linux and you can't upload files with UTF-8 characters to a Snowball, it might be because your Linux workstation doesn't recognize UTF-8 character encoding. You can correct this issue by installing the locales package on your Linux workstation and configuring it to use one...
  • Page 106: Hdfs Troubleshooting

    • The size of the file is greater than 5 TB – Objects in Amazon S3 must be 5 TB or less in size, so files that are larger 5 TB in size can't be transferred to the Snowball. If you encounter this problem, separate the file into parts smaller than 5 TB, compress the file so that it's within the 5 TB limit, or...
  • Page 107: Troubleshooting Import Job Problems

    In some cases, the list is prohibitively large, or the files in the list are too large to transfer over the internet. In these cases, you should create a new Snowball import job, change the file and folder names to comply with Amazon S3 rules, and transfer the files again.
  • Page 108: Job Management Api

    US for that account, respectively. The job management API for Snowball is an RPC model, in which there is a fixed set of operations and the syntax for each operation is known to clients without any prior interaction. Following, you can find a description of each API operation using an abstract RPC notation, with an operation name that does not appear on the wire.
  • Page 109: Api Version

    AWS Snowball User Guide API Version For a list of AWS Regions that Snowball supports (where you can create and manage jobs), see Import/Export in the AWS General Reference. The region-specific API endpoint defines the scope of the Snowball resources that are accessible when you make an API call.
  • Page 110 Amazon S3 Bucket Policy Principal for Creating Jobs If the Amazon S3 buckets that you use with Snowball have bucket policies in place that require listing the role session name of the assumed role, then you'll need to specify a principal in those policies that identifies AWSImportExport-Validation.
  • Page 111: Related Topics

    AWS Snowball User Guide Related Topics "Statement": { "Sid": "Allow AWS Snowball To Create Jobs", "Effect": "Deny", "NotPrincipal": { "AWS": [ "arn:aws:iam::111122223333:role/rolename", "arn:aws:sts::111122223333:assumed-role/rolename/AWSImportExport-Validation", "arn:aws:iam::111122223333:root" "Action": "S3:*", "Resource": ["arn:aws:s3:::examplebucket/*"] In this policy example, we deny access to all principals except the one named in the NotPrincipal element.
  • Page 112 AWS Snowball User Guide Document History The following table describes the important changes to the documentation since the last release of AWS Snowball. • API version: latest • Latest document update: April 4, 2018 Change Description Date Changed India carrier The carrier in India is now Blue Dart.
  • Page 113 AWS Snowball API now supported. Reference. For more information on using the Amazon S3 Adapter for Snowball to call Amazon S3 REST API actions to transfer data with a Snowball, see Transferring Data with the Amazon S3 Adapter for Snowball (p.
  • Page 114 A unique identifier that's associated with a secret access key (p. 151); the access key ID and secret access key are used together to sign programmatic AWS requests cryptographically. access key rotation A method to increase security by changing the AWS access key ID. This method...
  • Page 115 (p. 144) is evaluated. When a user makes a request to AWS, AWS evaluates the request based on all permissions that apply to the user and then returns either allow or deny. Amazon API Gateway A fully managed service that makes it easy for developers to create, publish,...
  • Page 116 See Also https://aws.amazon.com/cloudfront. Amazon CloudSearch A fully managed service in the AWS cloud that makes it easy to set up, manage, and scale a search solution for your website or application. Amazon CloudWatch A web service that enables you to monitor and manage various metrics, and configure alarm actions based on data from those metrics.
  • Page 117 See Also https://aws.amazon.com/elasticache/. Amazon Elasticsearch Service An AWS managed service for deploying, operating, and scaling Elasticsearch, an (Amazon ES) open-source search and analytics engine, in the AWS Cloud. Amazon Elasticsearch Service (Amazon ES) also offers security options, high availability, data durability, and direct access to the Elasticsearch APIs.
  • Page 118 See Also https://aws.amazon.com/lightsail/. Amazon Lumberyard A cross-platform, 3D game engine for creating high-quality games. You can connect games to the compute and storage of the AWS cloud and engage fans on Twitch. See Also https://aws.amazon.com/lumberyard/. Amazon Machine Image (AMI)
  • Page 119 A next-generation web browser available only on Fire OS tablets and phones. Built on a split architecture that divides processing between the client and the AWS cloud, Amazon Silk is designed to create a faster, more responsive mobile browsing experience.
  • Page 120 A set of tools for creating and running high-quality 3D, augmented reality (AR), and virtual reality (VR) applications on the web. See Also https://aws.amazon.com/sumerian/. Amazon Virtual Private Cloud A web service for provisioning a logically isolated section of the AWS cloud where (Amazon VPC) you can launch AWS resource (p.
  • Page 121 AWS Snowball User Guide scripts—along with an application specification file (p. 115). Revisions are stored Amazon S3 (p. 113) bucket (p. 122)s or GitHub (p. 133) repositories. For Amazon S3, a revision is uniquely identified by its Amazon S3 object key and its ETag, version, or both.
  • Page 122 See Also https://aws.amazon.com/cloudhsm/. AWS CloudTrail A web service that records AWS API calls for your account and delivers log files to you. The recorded information includes the identity of the API caller, the time of the API call, the source IP address of the API caller, the request parameters, and the response elements returned by the AWS service.
  • Page 123 See Also https://aws.amazon.com/dms. AWS Data Pipeline A web service for processing and moving data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals. See Also https://aws.amazon.com/datapipeline. AWS Device Farm An app testing service that allows developers to test Android, iOS, and Fire OS devices on real, physical phones and tablets that are hosted by AWS.
  • Page 124 (ETL) (p. 132) service that you can use to catalog data and load it for analytics. With AWS Glue, you can discover your data, develop scripts to transform sources into targets, and schedule and run ETL jobs in a serverless environment.
  • Page 125 See Also https://aws.amazon.com/ec2/vcenter-portal/. AWS Marketplace A web portal where qualified partners to market and sell their software to AWS customers. AWS Marketplace is an online software store that helps customers find, buy, and immediately start using the software and services that run on AWS.
  • Page 126 AWS Service Catalog A web service that helps organizations create and manage catalogs of IT services that are approved for use on AWS. These IT services can include everything from virtual machine images, servers, software, and databases to complete multitier application architectures.
  • Page 127 A web application firewall service that controls access to content by allowing or blocking web requests based on criteria that you specify, such as header values or the IP addresses that the requests originate from. AWS WAF helps protect web applications from common web exploits that could affect application availability, compromise security, or consume excessive resources.
  • Page 128 AWS CodeDeploy: A deployment method in which the instances in a deployment group (the original environment) are replaced by a different set of instances (the replacement environment). bootstrap action A user-specified default or custom action that runs a script or an application on...
  • Page 129 X.509 certificate (p. 159) . The certificate is paired with a private key. chargeable resources Features or services whose use incurs fees. Although some AWS products are free, others include charges. For example, in an AWS CloudFormation (p. 116) stack (p.
  • Page 130 Amazon CloudFront (p. 110) distributions. Conditions can include values such as the IP addresses that web requests originate from or values in request headers. Based on the specified conditions, you can configure AWS WAF to allow or block web requests to AWS resources.
  • Page 131 You can see a combined view of AWS costs that are incurred by all accounts in your organization, and you can get detailed cost reports for individual accounts.
  • Page 132 AWS CodeCommit (p. 116) AWS CodeDeploy (CodeDeploy) (p. 116) you can configure cross-account access so that a user in AWS account A can access an AWS CodeCommit repository created by account B. Or a pipeline in AWS CodePipeline (p. 117) created by account A can use AWS CodeDeploy resources created by account B.
  • Page 133 AWS Snowball User Guide takes time for the data to propagate to all storage locations. To support varied application requirements, Amazon DynamoDB (p. 110) supports both eventually consistent and strongly consistent reads. See Also eventual consistency, eventually consistent read, strongly consistent read.
  • Page 134 (p. 148)s in your AWS account. Between two AWS accounts: Setting up a trust between the account that owns the resource (the trusting account), and the account that contains the users that need to access the resource (the trusted account).
  • Page 135 Amazon Elastic Block Store (Amazon EBS). Amazon Elastic Compute Cloud (Amazon EC2). EC2 compute unit An AWS standard for compute CPU and memory. You can use this measure to evaluate the CPU capacity of different EC2 instance (p. 129) types. EC2 instance A compute instance (p.
  • Page 136 Logstash, Kibana, and Beats—that are designed to take data from any source and search, analyze, and visualize it in real time. Amazon Elasticsearch Service (Amazon ES) is an AWS managed service for deploying, operating, and scaling Elasticsearch in the AWS Cloud.
  • Page 137 The method through which AWS products achieve high availability, which involves replicating data across multiple servers in Amazon's data centers. When data is written or updated and Success is returned, all copies of the data are updated.
  • Page 138 AWS Snowball User Guide isn't frequently requested, CloudFront might evict the object (remove the object before its expiration date) to make room for objects that are more popular. exbibyte A contraction of exa binary byte, an exbibyte is 2^60 or 1,152,921,504,606,846,976 bytes.
  • Page 139 Allows individuals to sign in to different networks or services, using the same management group or personal credentials to access data across all networks. With identity federation in AWS, external identities (federated users) are granted secure access resource (p. 148)s in an AWS account (p.
  • Page 140 (MAC) involving a cryptographic hash function in combination with a secret key. You can use it to verify both the data integrity and the authenticity of a message at the same time. AWS calculates the HMAC using a standard, cryptographic hash algorithm, such as SHA-256.
  • Page 141 AWS CodeDeploy: A deployment method in which the application on each instance in the deployment group is stopped, the latest application revision is installed, and the new version of the application is started and validated. You can choose to use a load balancer so each instance is deregistered during its deployment and then restored to service after the deployment is complete.
  • Page 142 121): An attribute that specifies the IP addresses or IP address ranges that web requests originate from. Based on the specified IP addresses, you can configure AWS WAF to allow or block web requests to AWS resource (p. 148)s such as Amazon CloudFront (p.
  • Page 143 A five-character, alphanumeric string that uniquely identifies an AWS Import/ Export (p. 118) storage device in your shipment. AWS issues the job ID in response to a CREATE JOB email command. job prefix An optional string that you can add to the beginning of an AWS Import/ Export (p.
  • Page 144 For example, you might have EC2 instance (p. 129) with the tag key of Owner and the tag value of Jan. You can tag an AWS resource (p. 148) with up to 10 key–value pairs. Not all AWS resources can be tagged.
  • Page 145 When sending a create job request for an import or export operation, you describe your job in a text file called a manifest. The manifest file is a YAML-formatted file that specifies how to transfer data between your storage device and the AWS cloud.
  • Page 146 AWS Snowball User Guide marker pagination token. master node A process running on an Amazon Machine Image (AMI) (p. 112) that keeps track of the work its core and task nodes complete. maximum price The maximum price you will pay to launch one or more Spot Instance (p.
  • Page 147 Once you enable AWS (MFA) MFA, you must provide a six-digit, single-use code in addition to your sign-in credentials whenever you access secure AWS webpages or the AWS Management Console (p. 119). You get this single-use code from an authentication device that you keep in your physical possession.
  • Page 148 110), optimistic locking support is provided by the AWS SDKs. organization AWS Organizations (p. 119): An entity that you create to consolidate and manage your AWS accounts. An organization has one master account along with zero or more member accounts.
  • Page 149 CloudFront (p. 110). original environment The instances in a deployment group at the start of an AWS CodeDeploy blue/ green deployment. OSB transformation Orthogonal sparse bigram transformation. In machine learning, a transformation that aids in text string analysis and that is an alternative to the n-gram transformation.
  • Page 150 IAM (p. 118): A document defining permissions that apply to a user, group, or role; the permissions in turn determine what users can do in AWS. A policy typically allow (p. 109)s access to specific actions, and can optionally grant that the actions are allowed for specific...
  • Page 151 AWS cloud-based applications. Amazon stores public data sets at no charge to the community and, like all AWS services, users pay only for the compute and storage they use for their own applications. These data sets currently include data from the Human Genome Project, the U.S.
  • Page 152 • The number of cache clusters for each AWS account (p. 109) • The number of cache nodes per cache cluster • The total number of cache nodes per AWS account across all cache clusters created by that AWS account Numbers and Symbols (p. 108) A (p.
  • Page 153 When training data is overfitted, the ML model performs well on the training data but does not perform well on the evaluation data or on new data. replacement environment The instances in a deployment group after the AWS CodeDeploy blue/green deployment. replica shard See shard.
  • Page 154 153). requester The person (or application) that sends a request to AWS to perform a specific action. When AWS receives a request, it first evaluates the requester's permissions to determine whether the requester is allowed to perform the request action (if applicable, for the requested resource (p.
  • Page 155 AWS CloudFormation (p. 116) stack (p. 153). All resource (p. 148)s associated with the failure are deleted during the rollback. For AWS CloudFormation, you can override this behavior using the --disable-rollback option on the command line. root AWS Organizations (p.
  • Page 156 AWS Snowball User Guide sandbox A testing location where you can test the functionality of your application without affecting production, incurring charges, or purchasing products. Amazon SES (p. 113): An environment that is designed for developers to test and evaluate the service. In the sandbox, you have full access to the Amazon SES API, but you can only send messages to verified email addresses and the mailbox...
  • Page 157 AWS service so it can access AWS resource (p. 148)s. The policies that you attach to the service role determine which AWS resources the service can access and what it can do with those resources. Amazon Simple Email Service (Amazon SES).
  • Page 158 AWS Snowball User Guide Secure Hash Algorithm. SHA1 is an earlier version of the algorithm, which AWS has deprecated in favor of SHA256. shard Amazon Elasticsearch Service (Amazon ES) (p. 111): A partition of data in an index. You can split an index into multiple shards, which can include primary shards (original shards) and replica shards (copies of the primary shards).
  • Page 159 121): An attribute that specifies the part of web requests, such as a header or a query string, that AWS WAF inspects for malicious SQL code. Based on the specified conditions, you can configure AWS WAF to allow or block web requests to AWS resource (p.
  • Page 160 121): An attribute that specifies the strings that AWS WAF searches for in a web request, such as a value in a header or a query string. Based on the specified strings, you can configure AWS WAF to allow or block web requests to AWS resource (p.
  • Page 161 Z (p. 159) table A collection of data. Similar to other database systems, DynamoDB stores data in tables. Metadata that you can define and assign to AWS resource (p. 148)s, such as an EC2 instance (p. 129). Not all AWS resources can be tagged.
  • Page 162 The version of an AWS CloudFormation (p. 116) template design that determines the available features. If you omit the AWSTemplateFormatVersion section from your template, AWS CloudFormation assumes the most recent format version. template validation The process of confirming the use of JSON (p.
  • Page 163 (p. 109) that needs to make API calls to AWS products. Each user has a unique name within the AWS account, and a set of security credentials not shared with other users. These credentials are separate from the AWS account's security credentials. Each user is associated with one and...
  • Page 164 (key). For example, you might have EC2 instance (p. 129) with the tag key of Owner and the tag value of Jan. You can tag an AWS resource (p. 148) with up to 10 key–value pairs. Not all AWS resources can be tagged.
  • Page 165 Z (p. 159) Amazon WorkSpaces Application Manager (Amazon WAM). web access control list AWS WAF (p. 121): A set of rules that defines the conditions that AWS WAF searches for in web requests to AWS resource (p. 148)s such as Amazon CloudFront (p.
  • Page 166 AWS Snowball User Guide use the Amazon Elasticsearch Service Configuration API to replicate your data for your Elasticsearch cluster.

This manual is also suitable for:

Snowball edge

Table of Contents