Cloud Blog: Cloud Storage bucket relocation: An industry first for non-disruptive bucket migrations

Source URL: https://cloud.google.com/blog/products/storage-data-transfer/introducing-cloud-storage-bucket-relocation/
Source: Cloud Blog
Title: Cloud Storage bucket relocation: An industry first for non-disruptive bucket migrations

Feedly Summary: As your operational needs change, sometimes you need to move data residing within Google’s Cloud Storage to a new location, to improve resilience, optimize performance, meet compliance needs, or simply to reorganize your infrastructure. Yet moving buckets can be a daunting, complex, risky endeavor that involves manual scripting, painstaking coordination, and the risk of data loss, or worse yet, extended downtime. This can discourage organizations from making the changes they need to their storage environments.
We recently introduced Cloud Storage bucket relocation, a unique feature among leading hyperscalers that makes it easy  to change your bucket’s location. Bucket relocation eliminates the need for complex manual planning and helps prevent extended downtime, for an easy transition with minimal application disruption, and strong data integrity. Your bucket’s name, and all the object metadata within it, remain identical throughout the relocation, so there are no path changes, and your applications experience minimal downtime while the underlying storage is moved. Furthermore, your objects retain their original storage class (e.g., Standard, Nearline, Coldline, Archive) and time-in-class in the new location. This is key for many cost efficiency strategies, helping ensure capabilities such as Autoclass continue to operate intelligently to optimize your storage costs post-migration.
Bucket relocation is a key capability within the Storage Intelligence suite, alongside tools like Storage Insights, which  provides deep visibility into your storage landscape and identifies optimization opportunities. Bucket relocation then lets you act on these insights, and move your data between diverse Cloud Storage locations — regional locations for low latency, dual-regions for high availability and disaster recovery, or multi-regions for global accessibility — to meet your  business, performance, and compliance objectives.

aside_block
), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Bucket relocation under the hood
Bucket relocation relies on two critical techniques. 

Asynchronous data copy: Bucket relocation leverages a unique and optimized asynchronous data transfer mechanism that copies data in the background to  minimize impact to ongoing operations. Existing operations like writing, reading, and updating objects continue while the entire dataset is being copied.

Metadata preservation: Historically, Google Cloud customers moved data with the Storage Transfer Service, which copied the objects to a new bucket and deleted existing ones. Bucket relocation, on the other hand, automatically and meticulously moves all your bucket’s and objects’ associated metadata, thereby preserving state. This includes information like:

Storage class: Your objects retain their original storage class (e.g., Standard, Nearline, Coldline, Archive) in the new location.

Bucket and object names: The naming structure of your buckets and objects remains identical.

Creation and update timestamps: These markers are preserved, so that features like object lifecycle management (OLM) rules continue to operate.

Access Control Lists (ACLs) and IAM policies: Bucket- and object-level permissions are transferred to help maintain your security posture.

Custom metadata: Any user-defined metadata associated with your objects is also migrated.

By handling the complexities of asynchronous data transfer and automatic metadata migration, bucket relocation minimizes the risks and overhead associated with a manual bucket migration. Crucially, because the bucket name is preserved throughout the relocation process, applications accessing the bucket don’t need to be modified.
Relocate your bucket in a few simple steps
With bucket relocation, you can move your Cloud Storage buckets in three simple steps. Here’s a breakdown:
1. Initiate a dry run:

Before starting the actual relocation, it’s highly recommended to perform a dry run. This simulates the process without moving any data, allowing you to identify potential issues early on, such as incompatible configurations.
The dry run checks for incompatibilities like customer-managed encryption keys (CMEK), locked retention policies, objects with temporary holds, and bucket tags, without you having to manually validate each of them.
Make sure to add the –dry-run flag!

code_block
<ListValue: [StructValue([(‘code’, ‘gcloud storage buckets relocate gs://BUCKET_NAME –location=LOCATION –dry-run’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e5daefa1a00>)])]>

Replace BUCKET_NAME with the name of your bucket and LOCATION with the desired destination.
2. Start the relocation process:

This step initiates the actual data transfer from the source bucket to the destination bucket. During this phase, you can still read, modify, and delete objects in the bucket. However, the bucket metadata (i.e., bucket-level  parameters and configurations) is write-locked to prevent changes that could affect the relocation.Note: Removing the –dry-run flag from the dry-run command initiates the relocation.

code_block
<ListValue: [StructValue([(‘code’, ‘gcloud storage buckets relocate gs://BUCKET_NAME –location=LOCATION’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e5daefa18b0>)])]>

3. Finalize the relocation process:

Once the incremental data copy is complete, you’re ready to trigger the final synchronization step (except when moving between multi-region and configurable dual-region). This involves a brief period where writes to the bucket are disabled to help ensure their data integrity; any last-second changes made to the objects within the bucket while the incremental copy was in progress are copied to the destination. After the data’s integrity is verified, the bucket’s location is updated, and all requests are automatically redirected to the new location. During the final synchronization step, attempts to update objects in the bucket will result in an HTTP 412 error.
Do not initiate the final synchronization process until the relocation process progress reaches ~99%. This helps you minimize downtime because most of the data has already been synchronized in the background. 

Note: If you’re moving between multi-regions and configurable dual-regions within the same multi-region code, you’re all set — bucket relocation handles the transition in the background, no finalization or downtime required!

code_block
<ListValue: [StructValue([(‘code’, ‘gcloud storage buckets relocate –finalize –operation=projects/_/buckets/BUCKET_NAME/operations/OPERATION_ID’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e5daefa1b50>)])]>

The OPERATION_ID is provided as output from Step-2. The OPERATION_ID is listed with the keyword name. For instance:
name: projects/_/buckets/my-bucket/operations/AbCJYd8jKT1n-Ciw1LCNXIcubwvij_TdqO-ZFjuF2YntK0r74
And there you have it — In just three steps, you’ven moved your entire bucket, its data, and metadata, to its new location.
Early users of bucket relocation have had great success with the new feature. 
“With Storage Intelligence and bucket relocation, we effortlessly transitioned to dual-region buckets. The seamless process, powered by the bucket relocation, minimized downtime and ensured data integrity. We migrated the buckets with peace of mind and without the manual headaches.” – Adam Steele, Product Manager, Spotify
“We recently utilized the bucket relocation feature of Storage Intelligence to successfully complete a ~300 bucket migration and PBs of data project from multi-region to regional storage, to optimize network data transfer costs. Without bucket relocation, this process would have required extensive automation and scripting, resulting in increased downtime and effort.” – Deepak Mahato, Data Platform Infrastructure Manager, GroupOn
Experience the ease and efficiency of managing your Cloud Storage buckets with bucket relocation in Storage Intelligence. To learn more, visit the bucket relocation documentation and the Storage Intelligence overview.

AI Summary and Description: Yes

**Summary:** The text discusses a new feature called Cloud Storage bucket relocation by Google Cloud, aimed at simplifying the process of moving data across various cloud storage locations. This feature addresses challenges such as minimizing downtime, ensuring data integrity, and preserving metadata, which are crucial for compliance and operational efficiency.

**Detailed Description:**

The text highlights the new Cloud Storage bucket relocation feature from Google Cloud, focusing on its implications for organizations moving data within cloud environments. Here are the major points discussed:

– **Purpose of Bucket Relocation:**
– Designed to enhance resilience, optimize performance, and meet compliance requirements by simplifying the process of moving data without significant downtime or complexity.
– Critical for organizations aiming to reorganize their storage infrastructure.

– **Key Benefits of Bucket Relocation:**
– **Simplified Process:** Eliminates the need for complex manual planning and reduces the risks associated with data loss.
– **Minimal Application Downtime:** The bucket’s name and object metadata remain unchanged, allowing applications to operate without interruption during the transition.
– **Cost Efficiency Strategies:** The feature helps retain the storage class and time-in-class, ensuring that automatic cost optimization strategies continue to operate correctly post-migration.

– **Technical Mechanisms:**
– **Asynchronous Data Copy:** Utilizes a background data transfer mechanism to ensure ongoing operations (like reading and updating objects) are unaffected during the data movement.
– **Metadata Preservation:** Maintains important metadata (like access controls and custom metadata) throughout the relocation process which is crucial for security and operational continuity.

– **Bucket Relocation Process:**
1. **Initiate a Dry Run:** Users should begin with a dry run to identify potential incompatibilities without affecting any existing data.
2. **Start the Relocation Process:** This involves initiating the actual transfer while locking metadata to prevent changes that could disrupt the process.
3. **Finalize the Relocation Process:** A short synchronization step ensures data integrity before the new location is fully operational.

– **User Testimonials:** Early adopters have found success in using the bucket relocation feature, noting the reduction in downtime and efficiency in handling large-scale data migrations.

– Adam Steele from Spotify emphasizes the ease of transitioning to dual-region buckets with minimal headaches.
– Deepak Mahato from GroupOn describes utilizing the feature to handle a significant migration while avoiding extensive automation efforts.

This innovation within the realm of Cloud Computing Security is particularly relevant for professionals involved in data management and infrastructure security as it enhances operational efficiency while ensuring compliance and data integrity.