Cloud Blog: New Cassandra to Spanner adapter simplifies Yahoo’s migration journey

Source URL: https://cloud.google.com/blog/products/databases/new-proxy-adapter-eases-cassandra-to-spanner-migration/
Source: Cloud Blog
Title: New Cassandra to Spanner adapter simplifies Yahoo’s migration journey

Feedly Summary: Cassandra, a key-value NoSQL database, is prized for its speed and scalability, and used broadly for  applications that require rapid data retrieval and storage such as caching, session management, and real-time analytics. Its simple key-value pair structure helps ensure high performance and easy management, especially for large datasets. 
But this simplicity also leads to limitations like poor support for complex queries, potential data redundancy, and difficulty in modeling intricate relationships. Spanner, Google Cloud’s always-on, globally consistent, and virtually unlimited-scale database, combines the scalability and availability of NoSQL with the strong consistency and relational model of traditional databases, positioning it for traditional Cassandra workloads. And today, it’s easier than ever to switch from Cassandra to Spanner, with the introduction of the Cassandra to Spanner Proxy Adapter, an open-source tool for plug-and-play migrations of Cassandra workloads to Spanner, without any changes to the application logic.

aside_block
), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>

Spanner for NoSQL workloads
Spanner provides strong consistency, high availability, virtually unlimited scalability, and a familiar relational data model with support for SQL and ACID transactions for data integrity. As a fully managed service, it helps simplify operations, allowing teams to focus on application development rather than database administration. Furthermore, Spanner’s high availability, even at a massive global scale, supports business continuity by minimizing database downtime.

We’re constantly evolving Spanner to meet the needs of modern businesses. Some of the latest Spanner capabilities include enhanced multi-model capabilities such as graph, full-text search, vector search, improved performance for analytical queries with Spanner Data Boost, and unique enterprise features such as geo-partitioning and dual-region configurations. For Cassandra users, these powerful features, along with Spanner’s compelling price-performance, unlock a world of new, exciting possibilities.
The Cassandra to Spanner adapter — battle-tested by Yahoo!
If you’re wondering, “Spanner sounds like a leap forward from Cassandra. How do I get started?” the proxy adapter provides a plug-n-play way to forward your client applications’ Cassandra Query Language (CQL) traffic to Spanner. Under the hood, the adapter functions as a Cassandra client for the application but operates internally by interacting with Spanner for all data manipulation tasks. With the Cassandra to Spanner proxy adapter there is no migration for your application code needed — it just works! 
Yahoo successfully migrated from Cassandra to Spanner, reaping the benefits of improved performance, scalability, consistency, and operational efficiency. And the proxy adapter made it easy to migrate. 
“The Cassandra Adapter has provided a foundation for migrating the Yahoo Contacts workload from Cassandra to Spanner without changing any of our CQL queries. Our migration strategy has more flexibility, and we can focus on other engineering activities while utilizing the scale, redundancy, and support of Spanner without updating the codebase. Spanner is cost-effective for our specific needs, delivering the performance required for a business of our scale. This transition enables us to maintain operational continuity while optimizing cost and performance.” – Patrick JD Newnan, Principal Product Manager, Core Mail and Analytics, Yahoo 
Another Google Cloud customer that successfully migrated from Cassandra to Spanner recently is Reltio. Reltio benefited from an effortless migration process to minimize downtime and disruption to their services while reaping the benefits of a fully managed, globally distributed, and strongly consistent database.
These success stories demonstrate that migrating from Cassandra to Spanner can be a transformative step for businesses seeking to modernize their data infrastructure, unlock new capabilities, and accelerate innovation.
How does the new proxy adapter simplify your migration? A typical database migration involves the following steps:

Some of these steps — migrate your application (step 4) and migrate the data (step 6) — are more complex than others. The proxy adapter vastly simplifies migrating a Cassandra-backed application to point to Spanner. Here’s a high-level overview of the steps involved when using the new proxy adapter:
1. Assessment: Evaluate your Cassandra schema, data model, and query patterns which ones you can simplify after moving to Spanner. 
2. Schema design: Spanner’s table declaration syntax and data types are similar to Cassandra’s; the documentation covers these similarities and differences in depth. With Spanner, you can also take advantage of relational capabilities and features like interleaved tables for optimal performance.
3. Data migration: There are several steps to migrate your data:

Bulk load: Export data from Cassandra and import it into Spanner using tools like the Spanner Dataflow connector or BigQuery reverse ETL.
Replicate incoming data: Replicate incoming updates to your Cassandra cluster to Spanner in real-time using Cassandra’s Change Data Capture (CDC). Another possibility is to update your application logic to perform dual-writes to Cassandra and Spanner. We don’t recommend this approach if you’re trying to minimize changes to your application code.

4. Set up the proxy adapter and update your Cassandra configuration: Download and run the Cassandra to Spanner Proxy Adapter, which runs as a sidecar next to your application. By default, the proxy adapter runs on port 9042. In case you decide to use a different port, don’t forget to update your application code to point to the proxy adapter.
5. Testing: Thoroughly test your migrated application and data in a non-production environment to ensure everything works as expected.
6. Cutover: Once you’re confident in the migration, switch your application traffic to Spanner. Monitor closely for any issues and fine-tune performance as needed.
What’s under the hood of the new proxy adapter?
The new proxy adapter presents itself as a Cassandra client to the application. From the application’s perspective, the only noticeable change is the IP address or hostname of the Cassandra endpoint, which now points to the proxy adapter. This streamlines the Spanner migration, without requiring extensive modifications to application code.

We designed the proxy adapter to establish a one-to-one mapping between each Cassandra cluster and a corresponding Spanner database. The proxy instance employs a multi-listener architecture, with each listener bound to a distinct port. This facilitates concurrent handling of multiple client connections, where each listener manages a distinct connection with the specified Spanner database. 
The proxy’s translation layer handles the intricacies of the Cassandra protocol. This layer performs message decoding and encoding, manages buffers and caches, and crucially, parses incoming CQL queries and translates them into Spanner-compatible equivalents.
The proxy adapter supports OpenTelemetry to collect and export traces to Cloud Trace. 
For more details about different ways of setting up the adapter, limitations, mapping of CQL data types to Spanner, and more, refer to the proxy adapter documentation.
Addressing common concerns and challenges
Let’s address a few concerns you may have with your migrations:

Cost: Have a look at Accenture’s benchmark result that demonstrates that Spanner ensures not only consistent latency and throughput but also cost efficiency. Furthermore, Spanner now offers a new tiered pricing model (Spanner editions) that delivers better cost transparency and cost savings opportunities to help you take advantage of all of Spanner’s capabilities.

Latency increases: To minimize an increase in query latencies, we recommend running the proxy adapter on the same host as the client application (as a side-car proxy) or running on the same Docker network when running the proxy adapter in a Docker container. We also recommend keeping the CPU utilization of the proxy adapter host to under 80%.

Schema flexibility: While Cassandra offers schema flexibility, Spanner’s stricter relational schema provides advantages in terms of data integrity, query power, and consistency.

Learning curve: Spanner’s data types have some differences with Cassandra’s. Have a look at this comprehensive documentation that can ease the transition.

Get started today 
The benefits of strong consistency, simplified operations, enhanced data integrity, and global scalability make Spanner a compelling option for businesses looking to leverage the cloud’s full potential for NoSQL workloads. With the new Cassandra to Spanner proxy adapter, we are making it easier to plan and execute on your migration strategy, so you can unlock a new era of data-driven innovation for your organization.
Download the new Cassandra to Spanner proxy adapter, and try it out on a Spanner Free Trial instance at no cost today.

AI Summary and Description: Yes

Summary: The text outlines the transitioning capabilities from Cassandra, a NoSQL database, to Google’s Spanner database, highlighting the ease of migration using the newly introduced Cassandra to Spanner Proxy Adapter. This tool simplifies the migration process by allowing seamless integration of existing Cassandra applications into Spanner without altering their core logic.

Detailed Description:

– **Cassandra Overview**:
– A key-value NoSQL database known for speed and scalability.
– Commonly used for applications needing rapid data retrieval and storage.
– Simplifies management of large datasets but comes with limitations like poor support for complex queries and data redundancy.

– **Introduction to Google Cloud Spanner**:
– Spanner combines NoSQL scalability with traditional database consistency.
– Fully managed service that streamlines operations, allowing developers to focus on application development rather than database management.
– Key features include high availability, strong consistency, relational data model, and SQL support.

– **Cassandra to Spanner Proxy Adapter**:
– An open-source tool designed for easy migration of Cassandra workloads to Spanner.
– Acts as a bridge by forwarding Cassandra Query Language (CQL) traffic to Spanner, eliminating the need for application code changes.
– Example success story: Yahoo successfully transitioned to Spanner, benefiting from enhanced performance and operational efficiency.

– **Migration Process**:
1. **Assessment**: Evaluate existing Cassandra schemas and query patterns.
2. **Schema Design**: Align table declarations and data types, leveraging Spanner’s relational features.
3. **Data Migration**: Methods include bulk loading and real-time updates, ensuring minimal downtime.
4. **Proxy Adapter Setup**: Run the adapter alongside applications to facilitate smooth data operations.
5. **Testing**: Conduct thorough tests in a non-production environment.
6. **Cutover**: Switch application traffic to Spanner and monitor performance.

– **Under the Hood of the Proxy Adapter**:
– Functions as a Cassandra client, maintaining a simple IP address change to redirect traffic.
– Multi-listener architecture allows concurrent connections, handling complexities of translating CQL to Spanner-compatible commands.

– **Addressing Common Concerns**:
– **Cost Efficiency**: Spanner’s tiered pricing model can lead to cost savings.
– **Latency**: Strategies for optimizing query latencies include appropriate architecture placement.
– **Schema Flexibility**: Spanner’s stricter schema fosters data integrity and enhanced query capabilities.
– **Learning Curve**: Documentation is available to assist in the transition between different data types.

– **Conclusion**:
– Transitioning to Spanner offers significant business advantages, including strong consistency and global scalability. The Proxy Adapter simplifies the migration process, fostering a transformative opportunity for organizations seeking to modernize their data infrastructure.

This information is crucial for security and compliance professionals as it highlights the importance of understanding infrastructure changes and their implications for data security and integrity during the migration process.