Cloud Blog: African super app Yassir delivers on data with BigQuery migration

Source URL: https://cloud.google.com/blog/products/data-analytics/african-super-app-delivers-on-data-with-bigquery-migration/
Source: Cloud Blog
Title: African super app Yassir delivers on data with BigQuery migration

Feedly Summary: Yassir is a super app, supporting the daily lives of users in more than 45 cities across Algeria, Morocco, Tunisia, South Africa, and Senegal who rely on our ride-hailing, last-mile delivery, and financial services solutions. These users are both consumers and vendors — including drivers, couriers, restaurants, and more — that use our platform to run their businesses. 
At Yassir, we process a wide variety of datasets to ensure we provide the best and most reliable solutions for our users across all of our offerings, and we depend on that data to continually improve those services. However, our previous infrastructure made unifying data and AI difficult. 
Previously, we had two separate data systems: one using Databricks for deploying and training machine learning models and another through Google Cloud and BigQuery for storing and analyzing data. This setup led to several issues, such as formatting incompatibilities that we could not resolve. In addition, retrieving data from Databricks for processing within Google Cloud wasn’t possible, and this disconnect directly impacted our application performance.
These siloed environments meant our teams often had to duplicate work to develop and maintain any data projects, paying to maintain separate environments, and, despite all of this, failing to get the information that teams needed at the desired pace. 
To address these issues, we decided to consolidate our data infrastructure with Google Cloud to bring all of these functions into one place. This migration would allow us to provide better access to data and more scalability, and create new opportunities to analyze, review, and improve performance.

aside_block
), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>

Creating a more flexible, unified data platform
Our existing relationship with the Google Cloud team provided a strong foundation to not only resolve our data connectivity roadblocks but also implement new data processing workflows using BigQuery and deploy new AI and machine learning models with Vertex AI. Consolidating with a single data provider also gave us a centralized place to review and control expenses as well as simple, centralized data governance controls. As a growing company, being able to scale our cloud usage up or down to optimize costs allows us to test and iterate without a hard commitment to every project, and that flexibility is invaluable. 
We worked closely with the Google Cloud team to design a solution that aligns with our growth goals. This meant participating in technical and strategic workshops to help train our team on the ins and outs of BigQuery — and its real-time, governance, and open-source capabilities — empowering our engineers with the tools and resources they need to experiment. This collaborative approach allows us to nurture the type of engineering culture we want to promote at Yassir; rather than simply using out-of-the-box solutions, we can tackle more complex problems by adapting flexible, existing solutions to our specific use cases.
After conducting our internal compatibility reviews, we migrated individual models from our previous solution into Vertex AI to test their consistency, and now they’re up and running nearly autonomously. By migrating from Databricks to BigQuery and combining our own models with the models provided by Google Cloud, we’ve improved the performance and efficiency of our machine learning processes and better positioned ourselves for ongoing growth. We may not be processing petabytes of data yet, but we know that we have the capability to do so when needed.

Evolving from data processing to data insights
Our previously disconnected data solutions made it difficult to provide secure access to specific data for specific teams. Since we stored our data in BigQuery but deployed models with Databricks, granting access to information to a user or a team meant giving them the keys to everything. Now, we can implement role-based access controls as well as Infrastructure as Code (IaC) Terraform scripts to automatically grant and revoke access to datasets for individuals or teams. Sharing data through Looker Studio Pro and directly providing access to BigQuery tables for our more technical users also means we can ensure the required data reaches the right users.
With our data unified in BigQuery and connected to our machine learning models, we can better support everything from customer growth and retention to marketplace optimization by providing insights into product usage, customer data, and more. To ensure we’re hitting our internal and customer-related goals, we closely monitor and create dashboards for operational and analytical datasets. 
Our operational dashboards give our sales and marketing teams the insights they need to better target and reach merchants and consumers. They also include insights into our staffing processes, helping us to gradually reduce delivery times, complete more rides faster, and improve how we support specific markets. We also have product-level detection and monitoring that help us provide real-time dynamic pricing and identify fraudulent trips and orders. Each data point we collect gives us more opportunities to build a more personalized and consistent customer experience. 
Our leadership team relies on our rapidly available datasets to drive strategic decision-making, including regional investment decisions to grow the business, macro-level plans for growth trajectories and marketing budgets, and identification of the areas of the business that need the most support or attention. These roadmap decisions are core to our overall growth strategy, and they wouldn’t be possible without the flexibility and scalability we’ve been able to achieve with BigQuery.

AI Summary and Description: Yes

Summary: The text discusses Yassir’s transition to a unified data platform on Google Cloud, highlighting the challenges faced with previous siloed data systems. This migration not only improves data accessibility and scalability but also enhances machine learning capabilities, enabling role-based access controls and supporting strategic business insights, crucial for professionals involved in cloud computing security and data management.

Detailed Description: The narrative focuses on Yassir’s journey to consolidate its data infrastructure, primarily using Google Cloud technologies. The text emphasizes how this shift allows for better data processing, improved machine learning model deployment, and enhanced governance and cost management. Here are the key points:

– **Previous Infrastructure Challenges**:
– Siloed systems: Databricks for ML and Google Cloud for data storage created connectivity issues.
– Formatting incompatibilities hindered data processing and performance.
– Duplication of efforts and increased maintenance costs due to separate environments.

– **Migration to Google Cloud**:
– Consolidation into a unified platform facilitated improved data accessibility and scalability.
– Implementation of BigQuery and Vertex AI, allowing better data processing workflows and efficient ML model deployment.

– **Data Governance and Security**:
– Introduction of role-based access controls to manage data access securely.
– Use of Infrastructure as Code (IaC) with Terraform scripts to automate access controls.

– **Enhanced Analytical and Operational Capabilities**:
– Deployment of operational dashboards that provide insights for marketing, sales, and delivery processes, optimizing customer engagement and operational efficiency.
– Use of real-time analytics for dynamic pricing and fraud detection, improving service reliability.

– **Business Impact**:
– Strategic decision-making fueled by rapid data availability, impacting regional investments and growth strategies.
– General capability to scale and adapt to increasing data demands, positioning Yassir for long-term growth.

This consolidated data platform enhances Yassir’s ability to use data for operational insights and supports a more personalized customer experience, reflecting essential advancements in cloud data management, governance, and machine learning operations.