Cloud Blog: Introducing BigQuery ObjectRef: Supercharge your multimodal data and AI processing

Jun 25, 2025

—

Source URL: https://cloud.google.com/blog/products/data-analytics/new-objectref-data-type-brings-unstructured-data-into-bigquery/
Source: Cloud Blog
Title: Introducing BigQuery ObjectRef: Supercharge your multimodal data and AI processing

Feedly Summary: Traditional data warehouses simply can’t keep up with today’s analytics workloads. That’s because today, most data that’s generated is both unstructured and multimodal (documents, audio files, images, and videos). With the complexity of cleaning and transforming unstructured data, organizations have historically had to maintain siloed data pipelines for unstructured and structured data, and for analytics and AI/ML use cases. Between these fragmented data platforms, data access restrictions, slow consumption, and outdated information, enterprises struggle to unlock the full potential of their data. The same issues hinder AI initiatives.
Today we’re introducing a new data type, ObjectRef, now in preview in BigQuery, that represents a reference to any object in Cloud Storage with a URI and additional metadata. ObjectRef complements Object Tables, read-only tables over unstructured data objects in Cloud Storage, to integrate unstructured data like images and audio into existing BigQuery tables. The ObjectRef data type removes fragmentation in data processing and access control, providing a unified, multimodal, and governed way to process all modalities of data. You can process unstructured data with large language models (LLMs), ML models, and open-source Python libraries using the same SQL or Python scripts that process tabular data. You can also store structured and unstructured data in the same row throughout different data engineering stages (extract, load, transform a.k.a. ELT), and govern it using a similar access control model.

aside_block
), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>

For example, to answer the question “of the customers who complained about performance issues during interactions last month, show me the top 10 by revenue” you need to perform natural language processing (NLP) on audio calls, emails and online chat transcripts to normalize the data, identify whether the interaction discussed “performance issues” and detect whether the customer complained. For each of these steps, you need to decide how to build a pipeline over data in Cloud Storage, run AI/ML models on the data, and host the models (e.g., on Compute Engine, Google Kubernetes Engine, or Vertex AI). The normalized and extracted data would then need to be saved in structured format (e.g., in a BigQuery table) and joined with each customer’s revenue data.
With the launch of ObjectRef, you can now answer this question with a simple SQL query. Suppose you’ve combined call center audio files and agent chat text into one BigQuery table customer_interactions using columns (1) audio_ref of type ObjectRef, (2) chat of type STRING. Filtering for customers who complained about performance issues is as easy as adding one more condition in the WHERE clause:

BigQuery with ObjectRef unlocks unique platform capabilities across data and AI:

Multimodality: Natively handle structured (tabular) data, unstructured data, and a combination of the two, in a single table via ObjectRef. Now, you can build multimodal ELT data pipelines to process both structured and unstructured data.

Full SQL and Python support: Use your favorite language without worrying about interoperability. If it works in SQL, it works in Python (via BigQuery DataFrames), and vice versa. Object transformations, saving transformed objects back to Cloud Storage, and any other aggregations or filtering, can all be done in one SQL or Python script.

Gen-AI-ready, serverless, and auto-scaled data processing: Spend more time building your data pipelines, not managing infrastructure. Process unstructured data with LLMs, or use serverless Python UDFs with your favorite open-source library. Create embeddings, generate summaries using a prompt, use a BigQuery table as an input to Vertex AI jobs, and much more.

Unified governance and access control: Use familiar BigQuery governance features such as fine-grained access control, data masking, and connection-delegated access on unstructured data. There is no need to manage siloed governance models for structured versus unstructured data.

ObjectRef in action
Let’s take a closer look at how to use the ObjectRef data type.
What is an ObjectRef?
First, it’s good to understand ObjectRef under the hood. Simply put, ObjectRef is a STRUCT containing object storage and access control metadata. With this launch, when you create an Object Table, it is populated with a new ObjectRef column named ‘ref’.

code_block
<ListValue: [StructValue([(‘code’, ‘struct {\r\n uri string,\r\n authorizer string,\r\n version string, \r\n details json { \r\n\tgcs_metadata json\r\n }\r\n }’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed43e6869d0>)])]>

Create a BigQuery table with ObjectRefs
Imagine a call center that stores structured information in standard BigQuery tables ingestion.sessions, and call audio in a Cloud Storage bucket, with a BigQuery Object Table ingestion.audios created on the Cloud Storage bucket. While this example is based on audio, ObjectRefs can also represent images, documents, and videos.
In the following diagrams, ObjectRefs are highlighted in red.

With ObjectRef, you can join these two tables on sessions.RecordingID and audios.Ref.uri columns to create a single BigQuery table. The new table contains an Audio column of type ObjectRef, using the Ref column from the ingestion.audios table.

code_block
<ListValue: [StructValue([(‘code’, ‘CREATE OR REPLACE TABLE analysis.sessions\r\nAS\r\nSELECT sessions.session_id, sessions.date, sessions.customer_id, object_table.ref AS audio\r\nFROM ingestion.sessions INNER JOIN ingestion.audios object_table\r\nON object_table.uri = sessions.recording_id;’), (‘language’, ‘lang-sql’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed44a3cf760>)])]>

Capturing the object version allows BigQuery zero-copy snapshots and clones of analysis.sessions to be reproducible and consistent across structured and unstructured data. This allows reproducibility in downstream applications such as ML training and LLM fine-tuning.
Being a STRUCT, ObjectRef also supports nesting in ARRAY. The main audio file represented by Audio can be chunked (for example, into segments per agent ID), and the resulting objects represented in a new column Chunked of type ARRAY<ObjectRef>. This preserves the order of chunks, and stores them alongside the main audio file in the same row. This data transformation lets you report the number of agent handoffs per call and further analyze each call segment separately.

Process using serverless Python
With Python UDF integration, you can bring your favorite open-source Python library to BigQuery as a user-defined function (UDF). Easily derive structured data, and unstructured data from the source ObjectRef and store them in the same row.
The new function OBJ.GET_ACCESS_URL(ref ObjectRef, mode STRING) -> ObjectRefRuntime enables delegated access to the object in Cloud Storage. ObjectRefRuntime provides signed URLs to read and write data, allowing you to manage governance and access control entirely in BigQuery, and removing the need for Cloud Storage access control.
Serverless Python use case 1: Multimodal data to structured dataFor example, imagine you want to get the duration of every audio file in the analysis.sessions table. Assume that a Python UDF function analysis.GET_DURATION(object_ref_runtime_json STRING) -> INT has already been registered in BigQuery. GET_DURATION uses signed URLs from ObjectRefRuntime to read Cloud Storage bytes.

code_block
<ListValue: [StructValue([(‘code’, ‘– Object is passed to Python UDF using read-only signed URLs\r\nSELECT analysis.GET_DURATION(TO_JSON_STRING(OBJ.GET_ACCESS_URL(audio, “R"))) AS duration\r\nFROM analysis.sessions\r\nWHERE audio IS NOT NULL’), (‘language’, ‘lang-sql’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed44a3cf250>)])]>

code_block
<ListValue: [StructValue([(‘code’, ‘import bigframes.pandas as bpd\r\ndf = bpd.read_gbq("analysis.sessions")\r\nfunc = bpd.read_gbq_function("analysis.get_duration")\r\n# Object is passed to Python UDF using read-only signed URLs\r\ndf["duration"] = df["audio"].blob.get_runtime_json_str(mode="R").apply(func).cache() # cache to execute’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed44a3cf2e0>)])]>

Serverless Python use case 2: Multimodal data to processed multimodal dataAs another example, here’s how to remove noise from every audio file in the analysis.sessions table, assuming that a Python UDF function analysis.DENOISE(src_object_ref_runtime_json STRING, dst_object_ref_runtime_json STRING)-> object_ref_runtime_json STRING has already been registered in BigQuery. This function reads from the source audio, writes the new noise-removed audio to Cloud Storage, and returns ObjectRefs for the new audio files.
ObjectRefRuntime provides signed URLs for reading and writing object bytes.

code_block
<ListValue: [StructValue([(‘code’, ‘SELECT analysis.DENOISE(\r\n — Source is accessed using read-only signed URL\r\n TO_JSON_STRING(OBJ.GET_ACCESS_URL(audio, "R")), \r\n — Destination is written using read-write signed URL with prefix "denoised-"\r\n TO_JSON_STRING(OBJ.GET_ACCESS_URL(\r\n OBJ.MAKE_REF(\r\n CONCAT("denoised-", audio.uri), audio.authorizer),\r\n "RW"))\r\nFROM analysis.sessions\r\nWHERE audio IS NOT NULL’), (‘language’, ‘lang-sql’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed44a3cfa60>)])]>

code_block
<ListValue: [StructValue([(‘code’, ‘import bigframes.pandas as bpd\r\ndf = bpd.read_gbq("analysis.sessions")\r\n\r\ndf["denoised"] = ("denoised-" + df["audio"].blob.uri()).str.to_blob()\r\nfunc_df = df[["audio", "denoised"]]\r\n\r\nfunc = bpd.read_gbq_function("analysis.denoise")\r\n# Source is accessed using read-only signed URL\r\nfunc_df["audio"] = func_df["audio"].blob.get_runtime_json_str("R")\r\n# Destination is written using read-write signed URL with prefix "denoised-"\r\nfunc_df["denoised"] = func_df["denoised"].blob.get_runtime_json_str("RW")\r\nfunc_df.apply(func, axis=1).cache() # cache to execute’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed44a3cfdf0>)])]>

Process using Gemini and BigQuery ML
All BigQuery ML generative AI functions such as AI.GENERATE, ML.GENERATE_TEXT and ML.GENERATE_EMBEDDING now support ObjectRefs as first-class citizens. This enables a number of use cases.
BQML use case 1: Multimodal inference using GeminiYou can now pass multiple ObjectRefs in the same Gemini prompt for inference.
Here, you can use Gemini to evaluate noise removal quality by comparing the original audio file and the noise-removed audio file. This script assumes the noise-reduced audio file ObjectRef is already stored in column Denoised.

code_block
<ListValue: [StructValue([(‘code’, ‘SELECT AI.GENERATE(\r\n prompt => ("Compare original audio file to audio file with noise removed, and output quality of noise removal as either good or bad. Original audio is", OBJ.GET_ACCESS_URL(audio, "r"), "and noise removed audio is", OBJ.GET_ACCESS_URL(denoised, "r")),\r\n — BQ connection with permission to call Gemini\r\n connection_id => "analysis.US.gemini-connection",\r\n endpoint => "gemini-2.0-flash"\r\n).result\r\nFROM analysis.sessions WHERE audio IS NOT NULL AND denoised IS NOT NULL;’), (‘language’, ‘lang-sql’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed44a3cf580>)])]>

code_block
<ListValue: [StructValue([(‘code’, ‘import bigframes.pandas as bpd\r\nfrom bigframes.ml import llm\r\n\r\ngemini = llm.GeminiTextGenerator(model_name="gemini-2.0-flash", connection_name="analysis.US.gemini-connection")\r\ndf = bpd.read_gbq("analysis.sessions")\r\nresult = gemini.predict(df, prompt=["Compare original audio file to audio file with noise removed, and output quality of noise removal as either good or bad. Original audio is", df["audio"], "and denoised audio is", df["denoised"]])\r\nresult[["ml_generate_text_llm_result"]]’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed44a3cfc70>)])]>

As another example, here’s how to transcribe the Audio file using Gemini.

code_block
<ListValue: [StructValue([(‘code’, ‘SELECT AI.GENERATE(\r\n prompt => ("Transcribe this audio file", OBJ.GET_ACCESS_URL(audio, "r")),\r\n — BQ connection with permission to call Gemini\r\n connection_id => "analysis.US.gemini-connection",\r\n endpoint => "gemini-2.0-flash").result as transcript\r\nFROM analysis.sessions\r\nWHERE audio IS NOT NULL’), (‘language’, ‘lang-sql’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed44a3cfeb0>)])]>

code_block
<ListValue: [StructValue([(‘code’, ‘import bigframes.pandas as bpd\r\nfrom bigframes.ml import llm\r\n\r\ngemini = llm.GeminiTextGenerator(model_name="gemini-2.0-flash", connection_name="analysis.US.gemini-connection")\r\ndf = bpd.read_gbq("analysis.sessions")\r\nresult = gemini.predict(df, prompt=["Transcribe this audio file", df["audio"]])\r\nresult[["ml_generate_text_llm_result"]]’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed44a3cff70>)])]>

With BQML + Gemini, you can also generate structured or semi-structured results from multimodal inference. For example, you can do speaker diarization in the Audio file using Gemini to identify the operator vs. the customer.

code_block
<ListValue: [StructValue([(‘code’, ‘SELECT AI.GENERATE(\r\nprompt => ("Generate audio diarization for this interview. Use JSON format for the output, with the following keys: speaker, transcription. If you can classify the speaker as customer vs operator, please do. If not, use speaker A, speaker B, etc.", OBJ.GET_ACCESS_URL(audio, "r")),\r\n — BQ connection with permission to call Gemini\r\nconnection_id => "analysis.US.gemini_connection",\r\nendpoint => "gemini-2.0-flash").result as diarized_json\r\nFROM analysis.sessions\r\nWHERE audio IS NOT NULL;’), (‘language’, ‘lang-sql’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed4492c9d90>)])]>

code_block
<ListValue: [StructValue([(‘code’, ‘import bigframes.pandas as bpd\r\nfrom bigframes.ml import llm\r\n\r\ngemini = llm.GeminiTextGenerator(model_name="gemini-2.0-flash", connection_name="analysis.US.gemini-connection")\r\ndf = bpd.read_gbq("analysis.sessions")\r\nresult = gemini.predict(df, prompt=["Generate audio diarization for this interview. Use JSON format for the output, with the following keys: speaker, transcription. If you can classify the speaker as customer vs operator, please do. If not, use speaker A, speaker B, etc.", df["audio"]])\r\nresult[["ml_generate_text_llm_result"]]’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed4492c9280>)])]>

BQML use case 2: Multimodal embeddings using Gemini
With ML.GENERATE_EMBEDDING support, you can use ObjectRefs with text embedding and multimodal embedding models to create vector indices, and power RAG workflows to ground LLMs.
Assume we have an Object Table ingestion.images with the ref column containing image ObjectRefs.

code_block
<ListValue: [StructValue([(‘code’, "CREATE OR REPLACE MODEL `ingestion.multimodal_embedding_model`\r\nREMOTE WITH CONNECTION ‘ingestion.US.gemini-connection’\r\nOPTIONS (ENDPOINT = ‘multimodalembedding@001’);\r\n\r\nSELECT ref, ml_generate_embedding_result as embedding\r\nFROM ML.GENERATE_EMBEDDING(\r\n MODEL `ingestion.multimodal_embedding_model`,\r\n (\r\n SELECT OBJ.GET_ACCESS_URL(ref, ‘r’) as content, ref\r\n FROM ingestion.images\r\n ),\r\n STRUCT (256 AS output_dimensionality)\r\n);"), (‘language’, ‘lang-sql’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed4492c9cd0>)])]>

code_block
<ListValue: [StructValue([(‘code’, ’embed_model = llm.MultimodalEmbeddingGenerator(model_name="multimodalembedding@001", connection_name="ingestion.US.gemini-connection")\r\ndf = bpd.read_gbq("ingestion.images")\r\nembeddings = embed_model.predict(df["ref"])\r\nembeddings’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3ed4492c96d0>)])]>

What’s new
To summarize, here’s a list of all the new capabilities for performing analytics on unstructured and/or multimodal data using BigQuery:

New types and functions for handling multimodal data (documents, audio files, images, and videos):

ObjectRef and ObjectRefRuntime types along with new functions: OBJ.MAKE_REF, OBJ.GET_ACCESS_URL and OBJ.FETCH_METADATA

Object Table enhancements:

Scalability: Object Tables now support consistent views of Cloud Storage buckets, scaling 5x from 65M to 300M+ objects per table, and ingesting up to 1M object changes per hour per table

Interop with ObjectRef: New ref column provides pre-constructed ObjectRefs directly from Object Tables

BQML Gen-AI multimodal capabilities:

Support multimodal inference in TVFs ML.GENERATE_TEXT and AI.GENERATE_TABLE, and scalar functions such as AI.GENERATE, and AI.GENERATE_BOOL, by encapsulating multiple objects in the same prompt for Gemini using ObjectRef. Objects can be sourced from different columns, and complex types such as arrays.

Support embedding ObjectRef via the ML.GENERATE_EMBEDDING function

BigQuery DataFrames multimodal dataframe support:

An extension to pandas-like dataframe to include unstructured data (powered by ObjectRef) as just another column

Wrangle, process and filter mixed modality data with the familiarity of dataframe operations

Special transformers for unstructured data like chunking, image processing, transcription made available through server side processing functions and BQML

Python UDF support:

Leverage the rich Python library ecosystem for advanced unstructured data manipulation in a fully managed, serverless experience with BigQuery governance

Get started today
ObjectRef is now in preview. Follow these simple steps to get started:

Watch to learn more – watch live demos from Cloud Next about unifying unstructured and structured data, and generating text with LLMs and performing vector search.

Learn by doing – try out ObjectRefs with this multimodal data tutorial using either SQL or Python tutorials.

Build your use case – locate the Cloud Storage bucket containing the unstructured data you want to analyze. Create an Object Table or set up automatic Cloud Storage discovery to pull this data into BigQuery. The Object Table will contain a column of ObjectRefs and now you are ready to start transforming the data.

AI Summary and Description: Yes

Summary: The introduction of the new ObjectRef data type in BigQuery enables organizations to effectively unify and govern structured and unstructured data, addressing the challenges of siloed data pipelines. This innovation facilitates the integration of multimodal data types into analytics workflows, significantly enhancing the potential for AI and machine learning applications.

Detailed Description:

The text discusses a significant development in data management and analytics, particularly within Google Cloud’s BigQuery platform. The introduction of ObjectRef addresses longstanding issues companies face when dealing with both structured and unstructured data, making it crucial for professionals in AI, cloud computing, and data analytics. The ObjectRef data type is critical for advancing the capabilities of data processing and governance, particularly in an era where enterprises increasingly rely on diverse data types.

Key Insights and Major Points:

– **Unified Data Processing**:
– ObjectRef allows for a new way to refer to any object in Cloud Storage with rich metadata, enabling seamless integration of unstructured data (images, audio, video) with structured data.

– **Extensive Compatibility**:
– Tools and scripts designed to process structured data can now handle unstructured data without significant modifications, supporting both SQL and Python.

– **Enhanced Analytics Capability**:
– Users can perform complex data analysis using natural language processing (NLP) techniques over audio calls and chat transcripts directly in BigQuery, greatly simplifying the data preparation process.

– **Multimodal Data Management**:
– Organizations can manage multimodal datasets (structured and unstructured combined) in a single environment, making it easier to build data pipelines and perform advanced analytics.

– **Governance and Access Control**:
– ObjectRef offers a unified approach to data governance, allowing the application of existing BigQuery access controls to unstructured data, removing the need for separate governance models.

– **Serverless Processing**:
– The architecture allows for serverless processing capabilities with auto-scaling features, which reduces operational overhead for organizations focusing on analytics rather than infrastructure management.

– **Application of Generative AI**:
– The integration of BigQuery with generative AI models (Gemini) supports sophisticated applications such as audio transcription, performance assessment, and more, further emphasizing the utility of ObjectRef in AI initiatives.

– **Practical Use Cases**:
– New use cases include multimodal inference, generation of multimodal embeddings, and support for processing complex structures such as JSON for audio diarization tasks.

– **Getting Started**:
– The launch of ObjectRef is in preview, and organizations are encouraged to explore its capabilities through live demos and tutorials available from Google Cloud.

Overall, the ObjectRef data type represents a transformative capability that aligns with modern data analytics needs and empowers organizations to maximize the value of their data assets effectively while ensuring robust governance and compliance.

01 1 10 2 3 4 5 7 a aaS access access control access controls access restrictions Act actions addresses ads advanced advanced analytics agent AGI AI ai model AI models analysis analytics and app Application applications Arch architecture Arize art as assessment assets ated audio Audio transcription Auto auto-scaling based being Bi BigQuery building by Byte C Cache capabilities capability challenges chat Chunking CI CIA class Cloud cloud computing Cloud Next cloud storage co code Col companies compatibility complexity compliance compute Compute Engine Computing Condi Console consumption content control controls critical cross Customer D data data access data analysis data analytics data assets data engineering data governance data management data manipulation data masking data pipeline data pipelines data platform data preparation data processing data transformation Data Type data types data warehouses DataFrames dataset datasets day de DeFi demo deno design development diagrams diverse data document e ecosystem effective email embedding model embedding models embeddings end endpoint Engineer engineering enterprise enterprises environment ERP exp experience F5 face feature features file filtering fine fine-tuning first flash following for free full function g Gemini Gemini Pro Gen generated generation generative Generative AI generative AI models GIS Go Google Google Cloud Google Kubernetes Google Kubernetes Engine governance Governance and Compliance governance model governance models gs H handling high Highlight HR http HTTPS image image processing in Inference information infrastructure infrastructure management initiatives innovation insights integration inter interaction interactions interoperability io iOS Iron IRS issue ite J job jobs json Just k Key keys Kubernetes Kubernetes Engine l language language model language models language processing large large language model large language models Large Language Models (LLMs) learning led Li libraries library llm llms lm long low M mac machine Machine Learning machine learning applications made making man management manipulation max Meta metadata Mila mini mission mixed ML modal modalities Mode model models Modern ModI multi Multimodal multimodal capabilities multimodal data multimodal embedding model multimodality N nation native natural language natural language processing natural language processing (NLP) NCA needs new next NLP no non NPU o object storage object tables oE of off offs on one only open open-source operation operational overhead operations Operator OPM opt options organization organizations oS other out output over Pandas performance performance assessment performance issues phi Pipeline pipelines platform platforms point porting potential Power pre preparation Preview pro process processing product products professionals prompt ps Py Python Python library Python script Python scripts Q quality question R rag rate Ray RCE reading ready record red remote report reproducibility restrictions return revenue review Ro row RSA s sam saving scalability Scale scaling search Segment server serverless serverless processing shot side Sig signed URLs Sim Simple single Snap snapshots source sql SSE start storage stored structured structured data structures support system T tabular data Tails Task tasks tech techniques ted text Text Embedding the Time to tool tools Tor TP training transcribe Transform transformation transformations transformative transformer transformers trial tuning turn two type Uber UI under unified approach unstructured data up US use use cases user Users uth V val vector search version Vertex Vertex AI video WAN Ware Wi workflow workflows workload workloads writing x yt z Zen zero