Cloud Blog: BigQuery ML is now compatible with open-source gen AI models

Feb 18, 2025

—

Source URL: https://cloud.google.com/blog/products/data-analytics/run-open-source-llms-on-bigquery-ml/
Source: Cloud Blog
Title: BigQuery ML is now compatible with open-source gen AI models

Feedly Summary: BigQuery Machine Learning allows you to use large language models (LLMs), like Gemini, to perform tasks such as entity extraction, sentiment analysis, translation, text generation, and more on your data using familiar SQL syntax.
Today, we are extending this capability with support for any open-source LLM from the Vertex AI Model Garden — including any models you deploy from Hugging Face and including OSS models you might have tuned. This greatly expands the model choice available to developers.
In this post, we use the Meta Llama 3.3 70B model to illustrate how this integration works. However, you can use any of 170K+ text generation models available on Hugging Face by following the same steps. We’ve also got a tutorial notebook ready for you, or you can jump right into the steps below.
Using Open-Source Software (OSS) models with BigQuery ML
1. Host the model on a Vertex endpointFirst, choose a text -generation model from Hugging Face. Then, navigate to Vertex AI Model Garden > Deploy from Hugging Face. Enter the model URL and optionally modify the endpoint name, deployment region, and machine spec for the deployment endpoint.

Alternatively, you can search for ‘Llama 3.3’ from the Vertex AI Model Garden UI, accept the terms, and deploy the model endpoint. You can also do this step programmatically (see the tutorial notebook here).
Note: To use LLama models, you need to agree to LLAMA 3.3 COMMUNITY LICENSE AGREEMENT on the LLama 3.3 Model Card in Hugging Face or accept terms in the Vertex Model Garden UI. You need to complete this step before deploying the model.

aside_block
), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>

2. Create a remote model in BigQueryModel deployment takes several minutes. After the deployment is complete, create a remote model in BigQuery using a SQL statement like following:

code_block
<ListValue: [StructValue([(‘code’, “CREATE OR REPLACE MODEL bqml_tutorial.llama_3_3_70b\r\nREMOTE WITH CONNECTION `LOCATION.CONNECTION_ID’\r\nOPTIONS\r\n(endpoint=’https://<region>-aiplatform.googleapis.com/v1/projects/<project_name>/locations/<region>/endpoints/<endpoint_id>’\r\n)"), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3564febe80>)])]>

To allow BigQuery to connect to a remote endpoint you need to provide a ‘Connection’. If you don’t already have a connection you can create one following the instructions here. Replace the placeholder endpoint in the above code sample with the endpoint URL. You can get information on endpoint_id from the console via Vertex AI > Online Prediction>Endpoints>Sample Request.
3. Perform inferenceYou are now ready to perform inference against this model from BigQuery ML.For this scenario, take this medical transcripts dataset as an example. It has unstructured and varied raw transcripts capturing history, diagnosis and treatment provided of patients visiting a medical facility. A sample transcript looks like the image below:

Create a table
To analyze this data in BigQuery, first create a table.

code_block
<ListValue: [StructValue([(‘code’, "LOAD DATA OVERWRITE bqml_tutorial.medical_transcript\r\nFROM FILES( format=’NEWLINE_DELIMITED_JSON’,uris = [‘gs://cloud-samples-data/vertex-ai/model-evaluation/peft_eval_sample.jsonl’] )"), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3564feba60>)])]>

Perform inference
You can now use your Llama model to extract structured data from the unstructured transcripts in your table. Say you want to extract the patient’s age, gender and list of diseases for each entry. You can do so with a SQL statement like the following and save the derived insights to a table. Include the information you want to extract and its schema in the model prompt.

code_block
<ListValue: [StructValue([(‘code’, ‘CREATE TEMP FUNCTION ExtractOutput(s STRING)\r\nRETURNS STRING\r\nAS (\r\n SUBSTR(s, INSTR(s, "Output:")+8)\r\n);\r\n\r\n\r\nCREATE OR REPLACE TABLE bqml_tutorial.medical_transcript_analysis_results AS (\r\nSELECT\r\n ExtractOutput(ml_generate_text_llm_result) AS generated_text, * EXCEPT(ml_generate_text_llm_result)\r\nFROM\r\n ML.GENERATE_TEXT( MODEL `bqml_tutorial.llama_3_3_70b`,\r\n (\r\n SELECT\r\n CONCAT(\’Extract the Gender, Age (in years), and Disease information from the following medical transcript. Return **only** a JSON in the following schema: \\n{ "Age": Int, "Gender": "String", "Disease": ["String"]}. If Age, Gender, or Disease information is not found, return `null` for that field. Summarize the disease(s) in 1 to 5 words. If the patient has multiple diseases, include them in a comma-separated list within the "Disease" field. Do not include any other text or labels in your response.**. \\n\’, input_text) AS prompt\r\n FROM\r\n bqml_tutorial.medical_transcript\r\n ),\r\n STRUCT(\r\n 0 AS temperature,\r\n 0.001 AS top_p,\r\n 1 AS top_k,\r\n 128 AS max_output_tokens,\r\n TRUE AS flatten_json_output))\r\n);\r\n\r\n\r\nSELECT * FROM bqml_tutorial.medical_transcript_analysis_results;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3564febc70>)])]>

The output returned from this Llama endpoint includes the input prompt so we also wrote and used a ExtractOutput function to help us parse the output. The output table with the results in the ‘generated_text’ column is as follows:

Perform analytics on results
You can now perform all sorts of analytics on this data. For example, answer ‘What are the most common diseases in females with age 30+ in our sample?’ using a simple SQL query. You can see that ‘Hypertension’, ‘Arthritis’ and ‘Hyperlipidemia’ are most common.

code_block
<ListValue: [StructValue([(‘code’, "WITH\r\n parsed_data AS (\r\n SELECT\r\n JSON_EXTRACT_SCALAR(generated_text, ‘$.Gender’) AS gender,\r\n CAST(JSON_EXTRACT_SCALAR(generated_text, ‘$.Age’) AS INT64) AS age,\r\n JSON_EXTRACT_ARRAY(generated_text, ‘$.Disease’) AS diseases,\r\n FROM\r\n bqml_tutorial.medical_transcript_analysis_test)\r\n\r\nSELECT\r\n disease,\r\n count(*) AS occurrence\r\nFROM\r\n parsed_data, UNNEST(diseases) AS disease\r\nWHERE\r\n LOWER(gender) = ‘female’\r\n AND age >= 30\r\nGROUP BY\r\n disease\r\nORDER BY\r\n occurrence DESC\r\nLIMIT 3;"), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3564febc40>)])]>

Get started today
Try out BigQuery with your own preferred open model or a tuned/distilled model with the BigQuery and Vertex Model Garden integration today. Learn more in our documentation.

AI Summary and Description: Yes

**Summary:** The text discusses the extended capabilities of BigQuery Machine Learning, which now supports any open-source large language model (LLM) from the Vertex AI Model Garden. This includes detailed instructions on deploying models from Hugging Face and performing inference directly within BigQuery, thus enhancing data analytics and machine learning workflows for developers.

**Detailed Description:**
The announcement elaborates on several key points regarding the integration of open-source LLMs with BigQuery Machine Learning:

– **Integration with Vertex AI:**
– BigQuery Machine Learning now allows users to leverage large language models, such as Meta’s Llama 3.3, for various tasks including text generation and entity extraction, using standard SQL syntax.
– Support has been expanded to include over 170,000 text generation models available on Hugging Face through the Vertex AI Model Garden, providing developers significant flexibility and choice.

– **Deployment Steps:**
1. **Model Hosting:**
– Users can deploy a selected text-generation model directly from Hugging Face onto a Vertex endpoint, following either a manual or programmatic approach.
– Compliance with licensing agreements (e.g., the Llama 3.3 Community License Agreement) is necessary before deployment.

2. **Creating Remote Models:**
– After model deployment, users can create a remote model in BigQuery, enabling BigQuery to connect and interact with the model endpoint.

3. **Inference Execution:**
– Once set, users can perform inference using SQL queries against different datasets, with an example provided on extracting structured data from medical transcripts.

– **Analytics Capability:**
– The output of the Llama model can be processed to derive insights, such as extracting patient information and performing analytics tasks, like identifying common diseases in specific demographic groups.

– **Practical Application:**
– The integration allows data professionals to perform complex data manipulation and analysis tasks seamlessly within BigQuery, effectively utilizing advanced language models for enhanced data insights and analytics capabilities.

**Practical Implications for Security and Compliance Professionals:**
– Understanding the deployment of open-source LLMs in BigQuery environments poses security considerations, including data privacy, model compliance with licensing, and governance over data access and model integration.
– It is crucial for security professionals to consider the potential risks associated with using third-party models, including ensuring that datasets used for inference do not expose sensitive information.
– Adopting a strong compliance framework that governs the use of public models and data privacy during machine learning operations would be advisable to mitigate risks effectively.

This update highlights a significant evolution in AI capabilities within cloud environments, paving the way for more sophisticated data analytics while prioritizing responsible model usage and data security.

01 1 2 3 4 5 7 a access Act advanced language models after AI ai model AI models analysis analytics and API APIs Application Arch Arize art as BigQuery building by C capabilities CIA Cloud cloud environment cloud environments code Col community compliance compliance framework compliance professionals Console D data data access data analytics data insights data manipulation data privacy data professionals data security dataset datasets day de demo deployment developer developers document documentation e E 3 effective end endpoint endpoints environment evaluation execution exp eXtended extraction face first flexibility for framework free g Gemini Gen generated generation Go Google governance graph Group gs high Highlight hosting HR http HTTPS hugging Hugging Face Hyper image implications in Inference information insights integration inter IRS ite J json k Key l labels Labor language language model language models large large language model large language models Large Language Models (LLMs) learning led licensing licensing agreements llama Llama 3 Llama models llm llms lm low mac machine Machine Learning man manipulation max Meta Meta Llama mini ML model model card model compliance model deployment model usage models ModI multi native no non notebook NPU o of on one open open-source open-source software operation opt ory out over party phi platform point post potential practical implications pre privacy process product products professionals project projects prompt public R rag rate Ray RCE red Region response responsible return right Risk risks Ro RoT s Sable schema search sec security security and compliance security considerations security professionals sensitive information sentiment analysis side Sig Sim Simple SoC software source source software specific sql SSE SSL start state structured structured data T Task tasks test text text generation the third third-party Time to token tokens Tor TP translation trial UI up update ups US usage use user Users V val Valuation Vertex Vertex AI WAN Wi workflow workflows x