Source URL: https://cloud.google.com/blog/topics/partners/datadog-integrates-google-cloud-ai/
Source: Cloud Blog
Title: Datadog expands its AI observability capabilities with new integrations across the Google Cloud stack
Feedly Summary: Datadog and Google Cloud have long provided customers with powerful capabilities that enable performant, scalable, and differentiated applications in the cloud; in the past two years alone, Datadog’s revenue on Google Cloud Marketplace has more than doubled. As these customers bring Google Cloud’s AI capabilities into their technology stacks, they require observability tools that allow them to better troubleshoot errors, optimize usage, and improve product performance.
Today, Datadog is announcing expanded AI monitoring capabilities with Vertex AI Agent Engine monitoring in its new AI Agents Console. This new feature joins a large and growing set of Google Cloud AI monitoring capabilities that allow joint customers to better innovate and optimize product performance across the AI stack
Full-stack AI observability
With this extensive set of AI observability capabilities, Datadog customers with workloads on Google Cloud have enhanced visibility into all the layers of an AI application.
Application layer: As businesses adopt autonomous agents to power key workflows, visibility and governance become critical. Datadog’s new AI Agents Console now supports monitoring of agents deployed via Google’s Vertex AI Agent Engine, providing customers with a unified view of the actions, permissions, and business impact of third-party agents — including those orchestrated by Agent Engine.
Model layer: Datadog LLM Observability allows users to monitor, troubleshoot, improve and secure their large language model (LLM) applications. Earlier this year, Datadog introduced auto-instrumentation for Gemini models and LLMs in Vertex AI, which allows teams to start monitoring quickly, minimizing setup work and jumping right into troubleshooting efforts.
Infrastructure layer: In February, Datadog announced a new integration with Cloud TPU, allowing customers to monitor utilization, resource usage, and performance at the container, node, and worker levels. This helps customers rightsize TPU infrastructure and balance training performance with cost.
Data layer: Many Google Cloud customers use BigQuery for data insights. Datadog’s expanded BigQuery monitoring capabilities — launched at Google Cloud Next — help teams optimize costs by showing BigQuery usage per user and project, identifying top spenders and slow queries. It also flags failed jobs for immediate action and identifies data quality issues.
aside_block
Optimize monitoring costs
Datadog has regularly invested in optimizing the cost associated with its Google Cloud integrations, and Datadog customers can now use Google Cloud’s Active Metrics APIs, ensuring Datadog only calls Google Cloud APIs when there is new data. This significantly reduces API calls and associated costs, without sacrificing visibility. This joins Datadog’s support for Google Cloud’s Private Service Connect, which allows Datadog users running on Google Cloud to reduce data transfer costs, as another key tool to help Google Cloud customers optimize their monitoring costs without reducing visibility.
Get started today
Datadog’s unified observability and security platform offers a powerful advantage for organizations that want to use Google Cloud’s cutting-edge AI services. By monitoring the full Google Cloud stack across a breadth of telemetry types, Datadog gives Google Cloud customers the tools and insights they need to build more performant, cost-efficient, and scalable applications.
Ready to try it for yourself? Purchase Datadog directly from the Google Cloud Marketplace and start monitoring your environment in minutes. And if you’re in the New York area, you can see some of these new capabilities in action by visiting the Google Cloud booth at Datadog’s annual conference DASH from June 10-11.
AI Summary and Description: Yes
Summary: The text announces Datadog’s new AI monitoring capabilities integrated with Google Cloud, emphasizing the importance of observability in AI application performance. It highlights the need for enhanced monitoring tools as organizations increasingly adopt AI technologies, particularly in the context of large language model (LLM) applications and efficient infrastructure utilization.
Detailed Description:
The content reveals Datadog’s latest enhancements in AI observability features designed to support users harnessing Google Cloud’s AI technologies. These capabilities aim to provide comprehensive insights across various layers of AI applications, ultimately enhancing performance, governance, and cost optimization for organizations leveraging AI in their workflows.
Key Points:
– **Introduction of AI Monitoring Capabilities**:
– Datadog has expanded its AI monitoring features specifically for Google Cloud customers.
– The new AI Agents Console provides observability tools for third-party agents operated via Google’s Vertex AI Agent Engine.
– **Full-stack AI Observability**:
– **Application Layer**:
– Critical visibility and governance for autonomous agents.
– Unified view of actions, permissions, and business impacts.
– **Model Layer**:
– Datadog LLM Observability to monitor, troubleshoot, and secure LLM applications.
– Auto-instrumentation for Gemini models and LLMs minimizes setup work.
– **Infrastructure Layer**:
– Integration with Cloud TPU enables monitoring of resource utilization at multiple levels (container, node, worker).
– Helps in optimizing TPU infrastructure and balancing performance with costs.
– **Data Layer**:
– Enhanced BigQuery monitoring to optimize costs and improve data quality.
– Flags slow queries, identifies major cost drivers, and manages failed jobs for timely actions.
– **Cost Optimization Features**:
– Implementation of Google Cloud’s Active Metrics APIs allows reduction in unnecessary API calls, thus lowering costs while maintaining visibility.
– Partnership with Google Cloud’s Private Service Connect further aids in minimizing data transfer costs.
– **Call to Action**:
– Organizations are encouraged to try Datadog for unified observability and security, allowing for the creation of more scalable and cost-effective applications.
– An invitation to experience these new features at Datadog’s annual conference, DASH.
Overall, the announcement underscores the increasing necessity for robust monitoring tools as companies embrace AI technologies, highlighting Datadog’s commitment to facilitating effective monitoring across various application layers to improve overall performance and satisfaction for businesses using Google Cloud.