Tag: audio

  • Cloud Blog: Build with more flexibility: New open models arrive in the Vertex AI Model Garden

    Source URL: https://cloud.google.com/blog/products/ai-machine-learning/deepseek-r1-is-available-for-everyone-in-vertex-ai-model-garden/ Source: Cloud Blog Title: Build with more flexibility: New open models arrive in the Vertex AI Model Garden Feedly Summary: In our ongoing effort to provide businesses with the flexibility and choice needed to build innovative AI applications, we are expanding the catalog of open models available as Model-as-a-Service (MaaS) offerings in…

  • Simon Willison’s Weblog: Voxtral

    Source URL: https://simonwillison.net/2025/Jul/16/voxtral/#atom-everything Source: Simon Willison’s Weblog Title: Voxtral Feedly Summary: Voxtral Mistral released their first audio-input models yesterday: Voxtral Small and Voxtral Mini. These state‑of‑the‑art speech understanding models are available in two sizes—a 24B variant for production-scale applications and a 3B variant for local and edge deployments. Both versions are released under the Apache…

  • Simon Willison’s Weblog: Introducing Gemma 3n: The developer guide

    Source URL: https://simonwillison.net/2025/Jun/26/gemma-3n/ Source: Simon Willison’s Weblog Title: Introducing Gemma 3n: The developer guide Feedly Summary: Introducing Gemma 3n: The developer guide Extremely consequential new open weights model release from Google today: Multimodal by design: Gemma 3n natively supports image, audio, video, and text inputs and text outputs. Optimized for on-device: Engineered with a focus…

  • The Cloudflare Blog: Orange Me2eets: We made an end-to-end encrypted video calling app and it was easy

    Source URL: https://blog.cloudflare.com/orange-me2eets-we-made-an-end-to-end-encrypted-video-calling-app-and-it-was/ Source: The Cloudflare Blog Title: Orange Me2eets: We made an end-to-end encrypted video calling app and it was easy Feedly Summary: Orange Meets, our open-source video calling web application, now supports end-to-end encryption using the MLS protocol with continuous group key agreement. AI Summary and Description: Yes **Short Summary with Insight:** The…

  • Cloud Blog: How to use Gemini 2.5 to fine-tune video outputs on Vertex AI

    Source URL: https://cloud.google.com/blog/products/ai-machine-learning/how-to-fine-tune-video-outputs-using-vertex-ai/ Source: Cloud Blog Title: How to use Gemini 2.5 to fine-tune video outputs on Vertex AI Feedly Summary: Recently, we announced Gemini 2.5 is generally available on Vertex AI. As part of this update, tuning capabilities have extended beyond text outputs – now, you can tune image, audio, and video outputs on…

  • Tomasz Tunguz: The Multimodal Lake House : Partnering with Lance

    Source URL: https://www.tomtunguz.com/partnering-with-lance/ Source: Tomasz Tunguz Title: The Multimodal Lake House : Partnering with Lance Feedly Summary: Remember when you took a family photo & Ghibli-styled it? Or that vibe coding session, when you pasted a screenshot of the browser so the AI can help you debug some Javascript? Today, we expects AI to be…

  • Cloud Blog: How Conversational Agents and Looker can boost contact center efficiency and enhance constituent services

    Source URL: https://cloud.google.com/blog/topics/public-sector/how-conversational-agents-and-looker-can-boost-contact-center-efficiency-and-enhance-constituent-services/ Source: Cloud Blog Title: How Conversational Agents and Looker can boost contact center efficiency and enhance constituent services Feedly Summary: Conversational agents are transforming the way public sector agencies engage with constituents — enabling new levels of hyper-personalization, multimodal conversations, and improving interactions across touchpoints. And this is just the beginning. Our…