Tag: speech-to-speech

  • Simon Willison’s Weblog: Introducing gpt-realtime

    Source URL: https://simonwillison.net/2025/Sep/1/introducing-gpt-realtime/#atom-everything Source: Simon Willison’s Weblog Title: Introducing gpt-realtime Feedly Summary: Introducing gpt-realtime Released a few days ago (August 28th), gpt-realtime is OpenAI’s new “most advanced speech-to-speech model". It looks like this is a replacement for the older gpt-4o-realtime-preview model that was released last October. This is a slightly confusing release. The previous realtime…

  • The Register: ‘Savvy’ shortcuts produce near-instant speech-to-speech translation of 36 languages

    Source URL: https://www.theregister.com/2025/01/15/babel_fish_translations/ Source: The Register Title: ‘Savvy’ shortcuts produce near-instant speech-to-speech translation of 36 languages Feedly Summary: Babel Fish like ML model emerges after training on 4.5 million hours of multilingual spoken audio Meta has developed a machine learning model its researchers claim offers near-instant speech-to-speech translation between around 36 languages.… AI Summary and…

  • Simon Willison’s Weblog: First impressions of the new Amazon Nova LLMs (via a new llm-bedrock plugin)

    Source URL: https://simonwillison.net/2024/Dec/4/amazon-nova/ Source: Simon Willison’s Weblog Title: First impressions of the new Amazon Nova LLMs (via a new llm-bedrock plugin) Feedly Summary: Amazon released three new Large Language Models yesterday at their AWS re:Invent conference. The new model family is called Amazon Nova and comes in three sizes: Micro, Lite and Pro. I built…

  • Hacker News: Hugging Face tackles speech-to-speech

    Source URL: https://github.com/huggingface/speech-to-speech Source: Hacker News Title: Hugging Face tackles speech-to-speech Feedly Summary: Comments AI Summary and Description: Yes Summary: The text describes an open-sourced, modular Speech-to-Speech pipeline utilizing various advanced AI models available on the Hugging Face Hub. This initiative provides significant potential for developers and researchers interested in integrating speech processing capabilities into…