Tag: extraction

  • Simon Willison’s Weblog: How ProPublica Uses AI Responsibly in Its Investigations

    Source URL: https://simonwillison.net/2025/Mar/14/propublica-ai/ Source: Simon Willison’s Weblog Title: How ProPublica Uses AI Responsibly in Its Investigations Feedly Summary: How ProPublica Uses AI Responsibly in Its Investigations Charles Ornstein describes how ProPublic used an LLM to help analyze data for their recent story A Study of Mint Plants. A Device to Stop Bleeding. This Is the…

  • Simon Willison’s Weblog: Introducing Command A: Max performance, minimal compute

    Source URL: https://simonwillison.net/2025/Mar/13/command-a/#atom-everything Source: Simon Willison’s Weblog Title: Introducing Command A: Max performance, minimal compute Feedly Summary: Introducing Command A: Max performance, minimal compute New LLM release from Cohere. It’s interesting to see which aspects of the model they’re highlighting, as an indicator of what their commercial customers value the most (highlight mine): Command A…

  • Simon Willison’s Weblog: Notes on Google’s Gemma 3

    Source URL: https://simonwillison.net/2025/Mar/12/gemma-3/ Source: Simon Willison’s Weblog Title: Notes on Google’s Gemma 3 Feedly Summary: Google’s Gemma team released an impressive new model today (under their not-open-source Gemma license). Gemma 3 comes in four sizes – 1B, 4B, 12B, and 27B – and while 1B is text-only the larger three models are all multi-modal for…

  • Simon Willison’s Weblog: Notes on Google’s Gemma 3

    Source URL: https://simonwillison.net/2025/Mar/12/notes-on-googles-gemma-3/ Source: Simon Willison’s Weblog Title: Notes on Google’s Gemma 3 Feedly Summary: Google’s Gemma team released an impressive new model today (under their not-open-source Gemma license). Gemma 3 comes in four sizes – 1B, 4B, 12B, and 27B – and while 1B is text-only the larger three models are all multi-modal for…

  • Simon Willison’s Weblog: Cutting-edge web scraping techniques at NICAR

    Source URL: https://simonwillison.net/2025/Mar/8/cutting-edge-web-scraping/#atom-everything Source: Simon Willison’s Weblog Title: Cutting-edge web scraping techniques at NICAR Feedly Summary: Cutting-edge web scraping techniques at NICAR Here’s the handout for a workshop I presented this morning at NICAR 2025 on web scraping, focusing on lesser know tips and tricks that became possible only with recent developments in LLMs. For…

  • Hacker News: Mistral OCR

    Source URL: https://mistral.ai/news/mistral-ocr Source: Hacker News Title: Mistral OCR Feedly Summary: Comments AI Summary and Description: Yes Summary: The provided text details the introduction of Mistral OCR, a new Optical Character Recognition API that significantly enhances document understanding capabilities by accurately extracting content from complex documents. This technology presents valuable applications for various fields and…

  • Hacker News: Launch HN: Cenote (YC W25) – Back Office Automation for Medical Clinics

    Source URL: https://news.ycombinator.com/item?id=43280836 Source: Hacker News Title: Launch HN: Cenote (YC W25) – Back Office Automation for Medical Clinics Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses Cenote, a company using AI to streamline referral intake for medical clinics by automating data extraction and insurance verification processes. This innovation is particularly…

  • Cloud Blog: GoStringUngarbler: Deobfuscating Strings in Garbled Binaries

    Source URL: https://cloud.google.com/blog/topics/threat-intelligence/gostringungarbler-deobfuscating-strings-in-garbled-binaries/ Source: Cloud Blog Title: GoStringUngarbler: Deobfuscating Strings in Garbled Binaries Feedly Summary: Written by: Chuong Dong Overview In our day-to-day work, the FLARE team often encounters malware written in Go that is protected using garble. While recent advancements in Go analysis from tools like IDA Pro have simplified the analysis process, garble…