Tag: human judgment
-
Cloud Blog: Beyond guardrails: A taxonomy of platform engineering control mechanisms
Source URL: https://cloud.google.com/blog/products/application-modernization/platform-engineering-control-mechanisms/ Source: Cloud Blog Title: Beyond guardrails: A taxonomy of platform engineering control mechanisms Feedly Summary: The promise of platform engineering is to accelerate software delivery by empowering developers with self-service capabilities. However, this must be balanced with security, compliance, and operational stability, and for this, you need robust controls. But all too…
-
Tomasz Tunguz: The SQL Gap
Source URL: https://www.tomtunguz.com/spider-2-benchmark-trends/ Source: Tomasz Tunguz Title: The SQL Gap Feedly Summary: GPT-5 achieves 94.6% accuracy on AIME 2025, suggesting near-human mathematical reasoning. Yet ask it to query your database, and success rates plummet to the teens. The Spider 2.0 benchmarks reveal a yawning gap in AI capabilities. Spider 2.0 is a comprehensive text-to-SQL benchmark…