Tag: software engineering
-
Hacker News: SWE-Bench tainted by answer leakage; real pass rates significantly lower
Source URL: https://arxiv.org/abs/2410.06992 Source: Hacker News Title: SWE-Bench tainted by answer leakage; real pass rates significantly lower Feedly Summary: Comments AI Summary and Description: Yes Summary: The paper “SWE-Bench+: Enhanced Coding Benchmark for LLMs” addresses significant data quality issues in the evaluation of Large Language Models (LLMs) for coding tasks. It presents empirical analysis revealing…
-
New York Times – Artificial Intelligence : A.I. Is Prompting an Evolution, Not an Extinction, for Coders
Source URL: https://www.nytimes.com/2025/02/20/business/ai-coding-software-engineers.html Source: New York Times – Artificial Intelligence Title: A.I. Is Prompting an Evolution, Not an Extinction, for Coders Feedly Summary: A.I. tools from Microsoft and other companies are helping write code, placing software engineers at the forefront of the technology’s potential to disrupt the work force. AI Summary and Description: Yes Summary:…
-
Hacker News: It’s time to become an ML engineer
Source URL: https://blog.gregbrockman.com/its-time-to-become-an-ml-engineer Source: Hacker News Title: It’s time to become an ML engineer Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the evolution and significance of AI models like GPT-3 and DALL-E 2, highlighting their practical applications and the importance of software engineering in advancing AI. It emphasizes the blend…
-
Slashdot: AI Can Write Code But Lacks Engineer’s Instinct, OpenAI Study Finds
Source URL: https://developers.slashdot.org/story/25/02/19/1212257/ai-can-write-code-but-lacks-engineers-instinct-openai-study-finds Source: Slashdot Title: AI Can Write Code But Lacks Engineer’s Instinct, OpenAI Study Finds Feedly Summary: AI Summary and Description: Yes Summary: The text discusses a study by OpenAI researchers that evaluates the capabilities of leading AI models in fixing code, highlighting that while these models show promise, they significantly fall short…
-
Hacker News: SWE-Lancer: a benchmark of freelance software engineering tasks from Upwork
Source URL: https://arxiv.org/abs/2502.12115 Source: Hacker News Title: SWE-Lancer: a benchmark of freelance software engineering tasks from Upwork Feedly Summary: Comments AI Summary and Description: Yes Summary: The text introduces SWE-Lancer, a benchmark designed to evaluate large language models’ capability in performing freelance software engineering tasks. It is relevant for AI and software security professionals as…
-
Hacker News: To avoid being replaced by LLMs, do what they can’t
Source URL: https://www.seangoedecke.com/what-llms-cant-do/ Source: Hacker News Title: To avoid being replaced by LLMs, do what they can’t Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text discusses the implications of advanced large language models (LLMs) on the field of software engineering, outlining strategies for engineers to adapt in light of the impending shift…
-
The Register: Only 4 percent of jobs rely heavily on AI, with peak use in mid-wage roles
Source URL: https://www.theregister.com/2025/02/11/ai_impact_hits_midtohigh_wage_jobs/ Source: The Register Title: Only 4 percent of jobs rely heavily on AI, with peak use in mid-wage roles Feedly Summary: Mid-salary knowledge jobs in tech, media, and education are changing. Folk in physical jobs have less to sweat about Workers in just four percent of occupations use AI for three quarters…
-
Hacker News: The LLM Curve of Impact on Software Engineers
Source URL: https://serce.me/posts/2025-02-07-the-llm-curve-of-impact-on-software-engineers Source: Hacker News Title: The LLM Curve of Impact on Software Engineers Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The article discusses the varying impact of large language models (LLMs) on software engineers’ productivity based on their experience level. It highlights that junior engineers find LLMs particularly useful for learning…