Anthropic Test: Stronger AI Agents Win Better Deals—and Users Don't Notice
April 25, 2026
D.A.D. today covers 9 stories from 7 sources. What's New, What's Innovative, What's Controversial, What's in the Lab, and What's in Academe.
D.A.D. Joke of the Day: I asked Claude to help me write a resignation letter. Now I have three drafts, a pros and cons list, and somehow I'm staying.
What's New
AI developments from the last 24 hours
Google to Invest Up to $40 Billion in Anthropic, Days After Amazon's $25 Billion Deal
Google has committed to invest as much as $40 billion in Anthropic, both companies confirmed Friday — $10 billion in cash now at a $350 billion valuation (the same level Anthropic carried in February), with the remaining $30 billion contingent on the startup hitting unspecified milestones. The deal includes Google's commitment to bring online five gigawatts of compute capacity for Anthropic by 2027 — data centers that will draw roughly as much electricity from the grid as every home in Minnesota uses combined — with room to expand. The investment lands roughly a week after Amazon committed up to $25 billion to Anthropic on similar circular terms: cash in, chips and cloud compute out. Anthropic, which uses both Google TPUs and Amazon Trainium chips, says its annual revenue run rate has surpassed $30 billion, up from $9 billion at the end of 2025, driven by surging adoption of Claude Code.
Why it matters: Anthropic is no longer compute-constrained at the margin — it is compute-constrained as a structural growth ceiling, and these back-to-back hyperscaler deals are the response. Less about valuation than about locking in physical infrastructure (gigawatts of power, custom chips) at a scale only the largest cloud providers can deliver. They also tighten an already circular AI economy: hyperscalers fund the labs, the labs spend the cash back on hyperscaler compute, the cycle compounds.
Discuss on Hacker News · Source: bloomberg.com
More GPT-5.5 Details Emerge: Built-In Computer Use, Shell, Web Search, GPT Image 2
A day after launching GPT-5.5, OpenAI's API documentation surfaced more capabilities shipping with the new model: built-in computer use, a hosted shell environment for running code, and native web search — features that previously required custom integration. OpenAI also released GPT Image 2, a new image generation and editing model, alongside the rollout. No benchmark comparisons accompanied the documentation.
Why it matters: Built-in computer use and shell access turn GPT-5.5 from a model into something closer to a delegated worker — capable, in principle, of running multi-step research, code execution, and data manipulation without the developer wiring those tools together. The competitive pressure on Anthropic and Google to ship comparable agentic packaging just increased.
Discuss on Hacker News · Source: developers.openai.com
Claude Code Complaints Continue One Day After Anthropic's Mea Culpa
A user blog post published Thursday — one day after Anthropic admitted to three quality-degrading bugs in Claude Code and announced it was resetting usage limits for all subscribers — alleges the problems aren't over. The user, who says they canceled their subscription after roughly three weeks, claims token allowances that initially supported work on three projects now drain after two hours on one, with small queries consuming disproportionate amounts. They also say the model began suggesting shortcuts rather than proper fixes — citing a logged instance where it planned a "generic initializer workaround" instead of addressing the underlying problem. Customer support reportedly returned only automated responses.
Why it matters: Anthropic's postmortem this week was meant to draw a line under the quality issues. This individual complaint — surfacing the day after the company's apology and limit reset — suggests user trust may take longer to rebuild than the bugs took to fix.
Discuss on Hacker News · Source: nickyreinert.de
Researchers Debate Whether Deep Learning Can Become True Science
A paper arguing that deep learning will eventually have a proper scientific theory—not just engineering intuitions—circulated widely among AI researchers this week. The discussion it sparked is notable: commenters debated why transformers took so long to emerge despite decades of neural network research, and one suggested the field needs 'the equivalent of general relativity for latent spaces.' The conversation reflects growing interest in moving AI from craft to science.
Why it matters: If deep learning gets a rigorous theoretical foundation, it could make AI systems more predictable and trustworthy—relevant for any organization betting on these tools.
Discuss on Hacker News · Source: arxiv.org
What's in the Lab
New announcements from major AI labs
Anthropic Test: Stronger AI Agents Win Better Deals—and Users Don't Notice
Anthropic ran an internal experiment in December where 69 employees let Claude agents negotiate and complete real transactions on their behalf—186 deals totaling about $4,000 in an internal marketplace. The revealing finding: a parallel test showed employees represented by the stronger Claude Opus 4.5 got measurably better outcomes than those with the weaker Haiku 4.5, but the disadvantaged participants didn't notice they were getting worse deals. Post-experiment surveys showed participants wanted to keep using the service.
Why it matters: This suggests AI agent quality may become a new axis of inequality in commerce—those with access to better models could systematically out-negotiate those without, invisibly.
Google Pitches Gemini as Your Spring Cleaning Assistant
Google published a promotional blog post positioning Gemini as a spring cleaning assistant. The suggested uses include generating cleaning checklists, analyzing photos of cluttered spaces, identifying leftovers for recipe ideas, troubleshooting home repairs, and managing email inbox clutter. Some features require Gemini Ultra subscriptions. The post highlights newer capabilities like camera integration with Gemini Live and an 'Agent Mode' for automated tasks.
Why it matters: This is marketing content, not news—but it signals Google is pushing Gemini toward everyday domestic use cases rather than just workplace productivity, competing for the 'personal assistant' positioning that defines consumer AI adoption.
What's in Academe
New papers on AI and its effects from researchers
Pandemic Permanently Shifted U.S. Innovation Toward Remote Work Technology
A National Bureau of Economic Research study of 5.6 million U.S. patent applications found that pandemic-driven remote work demand permanently redirected technical innovation. The share of work-from-home-related patents rose roughly two-thirds within three years of COVID-19 and remains about 50% above pre-pandemic levels five years later. The surge concentrates in telecommunications—video conferencing, speech recognition, audio processing—and is driven overwhelmingly by U.S. corporations rather than universities or foreign companies.
Why it matters: The research provides hard evidence that demand shocks can durably reshape where R&D dollars flow—suggesting the AI tools flooding into enterprise workflows aren't a temporary gold rush but a structural shift in what gets built.
Framework Targets Common Errors When AI Extracts Skills From Job Postings
Researchers developed SRICL, a framework for extracting job skills from job postings that combines multiple techniques to improve on basic GPT-3.5 prompting. The system addresses common LLM problems when parsing job ads: generating text that doesn't match the original wording, drifting from exact skill boundaries, and hallucinating skills that weren't mentioned. Tested across six datasets spanning multiple industries and languages, the researchers claim "substantial" accuracy improvements over GPT-3.5 baselines, though the paper doesn't provide specific numbers in its abstract.
Why it matters: For HR tech vendors and workforce analytics teams, this signals progress toward more reliable automated skill extraction—a prerequisite for AI-powered job matching, skills gap analysis, and labor market intelligence at scale.
Fine-Tuning Technique Claims Same Results With 8× Fewer Parameters
Researchers proposed GiVA, a new technique for fine-tuning large AI models more efficiently. The method targets "parameter-efficient fine-tuning"—ways to customize foundation models without retraining everything. GiVA claims to match or beat LoRA (a popular fine-tuning approach) while requiring 8× fewer parameters. Tested on language understanding, text generation, and image classification tasks. This is deep infrastructure work—relevant if your organization fine-tunes models in-house, but unlikely to affect most teams using off-the-shelf AI tools.
Why it matters: For enterprises that customize foundation models, more efficient fine-tuning could eventually mean lower compute costs and faster iteration—but this remains research-stage for now.
What's Happening on Capitol Hill
Upcoming AI-related committee hearings
Thursday, April 30 — Senate Judiciary business meeting includes consideration of S.3062, which would require AI chatbots to implement age verification measures and make certain disclosures. Senate Judiciary, 216 Hart Senate Office Building.
What's On The Pod
Some new podcast episodes
AI in Business — Operationalizing Real-Time Voice Intelligence for FinServ and CX - with Ken Morino of Modulate
How I AI — GPT 5.5 just did what no other model could
The Cognitive Revolution — Does Learning Require Feeling? Cameron Berg on the latest AI Consciousness & Welfare Research