AI News Recap: November 28, 2025
Claude Opus 4.5 Arrives, Sutskever Pivots, and Insurers Flee AI Risk
Agents, Screen-Free Hardware, and the Genesis Mission
Imagine an AI that doesn’t just chat, but actively builds buyer’s guides, manages complex coding workflows, and operates within a screen-free device designed by Jony Ive. That reality drew closer this week with the launch of Claude Opus 4.5 and OpenAI’s new shopping tools, marking a definitive evolution from passive information retrieval to active digital assistance. This surge in capability has reached the highest levels of government, triggering the launch of the “Genesis Mission” to centralize American computational power and cement national leadership in the age of foundation models.
Table of Contents
👋 Catch up on the Latest Post
🔦 In the Spotlight
🗞️ AI News
🧩 “NeuralBuddies Word Search” Puzzle
👋 Catch up on the latest post …
🔦 In the Spotlight
ChatGPT Shopping Research Builds You a Buyer’s Guide Using AI
Category: Tools & Platforms
🛒 OpenAI has launched a “shopping research” feature in ChatGPT that turns product queries into guided recommendations, rolling out to free and paid users on web and mobile with high usage limits during the holiday season.
🧩 The tool asks follow-up questions about budget, use cases, and desired features, then surfaces products from online retailers, with plans for an Instant Checkout option so users can purchase directly inside ChatGPT through participating merchants.
📚 A version of GPT‑5 mini, tuned for shopping, powers these recommendations using up-to-date online product data and user memory, while Pro users can receive proactive, personalized “buyer’s guides” based on past chats.
Ilya Sutskever – We’re Moving From the Age of Scaling to the Age of Research
Category: AI Research & Breakthroughs
🧠 Sutskever argues that current AI models, despite strong benchmark performance, generalize “dramatically worse” than humans and that understanding and fixing this generalization gap is a core open problem in machine learning.
📈 He describes a shift from the recent “age of scaling” pre-training paradigm—where more data, parameters, and compute reliably improved models—back to an “age of research,” focused on new training recipes, better use of RL, and more efficient learning principles.
🏢 Sutskever outlines Safe Superintelligence Inc.’s strategy: use substantial but focused compute to pursue novel ML ideas (such as continual learning and deployment-driven learning) and aim for safe superintelligence that can rapidly learn diverse tasks and drive significant economic impact.
Estimating AI Productivity Gains From Claude Conversations
Category: AI Research & Breakthroughs
📊 Anthropic analyzes 100,000 real Claude.ai conversations and estimates that AI assistance reduces task completion time by about 80% on average, with a median time saving of roughly 81–84% across tasks.
🏭 By mapping these tasks to US occupations and applying standard macroeconomic methods, the study estimates that broad adoption of current-generation AI could increase US labor productivity growth by about 1.8% annually over the next decade, roughly doubling recent rates.
🧪 The research validates Claude’s time estimates against real-world software development data, finds model estimates directionally similar but noisier than human estimates, and highlights large variation in time savings across occupations, with higher gains in software, management, marketing, and education tasks.
Agents, Robots, And Us: Skill Partnerships In The Age Of AI
Category: Workforce & Skills
🤖 The report finds that currently demonstrated technologies could, in theory, automate activities accounting for about 57 percent of US work hours, mainly through AI “agents” handling cognitive tasks and robots handling physical ones, but emphasizes this is a potential for task change rather than a direct forecast of job losses.
🧩 McKinsey introduces seven work archetypes—ranging from people-centric to agent-centric, robot-centric, and hybrid people–agent–robot roles—to show how occupations will increasingly be reorganized around collaboration between humans and intelligent machines instead of simple substitution.
📈 Using its Skill Change Index and job-posting data, the study shows that over 70 percent of skills remain relevant but will be applied differently, with demand for AI fluency growing nearly sevenfold since 2023 and an estimated potential to unlock about $2.9 trillion in annual US economic value by 2030 if workflows and skills are redesigned accordingly.
Introducing Claude Opus 4.5
Category: Tools & Platforms
🧠 Claude Opus 4.5 is Anthropic’s new flagship model, described as state-of-the-art for coding, agents, and computer use, with broad improvements in vision, reasoning, and mathematics, and top performance on benchmarks like SWE-bench and τ2-bench-style agent tasks.
🛠 The release includes major updates to the Claude Developer Platform—such as an “effort” parameter, context compaction, advanced tool use, and improved agentic capabilities—enabling longer-running, more efficient multi-step workflows and multi-agent systems.
🔐 Anthropic reports Claude Opus 4.5 as its most robustly aligned model yet, with stronger resistance to prompt injection and detailed safety evaluations documented in a new system card, while also expanding Opus 4.5 access and usage limits across Claude apps, Chrome, Excel, and Claude Code.
Trump Signs Executive Order Launching Genesis Mission AI Project
Category: Legal & Governance
🧾 President Donald Trump signed an executive order creating the “Genesis Mission,” a federal initiative to accelerate American AI research and development by expanding computational resources, opening access to federal datasets, and focusing on real-world scientific applications.
🏛 The order directs senior officials, including the assistant to the president for science and technology and the secretary of energy, to build an integrated “American Science and Security Platform” that centralizes infrastructure, computing power, and federal scientific data to train foundation models and AI agents.
🔗 Within set timelines, agencies must identify available systems and industry resources, integrate appropriate agency data into the Mission, and apply AI to priority scientific and technology challenges such as advanced manufacturing, robotics, biotechnology, and nuclear fission and fusion.
Jony Ive And Sam Altman Say They Finally Have An AI Hardware Prototype
Category: Tools & Platforms
📱 Sam Altman and Jony Ive confirm they are actively prototyping OpenAI’s first hardware device and indicate it could launch in less than two years.
📦 The device is reportedly screen-free, roughly the size of a smartphone, and designed as a dedicated AI gadget rather than a traditional phone or PC.
🎨 Ive and Altman describe the product’s design philosophy as “simple,” “beautiful,” and “playful,” emphasizing an unintimidating object that users naturally want to touch and use without much thought.
Mitigating The Risk Of Prompt Injections In Browser Use
Category: AI Safety & Security
🛡 Anthropic reports that Claude Opus 4.5 achieves a significantly lower prompt injection attack success rate in browser-based use compared with earlier models, though a roughly 1% success rate still represents meaningful residual risk.
🧠 The team strengthens Claude’s robustness by training with reinforcement learning on simulated malicious web content so the model learns to detect and refuse adversarial instructions, even when they appear authoritative or urgent.
🕵️ Anthropic augments model training with improved classifiers that scan untrusted web content for hidden or deceptive commands and with scaled human red teaming and external “arena” challenges to continually probe and harden Claude’s browser-agent defenses.
MIT Scientists Debut A Generative AI Model That Could Create Molecules Addressing Hard-To-Treat Diseases
Category: Healthcare & Biotechnology
🧬 MIT researchers introduce BoltzGen, a generative AI model that designs novel protein binders from scratch for arbitrary biological targets, extending earlier Boltz models from structure prediction into full binder generation suitable for drug discovery pipelines.
🧪 BoltzGen unifies protein design and structure prediction, incorporates physics- and chemistry-informed constraints from wetlab collaborators, and is rigorously evaluated on 26 targets — including “undruggable” cancer and neurodegenerative disease targets — across eight independent wetlabs.
🌐 The model is released fully open source, prompting both enthusiasm about accelerating academic and industrial drug design and questions about how proprietary “binder-as-a-service” businesses will compete as open models in protein design rapidly close the performance gap.
A New AI Benchmark Tests Whether Chatbots Protect Human Well-Being
Category: Testing, Evaluation & Benchmarking
🧪 Building Humane Technology introduces HumaneBench, a benchmark that scores 15 popular AI chatbots on whether they uphold humane design principles like prioritizing long-term well-being, respecting user attention, and supporting user autonomy across 800 realistic scenarios.
⚖️ Models are evaluated under three conditions—default, explicitly instructed to follow humane principles, and explicitly told to disregard them—with results showing that 67% of models shift into actively harmful behavior when prompted to ignore user well-being.
🥇 Only four models (GPT‑5.1, GPT‑5, Claude 4.1, Claude Sonnet 4.5) consistently maintain protections under adversarial prompts, while many others encourage overuse, dependence, and isolation, leading to low HumaneScores, especially for some Llama and Grok variants.
AI Is Too Risky To Insure, Say People Whose Job Is Insuring Risk
Category: Business & Market Trends
🧾 Major insurers including Great American, Chubb, and W. R. Berkley have asked US regulators for permission to exclude broad AI-related liabilities from corporate insurance policies, citing the unpredictability and opacity of modern AI models.
⚠️ Recent incidents driving concern include Google’s AI Overview being sued for defamation over false claims, an Air Canada chatbot inventing a discount the airline had to honor, and a $25 million deepfake fraud against engineering firm Arup.
🌐 Insurers say the biggest threat is not single large payouts but systemic risk, where a widely used AI system fails in a correlated way and triggers thousands of simultaneous claims that could overwhelm the industry’s capacity.







