The Yardstick Feed
AI updates, frontier-model releases, and standards work.
A weekly digest for B2B AI buyers: capability and safety announcements from frontier labs, AI standards and regulatory updates, agent-reliability research, and Yardstick methodology notes. Sourced from a fixed list of labs, regulators, and research institutions.
-
AI Updates — Week of May 22, 2026
Frontier-lab consolidation and oversight signal both showed up this week. Cohere shipped Command A+ and announced two strategic MOUs the same day, DeepMind released its Co-Scientist research-agent system, AWS Bedrock unlocked per-request cost attribution, and the UK AI Security Institute published its Frontier AI Trends Report.
Frontier labs
-
Introducing Command A+ Cohere Blog
An open-weights mixture-of-experts model built for high-performance agentic tasks that can be privately deployed on as little as two H100 GPUs. The pitch: enterprise-grade reasoning and multimodal capability without the inference-cost profile of a frontier dense model.
Read at cohere.com → -
Co-Scientist: A multi-agent AI partner to accelerate research Google DeepMind
A multi-agent system built on Gemini that iteratively generates, debates, and refines novel hypotheses for life-sciences research. Positioned as a collaborator for working scientists rather than a one-shot reasoning engine.
Read at deepmind.google → -
Cohere acquires Reliant AI to expand sovereign enterprise AI Cohere Blog
Reliant AI is a Montreal- and Berlin-based biopharma AI company. The acquisition extends Cohere's sovereign-AI positioning into healthcare-vertical workflows where data residency and compliance shape vendor selection more than raw model capability.
Read at cohere.com →
Infrastructure & platforms
-
Amazon Bedrock expands support for request-level usage attribution AWS What's New
Customers can now tag InvokeModel and InvokeModelWithResponseStream calls with team, application, environment, and experiment metadata for per-request cost reporting. Closes a long-standing gap for enterprises that need to attribute Bedrock spend across business units.
Read at aws.amazon.com →
Standards & regulation
-
Frontier AI Trends Report UK AI Security Institute
An assessment of the AI oversight landscape, its robustness to capability advances, and the pathways that could lead to its degradation. Pairs with NIST's RMF as buyer-facing evidence on whether a vendor's claims about model safety hold up to independent evaluation.
Read at aisi.gov.uk →
By Yardstick Research
-
-
AI Updates — Week of May 15, 2026
Distribution and oversight dominated the week. OpenAI stood up a Deployment Company to embed engineers with enterprise customers, the UK AI Security Institute shipped two safety papers, and Anthropic announced a four-year, $200M partnership with the Gates Foundation.
Frontier labs
-
OpenAI launches the OpenAI Deployment Company OpenAI
A new arm of OpenAI focused on embedding Forward Deployed Engineers with enterprise customers, paired with the acquisition of Tomoro. Signals an industry shift toward services-and-models bundles rather than model-API-only sales.
Read at openai.com → -
Anthropic forms $200 million partnership with the Gates Foundation Anthropic News
Four-year commitment of grant funding, Claude usage credits, and technical support across global health, life sciences, education, and economic mobility. Foundation-grade enterprises now have a defined path to Claude deployment with vendor support included.
Read at anthropic.com →
Standards & regulation
-
Alignment research paper UK AI Security Institute
AISI's alignment work probes whether frontier-model behavior actually matches stated objectives under adversarial conditions. Useful as procurement reference when a vendor claims their agent is safe by design.
Read at aisi.gov.uk → -
Cyber autonomous systems capabilities research UK AI Security Institute
AISI's narrow-cyber suite finds the length of tasks frontier models can autonomously complete in cybersecurity scenarios is doubling every few months. Direct implication for any AI-buyer doing threat modeling — the offensive-capability curve outpaces typical procurement cycles.
Read at aisi.gov.uk →
By Yardstick Research
-
-
AI Updates — Week of May 8, 2026
Three deployment-relevant shifts. OpenAI released GPT-5.5 Instant with a measured drop in hallucination rate, AWS Bedrock pushed AgentCore into GovCloud, and HiDream-O1-Image opened a state-of-the-art image model under open weights.
Frontier labs
-
GPT-5.5 Instant: smarter, clearer, and more personalized OpenAI
New default model in ChatGPT. OpenAI reports 52.5% fewer hallucinated claims than GPT-5.3 Instant in internal evaluations covering medicine, law, and finance. Worth knowing if you are running a Claude-vs-GPT bake-off for a B2B vertical with hallucination risk.
Read at openai.com →
Infrastructure & platforms
-
Amazon Bedrock AgentCore now available in AWS GovCloud (US-West) AWS What's New
Enterprise-grade agentic AI capabilities now reach AWS GovCloud, unlocking agent deployment for workloads with elevated compliance needs. Relevant for federal contractors and regulated buyers whose data-residency rules previously blocked Bedrock.
Read at aws.amazon.com → -
Amazon Bedrock now offers OpenAI models, Codex, and Managed Agents (Limited Preview) AWS What's New
GPT-5.5 and GPT-5.4 come to Bedrock with unified security, governance, and cost controls. The Codex coding agent runs inside existing AWS environments, processing inference through Bedrock and applying usage toward AWS commitments — useful for shops with AWS spend commitments who want OpenAI-tier coding agents without a separate vendor contract.
Read at aws.amazon.com →
Research
-
HiDream-O1-Image: open-weights text-to-image released Hugging Face
8B-parameter image model open-sourced with both undistilled and distilled variants plus a Reasoning-Driven Prompt Agent. Debuted at #8 on the Artificial Analysis Text-to-Image Arena — useful reference point for buyers evaluating generative-imagery vendors against an open-weights baseline.
Read at huggingface.co →
By Yardstick Research
-
-
AI Updates — Week of May 1, 2026
The week's signal was platform-side. Microsoft and OpenAI restructured the next phase of their partnership, Microsoft 365 E7 and Agent 365 reached general availability, and AWS Bedrock added 18 fully managed open-weight models.
Frontier labs
-
The next phase of the Microsoft-OpenAI partnership Microsoft Official Blog
Restructures the multi-year arrangement that has anchored OpenAI's compute and Microsoft's frontier-AI distribution. The detail buyers should care about: which OpenAI APIs land in Azure first, and on what timeline.
Read at blogs.microsoft.com → -
Anthropic acquires Stainless Anthropic News
Stainless built the SDK and MCP-server tooling that powers every official Anthropic API client. The acquisition pulls developer-experience infrastructure in-house, which usually shows up downstream as faster SDK iteration and tighter integration between Claude and the MCP ecosystem.
Read at anthropic.com →
Infrastructure & platforms
-
Accelerating Frontier Transformation with Microsoft partners Microsoft Official Blog
Frames Microsoft 365 E7 (the Frontier Suite) and Microsoft Agent 365 — both reaching general availability May 1 — as the first Microsoft SKU bundle where AI-agent capacity is sold as a per-seat line item, not an add-on.
Read at blogs.microsoft.com → -
Amazon Bedrock adds 18 fully managed open weight models AWS News Blog
New additions include Mistral Large 3 and Ministral 3, behind the same managed inference + governance Bedrock provides for proprietary models. Closes the buyer-side argument that managed inference is closed-model-only.
Read at aws.amazon.com →
By Yardstick Research
-
Tip us on a source
A source we should be tracking?
Email a link to a frontier-lab blog, a standards body, or a research feed worth adding to the source list. Vendor pitches belong on the submit-vendor page instead.
hello@yardstickresearch.app
Email a tipTake the free 4-minute readiness audit.
Get your score, peer benchmarks, and three tailored vendor recommendations. No email required to see your results.