The Yardstick Feed

AI updates, frontier-model releases, and standards work.

A weekly digest for B2B AI buyers: capability and safety announcements from frontier labs, AI standards and regulatory updates, agent-reliability research, and Yardstick methodology notes. Sourced from a fixed list of labs, regulators, and research institutions.

  1. Weekly Digest

    AI Updates — Week of May 22, 2026

    Frontier-lab consolidation and oversight signal both showed up this week. Cohere shipped Command A+ and announced two strategic MOUs the same day, DeepMind released its Co-Scientist research-agent system, AWS Bedrock unlocked per-request cost attribution, and the UK AI Security Institute published its Frontier AI Trends Report.

    Frontier labs

    Infrastructure & platforms

    Standards & regulation

    • Frontier AI Trends Report UK AI Security Institute

      An assessment of the AI oversight landscape, its robustness to capability advances, and the pathways that could lead to its degradation. Pairs with NIST's RMF as buyer-facing evidence on whether a vendor's claims about model safety hold up to independent evaluation.

      Read at aisi.gov.uk →

    By Yardstick Research

  2. Weekly Digest

    AI Updates — Week of May 15, 2026

    Distribution and oversight dominated the week. OpenAI stood up a Deployment Company to embed engineers with enterprise customers, the UK AI Security Institute shipped two safety papers, and Anthropic announced a four-year, $200M partnership with the Gates Foundation.

    Frontier labs

    Standards & regulation

    • Alignment research paper UK AI Security Institute

      AISI's alignment work probes whether frontier-model behavior actually matches stated objectives under adversarial conditions. Useful as procurement reference when a vendor claims their agent is safe by design.

      Read at aisi.gov.uk →
    • Cyber autonomous systems capabilities research UK AI Security Institute

      AISI's narrow-cyber suite finds the length of tasks frontier models can autonomously complete in cybersecurity scenarios is doubling every few months. Direct implication for any AI-buyer doing threat modeling — the offensive-capability curve outpaces typical procurement cycles.

      Read at aisi.gov.uk →

    By Yardstick Research

  3. Weekly Digest

    AI Updates — Week of May 8, 2026

    Three deployment-relevant shifts. OpenAI released GPT-5.5 Instant with a measured drop in hallucination rate, AWS Bedrock pushed AgentCore into GovCloud, and HiDream-O1-Image opened a state-of-the-art image model under open weights.

    Frontier labs

    Infrastructure & platforms

    Research

    By Yardstick Research

  4. Weekly Digest

    AI Updates — Week of May 1, 2026

    The week's signal was platform-side. Microsoft and OpenAI restructured the next phase of their partnership, Microsoft 365 E7 and Agent 365 reached general availability, and AWS Bedrock added 18 fully managed open-weight models.

    Frontier labs

    • The next phase of the Microsoft-OpenAI partnership Microsoft Official Blog

      Restructures the multi-year arrangement that has anchored OpenAI's compute and Microsoft's frontier-AI distribution. The detail buyers should care about: which OpenAI APIs land in Azure first, and on what timeline.

      Read at blogs.microsoft.com →
    • Anthropic acquires Stainless Anthropic News

      Stainless built the SDK and MCP-server tooling that powers every official Anthropic API client. The acquisition pulls developer-experience infrastructure in-house, which usually shows up downstream as faster SDK iteration and tighter integration between Claude and the MCP ecosystem.

      Read at anthropic.com →

    Infrastructure & platforms

    By Yardstick Research

Tip us on a source

A source we should be tracking?

Email a link to a frontier-lab blog, a standards body, or a research feed worth adding to the source list. Vendor pitches belong on the submit-vendor page instead.

hello@yardstickresearch.app

Email a tip

Or submit a vendor for cohort consideration

Take the free 4-minute readiness audit.

Get your score, peer benchmarks, and three tailored vendor recommendations. No email required to see your results.