Yardstick Research tear-sheet / AI sales cohort

Methodology · how we score · rubric weights in plain sight · vendors received this sheet seven days before publication and could flag factual errors, never rankings

11x

Founded: 2022 [CITED — Crunchbase, GlobeNewswire] Headquarters: San Francisco, CA (relocated from London in 2024) [CITED — GlobeNewswire Series A] Funding: $76M+ total disclosed — Seed (~$2M, 2023, Project A Ventures); Series A ($24M, Sept 16 2024, Benchmark; ~$90M post-money per TechCrunch); Series B ($50M, Nov 11 2024, Andreessen Horowitz lead; ~$320–350M post-money). Investor stack includes Quiet Capital, SV Angel, Abstract Ventures, Lux Capital, HubSpot Ventures, 20VC, Activant. [CITED — TechCrunch, Bloomberg, GlobeNewswire] Leadership: CEO transition May 5, 2025 — founder Hasan Sukkar stepped down; CTO Prabhav Jain became CEO; Sukkar moved to non-executive chairman. Stated rationale: 0→1-to-scaling-stage transition. [CITED — TechCrunch May 5, 2025; Bloomberg; Tech.eu; Sifted] Archetype: Autonomous BDR replacement — Alice (AI SDR), Julian (AI Phone Rep, formerly "Mike"; renamed May 7, 2025), Platform X (no-code digital-worker builder).

ICP (vendor-stated): "Digital workers, human results." Sold as a replacement for a junior SDR seat (not augmentation), with Julian as a 24/7 inbound/outbound voice rep. [CITED — 11x.ai homepage, /worker/alice, /worker/julian]

Headline numbers

Observation Value Evidence
Total score (0–100) 32.5 / 100 weighted sum of dimension scores
Vendor-claimed ARR (Series B, 2024) "Approaching $10M" — directly contested in subsequent reporting CITED — TechCrunch (Sept 2024); see Material risk note below
Production-scale numbers (Jan 2025, 11x staff at LangChain Interrupt) ~2M leads sourced cumulative; ~3M messages sent; ~21K replies; ~2% reply rate "matching human SDR performance" CITED — V2 Sherwood Callaway + Keith Fearon, LangChain Interrupt 2025; ZenML LLMOps Database summary. Note: platform-aggregate including 11x's own internal use, not a customer outcome
Mike Donohue (CRO) on-record claim Alice + Julian "generate over 90% of [11x's] pipeline"; "Traditional BDR hiring may soon be obsolete" CITED — V5 Revenue Leadership Podcast w/ Kyle Norton, ep. 33, 2025. Self-reported internal use, not a customer-deployment outcome
Customer-evidence on video Essentially zero — across 14 videos surveyed, no on-camera 11x customer states a deployment outcome with named result CITED — demos/11x.md §7; Yardstick desk research 2026-04-29
Personalization grade (sample prospect) Pre-test hypothesis: capability-of-platform real (multi-agent researcher + positioning + LinkedIn writer + email writer per V2); output-quality reported by independent reviewers as templated-feeling at scale ("name, company, and a single research hook, not the multi-paragraph 'uncanny' personalization that buyers expect from the price tag"). Capability ≠ quality. MEASURED (pending May 5–18 test pass)
Setup time (sign-up → first send) 1–4 weeks reported by independent reviewers; time-to-first-meaningful-meeting 6–12 weeks CITED — V9, V11 reviewers; reply.io review; coldreach review
Reply rate v2 — held-out test data forthcoming
Cost per booked meeting v2 — held-out test data forthcoming. Modeled: at ~$50K/year for one Alice seat, even at an aggressive 30 booked meetings/month, modeled cost-per-meeting is ~$140 — high end of cohort. CITED — 11x.md §4 modeled estimate

Dimension scores (0–4) — verbatim from v1 scoring file

Dimension Weight Score (0–4) Evidence
Personalization quality 25% 1/4 "Generic messaging at scale — multiple reviewers describe Alice's emails as templated with shallow personalization (name, company), not the 'uncanny' tier you get from Clay+Lemlist or Lavender-coached human SDRs" (11x.md §8); demos: "the output is reported as templated-feeling at scale … Reviewer in V9 quantifies Artisan as 'approximately 4x higher response rates than 11x' and notes 11x 'frequently struggles with spam issues'" (demos/11x.md §4).
Deliverability infrastructure quality 20% 2/4 Native warmup advertised, multi-channel autonomous send (email + LinkedIn) per 11x.md §3; but "campaigns breaking, analytics not loading, integration failures" — reply.io and coldreach reviews (11x.md §8). 11x is a sender-of-record (unlike Outreach), so scores higher than non-senders, but reliability complaints are real.
CRM integration depth 15% 2/4 "Native: Salesforce, HubSpot, Pipedrive, Zoho. Bi-directional sync (emails, replies, call status, transcripts, custom fields)" — 11x.ai/platform/integrations/crm-integration (11x.md §5). Pending MEASURED check; vendor claim only.
Cost-per-seat efficiency 15% 0/4 "Most expensive in the cohort: ~$40K–$60K+/year per agent [THIRD-PARTY], with no free or self-serve trial" (11x.md tear-sheet §11); Vendr median $40,125/yr; "at ~$50K/year for one Alice seat, even at an aggressive 30 booked meetings/month, modeled cost-per-meeting is ~$140" (11x.md §4). 0 on cost-per-seat efficiency (~$50K/agent/yr).
Setup time 10% 1/4 "Setup is 1–3 weeks, not 'minutes' … Time-to-first-meeting is closer to 6–12 weeks" (11x.md §8); demos confirm "1–4 week ramp involving domain warm-up, ICP definition, knowledge-base ingestion, and persona configuration before Alice sends the first email" (demos/11x.md §3).
UI / UX 10% 2/4 "Live UI elements visible in V1, V9, V11 (third-party reviewers' screen-recordings) — ICP/persona builder, campaign dashboard, prospect list view, message preview pane, CRM-sync status" (demos/11x.md §3). Specific control names UNKNOWN; "bugs / platform reliability — campaigns breaking, analytics not loading" reported.
Data accuracy 5% 1/4 "Account research and persona-tuned messaging across 21+ enrichment data sources [VENDOR-CLAIMED]" (11x.md §3); no independent benchmark; one customer reportedly "ran a one-month trial in mid-January–mid-February and that '11x's product performed significantly worse than our SDR employees'" — TechCrunch March 24, 2025 (11x.md §7).
Total 100% 32.5 / 100
Reply rate (v2) v2 forthcoming
Cost per booked meeting (v2) v2 forthcoming

Pricing tiers (CITED — paid-only; no free tier, no free trial, no self-serve signup; all numbers reconstructed from third-party reports)

Tier (de facto) Annual cost (reported) Notes Evidence
Free / trial None Sales-gated only CITED — 11x.ai pricing flow
Alice (single agent) ~$50,000–$60,000 / year (~$5,000/mo) Per-agent, not per-seat ESTIMATED — MarketBetter 2026 pricing breakdown; SyncGTM review
Alice — Vendr median customer $40,125 / year median; $38,250–$65,550 typical range Negotiated outcomes ESTIMATED — Vendr marketplace
Julian (AI phone, standard inbound qualification) ~$48,000–$72,000 / year (~$4,000–$6,000/month) Voice agent ESTIMATED — Landbase 2026
Multi-agent / enterprise Custom; reports of $150K+ ACV multi-year Multi-year (24-month) terms now used routinely as a discount lever ESTIMATED — SyncGTM review

Contract terms reported [CITED — TechCrunch, third-party reviews]: - Annual minimum is the floor; 24-month terms now common as a discount lever. - 3-month break clause referenced in former-employee accounts is the period after which contracts "stick" for ARR purposes (central to the inflated-ARR allegation in Material risk note). - Cancellation friction is a recurring complaint in user reviews (Trustpilot, G2).

Integrations

AI feature stack (THIRD-PARTY-VIDEO-OBSERVED upgrades from demo dossier)

Feature What it does Evidence
Multi-agent supervisor topology (researcher + positioning + LinkedIn writer + email writer) Supervisor agent routes each lead through researcher → positioning report → channel-specific writer THIRD-PARTY-VIDEO-OBSERVED — V2 LangChain Interrupt 2025 (first-party engineering keynote by Sherwood Callaway + Keith Fearon); ZenML LLMOps Database summary
Multi-channel outbound (email + LinkedIn first-class) Distinct sub-agents per channel THIRD-PARTY-VIDEO-OBSERVED — V2
Account research / persona-tuned messaging Researcher + positioning report sub-agents THIRD-PARTY-VIDEO-OBSERVED — V2
Knowledge-base / company-document ingestion (LlamaParse-backed) Customers upload one-pagers, decks, case studies; powers researcher + positioning agents THIRD-PARTY-VIDEO-OBSERVED — V2 + LlamaIndex companion content
Julian (voice agent) — 30+ languages Inbound + outbound voice 24/7 THIRD-PARTY-VIDEO-OBSERVED (partial) — V4 Sukkar on Bloomberg Nov 11, 2024 confirms 30-language support
Sub-20-second response time (inbound form fills) Latency claim VENDOR-CLAIMED — not heard on any surveyed video
21+ enrichment data sources Personalization corpus VENDOR-CLAIMED — number not visible in any video
Platform X (no-code digital-worker builder) Compose new agents on 11x's primitives VENDOR-CLAIMED — no demo found in 14 videos surveyed

Material risk note — March 2025 TechCrunch report on customer/case-study quality

This is the single most important non-product factor in any 2026 11x evaluation. Captured neutrally. Yardstick gives 11x right-of-reply (no speculation beyond what reporting describes; no use of the word "fraud" — that legal characterization is not Yardstick's call).

Primary source: Marina Temkin and Charles Rollet, "a16z- and Benchmark-backed 11x has been claiming customers it doesn't have," TechCrunch, March 24, 2025. Reporting drew on "nearly two dozen sources" including investors, current employees, and former employees. [CITED]

Specific allegations as reported by TechCrunch:

  1. Customer-logo misrepresentation. As of March 21, 2025, 11x's website continued to display logos of companies that said they were not 11x customers — including Airtable (which said the product "was never used in production") and ZoomInfo (which stated it ran a one-month trial mid-Jan–mid-Feb and that "11x's product performed significantly worse than our SDR employees"). ZoomInfo had reportedly demanded 11x stop using its logo.
  2. Very high churn. A current employee was quoted: "We were losing 70-80% of customers that came through the door."
  3. ARR inflation via the 3-month break clause. Per former-and-current employee accounts, 11x counted contracts in headline ARR before the 3-month break clause expired. TechCrunch reported the company "might say it had $14 million in annual recurring revenue when in reality, the number of contracts that passed the three-month trial period totaled only about $3 million."

Corroborating coverage: - Futurism: "Startup Reportedly Claimed Fake Clients as Its AI-Powered Sales Bot Flailed" (https://futurism.com/ai-sales-bot-11x) - Sifted: 11x toxic-culture follow-up (https://sifted.eu/articles/11x-toxic-culture-ceo-working-nights-a16z) - Techmeme summary: https://www.techmeme.com/250324/p29

11x's response: publicly limited. The CEO transition six weeks later (May 5, 2025) was framed by the company as a scaling-stage move, not a response to the reporting. No on-camera response from 11x leadership exists — across 14 videos surveyed, no major broadcast outlet (Bloomberg, CNBC, WSJ video) produced a follow-up; the story moved primarily in text. [CITED — demos/11x.md §6]

Yardstick stance: report TechCrunch's findings factually with citations; do not speculate beyond them; flag the right-of-reply window so 11x can respond on the record before publication. Buyers should ask 11x directly during diligence about (a) which customer logos on the site are paying past the break clause, (b) net revenue retention by cohort, and (c) median contract length actually executed in 2025.

Editorial assessment

11x is the most-funded entrant in the autonomous-AI-SDR cohort with the strongest brand-name investor stack (a16z + Benchmark, $76M+ raised) and a multi-product surface (Alice + Julian + Platform X) that supports both email/LinkedIn and voice motions. The architectural promise is real: V2's first-party engineering keynote at LangChain Interrupt 2025 documents a credible multi-agent topology (supervisor + researcher + positioning + per-channel writers) that distinguishes 11x from template-merge competitors.

But two findings dominate the 2026 evidence stack:

  1. Customer-evidence on video is essentially zero. Across 14 videos surveyed (including vendor channel content, founder interviews, two TechCrunch news segments, and 8+ third-party reviews), no on-camera 11x customer states a deployment outcome with a named result. Every quantitative outcome claim on video is 11x's own internal use (V5 Donohue's "90% of pipeline," V2 Callaway/Fearon's 2% platform-aggregate reply rate, V4 Sukkar's "$4M pipeline") or third-party reviewers' own short-window tests. For a vendor charging $40K–$60K+/year and historically using customer logos contested by those customers, the absence is itself a credibility signal. Cite neutrally; do not editorialize.

  2. The capability-vs-output gap is real. Independent video reviewers (V9, V11, V12) and the independent Medium tester (article) converge on "templated-feeling at scale despite per-prospect architecture." V9 reviewer quantifies Artisan as approximately 4× higher response rates than 11x and notes 11x "frequently struggles with spam issues" — one reviewer's own test, not a controlled cohort study, but it converges with V11 and V12. The personalization-capability claim from the parent dossier survives as a capability-of-the-platform claim but does not upgrade to a quality-of-output claim.

Cost economics are punishing. At ~$50K/year per Alice seat, even at an aggressive 30 booked meetings/month, modeled cost-per-meeting is ~$140 — sitting at the high end of the cohort. Lower volumes balloon $/meeting fast. This is the basis of the 0/4 cost-per-seat efficiency score (~$50K/agent/yr).

Setup-to-results timeline is 6–12 weeks, not days. Foundation/Pilot-stage teams will burn budget before seeing pipeline. Yardstick test-pass methodology must distinguish vendor's setup window (signup → access) from Yardstick stopwatch (access → first send) — both should be reported in the final scoring.

Best for

Stage fit (verbatim from scoring file): - Foundation: no — "Apollo, Reply, Lemlist all serve this segment far better and ~10–20× cheaper" (11x.md §9). - Pilot: no — 6–12-week time-to-first-meeting plus $50K minimum burns budget before pipeline. - Scale: conditional — mid-market or enterprise with established ICP + working sequences + 8–12 weeks of tuning capacity. - Optimization: conditional — only if buyer has appetite to test seat-replacement (not seat-augmentation) on Salesforce/HubSpot.

Best-fit company profile: - Company size: Mid-market+ with named RevOps owner - Industry: B2B SaaS with broad ICP and validated messaging - Sales motion: Seat-replacement (Alice + Julian) — autonomous outbound + voice - Annual tooling budget: $50K–$200K+

Top strength (1 sentence): Most-funded entrant in the autonomous-AI-SDR cohort with the strongest brand-name investor stack (a16z + Benchmark) and a multi-product surface (Alice + Julian + Platform X), backed by the most architecturally serious public engineering disclosure in the category (V2 LangChain Interrupt 2025).

Top gap (1 sentence): March 2025 TechCrunch reporting on customer-logo misrepresentation, ~70–80% reported churn, and ARR-inflation allegations followed by May 2025 CEO transition raises the evidence bar on every other claim — and customer-outcome evidence on video is essentially zero across 14 surveyed videos.

Not for: Foundation/Pilot teams without defined ICP, validated messaging, or working warm-domain pool — 11x amplifies what's there; if inputs are weak, output is generic spam at $50K/year. Lean SMB. Buyers who need contractual flexibility (month-to-month, escape hatches, public pricing). Buyers nervous about vendor-stability risk after March 2025 reporting (legitimate concern; mitigate via shorter terms and named-customer references).

Right of reply

11x will receive the draft tear-sheet seven calendar days before publication. Vendor will be given the standard window to flag factual errors and may submit an on-the-record response which Yardstick will print verbatim if submitted within the 7-day window. Stance: factual citations to named published reporting (TechCrunch, Sifted) stay; quotes from named-byline sources stay; Yardstick will not speculate beyond what the reporting describes. [Vendor response will be populated post-window or recorded as "no response received within window."]

Affiliate disclosure

Yardstick Research does not currently have an active affiliate relationship with 11x.ai. 11x has no public self-serve signup; all paid acquisition runs through their sales team.

Sources (URLs verified to resolve, 2026-04-29)

Vendor / official: - 11x.ai homepage: https://www.11x.ai/ - Alice product page: https://www.11x.ai/worker/alice - Julian product page: https://www.11x.ai/worker/julian - CRM integrations: https://www.11x.ai/platform/integrations/crm-integration - Knowledge-base launch sequence: https://www.11x.ai/launch-sequence/alice-knowledge-base

Funding / corporate: - Series A coverage (Sept 16, 2024): https://techcrunch.com/2024/09/16/ai-digital-employee-startup-11xai-raises-24m-led-by-benchmark/ - Series A press: https://www.globenewswire.com/news-release/2024/09/16/2946867/0/en/11x-Secures-24-Million-Series-A-Funding-Led-by-Benchmark-to-Create-the-Future-of-Digital-Work.html - Series B coverage (Nov 11, 2024): https://www.bloomberg.com/news/articles/2024-11-11/andreessen-horowitz-leads-50-million-funding-for-ai-startup-11x - Series B press: https://www.globenewswire.com/news-release/2024/11/11/2978485/0/en/11x-Raises-a-50-Million-Series-B-Led-by-Andreessen-Horowitz-to-Accelerate-the-Era-of-Digital-Workers.html

TechCrunch March 2025 controversy (primary): - TechCrunch — Marina Temkin + Charles Rollet (March 24, 2025): https://techcrunch.com/2025/03/24/a16z-and-benchmark-backed-11x-has-been-claiming-customers-it-doesnt-have/ - Futurism: https://futurism.com/ai-sales-bot-11x - Sifted toxic-culture follow-up: https://sifted.eu/articles/11x-toxic-culture-ceo-working-nights-a16z - OnlyCFO analysis: https://www.onlycfo.io/p/ai-company-accused-of-fraud - Pivot to AI: https://pivot-to-ai.com/2025/03/25/ai-sales-startup-11x-claims-customers-it-doesnt-have-for-software-that-doesnt-work/

CEO transition (May 2025): - TechCrunch (May 5, 2025): https://techcrunch.com/2025/05/05/11x-ceo-hasan-sukkar-steps-down/ - Bloomberg: https://www.bloomberg.com/news/articles/2025-05-05/ceo-of-andreessen-horowitz-backed-ai-startup-11x-steps-down - Tech.eu: https://tech.eu/2025/05/06/founder-of-a16z-backed-11x-steps-down-as-ceo/ - Sifted: https://sifted.eu/articles/11x-ceo-steps-down

Engineering / leadership video: - LangChain Interrupt 2025 (Callaway + Fearon): https://www.youtube.com/watch?v=fegwPmaAPQk - ZenML LLMOps DB summary: https://www.zenml.io/llmops-database/rebuilding-an-ai-sdr-agent-with-multi-agent-architecture-for-enterprise-sales-automation - AI + a16z podcast w/ Prabhav Jain (March 20, 2025): https://www.youtube.com/watch?v=jDb8IF_BrA0 - Bloomberg Technology w/ Sukkar (Nov 11, 2024): https://www.bloomberg.com/news/videos/2024-11-11/andreessen-horowitz-puts-50m-in-ai-sales-video - Revenue Leadership Podcast ep. 33 w/ Mike Donohue: https://www.youtube.com/watch?v=4Y2_zzxa1mE - Lawrence Wu LangChain Interrupt recap: https://lawwu.github.io/posts/2025-05-23-langchain-interrupt-2025-recap/

Voice stack: - Cartesia case study: https://cartesia.ai/customers/11x - Ultravox case study: https://www.ultravox.ai/case-studies/how-11x-outsourced-voice-ai-innovation-to-dominate-their-market

Pricing aggregators (ESTIMATED): - Vendr marketplace: https://www.vendr.com/marketplace/11x - MarketBetter pricing 2026: https://marketbetter.ai/blog/11x-ai-pricing-2026/ - MarketBetter review: https://marketbetter.ai/blog/11x-ai-review-2026/ - Landbase pricing 2026: https://www.landbase.com/blog/11x-ai-pricing - SyncGTM review: https://syncgtm.com/blog/11x-ai-review

Third-party reviews: - G2: https://www.g2.com/products/11x/reviews - Trustpilot: https://www.trustpilot.com/review/11x.ai - Reply.io review: https://reply.io/blog/11x-ai-review/ - Coldreach review: https://coldreach.ai/blog/11x.ai-review - Independent Medium tester: https://saassalesdirector.medium.com/11x-ai-review-worth-the-hype-a33e89a9716f


Compiled for Yardstick Research 2026 Yardstick Report tear-sheets-final, 2026-04-29. Schema: aligned with software-research-2026q2.md tear-sheet 8. All claims labelled. Reply rate and cost-per-booked-meeting marked v2 — held-out test data forthcoming. TechCrunch reporting cited via Temkin/Rollet byline, neutrally. 11x given right-of-reply.