The May 2026 Joint Ventures Between AI Labs and Private Equity

Main essay: The Substrate Inversion

2026-05-08

1. Inner mechanics of the JV/DeployCo structures

1.1 Background and structure of the deals

On May 4, 2026, two announcements landed within hours of each other.

Anthropic, Blackstone, Hellman and Friedman, and Goldman Sachs announced a joint venture to build a new enterprise AI services company, with $1.5 billion in committed capital. The principal anchor commitments are approximately $300 million each from Anthropic, Blackstone, and Hellman and Friedman. Goldman Sachs and General Atlantic each committed approximately $150 million. The remaining capital comes from a consortium that includes Apollo Global Management, Leonard Green, GIC (Singapore), and Sequoia Capital. The new entity is described in Anthropic’s announcement as “a standalone entity with Anthropic engineering and partnership resources embedded directly within its team.” Goldman’s Marc Nachmann publicly framed the operating model as embedding engineers inside companies “to redesign workflows and integrate AI into core processes,” distinguishing it from a traditional consulting firm. A CEO has not been publicly announced. The proving ground is the founding investors’ portfolio companies, especially in healthcare services, manufacturing, financial services, retail, real estate, and infrastructure.

The same afternoon, Bloomberg reported that OpenAI had finalized a separate vehicle called The Deployment Company, internally and in trade press referred to as DeployCo. It is a Delaware LLC valued at $10 billion at close. OpenAI contributes an initial $500 million of equity with an option to deploy up to a further $1 billion, for a potential total OpenAI commitment of $1.5 billion. The PE syndicate puts in approximately $4 billion across a five-year window. There are 19 investors total. Anchor investors named in Bloomberg, FT, and Reuters reporting are TPG, Brookfield Asset Management, Bain Capital, Advent International, and Goanna Capital. Bloomberg reported that the 19 also include Dragoneer, SoftBank, and a mix of consulting firms whose identities have not been disclosed. Brad Lightcap, recently transitioned from OpenAI COO into a special projects role reporting to Sam Altman, runs DeployCo as CEO. OpenAI retains super-voting shares and majority economic and strategic control. The forward-deployed engineering team is being hired on the Palantir model, with dozens of engineers as of early May.

The most asymmetric structural detail in either announcement is a return commitment in DeployCo. The Financial Times reported that OpenAI guarantees its PE backers a 17.5% annualized return over a five-year window. Reuters and Bloomberg cite the FT figure but did not independently confirm it. The legal characterization, per Private Equity Insights and LetsDataScience, is closer to a subordinated-credit-with-equity-kicker structure than a standard equity contribution. Anthropic’s joint venture has no publicly reported equivalent floor.

1.2 How the joint ventures will operate

Both vehicles will acquire AI services and consulting firms, embed engineers from the lab parent inside acquired companies and selected portfolio companies of the financial backers, and rebuild operational workflows around AI agents over multi-year engagements. Reuters reported in early May that both ventures are in talks to acquire AI services firms specifically to add hundreds of engineers and consultants to their deployment capacity.

The operating model is illustrated by Anthropic’s own example. A typical engagement starts with a small team of engineers from the joint venture working alongside Anthropic Applied AI staff inside a customer company. In the healthcare services example given in Anthropic’s announcement, this means engineers sitting with clinicians and IT staff to identify where time disappears in workflows like documentation, coding, prior authorization, and compliance, then building tools around the answer.

This differs from prior PE-AI hybrid structures in three ways. General Catalyst’s Creation Strategy program co-founds holding companies with industry experts and has those new companies acquire 5-15 services targets, but the labs are not shareholders. Sequence Holdings, seeded by 8VC and led by alumni of Scale, Cognition, and Lone Pine, is a venture-style multi-vertical operator. Thrive Holdings, OpenAI’s December 2025 partnership with Joshua Kushner’s firm, is an equity-only embedded-engineer arrangement focused on accounting (Crete) and IT services (Shield). The May 4 vehicles are different from each: in the Anthropic case, the lab is a co-founder with equity and supplies engineers from its own team. In the OpenAI case, OpenAI keeps super-voting shares and operating control, and the structure is underwritten by the reported return floor.

1.3 Unit economics

The financial backers and the labs are allocating capital primarily to the engineering bench, integration work, and acquired-company integration costs. Inference costs paid back to the lab parents represent a small and declining share of total deployment economics.

Forward-deployed engineer compensation at lab-JV scale runs approximately $400,000-$550,000 in total compensation for mid-to-senior roles, per the Hashnode 2026 compensation guide. Levels.fyi data shows Palantir Forward-Deployed Software Engineers at $171,000-$415,000 total comp, with median around $215,000-$238,000 and staff-level engineers clearing $630,000. Fully loaded with overhead, the cost per engineer in the JV operating context is in the $500,000-$700,000 range.

The deployment pattern reported across Sierra, Harvey, and Crete is 3-5 forward-deployed engineers per acquired services firm during the ramp phase, decaying to 1-2 in steady state. At fully-loaded cost, a five-engineer ramp consumes $2.5-3.5 million per year per portfolio company in engineering bench cost alone. Across 50-100 mid-market platform companies, the implied target portfolio for a $1.5-10 billion vehicle, engineering bench cost alone consumes $125-350 million per year in steady state.

Acquisition targets, including mid-market healthcare services, accounting platforms, MSPs, and HOA management firms, typically transact at 8-12x EBITDA in the $50-500 million enterprise value range. Crete is at approximately $300 million in revenue with $500 million earmarked for 24 months of acquisitions. Long Lake has raised approximately $670 million and acquired 18 businesses (Hemant Taneja, October 2025).

Inference costs paid back to the lab parents run approximately 5-15% of total deployment cost in mature deployments, and that fraction is declining. Ramp’s 2026 enterprise data shows per-million-token costs across major providers falling from approximately $10 to approximately $2.50 over a single year. Stanford HAI’s 2025 AI Index documented an approximately 280-fold drop in inference cost for fixed performance over an 18-month window. Sam Altman has written that the cost of a given level of AI capability falls roughly tenfold every twelve months.

The largest cost categories in the joint venture economics are the engineering bench, integration custom work, evaluation framework infrastructure, and acquired-company integration. The smallest is what flows back to the lab as model API revenue.

For comparison: Constellation Software acquired approximately 30 vertical software businesses on initial $25 million of capital in 1995. A typical IT services or BPO platform PE rollup runs $300 million-$1 billion in committed capital across a 5-7 year hold. The DeployCo $10 billion vehicle and Anthropic JV $1.5 billion vehicle, normalized for portfolio size and engineering bench cost, run at approximately 2-4x the per-platform capital intensity of a non-AI rollup.

1.4 What each party brings and gets

Anthropic. Krishna Rao, Anthropic’s CFO, framed the rationale: “Enterprise demand for Claude is significantly outpacing any single delivery model.” The rate-limiting constraint on Anthropic’s enterprise revenue is delivery capacity, not capability. The joint venture manufactures delivery capacity through Wall Street’s portfolio access. Anthropic preserves the model API revenue stream and gains an equity claim on the joint venture. The absence of a return floor on the Anthropic side may indicate higher conviction, stronger negotiating leverage, or simply less disclosure. Anthropic’s $30 billion annualized revenue run rate as of April 2026 (Anthropic disclosure cited in Morningstar, SaaStr, TheAICorner) exceeds OpenAI’s reported run rate, and 80% of Anthropic’s revenue is enterprise versus a more consumer-weighted mix at OpenAI.

Dario Amodei has not commented directly on the joint venture in published material. The announcement was led by Rao and Blackstone COO Jon Gray, not Amodei. The framing inside Anthropic appears to be about finance and distribution rather than strategic pivot.

OpenAI. The DeployCo structure does three things for OpenAI. It captures patient capital locked up for five years (per FT) without a parent-level dilution event. It converts OpenAI’s growth optionality into a tradeable, capped, fixed-yield instrument that PE balance sheets can underwrite using their internal credit-fund frameworks. And it gives OpenAI distribution into approximately 2,000 PE portfolio companies in a single transaction, at a moment when OpenAI’s enterprise sales motion is reported to be lagging Anthropic’s (per the leaked Dresser memo cited in Morningstar).

The 17.5% return floor is open to two readings. One reading: OpenAI is hedging the downside scenario in which token-level capability commoditizes faster than its enterprise revenue ramps, and the floor caps OpenAI’s upside but transfers downside risk to PE, making the deal cheaper to close. Another reading: OpenAI is willing to subordinate returns for fast distribution because the IPO window matters more than marginal channel economics, and 17.5% approximately matches the rate at which capital allocated to AI infrastructure SPVs has cleared in 2025-2026, meaning the floor is the alternative cost of capital rather than a premium. Both readings are consistent with the structure as reported.

OpenAI is running DeployCo and Thrive Holdings simultaneously. Thrive (December 2025) is an equity-only, embedded-engineer arrangement with sectoral focus on accounting and IT services. DeployCo (May 2026) is a structured-finance arrangement with a captive multi-sector portfolio. The coexistence of both suggests OpenAI does not yet know which model wins.

The PE firms. Blackstone (Anthropic JV). Jon Gray’s framing is that the bottleneck is “highly skilled implementation partners.” Blackstone contributes its 230+ portfolio companies and its operating playbook. Blackstone has built out Blackstone Insurance Solutions and Blackstone Data Centers as adjacent moats. The joint venture is a third leg that monetizes the combined Insurance, Real Estate, and PE portfolio universe through a different revenue stream.

Hellman and Friedman. Mid-market focus, longer hold periods, less aggressive financial engineering. H&F contributes a particular type of mid-market portfolio (often founder-led, often $500 million-$3 billion enterprise value) that the JV’s playbook is calibrated to.

Goldman Sachs (Anthropic JV). Marc Nachmann’s framing is that the joint venture will “democratize access to forward-deployed engineers.” In practice Goldman serves three functions: capital provider via Goldman Sachs Asset Management’s $3.7 trillion in assets under supervision, M&A advisor on portfolio acquisitions, and distribution channel into private wealth for adjacent financial products. Goldman is closer to a financial sponsor than a deployment partner. It does not have a credible AI deployment team.

TPG, Bain, Brookfield, Advent (DeployCo). These firms are participating as capital providers, board representation, and portfolio access. None has a meaningful in-house AI deployment team. The forward-deployed engineering function is sourced from OpenAI. The PE firms get equity stakes, board seats, captive distribution into their own portfolios, and the 17.5% return floor.

Apollo, General Atlantic, Sequoia, GIC, Leonard Green (Anthropic JV consortium). Apollo brings insurance permanent capital (private credit is now 18% of US life general account assets per NAIC 2025) seeking long-duration credit-like exposure. GIC is a sovereign LP. Sequoia’s participation is notable because it confirms its published “Services: The New Software” thesis as a fund-level commitment.

Tom Tunguz’s December 2025 piece on PE-AI convergence describes “dedicated AI centers of excellence” at “forward-thinking PE firms,” but the named examples are mostly portfolio-company implementation teams sitting inside acquired businesses. KKR or Carlyle does not currently have an in-house cohort of forward-deployed engineers comparable to Anthropic’s Applied AI team. The joint venture structure exists in part because the PE firms cannot supply the engineering function organically.

1.5 What the structures imply about model-layer economics

Both labs reached the same operational conclusion in the same week: the model layer alone, as a five-year revenue stream, is insufficient to support the equity outcomes the labs need. Three pieces of evidence support this reading.

First, the dollars in both structures flow primarily to engineering and integration, not to model API revenue. Inference revenue back to the lab parents is 5-15% of deployment cost and falling.

Second, both labs are willing to accept structures that subordinate model-layer revenue to deployment-layer equity. The 17.5% floor in DeployCo is the most explicit version. The Anthropic structure, with no floor but with embedded Applied AI engineers and equity participation in the joint venture, is a softer version of the same trade.

Third, the empirical case for model commoditization is clear. Token costs at fixed capability are falling roughly tenfold per year. Open-weight models from Chinese labs (DeepSeek V4 in April 2026, Qwen 3.5 and 3.6 from February to April 2026, Llama 4 Maverick) reached production deployment economics in 2025. Frontier-equivalent capability now runs on commodity infrastructure for most enterprise tasks. Multi-model routing is standard at the application layer. The clearest single signal: Harvey, the legal AI company at $11 billion valuation, scrapped its proprietary legal model in 2025 after frontier reasoning models from Google, xAI, OpenAI, and Anthropic began outperforming Harvey’s own custom model on Harvey’s own internal benchmark. Harvey now routes across multiple foundation models depending on the task. Sierra’s moat is similarly described, in its own materials and in analyst reporting, as the customer evaluation loop rather than the agent infrastructure itself.

2. Outside-in versus inside-out adoption

The May 4 structures are outside-in plays: capital comes from outside the operating company, AI capability is built outside it, organizational design starts from the outcome rather than the legacy structure. This contrasts with inside-out programs, in which an incumbent attempts AI transformation from within using its own people, budgets, and IT structure.

2.1 Outside-in case studies

Long Lake (HOA management, General Catalyst Creation Strategy). Reports 25-30% productivity gains in HOA management (GC’s published number), 10x increase in new customer pipeline, 18 acquisitions across HOA, residential services, business services, and infrastructure as of October 2025, approximately $670 million raised, and doubled free cash flow within two years (Hemant Taneja, October 2025).

Crete Professionals Alliance (accounting, Thrive-backed). Reports approximately $300 million in ARR with 20+ accounting businesses (Reuters, June 2025), 900 employees across 17 offices including Indian operations in Gurugram, and a $500 million acquisition program over 24 months. Bennie Lewis at Assurance Dimensions, quoted in Reuters, reports AI tools “saving hundreds of hours per month” for audit testing teams.

Eudia (legal autopilot, GC Creation Strategy). Reports $10 million ARR achieved in less than 2 years, on track for $20 million by end of 2025 (Business Insider, AOL, July 2025). $105 million Series A in February 2025, structured as $30 million upfront plus $75 million earmarked for acquisitions subject to GC approval. Acquired Johnson Hana (Dublin, July 2025) and Out-House (Maryland, October 2025). Launched Eudia Counsel under Arizona’s Alternative Business Structure framework in September 2025, the first non-lawyer ownership of a regulated US legal entity at this scale. Clients include DHL, Intuit, Cargill, Duracell, Coherent.

Sierra (customer experience). Reports $150 million ARR by January-February 2026 (8 quarters from launch), $100 million ARR by November 2025, $26 million ARR end of 2024 (Sacra). $950 million raise at $15.8 billion post-money on May 4, 2026 (CNBC), the same day as the joint venture announcements. Outcome-based pricing per resolution. More than 40% of Fortune 50 as customers. Voice agents surpassed text as primary channel by October 2025.

Harvey (legal). Reports approximately $195 million ARR by January 2026, up from $100 million in August 2025 (Sacra). Customer count over 1,300 organizations, approximately 100,000 lawyers. Most of the AmLaw 100. $200 million raise at $11 billion valuation in March 2026, co-led by GIC and Sequoia. Approximately 10% of headcount in ex-lawyer customer-success and forward-deployed roles. Pricing: $1,200 per lawyer per month plus custom model fees, 20-seat minimum, 12-month commitments.

Failure modes. Thrasio, the Amazon-aggregator rollup, raised billions on cheap debt and filed for bankruptcy in February 2024 after canceling its SPAC IPO. It is cited in Contrary Research as the cautionary precedent for VC-backed services rollups that fail to reconcile VC return targets (5-10x) with rollup IRR economics (mid-double-digit). The Apollo-Athene-style permanent-capital partnerships in the May 4 structures are how the PE-AI hybrids attempt to fix this. No published rate exists for how often AI-rollup acquisitions get unwound or written down. The cohort is too young.

2.2 Inside-out adoption signal

McKinsey’s State of AI 2025 (September 2025 survey, 1,993 respondents) reports that nearly all organizations use AI, but only 39% report EBIT impact at the enterprise level. Only approximately 6% of respondents qualify as “AI high performers” with 5%+ EBIT attributable to AI. The high performers redesign workflows; the rest do not. The implication is that approximately 94% of inside-out programs have not produced enterprise EBIT impact after three or more years of effort.

Russell Reynolds’s 2024 Fortune 500 tech leadership analysis reports that the traditional CIO title is held by only 49% of Fortune 500 tech leaders, down from 68% five years prior. 54% of new tech-officer appointments since 2024 carry hybrid titles such as Chief Digital and Information Officer or Chief Strategy and Transformation Officer. Spencer Stuart 2024 reports CEO tenures declining to 7.2 years globally from 8.4 years in 2021/2023.

Gartner forecasts 35% of large organizations will have a Chief AI Officer reporting to CEO or COO by end of 2025. Approximately 30% of companies surveyed across multiple firms report having or hiring one. The role is too new for time-series data on attrition.

A counterexample worth noting: Eli Lilly’s LillyPod is a 9,000+ petaflop on-premises inference cluster used for drug discovery and clinical trial automation. Lilly’s case is unusual in that the AI workload sits at the strategic core of the business and the regulatory data must stay on-premises. This is closer to a special case than to general inside-out transformation.

2.3 Predictive variable

The cleanest predictive variable from the empirical material: whether the AI workload sits inside the regulated and data-bound core of the business or in commodity workflows around the edges. Inside-out programs work better where data residency, IP sensitivity, or regulatory chain-of-custody dominates. Outside-in vehicles work better where workflows are repeatable, regulation is moderate, and the customer pays for an outcome rather than for a tool. Both can coexist within the same company along different workflow dimensions.

2.4 Hybrid cases

Microsoft 1990s-2000s, Adobe SaaS transition, Schibsted, NYT, AmEx, and Walmart all share a common pattern: a CEO with multi-decade tenure, founder-level board protection, and willingness to accept a 3-5 year P&L hit. These are the structural conditions absent from most current Fortune 500 incumbents. Microsoft itself, under Satya Nadella’s 12-year tenure, is the canonical inside-out success story but cannot be replicated at companies whose CEOs face 3-year board patience windows.

Two AI-specific hybrid cases worth tracking. Salesforce Agentforce (Bret Taylor’s former employer) reached general availability in February 2026, doing inside-out via acquisition plus product wrap. Microsoft 365 Copilot has 15 million paying customers (per Stratechery, 2026); the per-seat pricing model is the legacy, and the agentic shift threatens it.

3. Macro restructuring patterns

3.1 Work composition

Professional services. Harvey at $195 million ARR in January 2026, with 100,000 lawyers using the platform and approximately 10% of Harvey’s headcount in forward-deployed roles. Harvey’s $11 billion valuation in March 2026 is approximately 2x the entire $5.59 billion legal-AI software market it sits inside. Crete at $300 million ARR in accounting. Eudia at $10-20 million ARR in legal.

The Brynjolfsson, Chandar, and Chen ADP payroll study (2025) shows approximately 13% drop in 22-25-year-old employment in AI-exposed occupations (notably entry-level accounting and software engineering) from late 2022 to mid-2025, against 6-12% growth for older workers in the same fields.

Non-professional services. Long Lake doubling free cash flow in HOA management within two years. Sierra at a $400 billion total addressable market estimate (Bret Taylor) and 5x ARR growth from late 2024 to early 2026. Dwelly reports doubling EBITDA margins in deployed property-management agencies and 40% reduction in repair wait times.

Aggregate vs. firm-level effects. Aggregate productivity effects from AI are still small in published macro estimates. Daron Acemoglu’s “Simple Macroeconomics of AI” (2025) estimates a 0.66% TFP gain over 10 years, or 0.064% per year. Hartley, Jolevski, Melo, and Moore (2026) report no statistically significant aggregate displacement signal through 2024-2025. At the firm level, the rollup case studies show labor-share-of-revenue dropping 15-30 percentage points within 12-24 months of AI deployment. The discrepancy is real and worth tracking.

New work categories. Forward-deployed engineering, AI operations, evaluation engineering, model routing, and agent observability (”silent failure” detection) are emerging as named functions. Forward-deployed engineering job postings grew approximately 800% from January to September 2025 (Indeed/FT). Salesforce committed to 1,000 forward-deployed engineers in late 2025. EY launched a dedicated forward-deployed engineering practice.

3.2 AI deployment patterns

Copilot vs. autopilot. Sequoia partner Julien Bek’s framing (March 2026) distinguishes copilots, which compete with each model release, from autopilots, which benefit from each model release. Empirically, autopilot revenue (Sierra, Harvey, Crete, Cognition Devin) is growing approximately 5-10x faster than copilot revenue at comparable maturity. Cognition’s reported $25 billion valuation in April 2026, on $100-155 million ARR (post-Windsurf), implies a 65-160x revenue multiple.

Token consumption. Reasoning models (o1, o3, Claude with thinking modes) generate 5-30x more tokens per task than non-reasoning models (Gartner, March 2026). Agentic workloads chain 5-15x more LLM calls than chat workloads. Per-token costs are declining; total inference spending grew 320% over 2023-2025 (Stabilarity Hub 2026). Inference is approximately 85% of enterprise AI budgets, up from one-third in 2023.

Model mix. Multi-model routing is dominant in 2026. Cloudshim and AnalyticsWeek 2026 data shows 80% of routine traffic going to cost-optimized small models, with frontier capability reserved for hard tasks.

3.3 Internal IT and AI organization

The traditional CIO role is held by 49% of Fortune 500 tech leaders, down from 68% five years prior. Hybrid Chief Digital and Information Officer or Chief Strategy and Transformation Officer titles account for 54% of new appointments since 2024 (Russell Reynolds).

Approximately 85% of enterprise AI budgets go to inference (AnalyticsWeek 2026). Gartner global AI spending forecast is $2.5 trillion+ for 2026.

Sovereign LLM and on-premises hosting investments by national governments include India’s IndiaAI Mission ($1.25 billion), Korea’s 2026 AI budget (9.9 trillion won, approximately $6.7 billion), Japan’s $20 billion sovereign-model investment (via SoftBank/Preferred Networks consortium), Saudi Arabia ($40 billion, Vision 2030), and UAE (Falcon, AI71).

3.4 Cognitive labor inside firms

Brynjolfsson, Chandar, Chen (2025) document a 13% drop in 22-25-year-old employment in AI-exposed occupations against 6-12% growth for older workers in the same occupations. Bifurcation, not aggregate decline.

A separate “evaluation tier” of cognitive labor is emerging. At Harvey, approximately 10% of headcount is ex-lawyer evaluators of AI work (Sacra). At Crescendo, AI-augmented customer experience with human-in-the-loop is structural to the offering. Brynjolfsson’s “tacit knowledge buffer” finding raises a 5-10 year pipeline question: if AI absorbs entry-level work, the cohort of senior workers in 2030-2035 may lack the experiential base prior cohorts had.

In parallel, major professional services firms are scaling back junior hiring. PwC quietly scrapped its 2021 pledge to add 100,000 to its worldwide headcount by mid-2026, cut staff by 5,600 in the year ending June 30, 2025, and is reducing entry-level hiring across multiple regions (FT, Business Insider). PwC UK is cutting 200 entry-level roles (1,300 vs. 1,500 prior year). Business Insider reported PwC plans to cut US graduate hiring by approximately one-third by 2028.

3.5 Labor market

Aggregate labor displacement from AI is still small in published estimates. Acemoglu (2025) puts TFP gains at 0.66% over 10 years. Hartley et al. (2026) report no statistically significant aggregate displacement signal. Burtch Works (2025) reports a 12% rise in entry-level AI worker starting salaries.

Geographic restructuring is partial and bidirectional. Crete uses Indian operations as an AI-augmented delivery hub. Crescendo’s PartnerHero acquisition rebuilt onshore. Both globalization-reverser and globalization-extender effects are visible in 2026.

3.6 Investment flows

Hyperscaler capital expenditure for 2026 is estimated by Goldman Sachs at $527-700 billion, by CreditSights at approximately $602 billion, and by Tom’s Hardware/FT at $725 billion. Big Tech bond issuance is over $100 billion in 2026 to fund AI capex. CDS spreads are at record levels.

AI-services rollup funding is concentrated in General Catalyst ($1.5 billion Creation Strategy), Thrive Holdings ($1 billion+), 8VC’s Sequence and Kodiak, Slow Ventures, Khosla, and Bessemer-Lightspeed-Sequoia at the ecosystem level. The two May 4 vehicles add $11.5 billion more.

The PE/VC split is shifting toward permanent-capital partners (Apollo-Athene, Blue Owl-Kuvare, KKR-Global Atlantic) for the operationally heavy rollup vehicles, with venture capital concentrated in autopilot platforms.

Insurance-linked permanent capital deployed approximately $180 billion into private credit in 2025 (McKinsey, cited in ABF Journal). Private credit is 18% of US life general account assets (NAIC 2025). The Apollo-Athene template is now industry standard for long-duration AI deployment vehicles.

3.7 Private equity playbook

AI deployment readiness is now a standard diligence dimension. Long Lake’s “AI-native rollup” pattern (2x EBITDA in 2 years) is the new value-creation template, replacing legacy “10% margin compression by Year 3” models. Exit timelines are stretching, partly by design (permanent-capital partners) and partly by necessity (IPO-window uncertainty). A new operating-partner type, the “applied AI partner,” is appearing at PE firms, often poached from Palantir or labs.

3.8 Workforce and enterprise types

Regulated knowledge workers (lawyers, doctors, accountants, financial advisors) face slower deployment but higher per-unit value. Harvey is the canonical legal case; healthcare revenue cycle management is the next vertical (Bek). Unregulated knowledge workers (consultants, marketers, designers, software engineers) face faster, deeper deployment; the Brynjolfsson 13% entry-level disruption clusters here.

Public mega-cap incumbents are struggling with inside-out transformation per the McKinsey 6% high-performer cohort. Mid-market PE-acquired firms are the May 4 structures’ primary target. Mid-market founder-led and family-owned firms face succession crisis as deal flow (Japan: 50,000+ SMEs annually facing closure risk per M&A Research Institute and Compound and Fire reporting).

3.9 Globalization

AI rollups can act as globalization-reverser (services that previously offshored can be onshored as AI replaces the offshored labor) and as globalization-extender (AI-native services exported globally with marginal labor cost). Sierra has Tokyo, Singapore, Madrid, Paris, London, and Sydney offices by March 2026.

Sovereign-AI investment is significant in the US, EU, China, India, Korea, Japan, Saudi Arabia, and UAE. Alibaba’s Qwen passed 1 billion downloads on Hugging Face by March 2026 (153.6 million in February 2026 alone). Alibaba, ByteDance, and Tencent are simultaneously running first-party services arms and shipping open-weight models globally; the Chinese strategy is closer to open-weight commoditization than to deployment-services rollup.

4. Capability commoditization and where value is moving

4.1 Token cost decline

Stanford HAI’s 2025 AI Index documents an approximately 280-fold drop in inference cost for fixed performance over an 18-month period (November 2022 to October 2024). Ramp’s 2026 enterprise data shows average per-million-token costs across major providers falling from approximately $10 to approximately $2.50 in a single year. Epoch AI documents 9-1,000x annual price declines for fixed capability, with median 200x for post-January 2024 data. Gartner (March 2026) projects inference cost for a 1-trillion-parameter LLM falling 90%+ by 2030 versus 2025. TokenMix 2026 documents 60-80% per-token cost drops across every major provider in 2025-2026, with Gemini Flash-Lite at $0.25 per million input tokens.

A counter-trend: arxiv 2511.23455 documents that per-token prices are declining but the cost of running frontier-level models has nonetheless risen approximately 18x per year because frontier models use exponentially more tokens per task.

The honest summary: at fixed capability, inference cost falls roughly 10x per year. At frontier capability, total spend per task is rising because frontier models consume more tokens per task and more sophisticated workloads chain more calls.

4.2 Open-weight model trajectory

DeepSeek V4 (April 2026): 1.6 trillion total parameters, 49 billion active in mixture-of-experts; 1 million context; approximately $0.14-1.74 per million tokens; approximately 80% on SWE-Bench Verified; MIT license.

Qwen 3.5/3.6 (February-April 2026): 397 billion total, 17 billion active; 256K context; Apache 2.0; 1 billion+ cumulative downloads; matches GPT-5-class on multiple benchmarks.

Llama 4 Maverick: 400 billion total, 17 billion active, 10 million context.

Combined DeepSeek and Qwen share of global AI market reportedly grew from approximately 1% in January 2025 to approximately 15% in January 2026 (particula.tech).

Andrej Karpathy’s October 2025 Dwarkesh interview frames the trajectory as a “decade of agents, not year of agents.” Frontier capability is still ahead of open-weight by 6-12 months on the highest-stakes tasks but commoditizing rapidly.

4.3 Where the moat is moving

Vertical integration. Sierra’s customer evaluation loop, Harvey’s law-firm relationship density, Decagon’s customer-service deployment depth. Concrete examples of vertical-integration moats sustained even as the underlying models become commoditized.

Distribution. PE board control (the May 4 structures’ explicit thesis). Existing customer base. Regulatory familiarity (Eudia’s Arizona ABS framework).

Data and substrate. Vertical-specific data ownership, evaluation framework infrastructure, deployment pattern libraries, customer-specific fine-tunes. General Catalyst’s Creation Strategy explicitly notes that workflows for HOA, MSP, accounting, and property management are “very similar across [verticals]” and that the technology and data infrastructure is reusable.

4.4 Coding agents as test case

Coding agents are the most empirically advanced category for testing the moat-migration thesis.

Cursor/Anysphere reports $2 billion ARR by February 2026 (Bloomberg), $29.3 billion valuation in November 2025, in talks for $50 billion+ valuation, with approximately 60% revenue from enterprise.

Cognition (Devin) grew from $73 million ARR in June 2025 to approximately $155 million post-Windsurf ARR by early 2026, with a reported $25 billion valuation raise in April 2026.

Claude Code reports $2.5 billion ARR by February 2026 (Anthropic disclosure cited in The Deep View and Anthropic blog).

GitHub Copilot is approaching $1 billion ARR.

Cursor, Claude Code, and Devin all use frontier models (Claude Opus, GPT-5, Gemini 3.1 Pro) as the inference substrate. Multi-model routing is standard. Differentiation lives in workflow design, IDE integration, autonomy gradient, and pricing model.

Coding has structured verifiability (tests pass or fail), so the autopilot loop closes more cleanly than in legal or healthcare where ground truth is harder. This may make coding the easiest case for the autopilot pattern, not the hardest. The implication for generalization is uncertain.

4.5 Where value capture is migrating fastest

Coding has already migrated. Approximately $7 billion+ in coding-agent ARR vs. approximately $20-30 billion in foundation-model coding revenue; integration captures more.

Legal is in progress. Harvey’s $11 billion valuation is greater than the entire $5.59 billion legal-AI software market it sits inside.

Customer experience is in progress. Sierra at $15.8 billion valuation, Salesforce Agentforce at general availability in February 2026, both capturing the integration layer.

Accounting is early. Crete at $300 million ARR vs. approximately $5 billion in accounting-software market.

Healthcare is very early. Regulatory friction is slowing deployment.

Open questions and flagged uncertainties

The Anthropic JV’s CEO and board composition were not publicly announced as of May 7, 2026.

The exact forward-deployed engineer cost per portfolio company in steady state is extrapolated from publicly available compensation data, not from JV disclosures.

The audited accounting of OpenAI’s 17.5% return floor in different downside scenarios is not fully disclosed. The legal characterization is publicly described; the exact mechanics (cash top-up, equity, model-credit) are not.

The actual share of the Anthropic JV’s $1.5 billion that flows back to Anthropic as token revenue versus equity participation is not disclosed.

The SpaceX-Anysphere acquisition reportedly closing April 21, 2026 (per Remio) is a single-source claim that needs cross-confirmation before citing.

The number of enterprise contracts at Anthropic at $1 million+ ARR (1,000+ per Anthropic April 2026 disclosure) is disputed by OpenAI per the leaked Dresser memo.

Failure rate of outside-in AI rollups: no published data; pre-IPO cohort; statistically uninformative as of May 2026.

Per-customer agent population in Sierra, Harvey, and Crete deployments: not publicly disclosed at any meaningful granularity.

The Cognition/Devin reported $25 billion raise: reported but not confirmed via official channel as of source-review date.

Whether OpenAI’s parallel Thrive Holdings and DeployCo structure represents two iterations of one experiment or two separate strategic bets remains undetermined.

Sources

Primary lab and PE communications. Anthropic press release (anthropic.com/news/enterprise-ai-services-company), Blackstone press release confirming JV terms, GIC press release, Anthropic April 2026 revenue disclosure, OpenAI internal communications (where leaked).

News reporting. Bloomberg coverage of DeployCo, Financial Times reporting on the 17.5% floor and DeployCo terms, Reuters on both ventures including the May 5 acquisition-talks story, CNBC and Fortune on the May 4 announcements, Yahoo Finance on consortium composition.

Equity research. Sacra (Sierra, Harvey, Cognition, Crete), Contrary Research (Vertical AI Playbook), Tom Tunguz on PE-AI convergence.

Industry analysis and primary research. Stanford HAI 2025 AI Index, Epoch AI on inference price decline, Ramp 2026 enterprise data, McKinsey State of AI 2025, Russell Reynolds Architects of Change analysis, Spencer Stuart CEO tenure data, Burtch Works 2025 compensation data.

Compensation and operating data. Levels.fyi, Hashnode 2026 compensation guide, EntrepreneurLoop on FDE costs.

Academic. Daron Acemoglu, “Simple Macroeconomics of AI” (NBER 2025); Erik Brynjolfsson, Sahil Chandar, Wenchao Chen (2025) on entry-level employment displacement; Jonathan Hartley, Filip Jolevski, Vinicios Melo, Brendon Moore (2026) on aggregate displacement signals.

Open-weight model coverage. TokenMix 2026, Nerova, Codersera, particula.tech on DeepSeek and Qwen market share.

Hyperscaler capex. Goldman Sachs, CreditSights, Tom’s Hardware/FT, IEEE Communications Society blog.

Insurance and PE structure. ABF Journal on insurance-linked private credit, NAIC 2025 data on life insurance general account allocations.

Sovereign AI. Korea Herald, The Economy (Japan), The Diplomat (Japan), Digital in Asia AI policy tracker.

PwC headcount. Financial Times (PwC headcount target abandonment), Business Insider (US graduate hiring cuts).

Vendor and case study materials. Vantaca and HOAi case studies on HOA management, Eudia public materials, Harvey’s own blog (”Why Harvey is Multi-Model by Design,” “Expanding Harvey’s Model Offerings”), Sierra’s own customer materials.

The empirical claims in this document have been verified against primary sources where possible. Where claims rest on a single source or could not be independently verified, the source is named or the claim is flagged. Where the financial topology is extrapolated from public compensation data and deployment patterns rather than from JV disclosures, that is stated. The 17.5% return floor in DeployCo is FT-reported; Reuters and Bloomberg cite FT but did not independently confirm.