The United States invented AI-first drug discovery. It deployed more capital, more talent, and more compute than any other country. It still has no FDA approval and it is losing the productivity race to China. This is the audit.
Fourteen years after Atomwise shipped the first convolutional-neural-net virtual screener, the United States remains the undisputed capital and talent hub of AI drug discovery. It also remains an industry whose headline productivity records are being set 12,000 kilometres away.
The first dedicated AI drug-discovery company (Atomwise, 2012) was American. So were Recursion (2013), Insilico's original Baltimore incorporation (2014), and Relay (2016). AlphaFold 2 came out of DeepMind but its open-source release in July 2021 and 43,000+ citations rewired every US pharma computational chemistry team. By 2024 US-based AI-native biotechs had raised roughly $27–30 billion in venture and public equity, with another $50 billion+ in pharma-deal biobucks layered on top.
Two Americans share the 2024 Nobel Prize in Chemistry. Most of the sector's open-source models (RFdiffusion, ESM, Chroma) trace to US labs. NVIDIA's BioNeMo is an American product. The talent stack is not the problem.
No US-origin AI-designed drug has received FDA approval as of Q2 2026. A decade into the wave, the sector has produced ~50 clinical-stage assets and zero NDAs. The best-funded US platform company, Recursion, has roughly 5 wholly-originated clinical assets despite raising north of $2 billion. Its phase-2 readout on REC-994 missed its primary endpoint in Q3 2024.
By comparison, a single global AI platform (Insilico Medicine) has filed 13 INDs, nominated 30+ PCCs, runs 9 clinical programmes (6 Phase 1, 3 Phase 2), and has logged 0 clinical failures on roughly one-third the capital. It generated $85.8M in 2024 revenue and $56.2M in H1 2025. Industry PCC-per-year records are being set in Suzhou, not Salt Lake City.
The thesis of this report. Generating a hit is easy. Generating a quality Development Candidate package — the GLP tox, CMC, ADME/PK, formulation, stability, and safety pharmacology bundle required for an IND — is hard, slow, and expensive. Most US AI biotechs optimised for the part of the funnel AI accelerates (hit generation, SAR, selectivity) while underinvesting in the part AI barely touches (wet-lab integration, CRO orchestration, process chemistry). The companies that treated AI as software-to-sell rather than a pipeline-to-build have produced few drugs. The ones that integrated AI into an end-to-end wet-lab operation have produced many. That second pattern, so far, is almost entirely a Chinese phenomenon.
Public tickers, founding years, HQs, cumulative funding, disclosed revenue, clinical-stage asset counts, pipeline metrics (DC / IND / Phase 1-2-3), novelty scores, and landmark deals. Type column distinguishes platform-only vs integrated pipeline plays – the central fault line of the US sector. Status pill encodes current trajectory. Table is sortable; use the search box to filter.
Ranked #1 by pipeline output per dollar: Insilico Medicine. 30+ PCCs, 13 INDs, 6 Phase 1, 3 Phase 2, 0 failures on ~$700M raised – 4.3 PCCs per $100M deployed, the industry record.
| # | Company | Status | Type | Founded | HQ | Funding (USD) | FY24 Rev | DCs | INDs | Ph1 | Ph2 | Ph3 | Novelty | Market Cap | Landmark Deal |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Insilico Medicine3696.HK Pharma.AI end-to-end |
🟢 Active & Growing | Integrated | 2014 | Global (NYC · Boston · Abu Dhabi · Shanghai · Suzhou · HK · Montreal · Taipei) | ~$700M + HK$2.277B IPO (~$292M) | $85.8M | 30+ | 13 | 6 | 3 | 0 | ★★★★★ | ~$5B (HK$36B) | Lilly $2.75B · Sanofi $1.2B · Servier $888M · Menarini $550M+ |
| 2 | Phenomics + Centaur (post-merger) |
🔴 Declining | Integrated | 2013 | Salt Lake City, UT | ~$2.2B+ (post-merger) | $58.8M | ~8 | ~10 | ~6 | ~4 | 0 | ★★★ | ~$2B | Roche $150M upfront / up to $12B (2021); Exscientia $688M acq. (2024) |
| 3 | Motion-based / Dynamo |
🟢 Active & Growing | Integrated | 2016 | Cambridge, MA | ~$980M | ~$10M | ~5 | 4 | 2 | 1 | 0 | ★★★★ | ~$500M | Genentech SHP2 $75M upfront / $695M (2020) |
| 4 | SchrödingerSDGR FEP+ / physics×ML |
🟡 Active, Stable | Hybrid | 1990 | New York, NY | ~$380M | $204M | ~3 | 3 | 3 | 0 | 0 | ★★★★ | ~$1.75B | Novartis up to $2.3B, $150M upfront (2023) |
| 5 | Tempus AITEM Clinical genomics + AI |
🟢 Active & Growing | Genomics | 2015 | Chicago, IL | ~$1.7B (pre-IPO $1.3B) | $693M | 0 | 0 | 0 | 0 | 0 | ★★★ | ~$9B | AstraZeneca $200M multimodal (2024) |
| 6 | AbCelleraABCL Microfluidics + ML |
🟡 Active, Stable | Biologics | 2012 | Vancouver, BC | ~$700M | $27M | 2 | 1 | 1 | 0 | 0 | ★★★ | ~$850M | Lilly bamlanivimab (>$800M royalties) |
| 7 | AbsciABSI Zero-shot antibody design |
🟡 Active, Stable | Biologics | 2011 | Vancouver, WA | ~$600M | $1.9M | ~1 | 1 | 1 | 0 | 0 | ★★★ | ~$450M | AstraZeneca $247M biobucks (2023) |
| 8 | BenevolentAIBAI.AS Knowledge graph |
🔴 Declining | Integrated | 2013 | London, UK | ~$292M | ~$8M | ~2 | 2 | 1 | 1 | 0 | ★★★ | ~$150M | AstraZeneca CKD/IPF up to $800M (2019) |
| — | Exscientiaacq. 2024 Centaur / Manifold → RXRX $688M |
⚫ Acquired → Recursion | Acquired | 2012 | Oxford, UK | ~$860M | ~$15M (partial) | 3 | ~5 | 3 | 0 | 0 | ★★★ | $688M exit | Sanofi $100M upfront / $5.2B (2022) |
| 9 | RFdiffusion + ESM |
🟢 Active & Growing | Integrated | 2023 | SF Bay Area | $1B (launch) | n/a | 0 | 0 | 0 | 0 | 0 | ★★★★ | ~$2–3B (private) | Self-funded; no pharma deal disclosed |
| 10 | AlphaFold 3 |
🟢 Active & Growing | Hybrid | 2021 | London (Alphabet) | ~$1B (Alphabet + $600M Thrive) | n/a | 0 | 0 | 0 | 0 | 0 | ★★★★ | Private | Lilly $1.7B + Novartis $1.2B (Jan 2024) |
| — | Chroma generative → Novartis |
⚫ Acquired → Novartis ~$1B | Biologics | 2018 | Somerville, MA | ~$670M | n/a | 2 | 2 | 2 | 0 | 0 | ★★★ | ~$1B+ (Novartis acq) | Amgen 5 targets / $1.9B (2022) |
| 15 | ML + iPSC functional genomics |
🟡 Active, Stable | Hybrid | 2018 | South SF, CA | ~$743M | n/a | 0 | 0 | 0 | 0 | 0 | ★★★ | ~$2.5B (2021 peak) | BMS ALS $50M / $2B (2020) |
| 11 | Virtual pharma + Schrödinger |
🟡 Active, Stable | Integrated | 2009 | Boston, MA | ~$710M | n/a | ~4 | 4 | 2 | 1 | 1 (via Takeda) | ★★★★ | $6.1B TYK2 exit | BMS TYK2 $4B upfront / $6.1B (2022) |
| 16 | Opal platform |
🔴 Declining | Hybrid | 2019 | Boston, MA | ~$750M | n/a | ~3 | 3 | 1 | 2 | 0 | ★★ | ~$2.8B (2021 peak) | Novo Nordisk $60M / $4.6B cardiometabolic (2024) |
| 14 | GEMS / spatiotemporal GNN |
🟡 Active, Stable | Integrated | 2019 | Burlingame, CA | ~$280M | n/a | ~1 | 0 | 0 | 0 | 0 | ★★★ | Private | Eli Lilly $670M/program (2024) |
| — | AtomNet CNN → Sanofi |
⚫ Acquired → Sanofi | Platform | 2012 | San Francisco, CA | ~$174M | n/a | 0 | 0 | 0 | 0 | 0 | ★★ | ~$100M+ exit | Sanofi 5 targets / $1B+ (2022); acq. 2024 |
| 20 | CONVERGE human-tissue ML |
🔴 Declining | Integrated | 2015 | South SF, CA | ~$150M | n/a | ~1 | 1 | 1 | 0 | 0 | ★★★ | Private | Eli Lilly $25M / $694M 4 targets (2021) |
| 13 | NeuralPLexer / OrbNet |
🟡 Active, Stable | Integrated | 2020 | San Diego, CA | ~$220M | n/a | 1 | 1 | 1 (IAM1363) | 0 | 0 | ★★★ | Private | NVIDIA-backed; Lilly indirect (2024) |
| 22 | tNova microwell chemistry |
🔵 Pre-Revenue/Early | Platform | 2018 | Monrovia, CA | ~$120M | n/a | 0 | 0 | 0 | 0 | 0 | ★★ | Private | BMS platform deal (2024) |
| 28 | DEL + ML |
🔵 Pre-Revenue/Early | Platform | 2020 | Cambridge, MA | ~$46M | n/a | 0 | 0 | 0 | 0 | 0 | ★★ | Private | Undisclosed pharma collabs |
| 24 | Allen Institute spinout |
🔵 Pre-Revenue/Early | Integrated | 2021 | Seattle, WA | ~$115M | n/a | 0 | 0 | 0 | 0 | 0 | ★★ | Private | — |
| 25 | Milliner closed-loop Ab |
🟡 Active, Stable | Biologics | 2019 | San Mateo, CA | ~$95M | n/a | 0 | 0 | 0 | 0 | 0 | ★★ | Private | Amgen antibody deal (2023) |
| 27 | AI + cryo-EM |
🔵 Pre-Revenue/Early | Biologics | 2020 | Burnaby, BC | ~$60M | n/a | 0 | 0 | 0 | 0 | 0 | ★★ | Private | Undisclosed |
| 18 | Cell-state AI (Flagship) |
🟡 Active, Stable | Hybrid | 2017 | Cambridge, MA | ~$230M | n/a | 0 | 0 | 0 | 0 | 0 | ★★★ | Private | Novo Nordisk (undisclosed) |
| 19 | RNA therapeutics AI |
🔴 Declining | Hybrid | 2015 | Toronto, ON | ~$230M | n/a | 0 | 0 | 0 | 0 | 0 | ★★ | Private | — |
| 29 | EVA closed-loop Ab |
🟡 Active, Stable | Biologics | 2012 | London, UK | ~$46M | n/a | 0 | 0 | 0 | 0 | 0 | ★★ | Private | — |
| 30 | Protein LLMs / OpenCRISPR |
🔵 Pre-Revenue/Early | Platform | 2022 | Berkeley, CA | ~$44M | n/a | 0 | 0 | 0 | 0 | 0 | ★★★ | Private | — |
| 26 | Protein design SaaS |
🟡 Active, Stable | Platform | 2021 | Amsterdam / ZRH | ~$97M | n/a | 0 | 0 | 0 | 0 | 0 | ★★ | Private | Johnson & Johnson, Novo (SaaS) |
| 31 | NVIDIA-backed protein design |
🔵 Pre-Revenue/Early | Platform | 2019 | Chicago, IL | ~$80M | n/a | 0 | 0 | 0 | 0 | 0 | ★★ | Private | — |
| — | Morphic Therapeuticacq. 2024 Schrödinger-founded → LLY $3.2B |
⚫ Acquired → Lilly $3.2B | Acquired | 2015 | Waltham, MA | ~$400M | n/a | 2 | 2 | 1 | 1 | 0 | ★★★ | $3.2B exit | Eli Lilly cash acquisition (Aug 2024) |
| 17 | Evotec SEEVO PanHunter / PanOmics |
🟡 Active, Stable | Platform | 1993 | Hamburg, DE | — | €770M | 0 | 0 | 0 | 0 | 0 | ★★ | ~€1.6B | BMS TPD $200M upfront (2022) |
Figures are best-available estimates as of May 2026 based on SEC filings, investor-relations pages, and aggregator databases (Deep Pharma Intelligence, BiopharmaTrend, Crunchbase). Clinical-stage counts include Phase 1 and later; wholly-owned assets only except where noted. Market caps are point-in-time and volatile. Activity status was assessed by an ensemble of LLMs.
A decade of AI drug discovery has generated an enormous amount of marketing language about productivity. The numbers below are the ones that actually matter for investors, regulators, and patients: speed to PCC, cost per IND, phase-transition rates, DC package completion, and pipeline output per dollar deployed.
Company-disclosed figures (2020–2024). Insilico's benchmark (INS018_055, ISM3412) is the publicly verified industry record.
External R&D spend from target nomination to IND-enabling package. Traditional figure from Paul et al. Nat Rev Drug Discov 2010 inflated to 2024 dollars. China figure from Insilico HKEX prospectus (∼$2.6M external spend per PCC; $3–5M to full IND).
BIO Industry Analysis (2011–2020) for traditional; Jayatunga et al. Nature 2024 for AI. Caveat: AI sample size small (n=24) and biased toward well-validated targets. The Phase 1 uplift is real (better potency/ADME design). Phase 2 tests biology, not chemistry, so AI's lift narrows. Phase 3 remains untested.
Cumulative PCCs disclosed divided by cumulative capital raised (VC + IPO + follow-on, through 2024). Insilico leads on roughly every productivity axis — the direct result of operating an integrated wet-lab stack inside China's CRO infrastructure.
The productivity delta is not about talent or technology. Insilico's Pharma.AI stack (PandaOmics + Chemistry42 + inClinico) is comparable in capability to Recursion OS, Relay's Dynamo, and Exscientia's Centaur. The difference is that Insilico pairs its AI with a Chinese CRO+CDMO stack running 6–7 day weeks at one-fifth US loaded cost and a wholly-owned robotic wet-lab (Life Star 1, Suzhou). The US platforms that never vertically integrated paid for that choice in PCCs-per-dollar and INDs-per-year.
Insilico deliberately expanded into China to compete with efficient local companies. By establishing Life Star 1 (a robotic wet-lab in Suzhou), leveraging China’s CDE regulatory fast-track paths, and accessing provincial biotech cluster incentives (Zhangjiang, Suzhou BioBAY), Insilico achieved a cost structure of ~$3–5M per IND vs $10–20M in the US. This is not cost-cutting — it is strategic positioning at the intersection of AI capability and operational efficiency.
The most important data in drug discovery is generated between target identification (0) and development candidate (DC). This is where the real science happens: potency optimization, selectivity profiling, ADME/PK, in vivo efficacy, safety pharmacology. With 30+ completed 0-to-DC trajectories, Insilico has built the largest proprietary dataset of full drug discovery campaigns in the AI industry. Each completed program trains the next generation of models. This compounding data advantage — the 0-to-DC flywheel — makes each subsequent program faster and cheaper. No other AI drug company has this volume of end-to-end experimental data.
In a strategic shift, Insilico has begun providing training and benchmarking services to foundation model companies and LLM vendors — including Liquid Networks and others — to help them build drug-discovery-specific capabilities. The model: Insilico provides reinforcement learning signals, curated molecular datasets, and real-world experimental validation. Foundation model companies provide compute and architectural innovation. Insilico then tests the resulting models experimentally in its wet labs, closing the loop between in silico prediction and in vitro/in vivo reality. This positions Insilico as both a drug company and the industry’s benchmarking standard — the proving ground where AI models graduate from molecular generation to actual drug candidates.
A Development Candidate is not a molecule. It is a dossier of roughly 25–40 preclinical studies that together justify the first human dose. The sequence below is what the FDA expects in an IND under 21 CFR 312.23. Each step is wet-lab work. AI accelerates step 1 dramatically. The rest of the pipeline has barely changed since 1995.
A typical US integrated AI biotech takes 18–30 months and $20–50 million to go from preclinical candidate to IND submission. The bulk of that time is GLP toxicology (which runs in calendar time regardless of compute) and CMC scale-up (kilogram API synthesis, stability studies, formulation, release testing).
AI compresses target identification and hit-to-lead by 60–80%. It compresses GLP tox by roughly 0%. The wet-lab bottleneck is therefore a larger fraction of total timeline for AI companies than for traditional pharma — a counter-intuitive result that explains why US AI biotechs report similar target-to-IND times (24–36 months) to the better traditional pharma programmes despite much faster hit generation.
Loaded med-chem FTE cost in China is $80–120k/year vs $250–400k in the US. GLP toxicology at NMPA-accredited CROs runs $1–3M vs $5–10M at Charles River or Covance. CMC turn-around on kilogram API is weeks in Suzhou, months in New Jersey. Chinese CROs run 6–7 day weeks and rotating 24-hour shifts on CMC programmes.
The compound effect: target→IND in 24 months for ~$5M in China vs 36 months for $20M+ in the US. For a platform running 30 programmes, that is the difference between 10 INDs a year and 3. It is also the difference between Insilico's output and Recursion's.
Figures consolidated from Paul et al. Nat Rev Drug Discov 2010 inflated to 2024 dollars, DiMasi 2016, and disclosed programme budgets from Recursion, Relay, Schrödinger, Insilico.
The strategic implication for US AI biotechs. Pure AI speed is not a moat if you run the wet-lab portion of your programme through the same US CRO network the incumbents use. The bottleneck rewards vertical integration (Recursion's BioHive-2, Insilico's Life Star), geographic arbitrage (Insilico HK/Suzhou), or exit to big pharma before the DC bill lands (Nimbus/TYK2 is the canonical example). Platform-only business models — Atomwise, Schrödinger, Isomorphic — effectively subcontract the expensive phase to partners while booking smaller upfronts and longer-dated biobucks.
The US AI drug-discovery story has four distinct eras: the CNN pioneers (2012–2016), the generative + phenomics wave (2016–2020), the AlphaFold / IPO frenzy (2020–2022), and the LLM era with its consolidation correction (2022–2026). The timeline below maps every landmark event by dollar, deal, or data point.
Every landmark AI drug-discovery deal involving a US sponsor, sorted by total potential value. "Biobucks" = upfront + research milestones + development milestones + sales milestones + royalties. Actual cash paid is usually 5–15% of headline value over the life of the contract.
The US raised roughly 3× the capital deployed by China-based AI drug-discovery companies. It still runs an IND productivity ratio of roughly 1:3 per dollar of AI funding vs the Chinese ecosystem. Four structural reasons explain the gap, and none of them are about AI.
Most US AI biotechs sell a platform: Schrödinger licences software, Isomorphic ships design services, Atomwise runs virtual screens. The companies that tried to build both platform and pipeline (Recursion, Relay, Exscientia) subcontracted wet lab to US CROs running on US cost and US calendars.
Insilico in contrast runs Pharma.AI (target ID + generative chemistry + clinical prediction) and a wholly-owned robotic wet lab (Life Star 1, Suzhou) and partners with WuXi / Pharmaron for GLP tox and CMC. End-to-end beats platform-only on throughput.
Loaded FTE cost: US med-chem $250–400k; China $80–120k. GLP toxicology in China $1–3M per programme; US $5–10M. CMC kilogram API in China: 4–8 weeks; US: 3–6 months. In vivo PK in two species: $500k in China, $1.5–2M in the US.
Compounded across a 25-study DC package, a Chinese programme costs roughly one-fifth of the US equivalent and completes in roughly half the calendar time.
NMPA reduced CTA review to ~60 days (from 150+) in 2017 reforms. FDA IND defaults to 30 days but requires a much thicker package. Australian TGA CTN allows first-in-human dosing with minimal review, which is why Insilico's INS018_055 first-in-human (Feb 2022) was conducted in Australia before cross-filing.
China/Australia/HK bundling creates a 6–12 month acceleration over US-only strategy for early phase.
China's CRO ecosystem runs 6–7 day weeks, rotating 24-hour shifts, and physical proximity to AI compute (WuXi has formal ML partnerships). US CRO industry is union-adjacent and geographically dispersed. AI efficiency compounds on top of wet-lab speed; when wet lab is slow, the compound rate is slow.
An integrated AI+CRO stack produces PCC-per-week outputs that would take a US platform-only model a quarter to replicate.
A full, independent audit of the Chinese AI drug-discovery ecosystem — including Insilico, XtalPi, Galixir, StoneWise, Neomer and the broader Four Dragons landscape — is published at aipharmachina.com. The Chinese portal documents roughly 80% of the global AI-discovered PCC output coming from one platform operating on one-third the capital of US peers.
India is 3–5 years behind China in AI drug discovery — not because of talent, but because of infrastructure. With the world’s largest software talent pool (5M+ developers), global leadership in generic pharmaceuticals (60% of the world’s vaccines), and 1.4 billion genetically diverse citizens, India has the raw ingredients. Generative AI is the inflection point: it shifts discovery from wet labs to dry labs, playing directly to India’s strengths. Companies like Jubilant Therapeutics, Verseon, and Elucidata are already in clinical stages. The full analysis is at aipharmaindia.com.
Big pharma's AI spend is concentrated in three buckets: genetic target ID, molecular design, and clinical-trial operations. The real budget flows through deal biobucks to the AI-native partners rather than pure in-house build. Below: the eight most active US pharma AI programmes.
The single largest buyer of external AI drug-discovery work in 2022–2026. Deals with Isomorphic ($1.7B), Insilico ($2.75B), Genesis ($670M/pgm), Iambic (2024), Verge ($694M), plus the Morphic cash acquisition ($3.2B, Aug 2024). Internal LillyTOI discovery engine, and OpenAI antimicrobial partnership 2024.
Spent $6.1B on Nimbus TYK2 (2022) — the single largest AI-adjacent deal in history. Schrödinger computational-chemistry collaboration. Evotec TPD deal ($200M upfront). Insitro ALS deal ($2B biobucks). Sustained multi-year commitment; heavy user of physics-based design.
AI-driven target identification via partnerships with Tempus, Schrödinger, XtalPi. Internal PfizerWorks computational platform. Acquired Trillium Therapeutics ($2.3B, 2021) for CD47 programme. Relatively less noisy than BMS or Lilly on external AI biotech deals but heavier on internal ML-ops.
Partnerships with Atomwise (2020, 3 targets up to $610M), Absci (2022), and AiCure on clinical-trial ops. Internal Modeller platform; invested in Variational AI, Cyrus Biotech, others. Historically conservative on AI-biotech biobucks but active acquirer (Acceleron, Peloton Therapeutics).
Acquired deCODE Genetics ($415M 2012; valued >$14B in talent+data leverage) as the industry's largest genetic target-ID asset. Generate Biomedicines partnership ($1.9B, 2022). BigHat Biosciences partnership (2023). Internal ML group staffed with ex-Google Brain and Genentech talent.
Operates the Regeneron Genetics Center with exome sequencing of 500,000+ participants (the largest private sequencing operation outside 23andMe/UK Biobank). Multiple partnerships with AbCellera and other antibody-discovery AI platforms. Internal ML focused on exome-linked drug ID.
BenevolentAI deal ($800M+, 2019, CKD/IPF) — reference case for disappointing biobucks. Absci oncology deal ($247M biobucks, 2023). Tempus multimodal foundation-model deal ($200M, 2024). Internal AI Research unit based in Cambridge UK with heavy NLP/imaging work.
Exscientia $5.2B biobucks (2022, 15 targets). Insilico $1.2B biobucks (2022). Atomwise $1B+ biobucks (2022). Reported acquisition of Atomwise (~$100M+, 2024). Recursion phenomics ($20M upfront, 2024). Most diversified portfolio of external AI partners of any big pharma.
Isomorphic Labs ($1.2B, Jan 2024). Schrödinger ($2.3B, 2023). Generate Biomedicines acquisition (~$1B, late 2024). Ongoing Microsoft Research partnership. Data42 internal data platform. Heavy bet on AI + protein design.
Annual AI drug-discovery venture funding peaked at $5.2B in 2021, bottomed at $2.2B in 2023, and recovered to $5B+ in 2024 on the back of a single round (Xaira, $1B). The generative-AI / LLM cycle reset expectations upward but corporate VC (Lilly Ventures, Novo Holdings, Eli Lilly Asia Ventures) has replaced exotic growth capital (SoftBank, Tiger Global) as the most consistent backer.
Third-party investment — excludes pharma biobucks. Sources: BCG 2022 Oct analysis (Jayatunga et al.); Deep Pharma Intelligence; BiopharmaTrend; company disclosures.
Cumulative NIH AI-drug-discovery-adjacent funding is approximately $500M/year (2024) across grants, contracts, and intramural IT infrastructure. Dwarfed by private capital but materially larger than Chinese equivalents on the basic-science end.
With IPO windows narrow and SPAC routes discredited, the 2023–2025 cycle has been the M&A cycle. Pharma bought the platforms it had been leasing. AI-native biotechs bought one another for scale. The deals below are the largest.
The public AI-drug-discovery basket peaked in Q1 2021 as COVID liquidity, AlphaFold narrative, and SPAC money all compounded. Two years later it was down 70–90% across the board. In 2024–2025 Tempus AI's IPO and Recursion's Exscientia deal rerated the highest-quality names upward; BenevolentAI, Absci, and others remain at or near all-time lows.
| Ticker | IPO date | IPO raise | IPO price | Peak price | Peak market cap | 2025 price range | 2025 market cap | Peak drawdown |
|---|---|---|---|---|---|---|---|---|
| TEM Tempus AI | Jun 2024 | $410M | $37 | ~$80 (Aug 2024) | ~$18B | $55–70 | ~$9B | –25% |
| SDGR Schrödinger | Feb 2020 | $232M | $17 | ~$110 (Jan 2021) | ~$8B | $20–25 | ~$1.75B | –78% |
| RXRX Recursion | Apr 2021 | $436M | $18 | ~$45 (Jul 2021) | ~$10B | $5–7 | ~$2B | –80% |
| RLAY Relay | Jul 2020 | $400M | $20 | ~$48 (Feb 2021) | ~$6B | $3–5 | ~$500M | –90% |
| ABCL AbCellera | Dec 2020 | $556M | $20 | ~$58 (Feb 2021) | ~$15B | $2.50–3.50 | ~$850M | –94% |
| ABSI Absci | Jul 2021 | $211M | $16 | ~$28 (Oct 2021) | ~$2.5B | $3–5 | ~$450M | –85% |
| EXAI Exscientia | Oct 2021 | $350M | $22 | ~$25 (late 2021) | ~$2.9B | acq. Nov 2024 | $688M | –77% pre-exit |
| BAI.AS BenevolentAI | Apr 2022 SPAC | €232M | €12 | €13 (May 2022) | ~€1.5B | <€1 | ~€150M | –93% |
The lesson of the basket. Every US AI biotech that IPO'd in 2020–2021 except Tempus AI drew down more than 75% from peak. Tempus is the only one with >$500M of diversified commercial revenue — and it is fundamentally a clinical-genomics company, not an AI drug-discovery pure-play. Investors in 2025 discount the "AI" tag entirely for biotech pricing and value these companies on pipeline, partnership revenue, and cash burn — the same metrics as any other biotech.
The BIOSECURE Act passed the House in September 2024 and heads toward Senate implementation in 2026. It bars federal contracts with named Chinese biotech companies — WuXi AppTec, WuXi Biologics, BGI, MGI Tech, Complete Genomics — with an eight-year wind-down to 2032. The consequences ripple through the AI drug-discovery CRO stack on which many US biotechs depend.
WuXi AppTec runs roughly 35% of small-molecule CMC manufacturing for US biotech (Biosecure Act House committee testimony, 2024). Separation costs estimated at $1–2B across sector, adding 12–24 months to affected programmes. Indian CROs (Syngene, Aragen Life Sciences, Piramal Pharma Solutions) and US CROs (Catalent, Lonza US) are the primary beneficiaries.
Companies with primarily US-based wet-lab operations are insulated. Tempus AI runs its own CLIA labs. AbCellera has GMP manufacturing coming online in Vancouver in 2025. Schrödinger sells software; no CRO dependency.
Life Star 1 robotic lab in Suzhou is owned, not rented. INSILICO operates through Hong Kong entities with non-restricted CRO networks. But some US-side federal contracts (DoD biodefence, NIH grants) could be affected if programme is deemed "Chinese-affiliated". Insilico has stated 2025 plans to scale US-side CDMO relationships.
The strategic question for 2026–2030. BIOSECURE accelerates the decoupling of US pharma's wet-lab footprint from China's CRO stack. It does not change the underlying cost or speed differentials — Indian CROs are cheaper than US but still 2–3x more expensive than Chinese. The geopolitical premium paid for "secure" manufacturing is a real cost to US AI-biotech productivity; it may also mean US biotechs increasingly route through Korea (Samsung Biologics), Japan, or India to reclaim some of the cost arbitrage.
Founders, chief scientists, and technologists whose decisions moved the sector. Two Nobel laureates (Baker, Hassabis/Jumper shared), multiple Nature covers, and roughly $10B of AI-driven biotech value created between them.
Ten US and US-adjacent university groups account for the majority of AI drug-discovery founder teams, open-source codebases, and Nature/Science covers. The three most consequential are David Baker's IPD (Washington), Demis Hassabis's DeepMind (UK, but open-sourced AlphaFold is a US-pharma enabler), and the broader MIT CSAIL / Broad Institute cluster.
Led by David Baker (2024 Nobel). RoseTTAFold, RFdiffusion, ProteinMPNN. Spawned Xaira co-founders (Hetu Kamisetty), Generate Biomedicines talent, Cyrus Biotech, Arzeda, Neoleukin, A-Alpha Bio, and AÖP Biosciences. Single most prolific drug-discovery founder factory in the sector.
Regina Barzilay's group published Chemprop (message-passing neural nets for molecular property prediction) and early GCN work. Broad Institute's Connectivity Map (CMap) + imaging datasets underlie phenomics approaches (Recursion, Cellarity). MIT Jameel Clinic partnership with Cambridge focuses on antibiotics (halicin 2020).
Vijay Pande's group produced Genesis Therapeutics founder Evan Feinberg and foundational GNN work on molecular property prediction. Pande now runs a16z Bio — the single most active AI-biotech VC partner. Stanford AIMI (AI in Medicine and Imaging) is a major applied-ML hub.
Alán Aspuru-Guzik collaborated with Insilico on 2016–2018 GAN/RL molecular generation papers; now leads the Acceleration Consortium at University of Toronto. Wyss Institute spawned major therapeutics via Don Ingber's lab. Harvard Medical School hosts the Laboratory of Systems Pharmacology under Peter Sorger.
David Koes's group produced GNINA (docking with neural nets), foundational for structure-based AI screening. CMU Computational Biology department is a major pipeline into Roivant, Schrödinger, and Insitro. Josh Bloomer / Russ Schwartz group on cancer genomics ML.
Barry Honig (RosettaFold predecessor methods) and Richard Friesner (Schrödinger founder) anchor the Columbia biophysics tradition. Joachim Frank (2017 Nobel, cryo-EM) built the reconstruction theory that underpins Gandeeva Therapeutics and others.
Andrej Sali's MODELLER (1993) is the foundational homology-modelling code in the field. UCSF QBI (Nevan Krogan) runs proteomics at scale. Major pipeline into BridgeBio and Genentech. UCSF hosted the 2022 RoseTTAFold Protein Contact Prediction collaboration.
Barbara Engelhardt (now Gladstone Institutes) on statistical genomics and Bayesian modelling. Princeton Ludwig Institute ML-in-cancer work. Strong pipeline into Flagship-adjacent companies and Google X.
Three discussion papers in 2023, a draft guidance in January 2025, a formal AI Council in CDER. The FDA has moved faster on AI-in-drug-development than most regulators and is the single most important external dependency for every US AI biotech. The agency's posture is risk-based, context-specific, and explicitly non-prescriptive about model architecture.
The FDA's Jan 2025 draft guidance introduces a risk-based credibility framework. A sponsor must specify:
Higher-risk contexts (e.g., model-informed dose selection in pivotal trials) require substantially more validation than lower-risk contexts (e.g., model-aided lead prioritisation).
The FDA has been clear: an AI-origin drug is held to the same standard as any other drug. No special pathway, no accelerated approval for "AI-discovered" status alone. The DC package (21 CFR 312.23) is unchanged. The pivotal efficacy bar is unchanged. The CMC bar is unchanged.
The only substantive regulatory benefit AI confers today is reduction in preclinical attrition: better ADME/PK, better selectivity, cleaner tox profiles at first-in-human. That may translate into Phase 1 success-rate uplift (as seen in the Jayatunga analysis) but does not change the approval bar.
Programmes closest to a US approval with significant AI contribution to discovery or design:
Six notable failures — covering technical flops, SPAC blowups, management collapses, and strategic mis-bets. Each carries a specific, repeatable lesson for operators, investors, and regulators.
2022 SPAC valuation: ~€1.5B. 2025: ~€150M (–90%). Laid off 180 staff (~50%) in 2023. Lead asset BEN-2293 (atopic dermatitis topical) missed Phase 2 in 2023 — a key blow after the company had publicly positioned text-mining + knowledge graphs as competitive with structure-based methods.
Lesson: Text-mining and knowledge graphs are useful for target repurposing (baricitinib/COVID) but have not yet produced a first-in-class de novo programme that cleared Phase 2. The gap between "the graph suggests X" and "the molecule works in humans" is entirely wet-lab work — which BenevolentAI under-invested in.
DSP-1181 (5-HT1A agonist for OCD, partnered with Sumitomo Dainippon) was the first "AI-designed" molecule in a human trial (Jan 2020). Phase 1 terminated 2021 — did not progress. Exscientia's EXS21546 (A2A/A2B immuno-oncology) followed into discontinuation 2023.
Lesson: AI improves the odds of reaching the clinic. It does not change the biology once there. A "designed" molecule can still have a bad target.
Valo announced a SPAC merger with Khosla Ventures Acquisition II (KVSA) at a $2.8B valuation in Dec 2021. Terminated May 2022 after market conditions deteriorated. Restructured private 2023; raised Series C of ~$175M; headcount down from ~450 to ~300. Novo Nordisk cardiometabolic deal (Mar 2024, $60M upfront, $4.6B biobucks) stabilised the company.
Lesson: SPAC market-timing cut both ways. Valo survived by cutting fast; many peers did not.
VRG50635 (PIKfyve inhibitor for ALS) was a flagship for "human-data-first" neurodegeneration AI. First-in-patient Phase 1 data (Mar 2024) showed safety and target engagement but limited efficacy signal. Company discontinued Phase 2 planning; pivoted to other targets.
Lesson: Human-tissue multi-omics helps prioritise but does not guarantee. ALS in particular has produced a decade of promising-preclinical, failing-Phase-2 programmes regardless of discovery modality.
REC-994 (Phase 2 SYCAMORE in cerebral cavernous malformation, Q3 2024) missed primary efficacy endpoint. Company is continuing into Phase 3 on secondary-endpoint signals. For a flagship platform asset, the data were underwhelming and the stock reacted accordingly.
Lesson: Even the best-funded AI platform cannot de-risk biology. Phenomics identifies candidates; it does not validate disease mechanism.
Insitro has raised $743M cumulative at a peak $2.5B valuation. It has no wholly-owned clinical assets as of mid-2026 — eight years post-founding. Its Gilead NASH partnership ended in 2023; BMS ALS partnership continues but has produced no IND. Laid off ~22% Oct 2024.
Lesson: "Machine learning + functional genomics" is a scientifically beautiful thesis with a punishing capital-efficiency profile. Target validation at scale is slow; capital-efficient is almost the opposite of what this model produces.
The AI drug-discovery sector is roughly 14 years old. No Phase 3 success yet, no FDA approval yet, and an unresolved productivity gap with China. The next four years will deliver most of the answers investors have been waiting for.
Most likely candidates (ranked): Takeda TAK-279 (ex-Nimbus TYK2, Schrödinger-designed); Relay RLY-4008 (FGFR2 cholangiocarcinoma); Insilico INS018_055 (TNIK IPF). If TAK-279 wins first, the narrative will emphasise physics-based design; if RLY-4008 wins first, it will emphasise MD + ML; if INS018_055 wins first, it will emphasise end-to-end generative. Industry perception will shift on whoever crosses first.
The ~30 well-funded US AI biotechs will compress to roughly 15 survivors by 2028 through M&A, attrition, and fold-ins. Tempus, Recursion, Schrödinger, Xaira, Isomorphic, Generate, Nimbus (post-next-exit), Tempus, and two to four others will be the 2030 cohort.
Lilly, BMS, and Novartis will all build meaningfully larger internal AI teams by 2028. The licensing model (pay-per-target) remains; the outsourcing of the full AI stack (pay-for-platform-access) narrows as pharma ML teams mature. Schrödinger and similar sellers face margin pressure.
Companies without owned wet-lab stacks underperform per unit capital. Recursion (BioHive + imaging), Insilico (Life Star 1), and Xaira (integrated foundation-model-plus-lab thesis) are the structural winners. Platform-only sellers (Atomwise-as-Numerion, Isomorphic) face commoditisation pressure from open-source models.
The 2024 Jayatunga Phase 1 success-rate uplift (80–90%) implies a materially larger AI-origin clinical pipeline entering Phase 3 in 2027–2029. Even with industry-standard Phase 2→3 attrition, 3–5 NDAs by 2030 is plausible. The sector's economic thesis (<$500M per NME) remains untested at approval scale.
If BIOSECURE fully separates US and Chinese pharma supply chains by 2028, Insilico and its Chinese peers retain their cost and speed advantage only for non-US markets. The US sector may rebuild Indian/Korean CRO redundancy at 2–3x Chinese cost, narrowing but not closing the productivity gap. The global AI drug market bifurcates.
The single most important signal to watch. Whether the first FDA-approved AI-discovered NME comes out of a Chinese-operating programme (Insilico INS018_055) or a US-operating programme (Relay RLY-4008, Takeda TAK-279). If China gets there first on a drug with a US label, it will force a strategic rethink across every US pharma's AI allocation. If the US gets there first, the productivity-gap narrative softens — but does not disappear, because approvals are a lagging indicator and PCC output is the leading one.
Fifty-plus AI-derived molecules are in active US clinical development as of May 2026. The table below lists the most-watched programmes by sponsor, phase, indication, and near-term readout. Green rows are Phase 2b or later; amber are Phase 1b/2a; blue are Phase 1 initiation or earlier.
| Sponsor | Asset | Target / MoA | Indication | Phase | Next readout | NCT |
|---|---|---|---|---|---|---|
| Relay Therapeutics | RLY-4008 (lirafugratinib) | FGFR2-selective | Cholangiocarcinoma (FGFR2 fusion) | Ph 2 pivotal | BLA filing 2025 | NCT04526106 |
| Relay Therapeutics | RLY-2608 | Mutant-selective PI3Kα | HR+/HER2– breast cancer | Ph 1b/2 | Ph 3 plan 2025 | NCT05216432 |
| Relay Therapeutics | RLY-5836 | PI3Kα CNS-penetrant | Advanced solid tumours | Ph 1 | 2025 safety | NCT05759949 |
| Recursion | REC-994 | Undisclosed / superoxide | Cerebral cavernous malformation | Ph 2/3 | Ph 3 design 2025 | NCT05085535 |
| Recursion | REC-2282 | Pan-HDAC | NF2 meningiomas | Ph 2/3 POPLAR | Interim 2025 | NCT05130866 |
| Recursion | REC-4881 | MEK1/2 | FAP | Ph 2 TUPELO | Readout 2026 | NCT05552741 |
| Recursion | REC-3964 | Toxin B inhibitor | C. difficile | Ph 2 | 2025 | NCT05963321 |
| Recursion | REC-1245 | RBM39 degrader | Advanced solid tumours | Ph 1 | 2026 | — |
| Recursion (ex-Exscientia) | GTAEXS617 | CDK7 selective | Solid tumours (HR+ BC, OvCa) | Ph 1/2 | 2025 dose-esc | NCT05985655 |
| Recursion (ex-Exscientia) | EXS74539 | LSD1 | SCLC / AML | Ph 1 | 2025 | NCT06266545 |
| Recursion (ex-Exscientia) | EXS73565 | MALT1 | B-cell malignancies | Ph 1 | 2025 | NCT06136559 |
| Schrödinger | SGR-1505 | MALT1 | B-cell lymphomas | Ph 1 | 2026 dose-esc | NCT05544019 |
| Schrödinger | SGR-2921 | CDC7 | AML / MDS | Ph 1 | 2025 | NCT06077162 |
| Schrödinger | SGR-3515 | Wee1 / Myt1 | Advanced solid tumours | Ph 1 | 2026 | NCT06207526 |
| Absci | ABS-101 | TL1A antibody (AI-designed) | IBD | Ph 1 | 2025 SAD/MAD | NCT06449585 |
| Takeda (ex-Nimbus) | TAK-279 (zasocitinib) | TYK2 allosteric | Plaque psoriasis, PsA, IBD | Ph 3 | Ph 3 readout 2025/26 | NCT06088043 |
| AbCellera | ABCL575 | OX40L antibody | Atopic dermatitis | Ph 1 | 2025 | — |
| Insilico Medicine | INS018_055 (rentosertib) | TNIK inhibitor (AI target + AI mol) | IPF | Ph 2a/b | Ph 2b ongoing | NCT05154240 |
| Insilico Medicine | ISM3412 | MAT2A | MTAP-deleted cancers | Ph 1 | 2025 | NCT06187857 |
| Insilico Medicine | ISM5939 | ENPP1 | Solid tumours | Ph 1 (US) | 2025 | NCT06183294 |
| Generate Biomedicines | GB-0895 | Anti-TSLP antibody | Severe asthma | Ph 1 | 2025 | — |
| Generate Biomedicines | GB-0669 | Anti-RSV antibody | RSV prophylaxis | Ph 1 | 2025 | — |
| Iambic Therapeutics | IAM1363 | HER2-mutant selective | HER2-mutant solid tumours | Ph 1/2 | 2025 dose-esc | NCT06253871 |
| BenevolentAI | BEN-8744 | PDE10 | Ulcerative colitis | Ph 1 | 2025 | — |
| BenevolentAI | BEN-34712 | RARb agonist | ALS | Ph 1 IND | 2025/26 | — |
| Nimbus Therapeutics | NDI-219216 | HPK1 | Advanced solid tumours | Ph 1 | 2025 | — |
| Valo Health | OPL-0301 | Undisclosed cardiovascular | Post-MI CV | Ph 2 | 2026 | — |
| Valo Health | OPL-0401 | ROCK1/2 | Diabetic retinopathy | Ph 2 | 2026 | — |
| Verge Genomics | VRG50635 | PIKfyve | ALS (winding down) | Ph 1 halted | n/a | NCT04768972 |
Source: ClinicalTrials.gov lookups, company press releases, 10-K/10-Q filings, May 2026. "AI origin" defined broadly to include molecules where AI played a material role in target selection, hit ID, lead optimisation, or both. TAK-279 included because Nimbus/Schrödinger ML drove the discovery programme although the molecule is now wholly owned by Takeda.
Twelve papers and seven open-source releases account for most of the technical lineage of modern AI drug discovery. All are freely accessible; most have citation counts in the thousands or tens of thousands.
A pattern worth naming. Every foundational open-source release in the sector came from academia or a research lab. Not one came from a US AI biotech. The commercial layer has consumed open-source aggressively (RFdiffusion inside Xaira and Generate, AlphaFold inside every pharma computational chemistry team) and contributed back sparingly. That is consistent with drug-discovery economics (IP concentrated on molecules, not methods) but also explains why the sector's technical narrative is increasingly written in DeepMind's London office and Baker's Seattle lab rather than in Boston or Salt Lake City.
Modern AI drug discovery runs on two inputs: compute and proprietary biological data. NVIDIA has become the default hardware vendor, with equity stakes in at least six US AI biotechs. Proprietary datasets — phenomics images, antibody sequences, DEL binding curves, clinical genomics — are the only durable moats in a world where models are increasingly commoditised.
NVIDIA has taken equity stakes or strategic partnerships in Recursion ($50M, Jul 2023), Schrödinger, Iambic, Terray, Evozyne, Generate Biomedicines, and Genesis Therapeutics. Its BioNeMo foundation-model service (GA 2023) is the most-used non-internal model stack in the sector. The bet is obvious: drug discovery is the largest serious enterprise workload for NVIDIA GPUs outside hyperscale AI, and model training at scale requires NVIDIA silicon.
The corollary: NVIDIA has structural pricing power over any AI biotech running 10,000+ GPU training jobs. In Q4 2024 NVIDIA DGX cluster lead times for biotech customers stretched to nine months.
Models commoditise; data does not. The cost of training a state-of-the-art protein language model has fallen ~100x since 2021 (ESM-1 vs ESM-2; AlphaFold 2 vs 3). The cost of generating 65PB of phenomics data, or 500k exomes, or 8M oncology records has not. The US AI biotechs with genuine long-term moats are the ones running proprietary wet-lab or clinical data loops at scale. Pure-model companies face the same margin compression as every other ML-as-a-service vendor confronting open-weights competition.