AI Drug Discovery in America · The Platform that Changed Pharma

01 · Executive Summary

The country that invented the field is no longer winning it.

Fourteen years after Atomwise shipped the first convolutional-neural-net virtual screener, the United States remains the undisputed capital and talent hub of AI drug discovery. It also remains an industry whose headline productivity records are being set 12,000 kilometres away.

The American Lead

The first dedicated AI drug-discovery company (Atomwise, 2012) was American. So were Recursion (2013), Insilico's original Baltimore incorporation (2014), and Relay (2016). AlphaFold 2 came out of DeepMind but its open-source release in July 2021 and 43,000+ citations rewired every US pharma computational chemistry team. By 2024 US-based AI-native biotechs had raised roughly $27–30 billion in venture and public equity, with another $50 billion+ in pharma-deal biobucks layered on top.

Two Americans share the 2024 Nobel Prize in Chemistry. Most of the sector's open-source models (RFdiffusion, ESM, Chroma) trace to US labs. NVIDIA's BioNeMo is an American product. The talent stack is not the problem.

The Productivity Problem

No US-origin AI-designed drug has received FDA approval as of Q2 2026. A decade into the wave, the sector has produced ~50 clinical-stage assets and zero NDAs. The best-funded US platform company, Recursion, has roughly 5 wholly-originated clinical assets despite raising north of $2 billion. Its phase-2 readout on REC-994 missed its primary endpoint in Q3 2024.

By comparison, a single global AI platform (Insilico Medicine) has filed 13 INDs, nominated 30+ PCCs, runs 9 clinical programmes (6 Phase 1, 3 Phase 2), and has logged 0 clinical failures on roughly one-third the capital. It generated $85.8M in 2024 revenue and $56.2M in H1 2025. Industry PCC-per-year records are being set in Suzhou, not Salt Lake City.

$27–30B

Cumulative VC + public equity 2015–2025

$80B+

Committed pharma biobucks

~75

AI-derived molecules in clinic globally

FDA approvals of AI-origin NME

~43k

AlphaFold 2 citations (Nov 2025)

2024 Nobel laureates in chemistry

–80%

Median AI biotech IPO drawdown from peak

200+

US AI drug-discovery companies

The thesis of this report. Generating a hit is easy. Generating a quality Development Candidate package — the GLP tox, CMC, ADME/PK, formulation, stability, and safety pharmacology bundle required for an IND — is hard, slow, and expensive. Most US AI biotechs optimised for the part of the funnel AI accelerates (hit generation, SAR, selectivity) while underinvesting in the part AI barely touches (wet-lab integration, CRO orchestration, process chemistry). The companies that treated AI as software-to-sell rather than a pipeline-to-build have produced few drugs. The ones that integrated AI into an end-to-end wet-lab operation have produced many. That second pattern, so far, is almost entirely a Chinese phenomenon.

02 · The Leaderboard

Thirty-two companies. Three tiers. Ranked by output, not hype.

Public tickers, founding years, HQs, cumulative funding, disclosed revenue, clinical-stage asset counts, pipeline metrics (DC / IND / Phase 1-2-3), novelty scores, and landmark deals. Type column distinguishes platform-only vs integrated pipeline plays – the central fault line of the US sector. Status pill encodes current trajectory. Table is sortable; use the search box to filter.

Ranked #1 by pipeline output per dollar: Insilico Medicine. 30+ PCCs, 13 INDs, 6 Phase 1, 3 Phase 2, 0 failures on ~$700M raised – 4.3 PCCs per $100M deployed, the industry record.

#	Company	Status	Type	Founded	HQ	Funding (USD)	FY24 Rev	DCs	INDs	Ph1	Ph2	Ph3	Novelty	Market Cap	Landmark Deal
1	Insilico Medicine3696.HK Pharma.AI end-to-end	🟢 Active & Growing	Integrated	2014	Global (NYC · Boston · Abu Dhabi · Shanghai · Suzhou · HK · Montreal · Taipei)	~$700M + HK$2.277B IPO (~$292M)	$85.8M	30+	13	6	3	0	★★★★★	~$5B (HK$36B)	Lilly $2.75B · Sanofi $1.2B · Servier $888M · Menarini $550M+
2	Recursion + ExscientiaRXRX Phenomics + Centaur (post-merger)	🔴 Declining	Integrated	2013	Salt Lake City, UT	~$2.2B+ (post-merger)	$58.8M	~8	~10	~6	~4	0	★★★	~$2B	Roche $150M upfront / up to $12B (2021); Exscientia $688M acq. (2024)
3	Relay TherapeuticsRLAY Motion-based / Dynamo	🟢 Active & Growing	Integrated	2016	Cambridge, MA	~$980M	~$10M	~5	4	2	1	0	★★★★	~$500M	Genentech SHP2 $75M upfront / $695M (2020)
4	SchrödingerSDGR FEP+ / physics×ML	🟡 Active, Stable	Hybrid	1990	New York, NY	~$380M	$204M	~3	3	3	0	0	★★★★	~$1.75B	Novartis up to $2.3B, $150M upfront (2023)
5	Tempus AITEM Clinical genomics + AI	🟢 Active & Growing	Genomics	2015	Chicago, IL	~$1.7B (pre-IPO $1.3B)	$693M	0	0	0	0	0	★★★	~$9B	AstraZeneca $200M multimodal (2024)
6	AbCelleraABCL Microfluidics + ML	🟡 Active, Stable	Biologics	2012	Vancouver, BC	~$700M	$27M	2	1	1	0	0	★★★	~$850M	Lilly bamlanivimab (>$800M royalties)
7	AbsciABSI Zero-shot antibody design	🟡 Active, Stable	Biologics	2011	Vancouver, WA	~$600M	$1.9M	~1	1	1	0	0	★★★	~$450M	AstraZeneca $247M biobucks (2023)
8	BenevolentAIBAI.AS Knowledge graph	🔴 Declining	Integrated	2013	London, UK	~$292M	~$8M	~2	2	1	1	0	★★★	~$150M	AstraZeneca CKD/IPF up to $800M (2019)
—	Exscientiaacq. 2024 Centaur / Manifold → RXRX $688M	⚫ Acquired → Recursion	Acquired	2012	Oxford, UK	~$860M	~$15M (partial)	3	~5	3	0	0	★★★	$688M exit	Sanofi $100M upfront / $5.2B (2022)
9	Xaira Therapeutics RFdiffusion + ESM	🟢 Active & Growing	Integrated	2023	SF Bay Area	$1B (launch)	n/a	0	0	0	0	0	★★★★	~$2–3B (private)	Self-funded; no pharma deal disclosed
10	Isomorphic Labs AlphaFold 3	🟢 Active & Growing	Hybrid	2021	London (Alphabet)	~$1B (Alphabet + $600M Thrive)	n/a	0	0	0	0	0	★★★★	Private	Lilly $1.7B + Novartis $1.2B (Jan 2024)
—	Generate Biomedicines Chroma generative → Novartis	⚫ Acquired → Novartis ~$1B	Biologics	2018	Somerville, MA	~$670M	n/a	2	2	2	0	0	★★★	~$1B+ (Novartis acq)	Amgen 5 targets / $1.9B (2022)
15	Insitro ML + iPSC functional genomics	🟡 Active, Stable	Hybrid	2018	South SF, CA	~$743M	n/a	0	0	0	0	0	★★★	~$2.5B (2021 peak)	BMS ALS $50M / $2B (2020)
11	Nimbus Therapeutics Virtual pharma + Schrödinger	🟡 Active, Stable	Integrated	2009	Boston, MA	~$710M	n/a	~4	4	2	1	1 (via Takeda)	★★★★	$6.1B TYK2 exit	BMS TYK2 $4B upfront / $6.1B (2022)
16	Valo Health Opal platform	🔴 Declining	Hybrid	2019	Boston, MA	~$750M	n/a	~3	3	1	2	0	★★	~$2.8B (2021 peak)	Novo Nordisk $60M / $4.6B cardiometabolic (2024)
14	Genesis Therapeutics GEMS / spatiotemporal GNN	🟡 Active, Stable	Integrated	2019	Burlingame, CA	~$280M	n/a	~1	0	0	0	0	★★★	Private	Eli Lilly $670M/program (2024)
—	Atomwise AtomNet CNN → Sanofi	⚫ Acquired → Sanofi	Platform	2012	San Francisco, CA	~$174M	n/a	0	0	0	0	0	★★	~$100M+ exit	Sanofi 5 targets / $1B+ (2022); acq. 2024
20	Verge Genomics CONVERGE human-tissue ML	🔴 Declining	Integrated	2015	South SF, CA	~$150M	n/a	~1	1	1	0	0	★★★	Private	Eli Lilly $25M / $694M 4 targets (2021)
13	Iambic Therapeutics NeuralPLexer / OrbNet	🟡 Active, Stable	Integrated	2020	San Diego, CA	~$220M	n/a	1	1	1 (IAM1363)	0	0	★★★	Private	NVIDIA-backed; Lilly indirect (2024)
22	Terray Therapeutics tNova microwell chemistry	🔵 Pre-Revenue/Early	Platform	2018	Monrovia, CA	~$120M	n/a	0	0	0	0	0	★★	Private	BMS platform deal (2024)
28	Anagenex DEL + ML	🔵 Pre-Revenue/Early	Platform	2020	Cambridge, MA	~$46M	n/a	0	0	0	0	0	★★	Private	Undisclosed pharma collabs
24	Cajal Neuroscience Allen Institute spinout	🔵 Pre-Revenue/Early	Integrated	2021	Seattle, WA	~$115M	n/a	0	0	0	0	0	★★	Private	—
25	BigHat Biosciences Milliner closed-loop Ab	🟡 Active, Stable	Biologics	2019	San Mateo, CA	~$95M	n/a	0	0	0	0	0	★★	Private	Amgen antibody deal (2023)
27	Gandeeva Therapeutics AI + cryo-EM	🔵 Pre-Revenue/Early	Biologics	2020	Burnaby, BC	~$60M	n/a	0	0	0	0	0	★★	Private	Undisclosed
18	Cellarity Cell-state AI (Flagship)	🟡 Active, Stable	Hybrid	2017	Cambridge, MA	~$230M	n/a	0	0	0	0	0	★★★	Private	Novo Nordisk (undisclosed)
19	Deep Genomics RNA therapeutics AI	🔴 Declining	Hybrid	2015	Toronto, ON	~$230M	n/a	0	0	0	0	0	★★	Private	—
29	LabGenius EVA closed-loop Ab	🟡 Active, Stable	Biologics	2012	London, UK	~$46M	n/a	0	0	0	0	0	★★	Private	—
30	Profluent Protein LLMs / OpenCRISPR	🔵 Pre-Revenue/Early	Platform	2022	Berkeley, CA	~$44M	n/a	0	0	0	0	0	★★★	Private	—
26	Cradle Protein design SaaS	🟡 Active, Stable	Platform	2021	Amsterdam / ZRH	~$97M	n/a	0	0	0	0	0	★★	Private	Johnson & Johnson, Novo (SaaS)
31	Evozyne NVIDIA-backed protein design	🔵 Pre-Revenue/Early	Platform	2019	Chicago, IL	~$80M	n/a	0	0	0	0	0	★★	Private	—
—	Morphic Therapeuticacq. 2024 Schrödinger-founded → LLY $3.2B	⚫ Acquired → Lilly $3.2B	Acquired	2015	Waltham, MA	~$400M	n/a	2	2	1	1	0	★★★	$3.2B exit	Eli Lilly cash acquisition (Aug 2024)
17	Evotec SEEVO PanHunter / PanOmics	🟡 Active, Stable	Platform	1993	Hamburg, DE	—	€770M	0	0	0	0	0	★★	~€1.6B	BMS TPD $200M upfront (2022)

Figures are best-available estimates as of May 2026 based on SEC filings, investor-relations pages, and aggregator databases (Deep Pharma Intelligence, BiopharmaTrend, Crunchbase). Clinical-stage counts include Phase 1 and later; wholly-owned assets only except where noted. Market caps are point-in-time and volatile. Activity status was assessed by an ensemble of LLMs.

03 · Performance Scorecard

The metrics that matter — and how US platforms score.

A decade of AI drug discovery has generated an enormous amount of marketing language about productivity. The numbers below are the ones that actually matter for investors, regulators, and patients: speed to PCC, cost per IND, phase-transition rates, DC package completion, and pipeline output per dollar deployed.

12–15yr

Traditional target → approval

DiMasi/Tufts CSDD 2016; $2.6B capitalised cost per NME.

1.2%

Big-pharma R&D IRR (Deloitte 2022)

Worst in 13 years. Eroom's Law still compounding.

12–18 mo

AI target → PCC (Insilico benchmark)

Vs 3–5 years traditional. Best public benchmark in the sector.

<$500M

Promised AI cost per NME

Vs $2.6B traditional. No AI drug has reached approval yet to test the claim.

Time to Preclinical Candidate, by company

Traditional pharma

48–60 months

Recursion

24–30 months

Exscientia (pre-acq.)

~24 months

Relay Therapeutics

24–28 months

Iambic (IAM1363)

24 months

Insilico Medicine

12–18 months

Company-disclosed figures (2020–2024). Insilico's benchmark (INS018_055, ISM3412) is the publicly verified industry record.

Cost per IND, AI vs traditional

Traditional NME

$80–100M

US AI biotech (avg)

$15–25M

China-integrated AI

$3–5M

External R&D spend from target nomination to IND-enabling package. Traditional figure from Paul et al. Nat Rev Drug Discov 2010 inflated to 2024 dollars. China figure from Insilico HKEX prospectus (∼$2.6M external spend per PCC; $3–5M to full IND).

Phase transition success rates

P1→P2 traditional

52%

P1→P2 AI-derived (n=24)

80–90%

P2→P3 traditional

29%

P2→P3 AI-derived

~40%

P3→approval traditional

58%

P3→approval AI-derived

n/a (0 AI NMEs approved)

BIO Industry Analysis (2011–2020) for traditional; Jayatunga et al. Nature 2024 for AI. Caveat: AI sample size small (n=24) and biased toward well-validated targets. The Phase 1 uplift is real (better potency/ADME design). Phase 2 tests biology, not chemistry, so AI's lift narrows. Phase 3 remains untested.

Pipeline productivity: PCCs nominated per $100M deployed

Insilico Medicine

~4.3 PCCs / $100M

Exscientia (pre-acq.)

~1.1 PCCs / $100M

Relay Therapeutics

~0.7 PCCs / $100M

Recursion

~0.6 PCCs / $100M

BenevolentAI

~0.3 PCCs / $100M

Cumulative PCCs disclosed divided by cumulative capital raised (VC + IPO + follow-on, through 2024). Insilico leads on roughly every productivity axis — the direct result of operating an integrated wet-lab stack inside China's CRO infrastructure.

Head-to-head: the four comparable platforms

Metric

Insilico (Global)

Recursion (US)

Relay (US)

Exscientia (UK)

Global (8 offices)

Salt Lake City

Cambridge, MA

Oxford (acq.)

Status

🟢 Active & Growing

🔴 Declining

🟢 Active & Growing

⚫ Acquired by RXRX

Founded

2014

2013

2016

2012

Capital raised

~$700M + HK$2.277B IPO (~$292M)

~$1.5B+

~$980M

~$860M

Market cap (2026)

~$5B (HK$36B)

~$2B

~$500M

$688M exit

FY24 revenue

$85.8M

$58.8M

~$10M

~$15M (partial)

H1 2025 revenue

$56.2M

n/d

n/a

Employees

~350

~1,200 (post-merger)

~350

~400 (pre-acq.)

0-to-DC programs completed

30+

PCCs disclosed

30+

Pipeline programs

40+

~20

~10

Clinical-stage assets

9 (6 Ph1 / 3 Ph2)

10 (post-Exscientia)

IND filings cumulative

~10

Clinical failures

1 (REC-994 missed primary)

1 (EXS21546 disc.)

2 (DSP-1181, EXS21546)

Target → PCC

12–18 mo

24–30 mo

24–28 mo

~24 mo

External $ / PCC

~$2.6M

~$15–30M est.

~$20–35M est.

~$15–25M est.

Cost per IND

~$3–5M (China)

~$15–25M (US)

~$20M (US)

~$15M (UK)

Primary CRO geography

China + HK + APAC + US

US + limited APAC

US + EU

UK + US

Wet-lab owned

Life Star 1 robotic lab (Suzhou)

Imaging only (BioHive-2)

Modest

Modest (Oxford)

Novelty score

★★★★★

★★★

★★★★

★★★

First AI drug in humans

ISM001-055 Feb 2022

REC-994 (2021)

RLY-1971 (2019)

DSP-1181 Jan 2020 (disc.)

The productivity delta is not about talent or technology. Insilico's Pharma.AI stack (PandaOmics + Chemistry42 + inClinico) is comparable in capability to Recursion OS, Relay's Dynamo, and Exscientia's Centaur. The difference is that Insilico pairs its AI with a Chinese CRO+CDMO stack running 6–7 day weeks at one-fifth US loaded cost and a wholly-owned robotic wet-lab (Life Star 1, Suzhou). The US platforms that never vertically integrated paid for that choice in PCCs-per-dollar and INDs-per-year.

Insilico’s China Strategy – Competing Where Efficiency Wins

Insilico deliberately expanded into China to compete with efficient local companies. By establishing Life Star 1 (a robotic wet-lab in Suzhou), leveraging China’s CDE regulatory fast-track paths, and accessing provincial biotech cluster incentives (Zhangjiang, Suzhou BioBAY), Insilico achieved a cost structure of ~$3–5M per IND vs $10–20M in the US. This is not cost-cutting — it is strategic positioning at the intersection of AI capability and operational efficiency.

The 0-to-DC Data Flywheel

The most important data in drug discovery is generated between target identification (0) and development candidate (DC). This is where the real science happens: potency optimization, selectivity profiling, ADME/PK, in vivo efficacy, safety pharmacology. With 30+ completed 0-to-DC trajectories, Insilico has built the largest proprietary dataset of full drug discovery campaigns in the AI industry. Each completed program trains the next generation of models. This compounding data advantage — the 0-to-DC flywheel — makes each subsequent program faster and cheaper. No other AI drug company has this volume of end-to-end experimental data.

Democratizing AI Drug Discovery – Training the Next Generation

In a strategic shift, Insilico has begun providing training and benchmarking services to foundation model companies and LLM vendors — including Liquid Networks and others — to help them build drug-discovery-specific capabilities. The model: Insilico provides reinforcement learning signals, curated molecular datasets, and real-world experimental validation. Foundation model companies provide compute and architectural innovation. Insilico then tests the resulting models experimentally in its wet labs, closing the loop between in silico prediction and in vitro/in vivo reality. This positions Insilico as both a drug company and the industry’s benchmarking standard — the proving ground where AI models graduate from molecular generation to actual drug candidates.

04 · The Development Candidate Problem

AI wins the hit. The DC package wins the IND.

A Development Candidate is not a molecule. It is a dossier of roughly 25–40 preclinical studies that together justify the first human dose. The sequence below is what the FDA expects in an IND under 21 CFR 312.23. Each step is wet-lab work. AI accelerates step 1 dramatically. The rest of the pipeline has barely changed since 1995.

Potency & Selectivity

3–4 mo

$0.3–0.8M

ADME / PK in vitro

3–4 mo

$0.5–1.2M

In vivo PK (2 species)

4–6 mo

$0.8–2M

Efficacy models

4–8 mo

$1–3M

Safety pharmacology

3–5 mo

$0.8–2M

GLP tox (28-day, 2 sp.)

6–9 mo

$3–10M

CMC & GMP supply

9–15 mo

$5–20M

Genotoxicity + IND write

3–5 mo

$0.5–1.5M

Why the DC package is the real bottleneck

A typical US integrated AI biotech takes 18–30 months and $20–50 million to go from preclinical candidate to IND submission. The bulk of that time is GLP toxicology (which runs in calendar time regardless of compute) and CMC scale-up (kilogram API synthesis, stability studies, formulation, release testing).

AI compresses target identification and hit-to-lead by 60–80%. It compresses GLP tox by roughly 0%. The wet-lab bottleneck is therefore a larger fraction of total timeline for AI companies than for traditional pharma — a counter-intuitive result that explains why US AI biotechs report similar target-to-IND times (24–36 months) to the better traditional pharma programmes despite much faster hit generation.

Why Chinese CROs shift the equation

Loaded med-chem FTE cost in China is $80–120k/year vs $250–400k in the US. GLP toxicology at NMPA-accredited CROs runs $1–3M vs $5–10M at Charles River or Covance. CMC turn-around on kilogram API is weeks in Suzhou, months in New Jersey. Chinese CROs run 6–7 day weeks and rotating 24-hour shifts on CMC programmes.

The compound effect: target→IND in 24 months for ~$5M in China vs 36 months for $20M+ in the US. For a platform running 30 programmes, that is the difference between 10 INDs a year and 3. It is also the difference between Insilico's output and Recursion's.

The 21 CFR 312.23 IND checklist

Required modules

Form FDA 1571 — Cover sheet
Table of Contents
Introductory Statement & General Investigational Plan — 1-year forward view
Investigator's Brochure — integrated preclinical + clinical summary
Clinical Protocol — FIH Phase 1 design, stopping rules, dose-esc
CMC — identity, potency, purity, stability, manufacturing, release specs
Pharmacology & Toxicology — PK/ADME ≥2 species, GLP tox 14d + 28d rodent + non-rodent, safety pharm, genotoxicity battery
Previous Human Experience — if any

Typical wall-clock post-PCC

GLP tox studies: 12–18 months, $3–10M
CMC / GMP supply: 12–24 months, $5–20M
Full IND package: 18–30 months, $20–50M
Pre-IND FDA meeting: +3 months (optional, common)
IND submission: +30 day FDA review clock
First-in-human: Typical 24–36 months post-PCC in US; 18–24 in China

Figures consolidated from Paul et al. Nat Rev Drug Discov 2010 inflated to 2024 dollars, DiMasi 2016, and disclosed programme budgets from Recursion, Relay, Schrödinger, Insilico.

The strategic implication for US AI biotechs. Pure AI speed is not a moat if you run the wet-lab portion of your programme through the same US CRO network the incumbents use. The bottleneck rewards vertical integration (Recursion's BioHive-2, Insilico's Life Star), geographic arbitrage (Insilico HK/Suzhou), or exit to big pharma before the DC bill lands (Nimbus/TYK2 is the canonical example). Platform-only business models — Atomwise, Schrödinger, Isomorphic — effectively subcontract the expensive phase to partners while booking smaller upfronts and longer-dated biobucks.

05 · Historical Timeline

Fourteen years, four waves, one thesis under pressure.

The US AI drug-discovery story has four distinct eras: the CNN pioneers (2012–2016), the generative + phenomics wave (2016–2020), the AlphaFold / IPO frenzy (2020–2022), and the LLM era with its consolidation correction (2022–2026). The timeline below maps every landmark event by dollar, deal, or data point.

2012

Atomwise founded first mover

Abraham Heifets and Izhar Wallach launch Atomwise in San Francisco. AtomNet — the first CNN-based structure-based virtual screening system — becomes one of the earliest dedicated AI drug discovery platforms. Y Combinator W15. Eventually raises ~$174M across rounds, rebrands as Numerion Labs in 2026.

2012

Exscientia founded (UK, US-active)

Andrew Hopkins spins Exscientia out of the University of Dundee, pairing AI-driven design with DMTA cycles. Will become the first AI biotech to put a designed molecule into human trials (DSP-1181, Jan 2020, partnered with Sumitomo).

2013

Recursion founded

Chris Gibson, Blake Borgeson, and Dean Li found Recursion in Salt Lake City around a phenomics model: brute-force cellular imaging across millions of conditions, ML-derived morphological embeddings. Spun out of the University of Utah. Over the next decade raises ~$1.5B, builds BioHive-1 and BioHive-2 GPU clusters, and becomes the US sector's largest platform by capital and pipeline breadth.

2013

BenevolentAI founded (UK, US deals)

Ken Mulvany launches Stratified Medical (rebranded BenevolentAI 2014) with a text-mining + knowledge graph thesis. Will reach a peak private valuation of ~$2B before SPAC-listing in 2022 and collapsing >95%.

2014

Insilico Medicine founded end-to-end

Alex Zhavoronkov founds Insilico Medicine in Baltimore, pivoting a longevity-research thesis into a generative AI pharmacology company. Seminal 2016–2018 GAN/RL molecular-generation work is published with Alán Aspuru-Guzik at Harvard. By 2019 Insilico establishes a Hong Kong R&D hub and starts the China CRO integration that will later define its productivity edge.

2015

AtomNet preprint published

Atomwise posts "AtomNet: A Deep CNN for Bioactivity Prediction in Structure-based Drug Discovery" to arXiv (1510.02855). One of the first applications of deep learning to docking. Cited thousands of times; foundational to the CNN era.

2016

Relay Therapeutics founded

Third Rock spins Relay out of D.E. Shaw Research, licensing the Anton supercomputer heritage for "motion-based drug design". Raises ~$520M pre-IPO from Third Rock, SoftBank, GV, D.E. Shaw. Lead programme RLY-4008 (FGFR2, cholangiocarcinoma) will receive Breakthrough Therapy designation in 2023.

2017

BenevolentAI $115M round

BenevolentAI reaches a private valuation near $2B on a $115M raise, cementing early-era hype around text-mining/knowledge-graph approaches. Four years later it will have laid off half its staff.

2018

AlphaFold 1 wins CASP13 inflection

DeepMind's first competition entry tops overall CASP13 rankings in December. Nature paper in 2020. The signal: deep learning can do structural biology. Every pharma computational chemistry team rewrites its roadmap.

2018

Insitro founded

Daphne Koller (Coursera, ex-Calico) launches Insitro with a ML-plus-functional-genomics thesis and $100M Series A from ARCH and a16z. Will later raise a total of ~$743M at a peak $2.5B valuation without producing an IND.

2018

Generate Biomedicines founded

Flagship Pioneering spins out Generate with a generative-protein thesis. Will later adopt a "Chroma" model analogue of David Baker's RFdiffusion, publish in Nature (2023), and ink a $1.9B Amgen deal in 2022.

2019

BenevolentAI / AstraZeneca $800M+ deal

AstraZeneca signs BenevolentAI for chronic kidney disease and heart failure targets. Structured as up to $800M+ in biobucks. Five years later, the collaboration is widely cited as the canonical example of knowledge-graph repurposing over-promising and under-delivering.

2020

AlphaFold 2 solves protein folding seismic

At CASP14 (Nov 2020), AlphaFold 2 posts median GDT-TS ~92 (experimental quality) on ~2/3 of targets. Nature paper July 2021. Open-sourced with a 200M-structure database via EMBL-EBI (July 2022). ~43,000 citations by Nov 2025. Hassabis, Jumper, Baker win the 2024 Nobel Prize in Chemistry. Single most cited deep-learning paper of the decade.

2020

BenevolentAI / baricitinib COVID story

Feb 2020: BenevolentAI's knowledge graph suggests baricitinib (Olumiant, JAK1/2) for COVID-19. Lancet publication (Richardson et al.) leads to NIH ACTT-2, FDA EUA Nov 2020, full approval May 2022. The first "AI-suggested" COVID therapeutic — though strictly repurposing, not de novo design.

2020

IPO wave begins

Schrödinger (Feb 2020, $202M, SDGR). Relay Therapeutics (Jul 2020, $400M, RLAY). AbCellera (Dec 2020, $556M, ABCL). COVID-era liquidity meets AlphaFold narrative. Every AI biotech on deck reprices upward.

2021

Peak hype: $5.2B AI drug VC; four more IPOs

Third-party AI drug-discovery investment peaks at $5.2B (BCG). Recursion IPO April ($436M, RXRX, $2.9B val). Absci IPO July ($211M). Exscientia IPO October ($350M). Atomwise Series C ($123M). Insitro Series C ($400M). Exscientia's DSP-1181 becomes the first AI-designed molecule in a human clinical trial (Jan 2020, terminated 2021). Isomorphic Labs founded Nov 2021.

2022

Nimbus / BMS TYK2 $6.1B record exit

Feb 2023 close (announced Dec 2022): BMS acquires Nimbus Lakshmi (TYK2 allosteric inhibitor NDI-034858) for $4B cash plus $2B in milestones. $6.1B total. Largest AI-adjacent deal ever, built on Schrödinger's physics-based platform inside Nimbus's virtual-pharma model. Sanofi-Insilico (Nov 2022) $21.5M upfront / $1.2B biobucks across 6 targets. Broader biotech sector collapses; XBI down ~50% peak-to-trough.

2023

Trough of disillusionment

AI drug VC falls to ~$2.2B. Exscientia's EXS21546 (A2A/A2B immuno-oncology) discontinued. BMS-Exscientia collaboration partially cancelled. BenevolentAI lays off 180 (~50%). Recursion REC-994 Phase 2 shows modest effects. Recursion acquires Cyclica and Valence Labs (May 2023, ~$47M + ~$40M all-stock) as bolt-ons. NVIDIA invests $50M in Recursion, launches BioNeMo.

2024

Consolidation year; biggest biobucks in sector history

Isomorphic Labs announces same-day $82.5M combined upfronts from Lilly ($45M) and Novartis ($37.5M) for up to $3B biobucks combined (Jan). Xaira Therapeutics launches with $1B seed (ARCH + Foresite, April) — largest biotech launch round in history. Tempus AI IPO (Jun, $410M, $6.1B val). Morphic Therapeutic acquired by Lilly ($3.2B cash, Aug). Recursion acquires Exscientia ($688M all-stock, announced Aug, closed Nov). Novartis acquires Generate Biomedicines (~$1B, late 2024). Insilico-Lilly deal announced ($115M upfront, $2.75B biobucks). Sanofi reportedly acquires Atomwise (~$100M+, 2024).

2024–25

AlphaFold 3 published; first FDA AI-drug guidance

AlphaFold 3 (Nature, May 2024): 50%+ improvement on protein-ligand-nucleic-acid complex prediction over prior methods. Hassabis and Jumper share half of the 2024 Nobel Prize in Chemistry; Baker receives the other half for computational protein design. FDA releases draft guidance "Considerations for the Use of AI to Support Regulatory Decision-Making for Drug and Biological Products" (Jan 2025), defining the Context-of-Use credibility framework.

2025

Insilico HKEX IPO; Recursion–Roche extension

Insilico Medicine lists on HKEX in late 2025 (ticker 3696, ~$293M raised). Becomes first AI drug-discovery company to IPO in Hong Kong. Recursion extends Roche/Genentech partnership with a fresh $150M upfront tranche. FDA formalises the AI Council in CDER/CBER.

2026

R&D productivity records in China; BIOSECURE takes effect

Insilico-Lilly $2.75B biobucks deal formally announced March 2026 ($115M upfront). Insilico–Servier $888M (Jan 2026). PCC production records set by AI platforms operating in China. BIOSECURE Act implementation begins, restricting federal contract work with Chinese CROs (WuXi AppTec, BGI). US and China AI drug-discovery sectors formally bifurcate on regulatory terms.

06 · Deal Tracker

$80B+ in biobucks committed. One exit above $5B.

Every landmark AI drug-discovery deal involving a US sponsor, sorted by total potential value. "Biobucks" = upfront + research milestones + development milestones + sales milestones + royalties. Actual cash paid is usually 5–15% of headline value over the life of the contract.

Mega-deals > $2B

Nimbus Lakshmi → BMS largest

TYK2 allosteric inhibitor NDI-034858 (now TAK-279) for plaque psoriasis. $4B cash upfront, $2B in milestones.

$6.1B

Dec 2022

Insilico → Eli Lilly

Oral small-molecule therapeutic, AI-discovered via Pharma.AI. $115M upfront + milestones + royalties.

$2.75B

2024/26

Isomorphic Labs → Eli Lilly

Multiple small-molecule targets via AlphaFold-based design. $45M upfront.

$1.7B

Jan 2024

Isomorphic Labs → Novartis

3 undisclosed small-molecule targets. $37.5M upfront. Same-day signing as Lilly.

$1.2B

Jan 2024

Insilico → Sanofi

6 AI-discovered targets. $21.5M upfront + equity. Template for subsequent Insilico biobucks deals.

$1.2B

Nov 2022

Exscientia → Sanofi

15 oncology + immunology targets. $100M upfront. Largest AI-biotech biobucks headline as of 2022. Partially cancelled in 2023.

$5.2B

Jan 2022

Valo Health → Novo Nordisk

11 cardiometabolic programmes. $60M upfront. Largest single Valo deal after failed SPAC.

$4.6B

Mar 2024

Strategic deals $500M–$2B

Recursion → Roche/Genentech

40 programmes, neuroscience + GI oncology. $150M upfront. Extended 2025 with fresh $150M tranche.

$12B biobucks

Dec 2021

Insitro → BMS

ALS and FTD targets. $50M upfront.

$2.0B

2020

Generate Biomedicines → Amgen

5 targets, generative protein design. ~$50M upfront reported.

$1.9B

Jan 2022

Exscientia → BMS

Oncology preclinical work. $50M upfront. Partially unwound 2023.

$1.2B

2021

Recursion → Bayer

Fibrosis focus, extended 2023. ~$50M upfront reported.

$1.0B

2020

Generate Biomedicines → Novartis

Immunology targets. Rumoured Novartis acquisition late 2024 at ~$1B.

$1.0B+

Oct 2023

Insilico → Servier

USP1 inhibitor programme. $32M upfront.

$888M

Jan 2026

BenevolentAI → AstraZeneca

Chronic kidney disease, heart failure, fibrosis. Historical benchmark — over-promised and under-delivered.

$800M+

2019

Relay → Genentech

SHP2 programme (GDC-1971 / migoprotafib). $75M upfront.

$695M

Dec 2020

Verge Genomics → Eli Lilly

4 neuro targets. $25M upfront + $50M equity.

$694M

Feb 2021

Genesis Therapeutics → Eli Lilly

Multi-target neuroscience. ~$20M upfront.

$670M/pgm

Feb 2024

Almirall → Absci

2 AI-designed antibodies, dermatology. $5.3M upfront.

$650M

Jan 2024

Annual AI drug-discovery deal volume (US sponsors, total biobucks)

2020

$3B

2021

$5B

2022

$10B (Nimbus $6.1B)

2023

$6B

2024

$15B+ (Isomorphic, Insilico/Lilly, Valo/Novo, Morphic acq.)

2025 YTD

$5B+

07 · The China Productivity Gap

Why the records are being set 12,000km away.

The US raised roughly 3× the capital deployed by China-based AI drug-discovery companies. It still runs an IND productivity ratio of roughly 1:3 per dollar of AI funding vs the Chinese ecosystem. Four structural reasons explain the gap, and none of them are about AI.

1. End-to-end vs platform-only

Most US AI biotechs sell a platform: Schrödinger licences software, Isomorphic ships design services, Atomwise runs virtual screens. The companies that tried to build both platform and pipeline (Recursion, Relay, Exscientia) subcontracted wet lab to US CROs running on US cost and US calendars.

Insilico in contrast runs Pharma.AI (target ID + generative chemistry + clinical prediction) and a wholly-owned robotic wet lab (Life Star 1, Suzhou) and partners with WuXi / Pharmaron for GLP tox and CMC. End-to-end beats platform-only on throughput.

2. Cost structure

Loaded FTE cost: US med-chem $250–400k; China $80–120k. GLP toxicology in China $1–3M per programme; US $5–10M. CMC kilogram API in China: 4–8 weeks; US: 3–6 months. In vivo PK in two species: $500k in China, $1.5–2M in the US.

Compounded across a 25-study DC package, a Chinese programme costs roughly one-fifth of the US equivalent and completes in roughly half the calendar time.

3. Regulatory rhythm

NMPA reduced CTA review to ~60 days (from 150+) in 2017 reforms. FDA IND defaults to 30 days but requires a much thicker package. Australian TGA CTN allows first-in-human dosing with minimal review, which is why Insilico's INS018_055 first-in-human (Feb 2022) was conducted in Australia before cross-filing.

China/Australia/HK bundling creates a 6–12 month acceleration over US-only strategy for early phase.

4. Wet-lab integration

China's CRO ecosystem runs 6–7 day weeks, rotating 24-hour shifts, and physical proximity to AI compute (WuXi has formal ML partnerships). US CRO industry is union-adjacent and geographically dispersed. AI efficiency compounds on top of wet-lab speed; when wet lab is slow, the compound rate is slow.

An integrated AI+CRO stack produces PCC-per-week outputs that would take a US platform-only model a quarter to replicate.

A full, independent audit of the Chinese AI drug-discovery ecosystem — including Insilico, XtalPi, Galixir, StoneWise, Neomer and the broader Four Dragons landscape — is published at aipharmachina.com. The Chinese portal documents roughly 80% of the global AI-discovered PCC output coming from one platform operating on one-third the capital of US peers.

Read the China Report → aipharmachina.com

AI Drug Discovery in India – An Emerging Opportunity

India is 3–5 years behind China in AI drug discovery — not because of talent, but because of infrastructure. With the world’s largest software talent pool (5M+ developers), global leadership in generic pharmaceuticals (60% of the world’s vaccines), and 1.4 billion genetically diverse citizens, India has the raw ingredients. Generative AI is the inflection point: it shifts discovery from wet labs to dry labs, playing directly to India’s strengths. Companies like Jubilant Therapeutics, Verseon, and Elucidata are already in clinical stages. The full analysis is at aipharmaindia.com.

Read the India Report → aipharmaindia.com

08 · Big Pharma AI Centres

Every major US pharma now runs an AI centre. Few produce AI drugs.

Big pharma's AI spend is concentrated in three buckets: genetic target ID, molecular design, and clinical-trial operations. The real budget flows through deal biobucks to the AI-native partners rather than pure in-house build. Below: the eight most active US pharma AI programmes.

Eli Lilly most aggressive

The single largest buyer of external AI drug-discovery work in 2022–2026. Deals with Isomorphic ($1.7B), Insilico ($2.75B), Genesis ($670M/pgm), Iambic (2024), Verge ($694M), plus the Morphic cash acquisition ($3.2B, Aug 2024). Internal LillyTOI discovery engine, and OpenAI antimicrobial partnership 2024.

Bristol-Myers Squibb

Spent $6.1B on Nimbus TYK2 (2022) — the single largest AI-adjacent deal in history. Schrödinger computational-chemistry collaboration. Evotec TPD deal ($200M upfront). Insitro ALS deal ($2B biobucks). Sustained multi-year commitment; heavy user of physics-based design.

Pfizer

AI-driven target identification via partnerships with Tempus, Schrödinger, XtalPi. Internal PfizerWorks computational platform. Acquired Trillium Therapeutics ($2.3B, 2021) for CD47 programme. Relatively less noisy than BMS or Lilly on external AI biotech deals but heavier on internal ML-ops.

Merck & Co.

Partnerships with Atomwise (2020, 3 targets up to $610M), Absci (2022), and AiCure on clinical-trial ops. Internal Modeller platform; invested in Variational AI, Cyrus Biotech, others. Historically conservative on AI-biotech biobucks but active acquirer (Acceleron, Peloton Therapeutics).

Amgen

Acquired deCODE Genetics ($415M 2012; valued >$14B in talent+data leverage) as the industry's largest genetic target-ID asset. Generate Biomedicines partnership ($1.9B, 2022). BigHat Biosciences partnership (2023). Internal ML group staffed with ex-Google Brain and Genentech talent.

Regeneron

Operates the Regeneron Genetics Center with exome sequencing of 500,000+ participants (the largest private sequencing operation outside 23andMe/UK Biobank). Multiple partnerships with AbCellera and other antibody-discovery AI platforms. Internal ML focused on exome-linked drug ID.

AstraZeneca

BenevolentAI deal ($800M+, 2019, CKD/IPF) — reference case for disappointing biobucks. Absci oncology deal ($247M biobucks, 2023). Tempus multimodal foundation-model deal ($200M, 2024). Internal AI Research unit based in Cambridge UK with heavy NLP/imaging work.

Sanofi

Exscientia $5.2B biobucks (2022, 15 targets). Insilico $1.2B biobucks (2022). Atomwise $1B+ biobucks (2022). Reported acquisition of Atomwise (~$100M+, 2024). Recursion phenomics ($20M upfront, 2024). Most diversified portfolio of external AI partners of any big pharma.

Novartis

Isomorphic Labs ($1.2B, Jan 2024). Schrödinger ($2.3B, 2023). Generate Biomedicines acquisition (~$1B, late 2024). Ongoing Microsoft Research partnership. Data42 internal data platform. Heavy bet on AI + protein design.

09 · VC & Investment Landscape

Eight-year cumulative: $27–30B of US AI drug-discovery venture.

Annual AI drug-discovery venture funding peaked at $5.2B in 2021, bottomed at $2.2B in 2023, and recovered to $5B+ in 2024 on the back of a single round (Xaira, $1B). The generative-AI / LLM cycle reset expectations upward but corporate VC (Lilly Ventures, Novo Holdings, Eli Lilly Asia Ventures) has replaced exotic growth capital (SoftBank, Tiger Global) as the most consistent backer.

Annual US AI drug-discovery VC volume (BCG + supplemental estimates)

2015

$0.45B

2018

$0.9B

2019

$1.1B

2020

$2.4B

2021 peak

$5.2B

2022

$3.5B

2023 trough

$2.2B

2024 rebound

$5.0B

2025 YTD

$4.0B+ est.

Third-party investment — excludes pharma biobucks. Sources: BCG 2022 Oct analysis (Jayatunga et al.); Deep Pharma Intelligence; BiopharmaTrend; company disclosures.

Tier 1 VCs

ARCH Venture Partners — Xaira $1B lead, Insitro, Generate, Recursion
Flagship Pioneering — Generate Biomedicines, Cellarity, Valo Health (all in-house)
Foresite Labs — Xaira co-lead
Third Rock Ventures — Relay Therapeutics founding
a16z Bio — Insitro, BigHat, Genesis, Profluent

Corporate / strategic

GV (Google Ventures) — Recursion, Relay, Isomorphic
NVIDIA — Recursion ($50M 2023), Iambic, Terray, Evozyne, Generate
Lilly Asia Ventures — Insilico (through Series C), others
Novo Holdings — Cellarity, Valo
SoftBank Vision Fund — Relay, Exscientia, Insitro (2021 cycle)

Specialist / data-pure

Lux Capital — Recursion, Xaira
Data Collective (DCVC) — Recursion, AbCellera
Sequoia — Xaira, Insilico
Baillie Gifford — Recursion, Tempus
OrbiMed — Insilico, others

NIH / ARPA-H public funding

$130M

NIH Bridge2AI (4 years, 2022–2026)

$20M/yr

NCATS ASPIRE

$50M/yr

NCI AI cancer grants

$1B+

ARPA-H (2022 launch; AI incl.)

Cumulative NIH AI-drug-discovery-adjacent funding is approximately $500M/year (2024) across grants, contracts, and intramural IT infrastructure. Dwarfed by private capital but materially larger than Chinese equivalents on the basic-science end.

10 · The Acquisition Wave

2023–2025: the consolidation era.

With IPO windows narrow and SPAC routes discredited, the 2023–2025 cycle has been the M&A cycle. Pharma bought the platforms it had been leasing. AI-native biotechs bought one another for scale. The deals below are the largest.

Recursion acquires Exscientia

All-stock, announced Aug 2024, closed Nov 2024. Exscientia shareholders receive 0.7729 RXRX shares per EXAI share. Combines Recursion's phenomics platform with Exscientia's Centaur/Manifold design stack. Post-merger: largest public AI drug company by pipeline breadth, ~1,200 employees.

$688M

Nov 2024

Eli Lilly acquires Morphic Therapeutic

$3.2B cash. MORF-057 (α4β7 integrin oral, IBD) Phase 2b-ready asset is the strategic prize. Schrödinger holds equity — nets ~$77M on the transaction.

$3.2B

Aug 2024

Novartis acquires Generate Biomedicines

Reported ~$1B acquisition late 2024. Consolidates Novartis's bet on generative protein therapeutics after the Oct 2023 immunology collaboration.

~$1B

late 2024

Sanofi acquires Atomwise

Reported $100M+ acquisition during 2024. Historical Atomwise-Sanofi partnership (2022, $20M upfront, $1B+ biobucks) converts to outright ownership. Atomwise rebrands to Numerion Labs.

~$100M+

2024

Recursion acquires Cyclica

All-stock. Polypharmacology / proteome-wide screening platform. Integrated into Recursion OS.

~$40M

May 2023

Recursion acquires Valence Labs

All-stock. Generative chemistry. Integrated into Recursion OS post-Valence.

~$47.5M

May 2023

Tempus AI acquires Ambry Genetics

$600M all-cash. Expands hereditary cancer testing; extends Tempus's clinical-genomic dataset.

$600M

Nov 2024

Relay Therapeutics acquires ZebiAI

All-stock. DNA-encoded library + ML capability bolt-on. Closed 2021 before Relay stock decline.

undisclosed

2021

Amgen acquires deCODE Genetics (historical)

$415M cash 2012. Valued today for the genetic target-ID leverage at roughly $14.3B in effective R&D NPV. Still the single most consequential AI-adjacent pharma acquisition in US history.

$415M

2012

11 · Public Market Performance

From $15B peak to $1B trough — and partway back.

The public AI-drug-discovery basket peaked in Q1 2021 as COVID liquidity, AlphaFold narrative, and SPAC money all compounded. Two years later it was down 70–90% across the board. In 2024–2025 Tempus AI's IPO and Recursion's Exscientia deal rerated the highest-quality names upward; BenevolentAI, Absci, and others remain at or near all-time lows.

Ticker	IPO date	IPO raise	IPO price	Peak price	Peak market cap	2025 price range	2025 market cap	Peak drawdown
TEM Tempus AI	Jun 2024	$410M	$37	~$80 (Aug 2024)	~$18B	$55–70	~$9B	–25%
SDGR Schrödinger	Feb 2020	$232M	$17	~$110 (Jan 2021)	~$8B	$20–25	~$1.75B	–78%
RXRX Recursion	Apr 2021	$436M	$18	~$45 (Jul 2021)	~$10B	$5–7	~$2B	–80%
RLAY Relay	Jul 2020	$400M	$20	~$48 (Feb 2021)	~$6B	$3–5	~$500M	–90%
ABCL AbCellera	Dec 2020	$556M	$20	~$58 (Feb 2021)	~$15B	$2.50–3.50	~$850M	–94%
ABSI Absci	Jul 2021	$211M	$16	~$28 (Oct 2021)	~$2.5B	$3–5	~$450M	–85%
EXAI Exscientia	Oct 2021	$350M	$22	~$25 (late 2021)	~$2.9B	acq. Nov 2024	$688M	–77% pre-exit
BAI.AS BenevolentAI	Apr 2022 SPAC	€232M	€12	€13 (May 2022)	~€1.5B	<€1	~€150M	–93%

The lesson of the basket. Every US AI biotech that IPO'd in 2020–2021 except Tempus AI drew down more than 75% from peak. Tempus is the only one with >$500M of diversified commercial revenue — and it is fundamentally a clinical-genomics company, not an AI drug-discovery pure-play. Investors in 2025 discount the "AI" tag entirely for biotech pricing and value these companies on pipeline, partnership revenue, and cash burn — the same metrics as any other biotech.

12 · The BIOSECURE Factor

A policy shift that redraws the CRO map.

The BIOSECURE Act passed the House in September 2024 and heads toward Senate implementation in 2026. It bars federal contracts with named Chinese biotech companies — WuXi AppTec, WuXi Biologics, BGI, MGI Tech, Complete Genomics — with an eight-year wind-down to 2032. The consequences ripple through the AI drug-discovery CRO stack on which many US biotechs depend.

Direct CRO impact

WuXi AppTec runs roughly 35% of small-molecule CMC manufacturing for US biotech (Biosecure Act House committee testimony, 2024). Separation costs estimated at $1–2B across sector, adding 12–24 months to affected programmes. Indian CROs (Syngene, Aragen Life Sciences, Piramal Pharma Solutions) and US CROs (Catalent, Lonza US) are the primary beneficiaries.

Tempus / AbCellera / Schrödinger: minimal exposure

Companies with primarily US-based wet-lab operations are insulated. Tempus AI runs its own CLIA labs. AbCellera has GMP manufacturing coming online in Vancouver in 2025. Schrödinger sells software; no CRO dependency.

Insilico: partially insulated

Life Star 1 robotic lab in Suzhou is owned, not rented. INSILICO operates through Hong Kong entities with non-restricted CRO networks. But some US-side federal contracts (DoD biodefence, NIH grants) could be affected if programme is deemed "Chinese-affiliated". Insilico has stated 2025 plans to scale US-side CDMO relationships.

The strategic question for 2026–2030. BIOSECURE accelerates the decoupling of US pharma's wet-lab footprint from China's CRO stack. It does not change the underlying cost or speed differentials — Indian CROs are cheaper than US but still 2–3x more expensive than Chinese. The geopolitical premium paid for "secure" manufacturing is a real cost to US AI-biotech productivity; it may also mean US biotechs increasingly route through Korea (Samsung Biologics), Japan, or India to reclaim some of the cost arbitrage.

13 · Key People

Fifteen builders who shaped the US AI drug-discovery stack.

Founders, chief scientists, and technologists whose decisions moved the sector. Two Nobel laureates (Baker, Hassabis/Jumper shared), multiple Nature covers, and roughly $10B of AI-driven biotech value created between them.

David Baker

Univ. of Washington · IPD

2024 Nobel Chemistry (shared) for computational protein design. Founder of RoseTTAFold and RFdiffusion. Underpins Xaira, Generate, and Isomorphic protein stacks. Founded Arzeda, Neoleukin, others.

Demis Hassabis

Isomorphic Labs · DeepMind

2024 Nobel Chemistry (shared). Founder of DeepMind; CEO of Isomorphic Labs. AlphaFold 1/2/3. Drives the most cited deep-learning paper of the decade and the tightest pharma-licencing narrative in the sector.

John Jumper

DeepMind

2024 Nobel Chemistry (shared). Senior staff research scientist at DeepMind; lead author on AlphaFold 2. Among the most recruited researchers in the field post-Nobel.

Daphne Koller

Insitro

CEO and founder of Insitro. Former Calico, Coursera, Stanford. Pioneer of probabilistic graphical models; now pairs ML with iPSC functional genomics. $743M raised, 0 INDs to date.

Chris Gibson

Recursion

Co-founder and CEO of Recursion. PhD University of Utah. Built the largest public AI biotech by headcount, cash, and compute (BioHive-2, ~63,000 GPUs).

Alex Zhavoronkov

Insilico Medicine

Founder and CEO of Insilico Medicine. Wrote the first paper pairing deep learning with drug discovery for aging (2016). Architect of Pharma.AI. Hong Kong IPO'd 2025. 30+ PCCs, 13 INDs — industry record.

Richard Friesner

Schrödinger

Co-founder, Columbia University. Architect of FEP+. Gold-standard free-energy perturbation in physics-based drug design. Foundational figure for the field's non-ML side.

Mark Murcko

Relay · ex-Vertex CSO

Co-founder of Relay Therapeutics; ex-CSO of Vertex. Pioneered structure-based drug design practice at Vertex in the 1990s. Key architect of Dynamo / motion-based design thesis.

Eric Lefkofsky

Tempus AI

Founder and CEO of Tempus AI. Serial entrepreneur (Groupon). Converted Tempus from clinical-genomics lab into largest real-world-data platform in US oncology. Public June 2024.

Andrew Hopkins

Exscientia

Founder of Exscientia, University of Dundee. First AI-designed drug into human trial (DSP-1181). Stepped back after Recursion acquisition (Aug 2024). Advocate for integrated design-make-test-learn.

Marc Tessier-Lavigne

Xaira Therapeutics

CEO of Xaira. Former President of Stanford, ex-CSO of Genentech, Rockefeller. Raised $1B seed (April 2024) — largest biotech launch round in history. Neuroscience + generative design thesis.

Carl Hansen

AbCellera

Founder and CEO of AbCellera. UBC professor. Built the largest antibody-discovery platform globally via single-cell microfluidics; IPO'd Dec 2020 at $15B peak valuation.

Bruce Booth

Atlas Venture · Nimbus

Atlas Venture partner and Nimbus Therapeutics founding investor/chairman. Architect of the Nimbus virtual-pharma + AI model. Engineered the $6.1B TYK2 exit to BMS (2022).

Sean McClain

Absci

Founder and CEO of Absci. Built the first zero-shot AI antibody design platform; ABS-101 (TL1A IBD) is Absci's first AI-originated molecule in clinic (2025).

Abraham Heifets

Atomwise / Numerion

Co-founder of Atomwise (2012). Trained at University of Toronto. AtomNet CNN preprint (2015) is foundational for the CNN era. Rebranded Atomwise as Numerion Labs in 2026.

14 · Academic Foundations

The research labs that seeded the sector.

Ten US and US-adjacent university groups account for the majority of AI drug-discovery founder teams, open-source codebases, and Nature/Science covers. The three most consequential are David Baker's IPD (Washington), Demis Hassabis's DeepMind (UK, but open-sourced AlphaFold is a US-pharma enabler), and the broader MIT CSAIL / Broad Institute cluster.

Institute for Protein Design (Univ. of Washington)

Led by David Baker (2024 Nobel). RoseTTAFold, RFdiffusion, ProteinMPNN. Spawned Xaira co-founders (Hetu Kamisetty), Generate Biomedicines talent, Cyrus Biotech, Arzeda, Neoleukin, A-Alpha Bio, and AÖP Biosciences. Single most prolific drug-discovery founder factory in the sector.

MIT CSAIL / Barzilay Lab / Broad Institute

Regina Barzilay's group published Chemprop (message-passing neural nets for molecular property prediction) and early GCN work. Broad Institute's Connectivity Map (CMap) + imaging datasets underlie phenomics approaches (Recursion, Cellarity). MIT Jameel Clinic partnership with Cambridge focuses on antibiotics (halicin 2020).

Stanford / Pande Lab

Vijay Pande's group produced Genesis Therapeutics founder Evan Feinberg and foundational GNN work on molecular property prediction. Pande now runs a16z Bio — the single most active AI-biotech VC partner. Stanford AIMI (AI in Medicine and Imaging) is a major applied-ML hub.

Harvard / Wyss Institute / Aspuru-Guzik (now Toronto)

Alán Aspuru-Guzik collaborated with Insilico on 2016–2018 GAN/RL molecular generation papers; now leads the Acceleration Consortium at University of Toronto. Wyss Institute spawned major therapeutics via Don Ingber's lab. Harvard Medical School hosts the Laboratory of Systems Pharmacology under Peter Sorger.

Carnegie Mellon University

David Koes's group produced GNINA (docking with neural nets), foundational for structure-based AI screening. CMU Computational Biology department is a major pipeline into Roivant, Schrödinger, and Insitro. Josh Bloomer / Russ Schwartz group on cancer genomics ML.

Columbia / Honig & Friesner

Barry Honig (RosettaFold predecessor methods) and Richard Friesner (Schrödinger founder) anchor the Columbia biophysics tradition. Joachim Frank (2017 Nobel, cryo-EM) built the reconstruction theory that underpins Gandeeva Therapeutics and others.

UCSF / Sali Lab / QBI

Andrej Sali's MODELLER (1993) is the foundational homology-modelling code in the field. UCSF QBI (Nevan Krogan) runs proteomics at scale. Major pipeline into BridgeBio and Genentech. UCSF hosted the 2022 RoseTTAFold Protein Contact Prediction collaboration.

Princeton / Engelhardt Group

Barbara Engelhardt (now Gladstone Institutes) on statistical genomics and Bayesian modelling. Princeton Ludwig Institute ML-in-cancer work. Strong pipeline into Flagship-adjacent companies and Google X.

15 · Regulatory Landscape

The FDA is the adult in the room.

Three discussion papers in 2023, a draft guidance in January 2025, a formal AI Council in CDER. The FDA has moved faster on AI-in-drug-development than most regulators and is the single most important external dependency for every US AI biotech. The agency's posture is risk-based, context-specific, and explicitly non-prescriptive about model architecture.

Timeline of FDA action

May 2023 — FDA Discussion Paper: "Using AI & ML in the Development of Drug & Biological Products"
May 2023 — FDA Discussion Paper: "AI in Drug Manufacturing"
Oct 2023 — FDA holds public workshop on AI in drug development (1,200+ attendees)
Jan 2025 — Draft Guidance: "Considerations for the Use of AI to Support Regulatory Decision-Making" — defines the Context-of-Use (COU) credibility framework
2024 — CDER/CBER AI Council formalised under PDUFA VII commitments
2025 — First FDA AI-drug acceptance criteria memo (internal)

Context-of-Use (COU) framework

The FDA's Jan 2025 draft guidance introduces a risk-based credibility framework. A sponsor must specify:

Question of interest — what regulatory question the model addresses
Context of use — the specific role of the model output in decision-making
Model risk — influence of the model on the decision × consequence of a wrong decision
Credibility activities — validation, verification, uncertainty quantification commensurate with risk

Higher-risk contexts (e.g., model-informed dose selection in pivotal trials) require substantially more validation than lower-risk contexts (e.g., model-aided lead prioritisation).

What AI drugs still need for approval

The FDA has been clear: an AI-origin drug is held to the same standard as any other drug. No special pathway, no accelerated approval for "AI-discovered" status alone. The DC package (21 CFR 312.23) is unchanged. The pivotal efficacy bar is unchanged. The CMC bar is unchanged.

The only substantive regulatory benefit AI confers today is reduction in preclinical attrition: better ADME/PK, better selectivity, cleaner tox profiles at first-in-human. That may translate into Phase 1 success-rate uplift (as seen in the Jayatunga analysis) but does not change the approval bar.

FDA approval watchlist

Programmes closest to a US approval with significant AI contribution to discovery or design:

Relay RLY-4008 (lirafugratinib) — FGFR2-selective inhibitor, Breakthrough Therapy 2023, BLA 2025 expected
Recursion REC-994 — CCM, Phase 2/3 transition; missed primary 2024
Takeda TAK-279 (ex-Nimbus TYK2) — PsO Phase 3 ongoing; if approved 2026/27 would be first Schrödinger-designed drug to market
Insilico INS018_055 (rentosertib) — IPF Phase 2b/3 transition; first dual AI-target + AI-molecule filing
Generate GB-0895 — anti-TSLP Phase 1

16 · Failures & Lessons

The graveyard is as instructive as the leaderboard.

Six notable failures — covering technical flops, SPAC blowups, management collapses, and strategic mis-bets. Each carries a specific, repeatable lesson for operators, investors, and regulators.

BenevolentAI: the knowledge-graph downfall

2022 SPAC valuation: ~€1.5B. 2025: ~€150M (–90%). Laid off 180 staff (~50%) in 2023. Lead asset BEN-2293 (atopic dermatitis topical) missed Phase 2 in 2023 — a key blow after the company had publicly positioned text-mining + knowledge graphs as competitive with structure-based methods.

Lesson: Text-mining and knowledge graphs are useful for target repurposing (baricitinib/COVID) but have not yet produced a first-in-class de novo programme that cleared Phase 2. The gap between "the graph suggests X" and "the molecule works in humans" is entirely wet-lab work — which BenevolentAI under-invested in.

Exscientia DSP-1181: the first AI drug, discontinued

DSP-1181 (5-HT1A agonist for OCD, partnered with Sumitomo Dainippon) was the first "AI-designed" molecule in a human trial (Jan 2020). Phase 1 terminated 2021 — did not progress. Exscientia's EXS21546 (A2A/A2B immuno-oncology) followed into discontinuation 2023.

Lesson: AI improves the odds of reaching the clinic. It does not change the biology once there. A "designed" molecule can still have a bad target.

Valo Health: the SPAC that never closed

Valo announced a SPAC merger with Khosla Ventures Acquisition II (KVSA) at a $2.8B valuation in Dec 2021. Terminated May 2022 after market conditions deteriorated. Restructured private 2023; raised Series C of ~$175M; headcount down from ~450 to ~300. Novo Nordisk cardiometabolic deal (Mar 2024, $60M upfront, $4.6B biobucks) stabilised the company.

Lesson: SPAC market-timing cut both ways. Valo survived by cutting fast; many peers did not.

Verge Genomics VRG50635: the ALS programme that faded

VRG50635 (PIKfyve inhibitor for ALS) was a flagship for "human-data-first" neurodegeneration AI. First-in-patient Phase 1 data (Mar 2024) showed safety and target engagement but limited efficacy signal. Company discontinued Phase 2 planning; pivoted to other targets.

Lesson: Human-tissue multi-omics helps prioritise but does not guarantee. ALS in particular has produced a decade of promising-preclinical, failing-Phase-2 programmes regardless of discovery modality.

Recursion REC-994: a textbook missed primary endpoint

REC-994 (Phase 2 SYCAMORE in cerebral cavernous malformation, Q3 2024) missed primary efficacy endpoint. Company is continuing into Phase 3 on secondary-endpoint signals. For a flagship platform asset, the data were underwhelming and the stock reacted accordingly.

Lesson: Even the best-funded AI platform cannot de-risk biology. Phenomics identifies candidates; it does not validate disease mechanism.

Insitro: $743M raised, 0 INDs

Insitro has raised $743M cumulative at a peak $2.5B valuation. It has no wholly-owned clinical assets as of mid-2026 — eight years post-founding. Its Gilead NASH partnership ended in 2023; BMS ALS partnership continues but has produced no IND. Laid off ~22% Oct 2024.

Lesson: "Machine learning + functional genomics" is a scientifically beautiful thesis with a punishing capital-efficiency profile. Target validation at scale is slow; capital-efficient is almost the opposite of what this model produces.

17 · Future Outlook

Five predictions for 2026–2030.

The AI drug-discovery sector is roughly 14 years old. No Phase 3 success yet, no FDA approval yet, and an unresolved productivity gap with China. The next four years will deliver most of the answers investors have been waiting for.

01 · First FDA approval of an AI-designed drug by 2027/28

Most likely candidates (ranked): Takeda TAK-279 (ex-Nimbus TYK2, Schrödinger-designed); Relay RLY-4008 (FGFR2 cholangiocarcinoma); Insilico INS018_055 (TNIK IPF). If TAK-279 wins first, the narrative will emphasise physics-based design; if RLY-4008 wins first, it will emphasise MD + ML; if INS018_055 wins first, it will emphasise end-to-end generative. Industry perception will shift on whoever crosses first.

02 · Consolidation accelerates; 30 to 15 to 8

The ~30 well-funded US AI biotechs will compress to roughly 15 survivors by 2028 through M&A, attrition, and fold-ins. Tempus, Recursion, Schrödinger, Xaira, Isomorphic, Generate, Nimbus (post-next-exit), Tempus, and two to four others will be the 2030 cohort.

03 · Pharma internalises the software layer

Lilly, BMS, and Novartis will all build meaningfully larger internal AI teams by 2028. The licensing model (pay-per-target) remains; the outsourcing of the full AI stack (pay-for-platform-access) narrows as pharma ML teams mature. Schrödinger and similar sellers face margin pressure.

04 · Wet-lab integration wins

Companies without owned wet-lab stacks underperform per unit capital. Recursion (BioHive + imaging), Insilico (Life Star 1), and Xaira (integrated foundation-model-plus-lab thesis) are the structural winners. Platform-only sellers (Atomwise-as-Numerion, Isomorphic) face commoditisation pressure from open-source models.

05 · FDA approves ~3–5 AI-origin NMEs by 2030

The 2024 Jayatunga Phase 1 success-rate uplift (80–90%) implies a materially larger AI-origin clinical pipeline entering Phase 3 in 2027–2029. Even with industry-standard Phase 2→3 attrition, 3–5 NDAs by 2030 is plausible. The sector's economic thesis (<$500M per NME) remains untested at approval scale.

BONUS · The China question is not resolved

If BIOSECURE fully separates US and Chinese pharma supply chains by 2028, Insilico and its Chinese peers retain their cost and speed advantage only for non-US markets. The US sector may rebuild Indian/Korean CRO redundancy at 2–3x Chinese cost, narrowing but not closing the productivity gap. The global AI drug market bifurcates.

The single most important signal to watch. Whether the first FDA-approved AI-discovered NME comes out of a Chinese-operating programme (Insilico INS018_055) or a US-operating programme (Relay RLY-4008, Takeda TAK-279). If China gets there first on a drug with a US label, it will force a strategic rethink across every US pharma's AI allocation. If the US gets there first, the productivity-gap narrative softens — but does not disappear, because approvals are a lagging indicator and PCC output is the leading one.

18 · Clinical Watchlist

Every US AI-origin drug currently in human trials.

Fifty-plus AI-derived molecules are in active US clinical development as of May 2026. The table below lists the most-watched programmes by sponsor, phase, indication, and near-term readout. Green rows are Phase 2b or later; amber are Phase 1b/2a; blue are Phase 1 initiation or earlier.

Sponsor	Asset	Target / MoA	Indication	Phase	Next readout	NCT
Relay Therapeutics	RLY-4008 (lirafugratinib)	FGFR2-selective	Cholangiocarcinoma (FGFR2 fusion)	Ph 2 pivotal	BLA filing 2025	NCT04526106
Relay Therapeutics	RLY-2608	Mutant-selective PI3Kα	HR+/HER2– breast cancer	Ph 1b/2	Ph 3 plan 2025	NCT05216432
Relay Therapeutics	RLY-5836	PI3Kα CNS-penetrant	Advanced solid tumours	Ph 1	2025 safety	NCT05759949
Recursion	REC-994	Undisclosed / superoxide	Cerebral cavernous malformation	Ph 2/3	Ph 3 design 2025	NCT05085535
Recursion	REC-2282	Pan-HDAC	NF2 meningiomas	Ph 2/3 POPLAR	Interim 2025	NCT05130866
Recursion	REC-4881	MEK1/2	FAP	Ph 2 TUPELO	Readout 2026	NCT05552741
Recursion	REC-3964	Toxin B inhibitor	C. difficile	Ph 2	2025	NCT05963321
Recursion	REC-1245	RBM39 degrader	Advanced solid tumours	Ph 1	2026	—
Recursion (ex-Exscientia)	GTAEXS617	CDK7 selective	Solid tumours (HR+ BC, OvCa)	Ph 1/2	2025 dose-esc	NCT05985655
Recursion (ex-Exscientia)	EXS74539	LSD1	SCLC / AML	Ph 1	2025	NCT06266545
Recursion (ex-Exscientia)	EXS73565	MALT1	B-cell malignancies	Ph 1	2025	NCT06136559
Schrödinger	SGR-1505	MALT1	B-cell lymphomas	Ph 1	2026 dose-esc	NCT05544019
Schrödinger	SGR-2921	CDC7	AML / MDS	Ph 1	2025	NCT06077162
Schrödinger	SGR-3515	Wee1 / Myt1	Advanced solid tumours	Ph 1	2026	NCT06207526
Absci	ABS-101	TL1A antibody (AI-designed)	IBD	Ph 1	2025 SAD/MAD	NCT06449585
Takeda (ex-Nimbus)	TAK-279 (zasocitinib)	TYK2 allosteric	Plaque psoriasis, PsA, IBD	Ph 3	Ph 3 readout 2025/26	NCT06088043
AbCellera	ABCL575	OX40L antibody	Atopic dermatitis	Ph 1	2025	—
Insilico Medicine	INS018_055 (rentosertib)	TNIK inhibitor (AI target + AI mol)	IPF	Ph 2a/b	Ph 2b ongoing	NCT05154240
Insilico Medicine	ISM3412	MAT2A	MTAP-deleted cancers	Ph 1	2025	NCT06187857
Insilico Medicine	ISM5939	ENPP1	Solid tumours	Ph 1 (US)	2025	NCT06183294
Generate Biomedicines	GB-0895	Anti-TSLP antibody	Severe asthma	Ph 1	2025	—
Generate Biomedicines	GB-0669	Anti-RSV antibody	RSV prophylaxis	Ph 1	2025	—
Iambic Therapeutics	IAM1363	HER2-mutant selective	HER2-mutant solid tumours	Ph 1/2	2025 dose-esc	NCT06253871
BenevolentAI	BEN-8744	PDE10	Ulcerative colitis	Ph 1	2025	—
BenevolentAI	BEN-34712	RARb agonist	ALS	Ph 1 IND	2025/26	—
Nimbus Therapeutics	NDI-219216	HPK1	Advanced solid tumours	Ph 1	2025	—
Valo Health	OPL-0301	Undisclosed cardiovascular	Post-MI CV	Ph 2	2026	—
Valo Health	OPL-0401	ROCK1/2	Diabetic retinopathy	Ph 2	2026	—
Verge Genomics	VRG50635	PIKfyve	ALS (winding down)	Ph 1 halted	n/a	NCT04768972

~30

US AI-origin Phase 1 assets

Initiated 2020–2025; majority oncology + I&I.

~12

In Phase 2

Includes Relay RLY-2608 / -4008, Recursion REC-994, Insilico INS018_055.

In or approaching Phase 3

Takeda TAK-279 (ex-Nimbus TYK2) is the most advanced AI-associated programme.

Source: ClinicalTrials.gov lookups, company press releases, 10-K/10-Q filings, May 2026. "AI origin" defined broadly to include molecules where AI played a material role in target selection, hit ID, lead optimisation, or both. TAK-279 included because Nimbus/Schrödinger ML drove the discovery programme although the molecule is now wholly owned by Takeda.

19 · Foundational Papers & Open Source

The literature that built the sector.

Twelve papers and seven open-source releases account for most of the technical lineage of modern AI drug discovery. All are freely accessible; most have citation counts in the thousands or tens of thousands.

Foundational papers

Wallach et al., 2015 — AtomNet: A Deep CNN for Bioactivity Prediction in Structure-based Drug Discovery. arXiv 1510.02855. arXiv
Segler, Kogej, Tyrchan & Waller, 2018 — Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks. ACS Central Science. DOI
Zhavoronkov et al., 2019 — Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nature Biotechnology. First high-profile demonstration of generative molecular design. DOI
Jumper et al., 2021 — Highly accurate protein structure prediction with AlphaFold. Nature. ~43,000 citations. DOI
Baek et al., 2021 — Accurate prediction of protein structures and interactions using a three-track neural network (RoseTTAFold). Science. DOI
Lin et al., 2023 — Evolutionary-scale prediction of atomic-level protein structure (ESMFold). Science. DOI
Watson et al., 2023 — De novo design of protein structure and function with RFdiffusion. Nature. DOI
Ingraham et al., 2023 — Illuminating protein space with a programmable generative model (Chroma, Generate Biomedicines). Nature. DOI
Abramson et al., 2024 — Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature. DOI
Ren et al., 2024 — A small-molecule TNIK inhibitor targets fibrosis in preclinical and clinical models (INS018_055 / rentosertib). Nature Biotechnology. DOI

Open-source releases that shaped practice

AlphaFold 2 (DeepMind, 2021) — model weights + inference code Apache 2.0. GitHub
AlphaFold Protein Structure Database (EMBL-EBI, 2022) — 200M+ predicted structures. Database
RoseTTAFold / RFdiffusion (Baker Lab) — de novo protein design via diffusion. GitHub
ESM-2 / ESMFold (Meta AI / FAIR, 2022–2023) — protein language models. GitHub
Chemprop (MIT Barzilay Lab) — directed message-passing NN for molecular property prediction. GitHub
NVIDIA BioNeMo (2023–) — pharma-grade foundation-model stack on Clara Parabricks. NVIDIA
DiffDock (MIT, 2023) — diffusion-generative docking. GitHub
OpenCRISPR / ProGen (Profluent / Salesforce Research) — generative CRISPR proteins. Company

A pattern worth naming. Every foundational open-source release in the sector came from academia or a research lab. Not one came from a US AI biotech. The commercial layer has consumed open-source aggressively (RFdiffusion inside Xaira and Generate, AlphaFold inside every pharma computational chemistry team) and contributed back sparingly. That is consistent with drug-discovery economics (IP concentrated on molecules, not methods) but also explains why the sector's technical narrative is increasingly written in DeepMind's London office and Baker's Seattle lab rather than in Boston or Salt Lake City.

20 · Compute & Data Infrastructure

The GPU arms race and the data moat.

Modern AI drug discovery runs on two inputs: compute and proprietary biological data. NVIDIA has become the default hardware vendor, with equity stakes in at least six US AI biotechs. Proprietary datasets — phenomics images, antibody sequences, DEL binding curves, clinical genomics — are the only durable moats in a world where models are increasingly commoditised.

63k

GPUs in Recursion BioHive-2

NVIDIA H100 + DGX cluster, Salt Lake City. ~2.2 exaflops AI performance. Commissioned 2024.

65PB

Recursion phenomics dataset

Cellular images + biological/chemical multi-omics. Largest single-company phenomics corpus globally.

Tempus AI oncology records

De-identified clinical records + 1.5M clinical-grade genomic profiles. Largest US RWD oncology asset.

500k

Regeneron GC exomes

Largest private exome-sequencing operation. Feeds AI target ID across the Regeneron pipeline.

NVIDIA's pharma equity portfolio

NVIDIA has taken equity stakes or strategic partnerships in Recursion ($50M, Jul 2023), Schrödinger, Iambic, Terray, Evozyne, Generate Biomedicines, and Genesis Therapeutics. Its BioNeMo foundation-model service (GA 2023) is the most-used non-internal model stack in the sector. The bet is obvious: drug discovery is the largest serious enterprise workload for NVIDIA GPUs outside hyperscale AI, and model training at scale requires NVIDIA silicon.

The corollary: NVIDIA has structural pricing power over any AI biotech running 10,000+ GPU training jobs. In Q4 2024 NVIDIA DGX cluster lead times for biotech customers stretched to nine months.

Proprietary data moats by company

Recursion — 65PB phenomics + CellProfiler-derived embeddings. Hardest to replicate.
AbCellera — ~millions of antibody sequences per Celium campaign.
Absci — billion-interactions/day SoluPro E. coli screens.
Terray — tNova chip generates the largest structured biochemistry dataset for ML training anywhere outside pharma.
Tempus AI — 8M patient records, 1.5M genomic profiles, 200+ pharma data contracts.
Schrödinger — 35 years of physics-based FEP data; no ML substitute exists.
Isomorphic Labs — exclusive access to AlphaFold 3 inference at scale, Alphabet compute.

Models commoditise; data does not. The cost of training a state-of-the-art protein language model has fallen ~100x since 2021 (ESM-1 vs ESM-2; AlphaFold 2 vs 3). The cost of generating 65PB of phenomics data, or 500k exomes, or 8M oncology records has not. The US AI biotechs with genuine long-term moats are the ones running proprietary wet-lab or clinical data loops at scale. Pure-model companies face the same margin compression as every other ML-as-a-service vendor confronting open-weights competition.

AI Drug Discovery in America · 2012 – 2026

The country that invented the field is no longer winning it.

The American Lead

The Productivity Problem

Thirty-two companies. Three tiers. Ranked by output, not hype.

The metrics that matter — and how US platforms score.

Time to Preclinical Candidate, by company

Cost per IND, AI vs traditional

Phase transition success rates

Pipeline productivity: PCCs nominated per $100M deployed

Head-to-head: the four comparable platforms

Insilico’s China Strategy – Competing Where Efficiency Wins

The 0-to-DC Data Flywheel

Democratizing AI Drug Discovery – Training the Next Generation

AI wins the hit. The DC package wins the IND.

Why the DC package is the real bottleneck

Why Chinese CROs shift the equation

The 21 CFR 312.23 IND checklist

Required modules

Typical wall-clock post-PCC

Fourteen years, four waves, one thesis under pressure.

$80B+ in biobucks committed. One exit above $5B.

Mega-deals > $2B

Strategic deals $500M–$2B

Annual AI drug-discovery deal volume (US sponsors, total biobucks)

Why the records are being set 12,000km away.

1. End-to-end vs platform-only

2. Cost structure

3. Regulatory rhythm

4. Wet-lab integration

AI Drug Discovery in India – An Emerging Opportunity

Every major US pharma now runs an AI centre. Few produce AI drugs.

Eli Lilly most aggressive

Bristol-Myers Squibb

Pfizer

Merck & Co.

Amgen

Regeneron

AstraZeneca

Sanofi

Novartis

Eight-year cumulative: $27–30B of US AI drug-discovery venture.

Annual US AI drug-discovery VC volume (BCG + supplemental estimates)

Tier 1 VCs

Corporate / strategic

Specialist / data-pure

NIH / ARPA-H public funding

2023–2025: the consolidation era.

From $15B peak to $1B trough — and partway back.

A policy shift that redraws the CRO map.

Direct CRO impact

Tempus / AbCellera / Schrödinger: minimal exposure

Insilico: partially insulated

Fifteen builders who shaped the US AI drug-discovery stack.

The research labs that seeded the sector.

Institute for Protein Design (Univ. of Washington)

MIT CSAIL / Barzilay Lab / Broad Institute

Stanford / Pande Lab

Harvard / Wyss Institute / Aspuru-Guzik (now Toronto)

Carnegie Mellon University

Columbia / Honig & Friesner

UCSF / Sali Lab / QBI

Princeton / Engelhardt Group

The FDA is the adult in the room.

Timeline of FDA action

Context-of-Use (COU) framework

What AI drugs still need for approval

FDA approval watchlist

The graveyard is as instructive as the leaderboard.

BenevolentAI: the knowledge-graph downfall

Exscientia DSP-1181: the first AI drug, discontinued

Valo Health: the SPAC that never closed

Verge Genomics VRG50635: the ALS programme that faded

Recursion REC-994: a textbook missed primary endpoint

Insitro: $743M raised, 0 INDs

Five predictions for 2026–2030.

01 · First FDA approval of an AI-designed drug by 2027/28

02 · Consolidation accelerates; 30 to 15 to 8

03 · Pharma internalises the software layer

04 · Wet-lab integration wins