For years now, I've documented AI in drug discovery startups, pharma's use of AI in drug discovery, and drugs in the AI in drug discovery pipeline. When I started, you could read those posts in a few minutes. So it was easy to stay up to date. Now they're each epic. So it's harder to find signals in the noise. To help, I created this post. My goal here is to highlight key trends and statistics related to AI in drug discovery. Like my other posts, it will help if this one is interactive. So if you have ideas for trends and statistics you would like to see analyzed, please post them in the comments. Now on to the data!
Research applying AI to drug discovery is accelerating
It certainly feels like research applying AI to drug discovery is increasing. But is it really? I quantified this by looking at publications in PubMed mentioning AI in drug discovery or development (using this query). I compared this to papers in drug discovery and development in general (using this query). The finding? Not just an increase, but an increasing rate relative to drug discovery papers in general. This said, overall, the rate is still quite low. So despite all the hype, there's lots of room to grow.
Startup formation may be slowing
Data I've gathered suggests that AI drug discovery startup formation has peaked. It's possible this will change as I become aware of newer startups not in my database. But my qualitative sense of the market aligns with the quantitative data. We may have hit peak startup.
Most startups focus on new or repurposed entities
I use somewhat arbitrary categories in my startups post. But they're descriptive. And as this chart shows, based on the data, the majority of startups focus on new and repurposed molecular entities. This includes generating novel drug candidates, repurposing existing drugs, designing drugs, and validating and optimizing drug candidates. In fact, this represents more than half of activity amongst startups. This also means there is far less competition in other areas of the drug discovery process. (PS: Sorry some of the labels got cut off. It's an issue with Google Sheets, from which I generate the charts. I'll see if I can fix it in future.)
Oncology, neurology, and infectious disease are the therapeutic areas with the most candidates
Using data I've gathered on drugs in the AI in drug discovery pipeline, I wanted to look at where companies are focusing their efforts. Similar to biopharma overall, oncology is well in the lead, followed by neurology and infectious disease.
Investment continues to increase, but the pace is slowing
Funding has increased every year since 2016. And there were huge increases year-over-year in 2017 and 2018. But the pace of growth appears to be slowing as the market matures.
Venture investment is shifting to later stages
Funding data looks a bit different when you focus on venture capital only. Rather than slowing, venture investment is maturing, with later stage funding increasing.
A few companies get most of the funding
We would expect the 80/20 rule would apply to funding of AI in drug discovery startups. Indeed, about 20% of startups have received 80% of funds. Here are the top 15:
US startups dominate the industry
A few countries account for the majority of AI in drug discovery startups:
Among these countries, the US is by far the leader. And only a few countries have more than one startup on the list:
Several cities, including outside the US, have become hotbeds of activity
While the US dominates as a country, several dispersed cities have emerged as hotbeds. They have commonalities. They tend to be areas of high startup activity in general, like San Francisco. Or they tend to have strength in both machine learning and biology, like Toronto.
Methodology and next steps
I compiled the data above using my own datasets, such as on startups using AI in drug discovery, pharma companies using AI in drug discovery, and drugs in the AI in drug discovery pipeline. I also used data from third-party sources such as Crunchbase. I'm generating the charts using Google Sheets. This is all somewhat of a beta, so if you see something buggy, please let me know.
I intend to expand this analysis over time. If you have ideas for trends and statistics you would like to see here, please post them in the comments.