This July marks five years since BenchSci launched its first product and we hire more talent to join us in hypergrowth (see our open roles), I thought now would be a good time to explain the BenchSci origin story.

At BenchSci, we're developing the world’s most advanced biomedical AI software, built on a platform that can read scientific research papers like a Ph.D. scientist. This is a complex challenge that has us working at the intersection of biomedical research data, artificial intelligence, and the very real problems faced by our users–scientists working to advance biomedical research and bring life-saving medicines to patients. 

Despite technological advancements in most sectors, the preclinical space has mostly been ignored. BenchSci is changing that, but chances are that BenchSci would not exist today if not for a perfect storm of events that came together in 2015. I talked briefly about why BenchSci was started a few years ago (pre-launch!), but I’d like to break down some of the events that brought us here today. The three key players in this story are significant policy changes, substantial advancements in machine learning technology, and one tenacious scientist who was determined to find a better way.

NIH Public Access Policy

In 2008, the US government mandated the NIH Public Access Policy requiring that all papers describing research funded by the National Institutes of Health be made publicly available within 12 months of their initial publication in a free, digital repository called PubMed Central. By 2015, this repository reached a critical mass of 1.2M papers describing biomedical and public health research, which has continued to grow by more than 1M publications per year. This policy effectively unlocked a massive dataset of peer-reviewed, biomedical research, to be freely available to researchers around the world.

The invention of modern AI

In 2012, Professor Geoffrey E. Hinton, experimental psychologist and computer scientist at the University of Toronto, demonstrated that artificial neural networks could be used as the basis for deep learning, a technology that could drastically increase machines’ ability to recognize patterns. Although computer scientists, including Hinton, had been working on neural networks for decades, deep learning-enabled a jump in accuracy that had significant implications for software with a wide range of applications, including voice recognition and computer vision.

A pain point for research scientists everywhere

Experiments are the lifeblood of scientific research, but more than 80% of experiments fail. Scientists advance scientific discovery based on what has been learned in the past, as documented in research papers. But extracting up-to-date and reliable information from these papers, and connecting the dots between different biological entities described within, is a huge challenge that often leads to unnecessary experimental error and missed research opportunities. While completing his Ph.D. in epigenetics at the University of Toronto (U of T), BenchSci co-founder (now Chief Science Officer) Tom Leung was frustrated by this problem and recognized that new AI techniques could be applied to the freely available dataset provided by PubMed to potentially unlock a solution.

The perfect storm

Dr. Tom Leung, or as we all call him, Tom, had an ambitious idea for reducing the number of experiments that fail but knew he couldn’t do it alone. He had a science background but needed AI, data, and business knowledge as well. So he started recruiting his team.

Tom searched on LinkedIn for “Python” and “biology,” and the first result was David Chen, a computer programmer who was pursuing his Ph.D. in neuroimaging, also at U of T. Tom messaged David and asked him to grab a beer at a local college pub (even though Tom does not drink alcohol). David agreed, and by the end of their meeting, he was committed.

But they still needed to round out the team, so he suggested they speak to Elvis Wianda, another U of T Ph.D. student with an interest in big data. The two of them pitched the concept to Elvis, who sat there quietly, saying nothing. Neither Tom nor David knew quite what to think of Elvis’ response, but to their surprise the very next day he sent them the first lines of code to prototype.

With the technical team in place, they started to build a solution, but none of them had ever built a business before, and they knew they would need someone with experience in entrepreneurship. One day, seemingly out of the blue, the founding team received a message from me, an entrepreneur and MBA candidate. At the time, I was working for the Creative Destruction Lab (CDL), a startup accelerator focused on machine learning out of the University of Toronto, and I wanted to recruit the BenchSci team to apply.

The team was successful in joining CDL, and I started working with them to support their commercialization. After a few weeks of working together, I joined as co-founder and CEO. And just like that, we were off to the races.

Flash forward to today

We now have over 100 proprietary machine learning models and empower 49,000 scientists globally to optimize their experiment designs and their research productivity. Our platform includes comprehensive data from 12.4 million scientific publications (including open- and closed-access papers plus preprint data) and over 400 vendor catalogs. BenchSci went from a team of five to a team of over 250 and counting, and as of January, we have raised over $100M through our Series C funding led by Inovia Capital and TCV. We’re using that money to fund our hypergrowth and grow our team to over 400 members who will help us further transform drug research and development with AI-powered software.

Tom may have been the one who started all this, but we know from our thousands of users that he is far from the only scientist who has struggled at the bench due to insufficient tools. As we near our fifth anniversary, BenchSci is more committed than ever to building great tools that help scientists do their work. 

Written By:
Liran Belenzon (he/him)