Hi, I’m Patrick, and I’m part of the Data Engineering team at BenchSci. Our team works on the back-end of BenchSci’s platform, building and maintaining our data pipelines. As we continually expand BenchSci’s database, scientific publications first come through us. We build the machine learning models that analyze those publications, extract biological information, and contextualize information within our existing biomedical database—the most comprehensive of its kind. That information is then constructed into data and visualized on our platform, where it helps biomedical scientists quickly glean valuable insights related to their research.
The biggest reason I joined BenchSci, which still ignites passion in me every day, is the meaningful impact we’re able to make on people’s health worldwide. As the world enters the third year of the COVID pandemic, our ability as a collective to expand our biomedical knowledge is more important than ever. At BenchSci, we empower biomedical scientists to increase the speed and quality of their research so the new treatments they develop can save more patients’ lives.
One of the challenges our team works to address for scientists is ambiguity in scientific publications. Because there are often scientists working concurrently on similar research in different parts of the world, it’s common for the same gene to be given multiple names. Not only that, sometimes these names can be pretty unintuitive, like “Sonic Hedgehog” or “Van Gogh.” In fact, up until a few years ago, there were gene names in regular use that looked more like calendar dates (“SEPT2,” for example)—these were finally updated because software like Microsoft Excel would often misinterpret the data, potentially causing major problems.
So, there are efforts being made to improve how scientists document their experiments, but eliminating all ambiguity is likely impossible. There are always going to be things that stump our process, in which case we may need to find clues in other parts of a publication or build a new machine learning system to understand the full experimental context. It definitely keeps us on our toes, but it’s really satisfying to find the solution and see everything fall into place.
It may surprise you that, though we operate within the life science industry, not everyone on the Data Engineering team comes to BenchSci with preexisting biomedical knowledge. I actually do come from a biomedical background, but anyone skilled enough with Python and SQL can be hired here and be successful. Working with data that intrinsically contains biological information tends to rub off on people—after a few months of working here, team members generally pick up quite a bit of scientific understanding. Lifelong learning is such a valuable skill to exercise and develop in our ever-changing modern world, and there are so many incredibly smart people at BenchSci to learn from, not only in regards to machine learning and bioscience, but business, leadership, and compassion as well. Everyone here has unique knowledge and experiences to share.
We’re a company in hypergrowth, and Engineering is both our largest and fastest growing department. We have plenty of opportunities for talented and passionate engineers looking for a fulfilling role where they can impact the world for the better. We’re also a remote-first company, which opens opportunities to folks like myself who aren’t near our headquarters in Toronto (I’m based in Vancouver, BC). You can learn more about what working at BenchSci is like on our Careers site, or check out our open engineering roles!