A New Era of Drug Discovery Powered by AI and Global Collaboration

Benjamin Haibe-Kains—Executive AI Scientific Director at the UHN

Artificial intelligence, large-scale computing, and high-quality data are coming together to accelerate biomedical discovery—transforming breakthroughs that once took decades into achievements that can now happen in a fraction of the time.

Benjamin Haibe-Kains—Executive AI Scientific Director at the University Health Network (UHN) in Toronto, Canada

In this interview, Benjamin Haibe-Kains—Executive AI Scientific Director at the University Health Network (UHN), Scientific Director of the AI Hub and Cancer Digital Intelligence program, and Director of Data Science at the Structural Genomics Consortium (SGC)—discusses how initiatives such as the AirCheck Project will be benefited by My Research Cloud (MRCA), a service developed by Computing for Humanity (CFH), to build a more open, collaborative, and globally accessible research ecosystem where innovation is not constrained by geography or institutional resources. He is also a Senior Scientist at the Princess Margaret Cancer Centre and Professor at the University of Toronto.

Building the foundation for AI in drug discovery

At the core of modern drug discovery is a simple but powerful idea: if we can understand how molecules interact with proteins, we can design better therapies for human disease. To achieve this, researchers rely on vast amounts of experimental data combined with advanced AI models capable of identifying patterns far beyond human scale. But as Benjamin explains, the real bottleneck is not only algorithms—it is data quality and accessibility.

That is where the Structural Genomics Consortium (SGC) and its initiatives come in. Through programs like AirCheck, researchers are generating and curating high-quality chemical and biological datasets designed specifically to be “AI-ready.” This means standardized formats, harmonized metadata, and reproducible pipelines that allow machine learning models to learn reliably and at scale.

AirCheck: turning experiments into AI-ready knowledge

AirCheck is designed to bridge experimental science and artificial intelligence.

It integrates large-scale screening technologies such as:

  • DNA-encoded chemical libraries capable of testing billions of compounds

  • Mass spectrometry methods- for smaller, more robust screens   

  • Systematic hit characterization pipelines used in pharmaceutical research

These technologies generate massive datasets—but more importantly, AirCheck ensures that the data is structured, annotated, and shared in a way that AI systems can effectively use. As Benjamin highlights, the goal is not just to collect data, but to make it reusable, transparent, and globally accessible, enabling researchers anywhere in the world to build and validate AI models.

Benjamin Haibe-Kains' laboratory at the University Health Network (UHN), where approximately 25–30 researchers work at the intersection of artificial intelligence, computational biology, and drug discovery.

My Research Cloud and Computing for Humanity: democratizing access

A major challenge in AI-driven science is computational power. Training and validating models at scale requires significant infrastructure—often beyond the reach of individual labs or institutions.

To address this, CFH will provide AirCheck team members with access to My Research Cloud, a platform designed to make computing resources more accessible and easier to use.

This initiative is grounded in a simple principle: scientific innovation should not depend on where you are or how much funding your institution has. To support this goal, My Research Cloud aims to complement Canada’s academic and research computing ecosystem by lowering barriers to entry and enabling broader participation. In particular, it helps ensure that researchers across high, middle, and low-income contexts can participate on equal footing, allows students and early-career scientists to work with real-world datasets, and empowers citizen scientists to contribute new ideas and approaches.

The critical role of donors and partners

None of this would be possible without the support of donors, partners, and collaborators who help build and sustain this infrastructure.

As Benjamin emphasizes, supporters of CFH contribute in essential ways:

  • Donating high-performance computing equipment that powers large-scale AI research

  • Providing financial support that sustains data infrastructure and global access platforms

  • Contributing services and expertise that allow research systems to scale

    These contributions are not secondary—they are foundational to scientific progress.

Every donated server, every funded system, and every shared resource becomes part of an infrastructure that enables researchers to process complex biological data and develop AI models that can accelerate drug discovery.

Importantly, through platforms like My Research Cloud, these resources are not confined to a single institution. Instead, free access for Canada’s research community multiplies their impact across thousands of researchers and projects.

By making advanced computing resources accessible to researchers worldwide, donors and partners are helping build the foundation for the next generation of scientific breakthroughs. This shared infrastructure supports ambitious global initiatives such as Target 2035, which aims to accelerate the discovery of new medicines through AI and large-scale biological data.

Target 2035: Scaling discovery for the next era of medicine

Target 2035 is a major international initiative from the Structural Genomics Consortium, aimed at developing pharmacological modulators for all human proteins by 2035. The initiative  seeks to enhance our understanding of human biology and disease, and as a result advance drug discovery.  A key component of the project involves generating  high-quality data for up up to 2,000 proteins, creating a foundation for AI models that can learn general principles of how molecules interact with biological targets. However, this goal is only achievable if computational and AI/ML methods improve significantly. As Benjamin Haibe-Kains explains, the long-term goal is to enable predictive systems that can identify promising compounds without relying on costly, repeated laboratory experiments. Supported by global industry partners and public funders, the initiative also depends on open, equitable access to computing and data infrastructure to ensure researchers worldwide can contribute to and benefit from these advances.

Why open science matters

Across all these efforts, one principle remains central: open science.

By sharing data, tools, and results openly, the research community can collectively accelerate progress in areas like:

  • Cancer research

  • Neurological diseases

  • Infectious diseases

  • Precision medicine

Benjamin emphasizes that scientific progress depends on strong foundations—especially data management, annotation, and sharing practices that are often undervalued but essential.

Without these foundations, even the most advanced AI systems cannot reach their full potential.

A global scientific ecosystem

From computational infrastructure to experimental biology, the future of research is increasingly interconnected.

Initiatives like AirCheck, My Research Cloud, and Target 2035 reflect a shared vision:

A world where scientific discovery is faster, more open, and accessible to researchers everywhere.

Through collaboration between institutions like UHN, SGC, Computing for Humanity, and global partners, this vision is becoming a reality.

NOTE: This interview has been edited for length and clarity

Next
Next

Expanding AI and HPC Access: How Cal Poly Pomona’s AI Sandbox Empowers Students