What is COVID-19?

COVID-19 is one of the most contagious pandemics we have witnessed, with the greatest loss of human life since the 1918 influenza outbreak. Existing antiviral drugs have shown limited activity against the SARS-CoV-2 virus (responsible for COVID-19), and they are often accompanied by adverse events. Thus, it is imperative that we discover drugs that are both efficacious and safe. This current situation poses a unique opportunity for scientists to make a real impact on public health.


Much effort has been invested in engineering a vaccine or discovering novel compounds against this novel virus. However, vaccines and new drugs are slow, and their approval takes on the order of years. To make an immediate impact on the pandemic situation, it is most realistic to repurpose existing drugs, whether as individual treatments or as components of a drug combination.

SARS-CoV-2 is a betacoronavirus, a family that encompasses SARS-CoV and MERS-CoV. While coronaviruses are a broad class of viruses that infect humans and many animals, they are rarely studied outside of public health emergencies. As a result, our knowledge of coronaviruses is limited primarily to SARS-CoV and MERS-CoV. Currently, there are three main targets of the SARS-CoV-2 virus: the RNA polymerase, the 3CL protease (M protease), and the PL protease. These targets are similar to their counterparts in SARS-CoV, to some extent.


From Antibacterial to Antiviral

We have recently leveraged machine learning approaches (Chemprop) to discover new antibacterial molecules from both the Drug Repurposing Hub from the Broad Institute, as well as the ZINC15 database. Briefly, after training models to predict structurally unique antibacterial chemicals using only ~2,500 training compounds, we discovered halicin from the Drug Repurposing Hub, a new antibacterial molecule with in vivo efficacy that rapidly kills a wide range of pathogenic bacteria through dissipation of the bacterial cytoplasmic membrane pH gradient. Furthermore, we discovered an additional eight new antibacterial molecules from the ZINC15 database, two of which display broad-spectrum activity. While the empirical data used for training antibiotic prediction models is different from what is needed for training an antiviral model, the underlying machine learning models are similar. We are therefore applying all that we learned through the discovery of halicin to rapidly identify new potential treatments for SARS-CoV-2, as well as other emerging viral and bacterial pathogens.


ML Formulation

Machine learning methods can easily learn to relate compounds and their properties so long as the molecule - property relationship is illustrated via examples.


For instance, data from molecular screens gives us a set of compounds that are active (e.g., inhibit a specific protein target) and others that are deemed inactive against this target.


From this dataset we can learn a classifier that can predict for any new compound whether the molecule has activity against the target. The trained classifier can be applied across libraries of known compounds in search for candidate drugs against the protein target.

Drug Discovery

Traditionally, drug discovery is a long and arduous process, that costs over $2 billion and 12-15 years on average. Pharmaceutical companies bring unlicensed compounds to the market through the following steps:


High-throughput screening yields a set of compounds known as drug candidates or hits, which potentially inhibit the target (e.g. 3CL protease of the SARS-CoV-2 virus).


Drug candidates are further filtered into leads through lead optimization, which may include in-vivo testing against the virus. These compounds are optimized for maximal efficacy and minimal side effects.


After process chemists determine whether the drug candidates can be manufactured at scale and how to do so, preclinical studies are conducted in animals to demonstrate that a drug is non-toxic.


Several stages of clinical trials are conducted in humans, starting from a small, healthy cohort and progressing to larger studies with actual patients. If a drug demonstrates both safety and efficacy, it may be approved.

Given the challenges of developing a new drug, it makes sense to find a SARS-CoV-2 antiviral within the pool of existing drugs, which have already been tested for safety. Multiple drugs with mediocre effects against SARS-CoV-2 can be combined into a more potent mixture, known as a drug combination or a drug cocktail.