Could someone direct me to resources to understand the link between ML and genomics related to COVID-19? Having a background on ML learning only, I do not understand fully what kind of mathematical problem the genomics community would try to solve when developing a cure or a vaccine using the virus' genome. The group I am working with at MIT developed sparse methods applied to regression and classification in which the number of samples n is much lower than the number of features p using a limited number of variables which improves interpretability, robustness and accuracy vs. existing methods. I understand that the genomics community is using methods such as LASSO and sparse partial least square. I would be very grateful to anyone helping with the following questions:
- What is the objective and meaning of mathematical problems related to this issue? (I understand that similarities with other viruses from a genetic point of view also means that similar vaccines/treatments would help)
- What data bases of existing viruses are used? are they available?
- Is there a genome database of COVID 19 that coul dbe used?