The pace of discovery has always depended on the tools that guide the seekers.
Venturing across unknown seas, early explorers navigated by lodestars at night, and lodestone compasses by day. Touchstones sorted true nuggets of precious metals from false, and Rosetta stone inscriptions helped scholars unlock the mysteries of Egyptian hieroglyphics.
Today we are engaged in one of the most urgent quests for new discoveries in generations. Covid-19 had killed more Americans than died in World War II and millions more around the globe. The race between vaccines and variants is getting tighter. So many lives and futures depend on finding better ways to defeat a virus that is plaguing the world.
To aid in this search, a UAB bioinformatics team has created a functional genomic SARS CoV-2 database that gives researchers free, searchable access to almost 12,000 pieces of data on genetic factors that are powering the pandemic. It includes annotated gene lists, genetic signatures, and details about pathways the virus is using to attack body systems.
"When the virus invades, it triggers very different responses in different people by activating a variety of pathways," Jake Chen, PhD, associate director and chief bioinformatics officer of the UAB School of Medicine Informatics Institute, said. "Understanding more about how the genes of the virus behave and interact with a patient's individual susceptibilities could spark ideas for new therapies and provide clues about which therapeutics we may already have that could be effective against the disease. It would support a data-driven precision medicine approach to matching the right therapy to the right patient."
The database is known as PAGER-CoV. It grew out of the infrastructure of PAGER (pathways, annotated gene lists and gene signatures), a database of gene sets developed by Chen over the past ten years primarily through his work with the genetics of cancer.
Pathways are the roadmap that describes how genes are turned on and off, and how they establish connections with each other. Annotated gene lists provide empirical information that researchers collect from experiments or literature. Gene lists help researchers understand how a certain cell type behaves under different conditions. A gene signature is a unique pattern of gene expression within a cell from a single or group of genes, providing information about the activity of those genes in the cell.
"Over the summer my colleagues and I began to adapt the PAGER format to collect genetic information currently available related to the virus and to add more as new findings were reported. Part of the data search used automated integration, but much of it was done with individually curated searches. We recruited our doctoral students who had been diverted from their regular work by the pandemic to assist us in combing through the new literature. Before we entered new information in the database, we ran the same research again to verify the findings," Chen said.
A report of the work was published in January in Nucleic Acids Research, a respected international science publication.
The important role of genetics in understanding the virus became evident early in the pandemic with the extreme variability in the course of the disease. "We wondered why some patients become very ill and die, while others show only mild symptoms or none at all. Why are different organs and systems affected? What determines who becomes a long hauler with persisting symptoms? Understanding the genetics should help us find the answers," Chen said.
Each point where the genes of the virus interact with the genes of the patient could offer an opportunity to intervene.
"Something about the hE2 cells deep within the lungs is vulnerable to the virus," Chen said. "That's where it anchors to ACE receptors and begins trying to hijack the body's cells to begin replicating its RNA. Then it starts down molecular pathways, turning genes on and off as it attacks."
As vulnerable genes linked to specific symptoms are identified, researchers may be able to identify therapies to counteract the effects. This could be particularly important to people with severe or persistent symptoms.
The body's response is also responsible for much of the damage. Individual genetic vulnerabilities, existing health conditions and other biological factors can trigger an over-response by the immune system, releasing a cytokine storm or creating a perfect environment where the virus can thrive. Learning how the body's genes affect response could lead to better strategies for regulating the immune system to prevent damage.
"SARS CoV-2 is a new virus with 15 genes and there is still much we don't understand about how the virus proteins are interacting with human cells and their downstream effects or what we can do about it," Chen said. "We invite researchers worldwide to make use of the portal and to participate in this community-based knowledge curation effort."
PAGER-CoV is freely available to the public without registration or login requirements at http://discovery.informatics.uab.edu/PAGER-CoV. The data is available for download based on the agreement of citing the work while using the data from the PAGER-CoV website.
Joint first authors on the paper published in Nucleic Acids Research are Zongliang Yue, PhD and Eric Zhang. Additional co-authors are Clark Xu, Sunny Khurana, Nishant Batra, Son Do Hai Dang and James J. Cimino, PhD.
PAGER-CoV was developed with support from the University of Alabama at Birmingham Informatics Institute, UAB Academic Enrichment Fund, the UAB Center for Clinical and Translational Science, National Cancer Institute, and the National Center for Advancing Translational Sciences of the National Institutes of Health.