Statistical Methods Development in Genetic Epidemiology
David Conti is an associate professor in the USC Keck School of Medicine's
Department of Preventive Medicine, Division of Biostatistics, and the
Zikha Neurogenetic Institute. His research interests include the development
of statistical methods in genetic epidemiology, as well as the investigation
of genetic contributions to smoking behavior, colon cancer, asthma, lymphoma,
and psychiatric disorders
Recent advances in technology have made it feasible to measure millions of
single-nucleotide polymorphisms (SNPs), which are DNA variations in a
single nucleotide. Using lab-generated data and leveraging publicly
available data, Conti and his colleagues extend the amount of genetic
information by estimating several million additional SNPs for each
individual, which is a computationally demanding process. For a typical
study, it can take a month to estimate all of an individual's SNPs. By
breaking the genome into small units and using HPCC resources, the time
needed to produce such an estimate can be reduced to a few hours.
Armed with this information, Conti and his team can examine the impact of
each SNP on the development of diseases. Rather than testing each SNP
independently, they use Bayesian hierarchical modeling approaches to
interrogate cominations of SNPs for synergistic effects. Since the
number of SNPs is very large, the space of all possible combinations is
astronomical and requires teh use of statistical and computational
techniques to limit the seach to selected combinations.
These selected combinations are determined in part by incorporating
known biology throught structured ontologies. This allows the search
algorithm to center more heavily on combinations of SNPs from genes
within a biological pathway. For example, when investigating the genetic
role in response to smoking therapies, the search concentrates on
combinations of SNPs from pathways related to nicotine metabolism and
the brain's reward system (i.e., serotonin and dopamine pathways).
Simulations to characterize when and how these methods best identify
causal SNPs serve as the foundation for future analysis.
Conti receives funding for research from the National Institutes of
Health and the Lymphoma and Leukemia Society.