Continued from page 1


IT WAS IN RECOGNITION of this legacy that, this year, Waterman was elected a fellow of the National Academy of Sciences, one of the highest honors that can be accorded a scientist or engineer. In November 2000, Celera Genomics – the private company that in competition with a U.S. government team completed the sequencing of the human genome (with heavy reliance on the Lander-Waterman algorithm) – made Waterman its first Celera Fellow. “Dr. Waterman,” noted the citation, “often referred to as the father of computational biology, is responsible for introducing the most important mathematical algorithms that have allowed scientists to assemble large genomes.” Waterman also founded the field’s major periodical, the Journal of Computational Biology and its premier conference, RECOMB, Celera officials acknowledged. And he wrote the first textbook on computational biology.
An enormous amount has happened in both biology and its computational arm since the Smith-Waterman algorithm appeared. Waterman recalls a prediction he heard at a conference in 1979: If everything went well, the speaker had said, by

Norman Arnheim’s pioneering role in discovering the polymerase chain reaction represented a major breakthrough for DNA analysis. He is now tracking the root cause of Huntington’s disease.
the year 2000 we might be sequencing 60,000 base-pairs (genetic letters) per year. “We are now sequencing 1 billion base-pairs every month,” Waterman says with a smile.
This information avalanche contains the explanation of how cells work at the most basic levels. It also potentially contains the information needed to end cancer – now clearly understood to be a disease of cell regulation – and as just the first step toward development of a batch of new cures and treatments for the medical profession. But this gigantic, accumulating body of data must first be analyzed and understood. And, as the founding proposal for USC’s new computational genomics center states, “the basic data of biology are too vast for any one person to grasp more than a small fraction.”
“The problem,” says Arnheim, “is an issue for every life scientist. We now have methods for rapidly collecting large amounts of data. We’re all running into the same problem. Once you’ve collected the data, how can you analyze it? The analysis is becoming the limiting factor.”
Waterman sees the role of the new center as “organizing a response to this data.”
“We want to identify the genes, and their transcription factors, and assemble a complete parts-list for the cell – including the whole range of proteins,” he says. “At the same time, we want to start to understand the regulatory regime and its pathways. And we have to do all of this in accordance with what we learned from [mid-20th-century zoologist and geneticist Theodosius] Dobzhansky: ‘Nothing in biology makes sense except in the light of evolution.’”
“My dream,” adds Waterman, “is to live to see a realistic model of how a cell works.”

THAT DREAM OBVIOUSLY WON'T BE REALIZED in its entirety at USC, but many talented hands besides Waterman’s are hard at work here. Among them is molecular biologist Myron Goodman, whose lab recently cracked a set of technical hurdles one colleague calls – in the jargon of skiing difficulty – “the black-diamond slopes of DNA biochemistry.”
Goodman’s team recreated in a test tube a cell process in which highly stressed bacteria try to repair genetic damage. The scientists found a whole new, previously unknown, less-accurate mechanism for copying DNA, dubbed a “sloppier copier.” This discovery – since confirmed and shown to work in humans as well as bacteria – has sparked a wave of study and speculation around the world.
Goodman now heads the newly organized Molecular and Computational Biology section of the College’s biological sciences department, which brings
A single gene is rarely responsible for disease. But finding the right gene combinations can be excruciatingly difficult. “For a pure mathematician, it’s wonderful to work in a field that makes a difference in people’s lives.
together computational biologists like Waterman and Tavaré, experimentalists like Arnheim and computing experts like Leonard Adleman.
Goodman is brimming with enthusiasm for the future of molecular biology at USC. “We had it all here,” he says, referring to USC’s longstanding grasp of the core elements. “Now it’s recognized.”
What impels Goodman now is the quest to finally catch a glimpse of the intricate clockwork that makes a cell work, while it’s working. “We are going to be able to apply mathematics to investigate the temporal regulation of genes in complex organisms,” he says. New “gene chip” technology – applying fabrication techniques invented for computers to problems of molecular biology – puts this goal squarely on the table. But in order to reliably interpret the data that gene chips supply, exceptionally powerful mathematical tools are needed.
Like Goodman, molecular biologist Norman Arnheim has grand ambitions: he hopes to track down a congenital killer. A leader in the study of heredity and genetic disease, Arnheim is also the experimental ace who pioneered a simple, ingenious way to harvest small amounts of DNA to produce samples large enough for statistical analysis. Using DNA from sperm cells, Arnheim, Tavaré, Goodman and researchers on both campuses are now making a concentrated, interdisciplinary push to find the root cause of Huntington’s disease, a fatal hereditary syndrome that causes chorea (dance-like involuntary movements), violent twitching and jerks, slurred speech, depression, irritability and progressive dementia.
It’s been known for some time that people with Huntington’s disease have a distinct abnormality in their DNA – specifically, three base-pair (letter) triads with a certain spelling (CAG) repeated over and over again, like a key held down too long on a computer keyboardddddddddddddd.
The USC group studying Huntington’s and other “genetic instability” diseases has won unusual long-term support from the National Institutes of Health in its attack on this problem precisely because it has the tools needed to win – not just the capability to create “knockout mice” prone to the genetic repeat, but, crucially, the mathematical and computer tools necessary to understand and model the results.
Similarly, the work of Magnus Nordborg has attracted a $2.7 million grant at least in part because of the full lab and other support USC can offer computational biologists. Nordborg, a Swedish molecular biologist educated in the United States, now collaborates with Tavaré in a subspecialty called “coalescent theory” – tracing traits back in time, building the mathematics needed to assess the relatedness of individuals. He experiments with a small weed called Arabidopsis.
The plant’s genetic makeup has been studied by geneticists for decades and is now well known. Like a modern-day molecular Gregor Mendel, Nordborg crosses the weeds to observe the way forces like cold or drought bring out special traits in the genome – actually observing evolution in action on the genomic level.


next page



Related Links

Scrambled Library

Monster Tinker-Toys

Stained Light Show

Other Features

Hooked on Classics

Giving Back to the Future

Mathematics of Life

In Memoriam: John H. McKay