|
The powerful world of bioinformatics
The
computer is increasingly being used to decipher, manage and organise
the vast genetic information that is the raw resource for the emerging
biotech economy. DEEPAK SHIKARPUR describes the exciting convergence
of IT and biotech and analyses the potential impact on both these
fields as well as the ramifications for humanity
After more than 40 years of running on parallel
tracks, information technology and biotechnology are slowly beginning
to fuse into a single technological and economic force. The computer
is increasingly being used to decipher, manage and organise the
vast genetic information that is the raw resource for the emerging
biotech economy. Scientists working in the new field of bioinformatics
are beginning to download this vast genetic information, creating
a powerful data warehousea Biological Data Bank. The rich
genetic information in these biological data banks is being used
by researchers to simulate the natural world on the computer.
The industrial age would not have been
possible without the invention of the printing press. Print technology
provided a new means of communications to manage the fast-paced,
complex world of coal and steam power. The print medium also redefined
the way human beings organise knowledge. It introduced charts, lists,
graphs and other visual aids. But today print technology is being
augmented and increasingly subsumed by computer technology in the
organisation and so is management of production, commerce and telecommunication.
The primary role of new communication technologies will be to manage
the genetic information of the new biotech marketplace just as print
was used to manage the industrial marketplace built on fossil fuels.
Information technology is not so much an
economic resource as it is a language of management
and coordination. Its destiny is intimately linked to the raw genetic
resources that it will isolate, download, organise, interpret, edit
and program in this Biotech Century. The computer and
accompanying telecommunication technologies are the extension of
the human nervous system into the world. In the coming years we
will all come to terms with usage of the computer more and more
as a substitute mind or languageto
manipulate, redirect and organise the vast genetic information that
makes up the physical substance of living nature.
Vast biological
information
Today, molecular biologists around the world are busily engaged
in the most extensive data collection project in history. The researchers
are mapping and sequencing the entire genomes of creatures from
the lowliest bacteria to humans, with the goal of finding new ways
of harnessing and exploiting genetic information for economic purposes.
By the end of the twenty-first century, molecular biologists should
have deciphered and catalogued the genomes of tens of thousands
of living organismsa vast library containing the evolutionary
blueprints of many of the microorganisms, plants and animals that
populate the Earth. Mapping the genomes of so many species will
yield quantities of information that will dwarf by orders of magnitude
anything encountered before. The biological information being gathered
is so great that it can only be managed by computers and stored
electronically in thousands of databases around the world. For example,
the complete human genomeonly one of the species that will
be sequenced and mappedwere it to be typed out in the form
used in a telephone directory, would take up five hundred volumes
of a thousand-page directory of a typical city. Thats a database
containing more than three billion entries.
Taking the analogy a step further, if we
were to print out the data on all human diversity, the database
would be at least four orders of magnitude biggeror ten thousand
times the size of the first database. In future, scientists are
likely to concentrate their efforts on micromanaging and updating
databases for small regions of the genome of individual species,
and coordinating their research with others by way of genome
work stationscomputer terminals that can provide researchers
with access to the genomic databases of their colleagues around
the world.
Multidisciplinary
cooperation
Collecting, downloading, managing, and utilising genomic information
will require closer cooperation of researchers in the related fields
of physics, mathematics, engineering, computer science, chemistry
and molecular biology. The Human Genome Project has hastened the
coming together of the computer and genetic sciences. Sequencing
and analysing the three billion base pairs would not be possible
without the help of computer scientists and increasingly sophisticated
computational techniques.
The Human Genome Project is turning biology
into an information science. Mapping and sequencing the genomes
is just the beginning. Reorganising the whole of the natural world
at the genetic level, with an eye to converting it to an array of
useful commodities in the marketplace is a challenge. Understanding
and chronicling all of the webs of relationships between genes,
tissues, organs, organisms and external environments and the perturbations
that trigger genetic mutations and phenotypic responses, is so far
beyond any kind of complex system ever modelled that only an interdisciplinary
approach, leaning heavily on the computational skills of the information
scientists, can hope to accomplish the task.
Virtual biology
Computers are also being used to create virtual biological environments
from which to model complex biological organisms, networks and ecosystems.
The virtual environments help researchers create new hypotheses
and scenarios that will later be used in the laboratory to test
new agricultural and pharmaceutical products and medical treatments
on living organisms. Working in virtual worlds, biologists can create
new synthetic molecules with a few keystrokes, bypassing the often-laborious
processes, which can take years of attempting to synthesise a real
molecule on the lab bench. With three dimensional computer models,
researchers can play with various combinations, on the screen, connecting
different molecules to see how they interact. Chemists are already
talking about compounds that could reproduce themselves, conduct
electricity, detect pollution, stop tumours, counter the effects
of cocaine, and block the progress of AIDS.
In 1996, first DNA chip was made. The DNA
chip closely resembles computer chips; they are packed with DNA
and are designed to read the reams of genetic information in the
genomes of living organisms. The first DNA chip was designed to
detect genetic abnormalities. Scientists say that the day is not
far off when DNA chips will be able to scan an individual patient,
read his or her genetic makeup in precise detail and even be able
to detect abnormal or malfunctioning genes. DNA chips will eventually
be able to determine which genes are flicking on or
off at any given time. Other DNA chips might be used
to scan a throat swab to identify a specific microbe that might
be the cause of a patients sore throat, even identifying specific
genes in the bacteria that are resistant to certain antibiotics.
DNA supercomputer
The final integration of the information and life sciences comes
in the form of the molecular computer a thinking machine
made of DNA strands rather than silicon. Scientists have already
constructed the first DNA computer and a growing number of both
computer scientists and molecular biologists predict that at some
time in the early years of the Biotech Century, much computing will
take place along DNA pathways rather than on the integrated circuitry
of a microchip. DNAs ability to compute information greatly
exceeds the most advanced supercomputers that exist today. Unlike
most conventional computers, which are sequential and can only handle
one thing at a time, DNA is a massive parallel computing machine
and can theoretically compute a hundred million billion things at
once. DNA is essentially digital, which means that it can count.
A coding procedure was invented then for translating DNA base pairs
into strings of ones and zeroes. Then poured together the contents
of test tubes filled with genetically sequenced molecules, which
allowed the DNA to simulate the electronic gates by which computers
make their yes-no decisions. In short the DNA was made to think.
The DNA supercomputer will bring the information
sciences and life sciences together into a single technology revolution
with the power to remake the world.
To facilitate the research process, many
Internet-based sites and ASPs are including e-commerce features
such as links to suppliers of reagents, DNA sequences, or research
clones. Whether it is with data visualisation programs or through
integration efforts at large corporations and Internet-based ASPs,
the goal increasingly is to put the tools in researchers hands.
Bioinformatics offerings continue to evolve and target individual
scientists, often through their desktop computers.
Bioinformatics companies say they will
increase access to more data and tools as they are generated, moving
beyond the most widely used gene-sequencing analysis to include
areas such as gene expression, protein identification and structure,
biochemical pathway data, pharmacogenomics, and chemical structure
and activity. From a user perspective, scientists hope to get integrated
packages of data, software, patent citations, literature, and supplier
links to support their research. These combined elements are anticipated
to decrease time spent in handling, manipulating, transmitting,
and analysing data to ultimately speed up drug discovery and development.
Bioinformatics companies are moving beyond
gene-sequencing analysis to areas such as gene expression, protein
identification and structure, biochemical pathway data, pharmacogenomics,
and chemical structure and activity.
Deepak Shikarpur is executive director of the
Computer Society of India. Contact him at deepakshikarpur@yahoo.com
|