Genome is one complete set of hereditary information that characterizes an organism, as encoded in the DNA (or, for some viruses, RNA). That is, a genome is equivalent to the complete genetic sequence on one of the two sets of chromosomes of the somatic cells of a diploid individual, or the total genetic sequence in the single chromosome of a bacteria, or the sequence of RNA in an RNA virus. The genome includes both the genes and the non-coding sequences of DNA.
In eukaryotes, the term genome can be applied specifically to mean that genetic content stored on a complete set of nuclear DNA (i.e., the "nuclear genome") but can also be applied to that stored within organelles that contain their own DNA, as with the mitochondrial genome or the chloroplast genome.
The sequencing and comparison of the genomes of diverse organisms shows the remarkable connectedness of living organisms, as even more complex species, higher on the phylogenetic tree, share basic sequences with bacteria. Many sequences in the genomes of the yeast Saccharomyces, the fruit fly Drosophila, and the worm Caenorhabditis are the same, coding for the same genes.
The complexity of the genome is also evident. An analogy to the human genome stored on DNA is that of instructions stored in a book:
The units of heredity in living organisms are encoded in an organism's genetic material, DNA. The nucleic acid DNA (deoxyribonucleic acid) contains the genetic instructions used in the development and functioning of all known living organisms. (Some viruses utilize RNA, but are not universally considered living organisms.) The main role of DNA molecules is the long-term storage of information. DNA teams with the nucleic acid RNA (ribonucleic acid) to together oversee and carry out the construction of the tens of thousands of protein molecules needed by living organisms.
As nucleic acids, DNA and RNA contain numerous nucleotides (each composed of a phosphate unit, a sugar unit, and a "base" unit) linked recursively through the sugar and phosphate units to form a long chain with base units protruding from it. Nucleic acids carry the coded genetic information of life according to the order of the base units extending along the length of the molecule. The DNA, which carries genetic information in cells, is normally packaged in the form of one or more large macromolecules called chromosomes.
Genome refers to the total DNA sequence that characterizes a species. That it, a genome is the genetic content (DNA sequences) contained within one set of chromosomes in eukaryotes, or the single chromosome of prokaryotes. For those viruses that utilize only RNA as hereditary material, genome is equivalent to the RNA sequence. Genome includes not only the coding genes of a chromosome but also the non-coding sequences, sometimes referred to as "junk DNA." In humans, this non-coding DNA may be as much as 97% of the total DNA.
The term genome was adapted in 1920 by Hans Winkler, Professor of Botany at the University of Hamburg, Germany. The Oxford English Dictionary suggests the name to be a portmanteau of the words gene and chromosome; however, many related -ome words already existed, such as biome and rhizome, forming a vocabulary into which genome fits systematically.
When people say that the genome of a sexually reproducing species has been "sequenced," typically they are referring to a determination of the sequences of one set of autosomes and one of each type of sex chromosome, which together represent both of the possible sexes. Even in species that exist in only one sex, what is described as "a genome sequence" may be a composite read from the chromosomes of various individuals.
In general use, the phrase "genetic makeup" is sometimes used conversationally to mean the genome of a particular individual or organism. The study of the global properties of genomes of related organisms is usually referred to as genomics, which distinguishes it from genetics, which generally studies the properties of single genes or groups of genes.
The size of genomes is measured in terms of the number of base pairs, although the large numbers mean that the unit used tends to be megabases (Mb), corresponding to 1,000 base pairs.
Most biological entities that are more complex than a virus sometimes or always carry additional genetic material besides that which resides in their chromosomes. The plasmids of plants and algae, such as chloroplasts, carry genetic material within their membranes, separate and distinct from that of the nucleus. Likewise, the mitochondria of all eukaryotes contain genetic material within their membranes as well, separate and distinct from the nuclear DNA.
Generally, in eukaryotes such as plants, protozoa, and animals, the term "genome" carries the typical connotation of only information on chromosomal DNA. So although these organisms contain mitochondria that have their own DNA, the genes in this mitochondrial DNA are not considered part of the genome. Instead, mitochondria or chloroplasts are sometimes said to have their own genome, often referred to as the "mitochondrial genome" or chloroplast genome.
In some contexts, such as sequencing the genome of a pathogenic microbe, "genome" is meant to include information stored on this auxiliary material, which is carried in plasmids or mitochondria. In such circumstances then, "genome" describes all of the genes and information on non-coding DNA that have the potential to be present.
Note that a genome does not capture the genetic diversity or the genetic polymorphism of a species. For example, the human genome sequence in principle could be determined from just half the information on the DNA of one cell from one individual. To learn what variations in genetic information underlie particular traits or diseases requires comparisons across individuals. This point explains the common usage of "genome" (which parallels a common usage of "gene") to refer not to the information in any particular DNA sequence, but to a whole family of sequences that share a biological context.
Although this concept may seem counter intuitive, it is the same concept that says there is no particular shape that is the shape of a cheetah. Cheetahs vary, and so do the sequences of their genomes. Yet both the individual animals and their sequences share commonalities, so one can learn something about cheetahs and "cheetah-ness" from a single example of either.
Technology has developed whereby it is possible to determine the entire DNA sequence of an organism's genome. In 1976, Walter Fiers at the University of Ghent (Belgium) was the first to establish the complete nucleotide sequence of a viral RNA-genome (bacteriophage MS2). The first DNA-genome project to be completed was the Phage Φ-X174, with only 5368 base pairs, which was sequenced by Fred Sanger in 1977. The first bacterial genome to be completed was that of Haemophilus influenzae, completed by a team at The Institute for Genomic Research in 1995. Genomes were subsequently elucidated for several bacteria (including Escherichia coli), then yeast (Saccharomyces), a plant (Arabidopsis), and some animals (the nematode Caenorhabditis and the fruit fly Drosophila).
The genome of numerous organisms has since been done. The Human Genome Project was organized to map and to sequence the human genome. The completion of the essential sequence of the human genome was announced in June 2000. Other genome projects include mouse, rice, and so forth, with the cost of sequencing continuing to drop and making the process more feasible. In May 2007, the full genome of DNA pioneer James D. Watson was recorded, perhaps a gateway to upcoming personalized genomic medicine.
One of the more interesting results from comparing the genomes of various organisms is that there are basic genes of higher organism that can be traced back to genes in bacteria.
In general, genome size is larger for organisms higher on the phylogenetic tree, with humans having a genome of about 3500 Mb and a bacterium only about 4 Mb. However, the presence of coding and non-coding DNA also is reflected in many organisms, such as lungfishes and salamanders, having unusually large genomes. The largest known genome belongs to an amoeba (Amoeba dubia).
|Organism||Genome size (base pairs)||Note|
|Virus, Bacteriophage MS2||3,569||First sequenced RNA-genome|
|Virus, Phage Φ-X174;||5,386||First sequenced DNA-genome|
|Virus, Phage λ||50,000|
|Bacterium, Haemophilus influenzae||1,830,000||First genome of living organism, July 1995|
|Bacterium, Carsonella ruddii||160,000||Smallest non-viral genome.|
|Bacterium, Buchnera aphidicola||600,000|
|Bacterium, Wigglesworthia glossinidia||700,000|
|Bacterium, Escherichia coli||4,000,000|||
|Amoeba, Amoeba dubia||670,000,000,000||Largest known genome.|
|Plant, Arabidopsis thaliana||157,000,000||First plant genome sequenced, Dec 2000.|
|Plant, Genlisea margaretae||63,400,000||Smallest recorded flowering plant genome, 2006.|
|Plant, Fritillaria assyrica||130,000,000,000|
|Plant, Populus trichocarpa||480,000,000||First tree genome, Sept 2006|
|Fungus, Aspergillus nidulans||30,000,000|
|Nematode, Caenorhabditis elegans||98,000,000||First multicellular animal genome, December 1998|
|Insect, Drosophila melanogaster aka Fruit Fly||130,000,000|||
|Insect, Bombyx mori aka Silk Moth||530,000,000|
|Insect, Apis mellifera aka Honeybee||1,770,000,000|
|Fish, Tetraodon nigroviridis, type of Puffer fish||385,000,000||Smallest vertebrate genome known|
|Mammal, Homo sapiens||3,200,000,000|
|Fish, Protopterus aethiopicus aka Marbled lungfish||130,000,000,000||Largest vertebrate genome known|
Note: The DNA from a single human cell has a length of ~1.8 meters (but at a width of ~2.4 nanometers).
Since genomes and their organisms are very complex, one research strategy is to reduce the number of genes in a genome to the bare minimum and still have the organism in question survive. There is experimental work being done on minimal genomes for single cell organisms as well as minimal genomes for multicellular organisms. The work is both in vivo and in silico.
Genomes are more than the sum of an organism's genes and have traits that may be measured and studied without reference to the details of any particular genes and their products. Researchers compare traits such as chromosome number (karyotype), genome size, gene order, codon usage bias, and GC-content to determine what mechanisms could have produced the great variety of genomes that exist today.
Duplications play a major role in shaping the genome. Duplications may range from extension of short tandem repeats, to duplication of a cluster of genes, and all the way to duplications of entire chromosomes or even entire genomes. Such duplications are probably fundamental to the creation of genetic novelty.
Horizontal gene transfer is invoked to explain how there is often extreme similarity between small portions of the genomes of two organisms that are otherwise very distantly related. Horizontal gene transfer seems to be common among many microbes. Also acquisition of entire sets of genes, even whole genomes of organisms, has been postulated as a major source of transmitted variation in organisms. And eukaryotic cells seem to have experienced a transfer of some genetic material from their chloroplast and mitochondrial genomes to their nuclear chromosomes.
New World Encyclopedia writers and editors rewrote and completed the Wikipedia article in accordance with New World Encyclopedia standards. This article abides by terms of the Creative Commons CC-by-sa 3.0 License (CC-by-sa), which may be used and disseminated with proper attribution. Credit is due under the terms of this license that can reference both the New World Encyclopedia contributors and the selfless volunteer contributors of the Wikimedia Foundation. To cite this article click here for a list of acceptable citing formats.The history of earlier contributions by wikipedians is accessible to researchers here:
Note: Some restrictions may apply to use of individual images which are separately licensed.