From New World Encyclopedia
(Redirected from Genes)

Diagram showing a greatly simplified gene in relation to the double helix structure of DNA with intron (non-coding) and exon (coding) portions labeled. If the gene is active in a particular cell, it will be transcribed into RNA that serves either a direct function in the cell or as a template for making a protein. In eukaryotic organisms, making protein from RNA entails excising the intron regions and splicing together some of the exon regions to make the template for one or more different proteins depending on which of the RNA exon regions are spliced together.

Genes, the units of heredity in living organisms, are encoded in an organism's genetic material (DNA). They exert a central influence on the organism's physical aspects and are passed on to succeeding generations through the reproduction process. Genetic material can also be passed between unrelated individuals on viruses or through the process of transfection used in genetic engineering.

Common usage of the word "gene" reflects its meaning in molecular biology, namely the segments of DNA that cells transcribe into either RNA that is translated into proteins (DNA=>RNA=>protein) or RNA used for direct purposes (DNA=>RNA). The Sequence Ontology project, a consortium of several centers of genomic studies, defines a gene as: "A locatable region of genomic sequence, corresponding to a unit of inheritance, which is associated with regulatory regions, transcribed regions, and/or other functional sequence regions." The definition reflects the full complexity that has come to be associated with the term gene.

Genes encode the information necessary for constructing the multitude of proteins and RNA units needed to maintain an organism's existence, growth, action, and multiplication. Each gene that serves as the first step in protein formation is a region of DNA comprising a mixture of some sections (exons) that code for proteins, others (introns) that have no apparent function, and still others that define the beginning and end of the gene or the conditions in which the gene will be expressed or not expressed.

Although the human genome comprises roughly 25,000 genes carrying codes for proteins, each human cell has the potential of making about 100,000 different proteins. Further complexity lies in the additional 10,000 or so genes used for making RNA that directly serves such cellular functions as structure, catalysis, and regulation of gene expression. The proteins and RNAs all share in the tasks of maintaining the cell, one of which is the continual fine tuning of the exact selection of genes being expressed according to the cell's function and its continually changing environment. The ongoing discovery of so much functional RNA in the cell, much of it related to the expression of genes, is taken be some as a sign that RNA may deserve a co-equal billing with DNA in terms of overall contribution to the cellular function.

Genes are of central importance to the physical aspect of a living organism: A person's eye color, the breed of a dog, the gender of a horse. Mouse DNA yields a mouse, not an elephant. However, the impact of genes is sometimes extrapolated to the view that genes control everything about human lives and destiny. This is the concept of genetic determinism whereby human behavior, intelligence, emotions and attitudes, and health are fixed by genetic makeup and thus unchangeable. Such a misconception has at times been used as a base for explaining away racial prejudices, addictions, and criminal behavior, and seeking solutions to social problems by turning to genetic engineering as the ultimate solution.

The more balanced and generally recognized view is that biological contributions to solving social problems must be sought through a biology that takes into account the influence of social and cultural factors in human physical development and behavior.


In molecular biology, a gene is considered to comprise both a coding sequence—the region of DNA (or RNA, in the case of some viruses) that determines the structure of a protein—and a regulatory sequence—the region of DNA that controls when and where the protein will be produced. The genetic code determines how the coding DNA sequence is converted into a protein sequence (via transcription and translation). The genetic code is essentially the same for all known life, from bacteria to humans.

Through the proteins they encode, genes govern the cells in which they reside. In multicellular organisms, much of the development of the individual, as well as the day-to-day functions of the cells, is tied to genes. The genes' protein products fulfill roles ranging from mechanical support of the cell structure to the transportation and manufacture of other molecules and the regulation of other proteins' activities.

Due to rare, spontaneous changes (e.g. in DNA replication) mutations in the sequence of a gene may arise. If these mutations occur in the germ line cells, they may be passed on to the organism's offspring. Once propagated to the next generation, this mutation may lead to variations within the population of a species. Variants of a single gene are known as alleles, and differences in alleles may give rise to differences in traits, for example, eye color. A gene's most common allele is called the wild type allele, and rare alleles are called mutants.

The genotype of an individual organism is its specific genetic makeup (the specific genome). The phenotype of an individual organism is determined to some extent by the genotype, or by the identity of the alleles that an individual carries at one or more positions on the chromosomes. A phenotype is either the organism's total physical appearance and constitution or a specific manifestation of a trait, such as size, eye color, or behavior that varies between individuals. Many phenotypes are determined by multiple genes and influenced by environmental factors.

In most cases, RNA is an intermediate product in the process of manufacturing proteins from genes. However, for some gene sequences, the RNA molecules are the actual functional agents. For example, RNAs known as ribozymes are capable of enzymatic function, and small interfering RNAs have a regulatory role. The DNA sequences from which such RNAs are transcribed are known as genes for non-coding RNA or RNA.

Most living organisms carry their genes as, and transmit them to offspring as, DNA, but some viruses carry only RNA. Because these viruses use RNA, their cellular hosts may synthesize the viral proteins as soon as they are infected and without the delay in waiting for transcription. On the other hand, RNA retroviruses, such as HIV, require the reverse transcription of their genome from RNA into DNA before their proteins can be synthesized.

In common speech, "gene" is often used to refer to the hereditary cause of a trait, disease, or condition—as in "the gene for obesity." Speaking more precisely, a biologist might refer to an allele or a mutation that "has been implicated in" or "is associated with" obesity. This is because biologists know that many factors other than genes decide whether a person is obese or not: Eating habits, exercise, prenatal environment, upbringing, culture, and the availability of food, for example.

Moreover, it is highly unlikely that variations within a single gene, or single genetic locus, would fully determine an individual's genetic predisposition for obesity. Rather, the norm with regard to many and perhaps most ("complex" or "multi-factoral") traits is that they reflect the combined effects of several factors including inheritance, interplay between genes and environment, and the combined influence of many genes. The term phenotype refers to the physical characteristics result from the interplay of all of these factors.

Typical numbers of genes in an organism

This table gives typical numbers of genes and genome size for some organisms. Estimates of the number of genes in an organism are somewhat controversial because they depend on the discovery of genes, and no techniques currently exist to prove that a DNA sequence contains no gene. (In early genetics, genes could be identified only if there were mutations, or alleles.) Nonetheless, estimates are made based on current knowledge.

organism genes base pairs
Plant <50,000 <1011
Human, mouse or rat 25,000 3×109
Fruit Fly 13,767 1.3×108
Honey bee 15,000 3×108
Worm 19,000 9.7×107
Fungus 6,000 1.3×107
Bacterium 500–6,000 5×105–107
Mycoplasma genitalium 500 580,000
DNA virus 10–900 5,000–800,000
RNA virus 1–25 1,000–23,000
Viroid 0–1 ~500

Human gene nomenclature

For each known human gene, the HUGO Gene Nomenclature Committee (HGNC) approves a gene name and symbol (short-form abbreviation) and stores all approved symbols in the HGNC Database. Each symbol is unique and each gene is given only one symbol. This protocol greatly facilitates clear and precise gene identifications in communications and in electronic data retrieval from publications. By convention, symbols for the different genes within a gene family all share a certain parallelism of construction. The symbols for human genes can also be applied to congruent genes in other species, such as the mouse.


The word "gene" was coined in 1909 by Danish botanist Wilhelm Johannsen for the fundamental physical and functional unit of heredity. The word gene was derived from Hugo De Vries' term pangen, itself a derivative of the word pangenesis, which Darwin (1868) had coined. The word pangenesis is made from the Greek words pan (a prefix meaning "whole," "encompassing") and genesis ("birth") or genos ("origin").

The existence of genes was first suggested by Gregor Mendel, who, in the 1860s, studied inheritance in pea plants and hypothesized a factor that conveys traits from parent to offspring. Although he did not use the term "gene," he explained his results in terms of inherited characteristics. Mendel was also the first to hypothesize independent assortment (the idea that pairs of alleles separate independently during meiosis), the distinction between dominant and recessive traits, the distinction between a heterozygote and homozygote (an organism with different or the same alleles, respectively, of a certain gene on homologous chromosomes), and the difference between what would later be described as genotype (specific genetic make-up) and phenotype (physical manifestation of the genetic make-up). Mendel's concept was finally named when Wilhelm Johannsen coined the word "gene" in 1909.

In the early 1900s, Mendel's work received renewed attention from scientists. In 1910, Thomas Hunt Morgan showed that genes reside on specific chromosomes. He later showed that genes occupy specific locations on the chromosome. With this knowledge, Morgan and his students began the first chromosomal map of the fruit fly Drosophila. In 1928, Frederick Griffith showed that genes could be transferred. In what is now known as Griffith's Experiment, injections into a mouse of a deadly strain of bacteria that had been heat-killed transferred genetic information to a safe strain of the same bacteria, killing the mouse.

In 1941, George Wells Beadle and Edward Lawrie Tatum showed that mutations in genes caused errors in certain steps in metabolic pathways. This showed that specific genes code for specific proteins, leading to the "one gene, one enzyme" hypothesis. Oswald Avery, Collin Macleod, and Maclyn McCarty showed in 1944 that DNA holds the gene's information. In 1953, James D. Watson and Francis Crick demonstrated the molecular structure of DNA, a double-helix. Together, these discoveries established the central dogma of molecular biology, which states that proteins are translated from RNA which is transcribed from DNA. This dogma has since been shown to have exceptions, such as reverse transcription in retroviruses.

The term "gene" is shared by many disciplines, including classical genetics, molecular genetics, evolutionary biology, and population genetics. Because each discipline models the biology of life differently, the usage of the word gene varies between disciplines. It may refer to either material or conceptual entities.

Evolution and genes

Broadly defined, evolution is any heritable change in a population of organisms over time. As noted by Curtis & Barnes (1989),

The changes in populations that are considered evolutionary are those that are inheritable via the genetic material from one generation to another. As such, evolution can also be defined in terms of allele frequency, with allele being alternative forms of a gene, such as an allele for blue eye color versus brown eye color. Two important and popular evolutionary theories that address the pattern and process of evolution are the Theory of descent with modification and the theory of natural selection.

The theory of descent with modification, or the "theory of common descent" deals with the pattern of evolution and essentially postulates that all organisms have descended from common ancestors by a continuous process of branching. The theory of modification through natural selection, or the "theory of natural selection," deals with mechanisms and causal relationships, and offers one explanation for how evolution might have occurred—the process by which evolution took place to arrive at the pattern.

According to the modern evolutionary synthesis, which integrated Charles Darwin's theory of evolution by natural selection with Gregor Mendel's theory of genetics as the basis for biological inheritance and mathematical population genetics, evolution consists primarily of changes in the frequencies of alleles between one generation and another as a result of natural selection. Natural selection has traditionally been viewed as acting on individual organisms, but has also been seen as working on groups of organisms.

An alternative model, the gene-centered view of evolution, sees natural selection as working on the level of genes.

Gene-centered view of evolution

The gene-centered view of evolution, gene selection theory, or selfish gene theory, holds that natural selection acts through differential survival of competing genes, increasing the frequency of those alleles whose phenotypic effects successfully promote their own propagation. According to this theory, adaptations are the phenotypic effects through which genes achieve their propagation.

The view of the gene as the unit of selection was mainly developed in the books Adaptation and Natural Selection, by George C. Williams, and also in The Selfish Gene and The Extended Phenotype, both by Richard Dawkins.

Essentially, this view notes that the genes in existence today are those that have reproduced successfully in the past. Often, many individual organisms share a gene; thus, the death of an individual need not mean the extinction of the gene. Indeed, if the sacrifice of one individual enhances the survivability of other individuals with the same gene, the death of an individual may enhance the overall survival of the gene. This is the basis of the selfish gene view, popularized by Richard Dawkins. He points out in his book, The Selfish Gene, that to be successful, genes need have no other "purpose" than to propagate themselves, even at the expense of their host organism's welfare. A human that behaved in such a way would be described as "selfish," although ironically a selfish gene may promote altruistic behaviors. According to Dawkins, the possibly disappointing answer to the question "what is the meaning of life?" may be "the survival and perpetuation of ribonucleic acids and their associated proteins."

However, a number of prominent evolutionists, including Ernst Mayr and Stephen Jay Gould, who do recognize selection at levels other than the individual, nonetheless strongly reject the selfish gene theory. Mayr (2001) states that "the reductionist thesis that the gene is the object of selection" is "invalid." Gould (2002)calls the theory a "conceptual error" that sidetracked the profession, and "inspired both a fervent following of a quasi-religious nature" and "strong opposition from many evolutionists."

Chemistry and function of genes

Chemical structure of a gene

A DNA molecule or strand comprises four kinds of sequentially linked nucleotides, which together constitute the genetic alphabet. A sequence of three consecutive nucleotides, called a codon, is the protein-coding vocabulary. The sequence of codons in a gene specifies the amino acid sequence of the protein it encodes.

In most eukaryotic species, very little of the DNA in the genome actually encodes proteins, and the genes may be separated by vast sequences of so-called "junk DNA." Moreover, the genes are often fragmented internally by non-coding sequences called introns, which can be many times longer than the coding sequence. Introns are removed on the heels of transcription by splicing. In the primary molecular sense, however, they represent parts of a gene.

All the genes and intervening DNA together make up the genome of an organism, which in many species is divided among several chromosomes and typically present in two or more copies. The location (or locus) of a gene and the chromosome on which it is situated is, in a sense, arbitrary. Genes that appear together on the chromosomes of one species, such as humans, may appear on separate chromosomes in another species, such as mice. Two genes positioned near one another on a chromosome may encode proteins that figure in the same cellular process or in completely unrelated processes. As an example of the former, many of the genes involved in spermatogenesis reside together on the Y chromosome.

Many species carry more than one copy of their genome within each of their somatic cells. These organisms are called diploid if they have two copies, or polyploid if they have more than two copies. In such organisms, the copies are practically never identical. With respect to each gene, the copies that an individual possesses are liable to be distinct alleles, which may act synergistically or antagonistically to generate a trait or phenotype. The ways that gene copies interact are explained by chemical dominance relationships.

Expression of molecular genes

For various reasons, the relationship between a DNA strand and a phenotype trait is not direct. The same DNA strand in two different individuals may result in different traits because of the effect of other DNA strands or the environment.

  • The DNA strand is expressed into a trait only if it is transcribed to RNA. Because the transcription starts from a specific base-pair sequence (a promoter) and stops at another (a terminator), the DNA strand needs to be correctly placed between the two. If not, it is considered junk DNA, and is not expressed.
  • Cells regulate the activity of genes in part by increasing or decreasing their rate of transcription. Over the short term, this regulation occurs through the binding or unbinding of proteins, known as transcription factors, to specific non-coding DNA sequences called regulatory elements. Therefore, to be expressed, the DNA strand needs to be properly regulated by other DNA strands.
  • The DNA strand may also be silenced through DNA methylation or by chemical changes to the protein components of chromosomes.
  • The RNA is often edited before its translation into a protein. Eukaryotic cells splice the transcripts of a gene by keeping the exons and removing the introns. Therefore, the DNA strand needs to be in an exon to be expressed. Because of the complexity of the splicing process, one transcribed RNA may be spliced in alternate ways to produce not one, but a variety of proteins (alternative splicing) from one pre-mRNA (mRNA transcript at pre-splicing stage). Prokaryotes produce a similar effect by shifting reading frames (the three ways the mRNA can be read by grouping the nucleotides into sets of three, as codons) during translation.
  • The translation of RNA into a protein also starts with a specific start and stop sequence.
  • Once produced, the protein interacts with the many other proteins in the cell, according to the cell metabolism. This interaction finally produces the trait.

This complex process helps explain the different meanings of "gene":

  • a nucleotide sequence in a DNA strand;
  • or the transcribed RNA, prior to splicing;
  • or the transcribed RNA after splicing, i.e. without the introns

The latter meaning of gene is the result of a more "material entity" than the first one.

Mutations and evolution

Just as there are many factors influencing the expression of a particular DNA strand, there are many ways to have genetic mutations.

For example, natural variations within regulatory sequences appear to underlie many of the heritable characteristics seen in organisms. The influence of such variations on the trajectory of evolution may be as large as or larger than variation in sequences that encode proteins. Thus, though regulatory elements are often distinguished from genes in molecular biology, in effect they satisfy the shared and historical sense of the word. Indeed, a breeder or geneticist, in following the inheritance pattern of a trait, has no immediate way of knowing whether this pattern arises from coding sequences or regulatory sequences. Typically, he or she will simply attribute it to variations within a gene.

Errors during DNA replication may lead to the duplication of a gene, which may diverge over time. Though the two sequences may remain the same, or be only slightly altered, they are typically regarded as separate genes (i.e. not as alleles of the same gene). The same is true when duplicate sequences appear in different species. Yet, though the alleles of a gene differ in sequence, nevertheless they are regarded as a single gene (occupying a single locus).

The Shifting Locus of Biological Centrality

When the Human Genome Project began in 1990 scientists' estimates of the number of genes they would find was roughly 100,000-150,000, largely because of the number of different kinds of proteins found in the body and the assumption that one gene coded for one protein. By the end of the project in 2003, the estimate was 20,000 to 25,000 genes that coded for proteins, which was taken to mean that many genes must be coding for two, three, or four, or perhaps more different kinds of proteins.

This marked the beginning of a shift from the sense that DNA and the genes carried on it exercise singular influence and control in shaping the physical potentials of an individual. If one gene makes more than one protein, then the mechanism deciding which protein is produced from a given gene would be critical to the shaping of the individual's physical potentials. Of comparable importance to the question of the centrality of the genes is that of how a given human cell selects a subset of the 20-25,000 genes that will ultimately yield the 10,000 or so proteins that the cell needs out of the roughly 100,000 proteins available to it.

After the several decades in which DNA and the genes carried on it have been widely treated as the "stars" of the cellular world, new candidates are challenging for coequal or even perhaps primary recognition in terms of central importance to the cellular function. One, tied to the RNA World model of the origins of life, notes the growing number of identified types of non-coding functional RNA, many of which play a role in the regulation of gene expression. In this view, DNA is a passive, unchanging repository of information, whereas RNA is the active information agent even influencing which segments of DNA are expressed. This view suggests that RNA must deserve at least a co-equal place with DNA as a factor influencing an organism's physiology and psychology.

The second view shifts the focus completely away from the cell nucleus, DNA, and RNA. It notes that cells alter the selection of genes they express according to environmental influences they experience, and further that cells experience the environment through the mediation of the protective cell membrane and the thousands of proteins floating in it. With membrane proteins being sensitive to both magnetic and electromagnetic signals, the cells become object partners to their immediate environment(epigenetic factors), which includes influences from thoughts and emotions of the human mind responding to the environment (such as the adrenaline rush when a person wakes up in a burning house). In this view, mind becomes the intermediate third actor in the traditional dichotomy of nature (genes) or nurture (environment).

ISBN links support NWE through referral fees

  • Curtis, H., and N. S. Barnes. 1989. Biology, Fifth Edition. New York: Worth Publishers.
  • Dawkins, R. 1990. The Selfish Gene. Oxford University Press. ISBN 0192860925
  • Lipton, Bruce. 2005. The Biology of Belief: Unleashing the Power of Consciousness, Matter, and Miracles. Santa Rosa, CA: Mountain of Love Productions. ISBN 0975991477
  • Williams, G. C. 1966. Adaptation and Natural Selection. Princeton, NJ: Princeton University Press.


New World Encyclopedia writers and editors rewrote and completed the Wikipedia article in accordance with New World Encyclopedia standards. This article abides by terms of the Creative Commons CC-by-sa 3.0 License (CC-by-sa), which may be used and disseminated with proper attribution. Credit is due under the terms of this license that can reference both the New World Encyclopedia contributors and the selfless volunteer contributors of the Wikimedia Foundation. To cite this article click here for a list of acceptable citing formats.The history of earlier contributions by wikipedians is accessible to researchers here:

The history of this article since it was imported to New World Encyclopedia:

Note: Some restrictions may apply to use of individual images which are separately licensed.