From New World Encyclopedia
Jump to: navigation, search
A representation of the three-dimensional structure of myoglobin, the oxygen carrier in muscle. Max Perutz and Sir John Cowdery Kendrew received a Nobel Prize in Chemistry for their elucidation of myoglobin's structure in 1958; it was the first protein whose structure was solved using X-ray crystallography. The colored alpha helices represent myoglobin's secondary structure (discussed below).

A protein is a biological polymer comprising numerous amino acids linked recursively through peptide bonds between a carboxyl group and an amino group of adjacent amino acids to form a long chain with the defining side group of each amino acid protruding from it. The sequence of amino acids in a protein is defined by a gene and encoded in the genetic code, which selects protein components from a set of 20 "standard" amino acids.

Some proteins function as separate entities while others associate together to form stable functional complexes, such as the ribosomes, which comprise more than 50 proteins. Along with polysaccharides, lipids, and nucleic acids, proteins are one of the major classes of macromolecules that make up the primary constituents of biological organisms.

As suggested by the etymological origins of the term (from the Greek word proteios, meaning “of the first order”), proteins are of prime importance in the structure and function of all living cells and viruses. Different proteins perform a wide variety of biological functions. Some proteins are enzymes, catalyzing the chemical reactions in an organism. Other proteins play structural or mechanical roles, such as those that form the struts and joints of the cytoskeleton, which is like a system of scaffolding within a cell. Still others, such as antibodies, are able to identify and neutralize foreign substances like bacteria and viruses.

Dietary protein is essential for the survival of animals. Unlike plants, which are able to synthesize all the amino acids they require, animals can only synthesize some of the 20 standard amino acids necessary for normal functioning. The amino acids required in the animal diet are known as essential amino acids, though their specific number and type vary among species.

The functionality of a protein is dependent upon its ability to fold into a precise three-dimensional shape. This complex folding remains a mystery and reveals a remarkable complexity and harmony in our universe. As Lewis (2005) notes, "there are so many solutions it would not be possible for a protein to test all of these until it finds the right one, it would take too long. A small chain of 150 amino acids testing 1012 different configurations each second would take about 1026 years—a billion, billion times the age of the universe—to find the 'correct configuration.' Yet, the refolding of a denatured enzyme takes place in less than a minute."

Discovered by Jöns Jakob Berzelius in 1838, proteins are among the most actively studied molecules in biochemistry. Biochemists are interested in determining a protein's unique amino acid sequence, which is presumed to govern its three-dimensional structure and, in turn, its biological function. Knowing a protein's amino acid sequence can be helpful in the study and treatment of disease, since a change in a single amino acid in a single protein (which often reflects a mutation in a particular gene) can result in diseases such as sickle-cell anemia and cystic fibrosis. Charting the amino acid sequences of proteins contributes to a reconstruction of the history of early life, as proteins resemble one another in sequence only if they evolved from a common ancestor.


The structure of proteins

Components and synthesis

Proteins are built from combinations of 20 different biological amino acids, which are molecules composed of a central or alpha carbon with three attachments: an amino group (-NH2), a carboxylic acid group (-COOH), and a unique R group, or side chain. In proteins, amino acids (specifically, alpha-amino acids) are linked together by peptide bonds, which form when the amino group of one amino acid reacts with the carboxyl group of a second amino acid to form a covalent bond after releasing a water molecule. An amino acid residue is what is left of an amino acid once it has coupled with another amino acid to form a peptide bond.

Proteins are generally large molecules (e.g., the muscle protein titin or connectin has a single amino acid chain that is 27,000 subunits long). Such long chains of amino acids are almost universally referred to as proteins, but shorter strings of amino acids may be referred to as polypeptides, peptides, or, less commonly, oligopeptides. The variation in protein size contributes to their functional diversity—for instance, a shorter amino acid chain may be more likely to act as a hormone (like insulin), rather than as an enzyme (which depends on its defined three-dimensional structure for functionality).

The molecular surfaces of several proteins showing their comparative sizes. From left to right: immunoglobin G (an antibody), hemoglobin (a transport protein), insulin (a hormone), adenylate kinase (an enzyme), and glutamine synthetase (an enzyme).

Proteins are assembled from amino acids based on information encoded as genes, specific nucleotide sequences in the DNA. From the DNA, the protein-coding nucleotide sequences are each transcribed into an immature messenger RNA (mRNA), which is then cleaned up and modified to form the mature mRNA that is translated into a protein. In many cases, the resulting protein is further chemically altered (post-translational modification) before it becomes functional.

The four levels of protein structure

The four levels of protein structure

Proteins fold into unique three-dimensional structures. The shape into which a protein naturally folds is known as its native state, which is presumed to be determined by its sequence of amino acids. Sometimes, however, proteins do not fold properly. The incorrect folding of proteins can lead to illnesses such as Alzheimer’s disease, in which brain function is limited by deposits of incorrectly-folded proteins that can no longer perform their functions. A full understanding of why incorrect protein folding occurs might lead to advances in the treatment of diseases like Alzheimer’s.

Biochemists refer to four distinct aspects of a protein's structure:

  • Primary structure is the linear amino acid sequence encoded by DNA. Any error in this sequence, such as the substitution of one amino acid residue for another, may lead to a congenital disease.
  • Secondary structures are highly patterned sub-structures that form in the interaction of amino acid residues near to each other on the chain. The most common are the alpha helix and the beta sheet. There can be many different secondary motifs present in one single protein molecule.
  • Tertiary structure refers to the overall, three-dimensional shape of a single protein molecule. This spatial relationship of amino acid residues that are far apart on the sequence is primarily formed by hydrophobic interactions, though hydrogen bonds and ionic interactions, and disulfide bonds are usually involved as well.
  • Some proteins may have a quaternary structure, a shape or structure that results from the union of more than one protein molecule (called subunits in this context), which function as part of the larger assembly, or protein complex. Hemoglobin, which serves as an oxygen carrier in blood, has a quaternary structure of four subunits.
The quaternary structure of hemoglobin. The four subunits are shown in red and yellow; the iron-containing heme groups are in green.

In addition to these levels of structure, proteins may shift between several similar structures in performing their biological function. In the context of these functional rearrangements, tertiary or quaternary structures are usually referred to as conformations, and transitions between them are called conformational changes. Although any unique polypeptide may have more than one stable folded conformation, each conformation has its own biological activity, and only one conformation is considered to be the active one. This assumption has been recently challenged, however, by the discovery of intrinsically unstructured proteins, which can fold in multiple structures with different biological activity.

Major functions of proteins

The enzyme hexokinase is shown as a simple ball-and-stick molecular model. To scale in the top right-hand corner are its two substrates, ATP and glucose.

Proteins are involved in practically every function performed by a cell, including regulation of cellular functions such as signal transduction and metabolism. However, several major classes of proteins may be identified based on the functions below:

  • Enzyme catalysis. Nearly all of the chemical reactions in living organisms—from the initial breakdown of food nutrients in the saliva to the replication of DNA—are catalyzed by proteins.
  • Transport and storage. Membrane-associated proteins move their substrates (such as small molecules and ions) from place to place without altering their chemical properties. For example, the protein hemoglobin (pictured above) transports oxygen in blood.
  • Immune protection. Antibodies, the basis of the adaptive immune system, are soluble proteins capable of recognizing and combining with foreign substances. This class also includes toxins, which play a defensive role (e.g., the dendrotoxins of snakes).
  • Signaling. Receptors mediate the responses of nerve cells to specific stimuli. Rhodopsin, for example, is a light sensitive protein in the rod cells of the retina of vertebrates.
  • Structural support. Examples include tubulin, actin, collagen, and keratin, which are important strengthening components of skin, hair, and bone.
The flagella is composed of motor proteins that propel sperm cells toward the ovum for fertilization
  • Coordinated motion. Another special class of proteins consists of motor proteins such as myosin, kinesin, and dynein. These proteins are "molecular motors," generating physical force which can move organelles, cells, and entire muscles. Proteins are the major components of muscle, and muscle contraction involves the sliding motion of two kinds of protein filaments. At the microscopic level, the propulsion of sperm by flagella is produced by protein assemblies.
  • Control of growth and differentiation. In higher organisms, 'growth factor proteins such as insulin control the growth and differentiation of cells. Transcription factors regulate the activation of transcription in eukaryotes, while cyclins regulate the cell cycle, the series of events in a eukaryotic cell between one cell division and the next.

Proteins in the human diet

Sources of protein

Soybeans are a good source of essential amino acids

Protein is an important macronutrient in the human diet, supplying the body's needs for amino acids, particularly the essential amino acids that humans are unable to synthesize. Between eight and ten amino acids are considered essential for humans.

While animal meats are rich sources of this vital dietary element, protein is also found in plant foods, such as grains and legumes, and in eggs and dairy products, such as milk and yogurt. The best way to obtain the full range of essential amino acids is to consume a variety of protein-rich foods. Soy products such as tofu are particularly important to many vegetarians and vegans as a source of complete protein (a protein that contains significant amounts of all the essential amino acids).

The exact amount of dietary protein needed to satisfy protein requirements for humans, known as a Recommended Dietary Allowance (RDA), may vary widely depending on age, sex, level of physical activity, and medical condition.

Protein deficiency and dietary imbalance

A child with kwashiorkor in Nigeria

Protein deficiency can lead to symptoms such as fatigue, insulin resistance, hair loss, loss of hair pigment, loss of muscle mass, low body temperature, hormonal irregularities, and loss of skin elasticity. Severe protein deficiency is most commonly encountered in developing countries in times of famine, when diets are high in starch and low in protein. Kwashiorkor is a type of childhood malnutrition that is linked to insufficient protein intake (and may also result from deficiencies in various nutrients), though its causes are not fully understood.

Given the central importance of proteins to life, particularly the importance of strong muscles for survival, animals are designed to minimize the loss of protein from muscle during periods of starvation. When dietary proteins and carbohydrates are deficient, proteins may be broken down to synthesize glucose to supply organs, like the brain, that normally utilize glucose as a fuel. However, over a period of days, the body’s metabolism switches to the breakdown of ‘’fats’’, the storage form of fatty acids, which can be precursors for ketone bodies, an alternative fuel for the brain. This mechanism also works to the advantage of migratory birds, such as the ruby-throated hummingbird, which build up their fat stores before journeying long distances over water. The brain’s transition from glucose to ketone bodies occurs quite rapidly, so that hardly any protein in muscle is lost, enabling them to make their arduous, 2,400-kilometer flight.

The ruby-throated hummingbird

Excessive protein intake may be linked to some health problems:

  • Liver dysfunction due to increased toxic residues. Because the body is unable to store excess protein, it is broken down and converted into sugars or fatty acids. The liver removes nitrogen from the amino acids, so that they can be burned as fuel, and the nitrogen is incorporated into urea, the substance that is excreted by the kidneys. These organs can normally cope with an extra workload but if kidney disease occurs, a decrease in protein will often be prescribed.
  • Loss of bone density as calcium and glutamine are leached from bone and muscle tissue to balance increased acid intake from the diet. This effect is not present if intake of alkaline minerals is high. In such cases, protein intake helps to strengthen bones.

Studying proteins

Jöns Jakob Berzelius

The word protein was first mentioned in a letter sent by the Swedish chemist Jöns Jakob Berzelius to Gerhardus Johannes Mulder on July 10, 1838. He wrote:

The name protein that I propose for the organic oxide of fibrin and albumin, I wanted to derive from the Greek word πρωτειος, because it appears to be the primitive or principal substance of animal nutrition.

In twentieth-century study of proteins, one of the more striking discoveries was that the native and denatured states in many proteins were interconvertible (denatured refers to a protein that is not in its native state and is generally lacking a well-defined secondary structure). That is, by careful control of solution conditions to separate a denatured protein from the denaturing chemical, a denatured protein could be converted to its native form. The question of how proteins arrive at their native state is an important area of biochemistry, called the study of protein folding.

Through genetic engineering, researchers can alter the amino acid sequence and hence the structure, targeting, susceptibility to regulation, and other properties of a protein. The genetic sequences of different proteins may be spliced together to create chimeric proteins that possess properties of both. This form of tinkering represents one of the chief tools used by cell and molecular biologists to understand the workings of cells. Another area of protein research attempts to engineer proteins with entirely new properties or functions, a field known as protein engineering.


  • Atkins, P., and L. Jones. 2005. Chemical Principles, 3rd edition. New York: W. H. Freeman.
  • Lewis, R. L. 2005. Do Proteins Teleport in an RNA World. New York: International Conference on the Unity of the Sciences.
  • Stryer, L. 1995. Biochemistry, 4th edition. New York: W. H. Freeman.


New World Encyclopedia writers and editors rewrote and completed the Wikipedia article in accordance with New World Encyclopedia standards. This article abides by terms of the Creative Commons CC-by-sa 3.0 License (CC-by-sa), which may be used and disseminated with proper attribution. Credit is due under the terms of this license that can reference both the New World Encyclopedia contributors and the selfless volunteer contributors of the Wikimedia Foundation. To cite this article click here for a list of acceptable citing formats.The history of earlier contributions by wikipedians is accessible to researchers here:

Note: Some restrictions may apply to use of individual images which are separately licensed.

Research begins here...