Difference between revisions of "DNA" - New World Encyclopedia

From New World Encyclopedia
(added most recent version from Wikipedia)
Line 1: Line 1:
 
{{Claimed}}
 
{{Claimed}}
[[Image:DNA_Overview.png|thumb|270px|The general structure of a section of DNA]]
+
[[Image:DNA Overview.png|thumb|220px|The structure of part of a DNA double helix]]
'''Deoxyribonucleic acid''' ('''DNA''') is a [[nucleic acid]] —usually in the form of a double [[helix]]— that contains the [[genetics|genetic]] instructions specifying the [[developmental biology|biological development]] of all [[Cell (biology)|cellular]] forms of [[life]], and most [[virus]]es.  DNA is a long [[polymer]] of [[nucleotides]] and encodes the sequence of the [[amino acid residue]]s in [[protein]]s using the [[genetic code]], a triplet code of [[nucleotide]]s.
+
'''Deoxyribonucleic acid''', or '''DNA''' is a [[nucleic acid]] molecule that contains the [[genetics|genetic]] instructions used in the [[developmental biology|development]] and functioning of all [[life|living organisms]]. The main role of DNA is the long-term storage of information and it is often compared to a set of blueprints, since DNA contains the instructions needed to construct other components of [[cell (biology)|cell]]s, such as [[protein]]s and [[RNA]] [[molecule]]s. The DNA segments that carry this genetic information are called [[gene]]s, but other DNA sequences have structural purposes, or are involved in regulating the use of this genetic information.
  
In complex [[eukaryote|eukaryotic]] [[Cell (biology)|cells]] such as those from [[plant]]s, [[animal]]s, [[fungi]] and [[protist]]s, most of the DNA is located in the [[cell nucleus]]. By contrast, in simpler cells called [[prokaryotes]], including the [[bacterium|eubacteria]] and [[archaea]], DNA is not separated from the [[cytoplasm]] by a [[nuclear envelope]]. The cellular [[organelle]]s known as [[chloroplast]]s and [[mitochondria]] also carry DNA.
+
Chemically, DNA is a long [[polymer]] of simple units called [[nucleotide]]s, with a backbone made of sugars and phosphate atoms joined by [[ester]] bonds. Attached to each sugar is one of four types of molecules called [[nucleobase|bases]]. It is the sequence of these four bases along the backbone that encodes information. This information is read using the [[genetic code]], which specifies the sequence of the [[amino acid]]s within proteins. The code is read by copying  stretches of DNA into the related nucleic acid [[RNA]], in a process called [[transcription (genetics)|transcription]]. Most of these RNA molecules are used to synthesize proteins, but others are used directly in structures such as [[ribosome]]s and [[spliceosome]]s.
  
DNA is often referred to as the molecule of [[heredity]] as it is responsible for the genetic propagation of most [[biological inheritance|inherited]] [[Trait (biological)|trait]]s. In humans, these traits can range from hair colour to disease susceptibility. During [[cell division]], DNA is [[DNA replication|replicated]] and can be transmitted to offspring during [[reproduction]]. [[Kinship and descent|Lineage]] studies can be done based on the facts that the [[mitochondrial DNA]] only comes from the mother, and the male [[Y chromosome]] only comes from the father.
+
Within cells, DNA is organized into structures called [[chromosome]]s and the set of chromosomes within a cell make up a [[genome]]. These chromosomes are duplicated before cells [[cell division|divide]], in a process called [[DNA replication]]. [[Eukaryote|Eukaryotic organisms]] such as [[animal]]s, [[plant]]s, and [[fungi]] store their DNA inside the [[cell nucleus]], while in [[prokaryote]]s such as [[bacteria]] it is found in the cell's [[cytoplasm]]. Within the chromosomes, [[chromatin]] proteins such as [[histone]]s compact and organize DNA, which helps control its interactions with other proteins and thereby control which [[genes]] are transcribed.
  
Every person's DNA, their [[genome]], is inherited from both parents. The mother's [[mitochondrial DNA]] together with twenty-three [[chromosome]]s from each parent combine to form the genome of a [[zygote]], the [[fertilization|fertilized]] [[ovum|egg]]. As a result, with certain exceptions such as [[red blood cell]]s, most human cells contain 23 pairs of chromosomes, together with mitochondrial DNA inherited from the mother.
+
==Physical and chemical properties==
 +
[[Image:DNA_chemical_structure.svg|right|thumb|350px|The chemical structure of DNA.]]
  
==Overview==
+
DNA is a long [[polymer]] made from repeating units called [[nucleotide]]s.<ref name=Alberts>{{cite book | last = Alberts| first = Bruce| coauthors = Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts, and Peter Walters | title = Molecular Biology of the Cell; Fourth Edition | publisher = Garland Science| date = 2002 | location = New York and London | url = http://www.ncbi.nlm.nih.gov/books/bv.fcgi?call=bv.View..ShowTOC&rid=mboc4.TOC&depth=2 | id = ISBN 0-8153-3218-1}}</ref><ref name=Butler>Butler, John M. (2001) ''Forensic DNA Typing'' "Elsevier". pp. 14 – 15. ISBN 978-0-12-147951-0.</ref> The DNA chain is 22 to 24&nbsp;[[Ångström]]s wide (2.2 to 2.4&nbsp;[[nanometre]]s), and one nucleotide unit is 3.3&nbsp;Ångstroms (0.33&nbsp;nanometres) long.<ref>{{cite journal | author = Mandelkern M, Elias J, Eden D, Crothers D | title = The dimensions of DNA in solution | journal = J Mol Biol | volume = 152 | issue = 1 | pages = 153 – 61 | year = 1981 | id = PMID 7338906}}</ref> Although each individual repeating unit is very small, DNA polymers can be enormous molecules containing millions of nucleotides. For instance, the largest human [[chromosome]], chromosome number 1, is 220 million [[base pair]]s long.<ref>{{cite journal | author = Gregory S, ''et al.'' | title = The DNA sequence and biological annotation of human chromosome 1 | journal = Nature | volume = 441 | issue = 7091 | pages = 315 – 21 | year = 2006 | id = PMID 16710414}}</ref>
[[Image:DNA123.png|thumb|right|125px|Space-filling model of a section of DNA molecule]]
 
[[Image:Dna_pairing_aa.gif|thumb|300px|DNA base pairing]]
 
  
Contrary to a common misconception, the DNA is not a single molecule, but rather a pair of molecules joined by [[hydrogen bond]]s: it is organized as two complementary strands, head-to-toe, with the hydrogen bonds between them. Each strand of DNA is a chain of chemical "building blocks", called [[nucleotide]]s, of which there are four types: [[adenine]] (abbreviated A), [[cytosine]] (C), [[guanine]] (G) and [[thymine]] (T). (Thymine should not be confused with [[thiamine]], which is vitamin B<sub>1</sub>.) In some organisms, most notably the PBS1 [[phage]], [[Uracil]] (U) replaces T in the organism's DNA.<ref>I. Takahashi and J. Marmur. Replacement of thymidylic acid by deoxyuridylic acid in the deoxyribonucleic acid of a transducing phage for Bacillus subtilis. ''Nature'' 197, 794&ndash;795, 1963.</ref> These allowable base components of nucleic acids can be [[polymerized]] in any order giving the molecules a high degree of uniqueness.
+
In living organisms, DNA does not usually exist as a single molecule, but instead as a tightly-associated pair of molecules.<ref name=Watson>{{cite journal | author = Watson J, Crick F | title = Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid | url=http://profiles.nlm.nih.gov/SC/B/B/Y/W/_/scbbyw.pdf | journal = Nature | volume = 171 | issue = 4356 | pages = 737 – 8 | year = 1953 | id = PMID 13054692}}</ref><ref name=berg>Berg J., Tymoczko J. and Stryer L. (2002) ''Biochemistry.'' W. H. Freeman and Company ISBN 0-7167-4955-6</ref> These two long strands entwine like vines, in the shape of a [[helix|double helix]]. The nucleotide repeats contain both the segment of the backbone of the molecule, which holds the chain together, and a base, which interacts with the other DNA strand in the helix. In general, a base linked to a sugar is called a [[nucleoside]] and a base linked to a sugar and one or more phosphate groups is called a [[nucleotide]]. If multiple nucleotides are linked together, as in DNA, this polymer is referred to as a [[polynucleotide]].<ref name=IUPAC>[http://www.chem.qmul.ac.uk/iupac/misc/naabb.html Abbreviations and Symbols for Nucleic Acids, Polynucleotides and their Constituents] IUPAC-IUB Commission on Biochemical Nomenclature (CBN) Accessed 03 Jan 2006</ref>
  
Between the two strands, each base can only "pair up" with one single predetermined other base: A+T, T+A, C+G and G+C are the only possible combinations; that is, an "A" on one strand of double-stranded DNA will "mate" properly only with a "T" on the other, complementary strand; therefore, naming the bases on the conventionally chosen side of the strand is enough to describe the entire double-strand sequence. Two nucleotides paired together are called a [[base pair]]. On rare occasions, wrong pairing can happen, when [[thymine]] goes into its [[enol]] form or [[cytosine]] goes into its [[imino]] form. The double-stranded structure of DNA provides a simple mechanism for [[DNA replication]]: the DNA double strand is first "unzipped" down the middle, and the "other half" of each new single strand is recreated by exposing each half to a mixture of the four bases. An enzyme makes a new strand by finding the correct base in the mixture and pairing it with the original strand. In this way, the base on the old strand dictates which base will be on the new strand, and the cell ends up with an extra copy of its DNA.
+
The backbone of the DNA strand is made from alternating [[phosphate]] and [[carbohydrate|sugar]] residues.<ref name=Ghosh>{{cite journal | author = Ghosh A, Bansal M | title = A glossary of DNA structures from A to Z | journal = Acta Crystallogr D Biol Crystallogr | volume = 59 | issue = Pt 4 | pages = 620 – 6 | year = 2003 | id = PMID 12657780}}</ref> The sugar in DNA is 2-deoxyribose, which is a [[pentose]] (five [[carbon]]) sugar. The sugars are joined together by phosphate groups that form [[phosphodiester bond]]s between the third and fifth carbon [[atom]]s of adjacent sugar rings. These asymmetric [[covalent bond|bonds]] mean a strand of DNA has a direction. In a double helix the direction of the nucleotides in one strand is opposite to their direction in the other strand. This arrangement of DNA strands is called antiparallel. The asymmetric ends of a strand of DNA bases are referred to as the [[directionality (molecular biology)|5′]] (''five prime'') and [[directionality (molecular biology)|3′]] (''three prime'') ends. One of the major differences between DNA and RNA is the sugar, with 2-deoxyribose being replaced by the alternative pentose sugar [[ribose]] in RNA.<ref name=berg/>
  
DNA contains the genetic [[information]] that is inherited by the offspring of an organism; this information is determined by the [[DNA sequence|sequence]] of base pairs along its length. A strand of DNA contains [[gene]]s, areas that [[gene regulation|regulate genes]], and areas that either have no function, or a function [[junk DNA|yet unknown]]. Genes can be loosely viewed as the organism's "cookbook" or "blueprint".
+
The DNA double helix is stabilized by [[hydrogen bond]]s between the bases attached to the two strands. The four bases found in DNA are [[adenine]] (abbreviated A), [[cytosine]] (C), [[guanine]] (G) and [[thymine]] (T). These four bases are shown below and are attached to the sugar/phosphate to form the complete nucleotide, as shown for adenosine monophosphate.
  
[[Image:DNA Under electron microscope Image 3576B-PH.jpg|thumb|left|250px|DNA Under an electron microscope]]
+
These bases are classified into two types; adenine and guanine are fused five- and six-membered [[heterocyclic compound]]s called [[purine]]s, while cytosine and thymine are six-membered rings called [[pyrimidine]]s.<ref name=IUPAC/> A fifth pyrimidine base, called [[uracil]] (U), usually takes the place of thymine in RNA and differs from thymine by lacking a [[methyl group]] on its ring. Uracil is normally only found in DNA as a breakdown product of cytosine, but a very rare exception to this rule is a [[phage|bacterial virus]] called PBS1 that contains uracil in its DNA.<ref name="nature1963-takahashi">{{cite journal | author=Takahashi I, Marmur J. | title=Replacement of thymidylic acid by deoxyuridylic acid in the deoxyribonucleic acid of a transducing phage for Bacillus subtilis | journal=Nature | year=1963 | pages=794 – 5 | volume=197 | id=PMID 13980287}}</ref> In contrast, following synthesis of certain RNA molecules, a significant number of the uracils are converted to thymines by the enzymatic addition of the missing methyl group. This occurs mostly on structural and enzymatic RNAs like [[transfer RNA]]s and [[ribosomal RNA]].<ref>{{cite journal |author=Agris P |title=Decoding the genome: a modified view |
 +
url=http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=14715921 |journal=Nucleic Acids Res |volume=32 |issue=1 |pages=223 – 38 |year=2004 |pmid=14715921}}</ref>
  
Other interesting points:
+
[[Image:DNA orbit animated small.gif|frame|right|Animation of the structure of a section of DNA. The bases lie horizontally between the two spiraling strands. [[:Image:DNA orbit animated.gif|Large version]]<ref>Created from [http://www.rcsb.org/pdb/cgi/explore.cgi?pdbId=1D65 PDB 1D65]</ref>]]
  
 +
The double helix is a right-handed spiral. As the DNA strands wind around each other, they leave gaps between each set of phosphate backbones, revealing the sides of the bases inside (see animation). There are two of these grooves twisting around the surface of the double helix: one groove, the major groove, is 22&nbsp;Å wide and the other, the minor groove, is 12&nbsp;Å wide.<ref>{{cite journal | author = Wing R, Drew H, Takano T, Broka C, Tanaka S, Itakura K, Dickerson R | title = Crystal structure analysis of a complete turn of B-DNA | journal = Nature | volume = 287 | issue = 5784 | pages = 755 – 8 | year = 1980 | id = PMID 7432492}}</ref> The narrowness of the minor groove means that the edges of the bases are more accessible in the major groove. As a result, proteins like [[transcription factor]]s that can bind to specific sequences in double-stranded DNA usually make contacts to the sides of the bases exposed in the major groove.<ref>{{cite journal | author = Pabo C, Sauer R | title = Protein-DNA recognition | journal = Annu Rev Biochem | volume = 53 | issue = | pages = 293 – 321 | year = | id = PMID 6236744}}</ref>
  
 +
<div class="thumb tleft" style="background-color: #f9f9f9; border: 1px solid #CCCCCC; margin:0.5em;">
 +
{|border="0" width=230px border="0" cellpadding="2" cellspacing="0" style="font-size: 85%; border: 1px solid #CCCCCC; margin: 0.3em;"
 +
|[[Image:GC DNA base pair.svg|280px]]
 +
|}
 +
{|border="0" width=230px border="0" cellpadding="2" cellspacing="0" style="font-size: 85%; border: 1px solid #CCCCCC; margin: 0.3em;"
 +
|[[Image:AT DNA base pair.svg|280px]]
 +
|}
 +
<div style="border: none; width:280px;"><div class="thumbcaption">At top, a '''GC''' base pair with three [[hydrogen bond]]s. At the bottom, '''AT''' base pair with two hydrogen bonds. Hydrogen bonds are shown as dashed lines.</div></div></div>
  
* DNA is an acid because of the phosphate groups between each deoxyribose. This is the primary reason why DNA has a negative charge.
+
===Base pairing===
* The "polarity" of each pair is important: A+T is not the same as T+A, just as C+G is not the same as G+C (note that "polarity" as such is never used in this context — it's just a suggestive way to get the idea across)
+
{{further|[[Base pair]]}}
* [[Mutation]]s are chemical imperfections in this process, where a base is accidentally skipped, inserted, or incorrectly copied, or the chain is trimmed, or added to; many basic mutations can be described as combinations of these accidental "operations". Mutations can also occur through chemical damage (through [[mutagens]]), light ([[Ultraviolet|UV]] damage), or through other more complicated gene swapping events.
+
Each type of base on one strand forms a bond with just one type of base on the other strand. This is called complementary [[base pair]]ing. Here, purines form [[hydrogen bond]]s to pyrimidines, with A bonding only to T, and C bonding only to G. This arrangement of two nucleotides binding together across the double helix is called a base pair. In a double helix, the two strands are also held together via [[force]]s generated by the [[hydrophobic effect]] and [[pi stacking]], which are not influenced by the sequence of the DNA.<ref>{{cite journal | author = Ponnuswamy P, Gromiha M | title = On the conformational stability of oligonucleotide duplexes and tRNA molecules | journal = J Theor Biol | volume = 169 | issue = 4 | pages = 419 – 32 | year = 1994 | id = PMID 7526075}}</ref> As hydrogen bonds are not [[covalent bond|covalent]], they can be broken and rejoined relatively easily. The two strands of DNA in a double helix can therefore be pulled apart like a zipper, either by a mechanical force or high [[temperature]].<ref>{{cite journal | author = Clausen-Schaumann H, Rief M, Tolksdorf C, Gaub H | title = Mechanical stability of single DNA molecules | url=http://www.pubmedcentral.nih.gov/picrender.fcgi?artid=1300792&blobtype=pdf | journal = Biophys J | volume = 78 | issue = 4 | pages = 1997 – 2007 | year = 2000 | id = PMID 10733978}}</ref> As a result of this complementarity, all the information in the double-stranded sequence of a DNA helix is duplicated on each strand, which is vital in DNA replication. Indeed, this reversible and specific interaction between complementary base pairs is critical for all the functions of DNA in living organisms.<ref name=Alberts/>
*[[Deoxyribozyme|DNA molecules that act as enzymes]] are known in laboratories, but none have been known to be found in life so far.
 
* In addition to the traditionally viewed duplex form of DNA, DNA can also acquire triplex and quadruplex forms. Here instead of the Watson-Crick base pairing, [[Hoogsteen base pair|Hoogsteen base pairing]] comes into the picture.
 
* DNA differs from [[ribonucleic acid]] (RNA) by having a sugar 2-deoxyribose instead of [[ribose]] in its backbone. This is the basic chemical distinction between RNA and DNA. In addition, in RNA, the nucleotides [[thymine]] (T) are replaced by [[uracil]] (U).
 
  
==DNA in practice==
+
The two types of base pairs form different numbers of hydrogen bonds, AT forming two hydrogen bonds, and GC forming three hydrogen bonds (see figures, left). The GC base pair is therefore stronger than the AT base pair. As a result, it is both the percentage of GC base pairs and the overall length of a DNA double helix that determine the strength of the association between the two strands of DNA. Long DNA helices with a high GC content have stronger-interacting strands, while short helices with high AT content have weaker-interacting strands.<ref>{{cite journal | author = Chalikian T, Völker J, Plum G, Breslauer K | title = A more unified picture for the thermodynamics of nucleic acid duplex melting: a characterization by calorimetric and volumetric techniques | url=http://www.pubmedcentral.nih.gov/picrender.fcgi?artid=22151&blobtype=pdf | journal = Proc Natl Acad Sci U S A | volume = 96 | issue = 14 | pages = 7853 – 8 | year = 1999 | id = PMID 10393911}}</ref> Parts of the DNA double helix that need to separate easily, such as the TATAAT [[Pribnow box]] in bacterial [[promoter]]s, tend to have sequences with a high AT content, making the strands easier to pull apart.<ref>{{cite journal | author = deHaseth P, Helmann J | title = Open complex formation by Escherichia coli RNA polymerase: the mechanism of polymerase-induced strand separation of double helical DNA | journal = Mol Microbiol | volume = 16 | issue = 5 | pages = 817 – 24 | year = 1995 | id = PMID 7476180}}</ref> In the laboratory, the strength of this interaction can be measured by finding the temperature required to break the hydrogen bonds, their [[melting temperature]] (also called ''T<sub>m</sub>'' value). When all the base pairs in a DNA double helix melt, the strands separate and exist in solution as two entirely independent molecules. These single-stranded DNA molecules have no single common shape, but some conformations are more stable than others.<ref>{{cite journal | author = Isaksson J, Acharya S, Barman J, Cheruku P, Chattopadhyaya J | title = Single-stranded adenine-rich DNA and RNA retain structural characteristics of their respective double-stranded conformations and show directional differences in stacking pattern | journal = Biochemistry | volume = 43 | issue = 51 | pages = 15996 – 6010 | year = 2004 | id = PMID 15609994}}</ref>
  
===DNA in crime===
+
===Sense and antisense===
{{main|Genetic fingerprinting}}
+
{{further|[[Sense (molecular biology)]]}}
[[Forensic science|Forensic scientists]] can use DNA located in [[blood]], [[semen]], [[skin]], [[saliva]] or hair left at the scene of a crime to identify a possible suspect, a process called [[genetic fingerprinting]] or DNA profiling.  In DNA profiling the relative lengths of sections of repetitive DNA, such as [[short tandem repeats]] and [[minisatellite]]s, are compared.  DNA profiling was developed in 1984 by English geneticist [[Alec Jeffreys]] of the [[University of Leicester]], and was first used to convict Colin Pitchfork in 1988 in the [[Enderby murders]] case in [[Leicestershire]], [[England]].  Many jurisdictions require convicts of certain types of crimes to provide a sample of DNA for inclusion in a computerized database.  This has helped investigators solve old cases where the perpetrator was unknown and only a DNA sample was obtained from the scene (particularly in [[rape]] cases between strangers).  This method is one of the most reliable techniques for identifying a criminal, but is not always perfect, for example if no DNA can be
 
retrieved, or if the scene is contaminated with the DNA of several possible suspects.
 
  
===DNA in computation ===
+
A DNA sequence is called "sense" if its sequence is the same as that of a [[messenger RNA]] (mRNA) copy that is translated into protein. The sequence on the opposite strand is complementary to the sense sequence and is therefore called the "antisense" sequence. Since [[RNA polymerase]]s work by making a complementary copy of their templates, it is this antisense strand that is the template for producing the sense mRNA. Both sense and antisense sequences can exist on different parts of the same strand of DNA (i.e. both strands contain both sense and antisense sequences). In both prokaryotes and eukaryotes, antisense RNA sequences are produced, but the functions of these RNAs are not entirely clear.<ref>{{cite journal | author = Hüttenhofer A, Schattner P, Polacek N | title = Non-coding RNAs: hope or hype? | journal = Trends Genet | volume = 21 | issue = 5 | pages = 289 – 97 | year = 2005 | id = PMID 15851066}}</ref> One proposal is that antisense RNAs are involved in regulating [[gene expression]] through RNA-RNA base pairing.<ref>{{cite journal | author = Munroe S | title = Diversity of antisense regulation in eukaryotes: multiple mechanisms, emerging patterns | journal = J Cell Biochem | volume = 93 | issue = 4 | pages = 664 – 71 | year = 2004 | id = PMID 15389973}}</ref>
DNA plays an important role in [[computer science]], both as a motivating research problem and as a method of computation in itself.
 
  
Research on [[string searching algorithm]]s, which find an occurrence of a sequence of letters inside a larger sequence of letters, was motivated in part by DNA research, where it is used to find specific sequences of nucleotides in a large sequence.<ref>Gusfield, Dan. ''Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology''. Cambridge University Press, 15 January [[1997]]. ISBN 0521585198.</ref> In other applications such as [[text editor]]s, even simple algorithms for this problem usually suffice, but DNA sequences cause these algorithms to exhibit near-worst-case behavior due to their small number of distinct characters.
+
A few DNA sequences in prokaryotes and eukaryotes, and more in [[plasmid]]s and [[virus]]es, blur the distinction made above between sense and antisense strands by having overlapping genes.<ref>{{cite journal | author = Makalowska I, Lin C, Makalowski W | title = Overlapping genes in vertebrate genomes | journal = Comput Biol Chem | volume = 29 | issue = 1 | pages = 1 – 12 | year = 2005 | id = PMID 15680581}}</ref> In these cases, some DNA sequences do double duty, encoding one protein when read 5′ to 3′ along one strand, and a second protein when read in the opposite direction (still 5′ to 3′) along the other strand. In [[bacteria]], this overlap may be involved in the regulation of gene transcription,<ref>{{cite journal | author = Johnson Z, Chisholm S | title = Properties of overlapping genes are conserved across microbial genomes | journal = Genome Res | volume = 14 | issue = 11 | pages = 2268 – 72 | year = 2004 | id = PMID 15520290}}</ref> while in viruses, overlapping genes increase the amount of information that can be encoded within the small viral genome.<ref>{{cite journal | author = Lamb R, Horvath C | title = Diversity of coding strategies in influenza viruses | journal = Trends Genet | volume = 7 | issue = 8 | pages = 261 – 6 | year = 1991 | id = PMID 1771674}}</ref> Another way of reducing genome size is seen in some viruses that contain linear or circular single-stranded DNA as their genetic material.<ref>{{cite journal | author = Davies J, Stanley J | title = Geminivirus genes and vectors | journal = Trends Genet | volume = 5 | issue = 3 | pages = 77 – 81 | year = 1989 | id = PMID 2660364}}</ref><ref>{{cite journal | author = Berns K | title = Parvovirus replication | journal = Microbiol Rev | volume = 54 | issue = 3 | pages = 316 – 29 | year = 1990 | id = PMID 2215424}}</ref>
  
[[Database]] theory has been influenced by DNA research, which poses special problems for storing and manipulating DNA sequences. Databases specialized for DNA research are called [[genomic database]]s, and must address a number of unique technical challenges associated with the operations of approximate matching, sequence comparison, finding repeating patterns, and homology searching.
+
===Supercoiling===
 +
{{Further|[[DNA supercoil]]}}
 +
DNA can be twisted like a rope in a process called [[DNA supercoil]]ing. With DNA in its "relaxed" state, a strand usually circles the axis of the double helix once every 10.4 base pairs, but if the DNA is twisted the strands become more tightly or more loosely wound.<ref>{{cite journal | author = Benham C, Mielke S | title = DNA mechanics | journal = Annu Rev Biomed Eng | volume = 7 | issue = | pages = 21 – 53 | year = | id = PMID 16004565}}</ref> If the DNA is twisted in the direction of the helix, this is positive supercoiling, and the bases are held more tightly together. If they are twisted in the opposite direction, this is negative supercoiling, and the bases come apart more easily. In nature, most DNA has slight negative supercoiling that is introduced by enzymes called [[topoisomerase]]s.<ref name=Champoux>{{cite journal | author = Champoux J | title = DNA topoisomerases: structure, function, and mechanism | journal = Annu Rev Biochem | volume = 70 | issue = | pages = 369 – 413 | year = | id = PMID 11395412}}</ref> These enzymes are also needed to relieve the twisting stresses introduced into DNA strands during processes such as [[transcription (genetics)|transcription]] and [[DNA replication]].<ref name=Wang>{{cite journal | author = Wang J | title = Cellular roles of DNA topoisomerases: a molecular perspective | journal = Nat Rev Mol Cell Biol | volume = 3 | issue = 6 | pages = 430 – 40 | year = 2002 | id = PMID 12042765}}</ref>
  
In 1994, [[Leonard Adleman]] of the [[University of Southern California]] made headlines when he discovered a way of solving the directed [[Hamiltonian path problem]], an [[NP-complete]] problem, using tools from molecular biology, in particular DNA. The new approach, dubbed [[DNA computing]], has practical advantages over traditional computers in power use, space use, and efficiency, due to its ability to highly parallelize the computation (see [[parallel computing]]), although there is labor worth mentioning involved in retrieving the answers. A number of other problems, including simulation of various [[abstract machine]]s, the [[boolean satisfiability problem]], and the bounded version of the [[Post correspondence problem]], have since been analyzed using DNA computing.
+
[[Image:A-DNA, B-DNA and Z-DNA.png|thumb|right|290px|From left to right, the structures of A, B and Z DNA]]
  
Due to its compactness, DNA also has a theoretical role in [[cryptography]], where in particular it allows unbreakable [[one-time pad]]s to be efficiently constructed and used.<ref>Ashish Gehani, Thomas LaBean and John Reif. [http://citeseer.ist.psu.edu/gehani99dnabased.html DNA-Based Cryptography].
+
===Alternative double-helical structures===
Proceedings of the 5th DIMACS Workshop on DNA Based Computers, Cambridge, MA, USA, 14&ndash;15 June 1999.</ref>
+
{{Further|[[Mechanical properties of DNA]]}}
  
=== DNA in historical and anthropological study ===
+
DNA exists in several possible [[Conformational isomerism|conformations]]. The conformations so far identified are: [[A-DNA]], B-DNA, C-DNA, D-DNA,<ref name=Hayashi2005>{{cite journal | author = Hayashi G, Hagihara M, Nakatani K | title = Application of L-DNA as a molecular tag | journal = Nucleic Acids Symp Ser (Oxf) | volume = 49 | pages = 261 – 262 | year = 2005 | id = PMID 17150733}}</ref> E-DNA,<ref name=Vargason2000>{{cite journal | author = Vargason JM, Eichman BF, Ho PS | title = The extended and eccentric E-DNA structure induced by cytosine methylation or bromination | journal = Nature Structural Biology | volume = 7 | pages = 758 – 761 | year = 2000 | id = PMID 10966645}}</ref> H-DNA,<ref name=Wang2006>{{cite journal | author = Wang G, Vasquez KM | title = Non-B DNA structure-induced genetic instability | journal = Mutat Res | volume = 598 | issue = 1 – 2 | pages = 103 – 119 | year = 2006 | id = PMID 16516932}}</ref> L-DNA,<ref name=Hayashi2005>{{cite journal | author = Hayashi G, Hagihara M, Nakatani K | title = Application of L-DNA as a molecular tag | journal = Nucleic Acids Symp Ser (Oxf) | volume = 49 | pages = 261 – 262 | year = 2005 | id = PMID 17150733}}</ref> P-DNA,<ref name="Allemand1998">{{cite journal |author=Allemand, et al |title=Stretched and overwound DNA forms a Pauling-like structure with exposed bases |journal=PNAS |volume=24 |pages=14152-14157 |year=1998 |id=PMID 9826669}}</ref> and [[Z-DNA]].<ref name=Ghosh/><ref>{{cite journal | author = Palecek E | title = Local supercoil-stabilized DNA structures | journal = Critical Reviews in Biochemistry and Molecular Biology | volume = 26 | issue = 2 | pages = 151 – 226 | year = 1991 | id = PMID 1914495}}</ref> However, only A-DNA, B-DNA, and Z-DNA have been observed in naturally occurring biological systems. Which conformation DNA adopts depends on the sequence of the DNA, the amount and direction of supercoiling, chemical modifications of the bases and also solution conditions, such as the concentration of [[metal]] [[ion]]s and [[polyamine]]s.<ref>{{cite journal | author = Basu H, Feuerstein B, Zarling D, Shafer R, Marton L | title = Recognition of Z-RNA and Z-DNA determinants by polyamines in solution: experimental and theoretical studies | journal = J Biomol Struct Dyn | volume = 6 | issue = 2 | pages = 299 – 309 | year = 1988 | id = PMID 2482766}}</ref> Of these three conformations, the "B" form described above is most common under the conditions found in cells.<ref>{{cite journal |author=Leslie AG, Arnott S, Chandrasekaran R, Ratliff RL |title=Polymorphism of DNA double helices |journal=J. Mol. Biol. |volume=143 |issue=1 |pages=49–72 |year=1980 |pmid=7441761}}</ref> The two alternative double-helical forms of DNA differ in their geometry and dimensions.
  
Because DNA collects mutations over time, which are then passed down from parent to offspring, it contains information about processes that have occurred in the past. By comparing different DNA sequences, geneticists can attempt to infer the history of organisms.  
+
The A form is a wider right-handed spiral, with a shallow and wide minor groove and a narrower and deeper major groove. The A form occurs under non-physiological conditions in dehydrated samples of DNA, while in the cell it may be produced in hybrid pairings of DNA and RNA strands, as well as in enzyme-DNA complexes.<ref>{{cite journal | author = Wahl M, Sundaralingam M | title = Crystal structures of A-DNA duplexes | journal = Biopolymers | volume = 44 | issue = 1 | pages = 45 – 63 | year = 1997 | id = PMID 9097733}}</ref><ref>{{cite journal |author=Lu XJ, Shakked Z, Olson WK |title=A-form conformational motifs in ligand-bound DNA structures |journal=J. Mol. Biol. |volume=300 |issue=4 |pages=819-40 |year=2000 |pmid=10891271}}</ref> Segments of DNA where the bases have been chemically-modified by [[methylation]] may undergo a larger change in conformation and adopt the [[Z-DNA|Z form]]. Here, the strands turn about the helical axis in a left-handed spiral, the opposite of the more common B form.<ref>{{cite journal | author = Rothenburg S, Koch-Nolte F, Haag F | title = DNA methylation and Z-DNA formation as mediators of quantitative differences in the expression of alleles | journal = Immunol Rev | volume = 184 | issue = | pages = 286 – 98 | year = | id = PMID 12086319}}</ref> These unusual structures can be recognised by specific Z-DNA binding proteins and may be involved in the regulation of transcription.<ref>{{cite journal |author=Oh D, Kim Y, Rich A |title=Z-DNA-binding proteins can act as potent effectors of gene expression in vivo |url=http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=12486233 |journal=Proc. Natl. Acad. Sci. U.S.A. |volume=99 |issue=26 |pages=16666-71 |year=2002 |pmid=12486233}}</ref>
  
If DNA sequences from different [[species]] are compared, then the resulting family tree, or [[phylogeny]] can be used to study the [[evolution]] of these species. This field of [[phylogenetics]] is a powerful tool in [[evolutionary biology]]. If DNA sequences within a species are compared, [[population genetics|population geneticists]] can glean information on the history of particular populations. This can be used in studies ranging from [[ecological genetics]] to [[anthropology]] (for example, DNA evidence is also being used to try to identify the [[Ten Lost Tribes of Israel]]<ref>''Lost Tribes of Israel'', [[NOVA (TV series)|NOVA]], PBS airdate: 22 February 2000. Transcript available from http://www.pbs.org/wgbh/nova/transcripts/2706israel.html (last accessed on 4 March 2006)</ref><ref>{{cite web| url=http://www.aish.com/societywork/sciencenature/the_cohanim_-_dna_connection.asp|  title=The Cohanim/DNA Connection| first= Yaakov | last=Kleiman| accessdate=2006-03-04}}</ref>).
+
[[Image:Telomere quadruplex.jpg|thumb|left|300px|Structure of a DNA quadruplex formed by [[telomere]] repeats.<ref>Created from [http://ndbserver.rutgers.edu/atlas/xray/structures/U/ud0017/ud0017.html NDB UD0017]</ref>]]
  
DNA has also been used to look at fairly recent issues of family relationships, such as establishing some manner of familial relationship between the descendants of [[Sally Hemings]] and the family of [[Thomas Jefferson]]. This usage is closely related to the use of DNA in criminal investigations detailed above. Indeed, some criminal investigations have been solved when DNA from crime scenes has fortuitously matched relatives of the guilty individual.[http://www.newscientist.com/article.ns?id=dn4908][http://news.bbc.co.uk/1/hi/wales/3044282.stm]
+
===Quadruplex structures===
 +
At the ends of the linear [[chromosome]]s are specialized regions of DNA called [[telomere]]s. The main function of these regions is to allow the cell to replicate chromosome ends using the enzyme [[telomerase]], as the enzymes that normally replicate DNA cannot copy the extreme 3′ ends of chromosomes.<ref name=Greider>{{cite journal | author = Greider C, Blackburn E | title = Identification of a specific telomere terminal transferase activity in Tetrahymena extracts | journal = Cell | volume = 43 | issue = 2 Pt 1 | pages = 405 – 13 | year = 1985 | id = PMID 3907856}}</ref> As a result, if a chromosome lacked telomeres it would become shorter each time it was replicated. These specialized chromosome caps also help protect the DNA ends from [[exonuclease]]s and stop the [[DNA repair]] systems in the cell from treating them as damage to be corrected.<ref name=Nugent>{{cite journal | author = Nugent C, Lundblad V | title = The telomerase reverse transcriptase: components and regulation | url=http://www.genesdev.org/cgi/content/full/12/8/1073 | journal = Genes Dev | volume = 12 | issue = 8 | pages = 1073 – 85 | year = 1998 | id = PMID 9553037}}</ref> In human cells, telomeres are usually lengths of single-stranded DNA containing several thousand repeats of a simple TTAGGG sequence.<ref>{{cite journal | author = Wright W, Tesmer V, Huffman K, Levene S, Shay J | title = Normal human chromosomes have long G-rich telomeric overhangs at one end | url=http://www.genesdev.org/cgi/content/full/11/21/2801 | journal = Genes Dev | volume = 11 | issue = 21 | pages = 2801 – 9 | year = 1997 | id = PMID 9353250}}</ref>
  
==Molecular structure==
+
These guanine-rich sequences may stabilize chromosome ends by forming very unusual structures of stacked sets of four-base units, rather than the usual base pairs found in other DNA molecules. Here, four guanine bases form a flat plate and these flat four-base units then stack on top of each other, to form a stable ''quadruplex'' structure.<ref name=Burge>{{cite journal | author = Burge S, Parkinson G, Hazel P, Todd A, Neidle S | title = Quadruplex DNA: sequence, topology and structure | url=http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=17012276 | journal = Nucleic Acids Res | volume = 34 | issue = 19 | pages = 5402 – 15 | year = 2006 | id = PMID 17012276}}</ref> These structures are stabilized by hydrogen bonding between the edges of the bases and [[chelation]] of a metal ion in the centre of each four-base unit. The structure shown to the left is a top view of the quadruplex formed by a DNA sequence found in human telomere repeats. The single DNA strand forms a loop, with the sets of four bases stacking in a central quadruplex three plates deep. In the space at the centre of the stacked bases are three chelated [[potassium]] ions.<ref>{{cite journal | author = Parkinson G, Lee M, Neidle S | title = Crystal structure of parallel quadruplexes from human telomeric DNA | journal = Nature | volume = 417 | issue = 6891 | pages = 876 – 80 | year = 2002 | id = PMID 12050675}}</ref> Other structures can also be formed, with the central set of four bases coming from either a single strand folded around the bases, or several different parallel strands, each contributing one base to the central structure.
[[Image:NA-comparedto-DNA thymineAndUracilCorrected.png|right|400px|thumb|Comparisons between DNA and single stranded RNA with the diagram of the bases showing.]]
 
Although sometimes called "the molecule of heredity", DNA macromolecules as people typically think of them are not single molecules. Rather, they are pairs of molecules, which entwine like vines to form a '''double [[helix]]''' (see the illustration at the right).
 
  
Each vine-like molecule is a strand of DNA: '''a chemically linked chain of [[nucleotide]]s, each of which consists of a [[sugar]] ([[deoxyribose]]), a [[phosphate]] and one of five kinds of [[nucleobase]]s ("bases")'''. Because DNA strands are composed of these nucleotide subunits, they are [[polymer]]s.
+
In addition to these stacked structures, telomeres also form large loop structures called telomere loops, or T-loops. Here, the single-stranded DNA curls around in a long circle stabilized by telomere-binding proteins.<ref>{{cite journal | author = Griffith J, Comeau L, Rosenfield S, Stansel R, Bianchi A, Moss H, de Lange T | title = Mammalian telomeres end in a large duplex loop | journal = Cell | volume = 97 | issue = 4 | pages = 503 – 14 | year = 1999 | id = PMID 10338214}}</ref> At the very end of the T-loop, the single-stranded telomere DNA is held onto a region of double-stranded DNA by the telomere strand disrupting the double-helical DNA and base pairing to one of the two strands. This triple-stranded structure is called a displacement loop or D-loop.<ref name=Burge/>
  
The diversity of the bases means that there are five kinds of nucleotides, which are commonly referred to by the identity of their bases. These are [[adenine]] (A), [[thymine]] (T), [[uracil]] (U), [[cytosine]] (C), and [[guanine]] (G). U is rarely found in DNA except as a result of chemical degradation of C, but in some viruses, notably PBS1 phage DNA, U completely replaces the usual T in its DNA. Similarly, RNA usually contains U in place of T, but in certain RNAs such as [[transfer RNA]], T is always found in some positions. Thus, the only true difference between DNA and RNA is the sugar, 2-deoxyribose in DNA and ribose in RNA.
+
==Chemical modifications==
 
+
<div class="thumb tright" style="background-color: #f9f9f9; border: 1px solid #CCCCCC; margin:0.5em;">
In a DNA double helix, two polynucleotide strands can associate through the [[hydrophobic effect]] and [[pi stacking]]. Specificity of which strands stay associated is determined by [[base pair|complementary pairing]]. Each base forms [[hydrogen bond]]s readily to only one other — A to T and C to G -- so that the identity of the base on one strand dictates the strength of the association; the more complementary bases exist, the stronger and longer-lasting the association.
+
{|border="0" width=300px border="0" cellpadding="2" cellspacing="0" style="font-size: 85%; border: 1px solid #CCCCCC; margin: 0.3em;"
 
+
|[[Image:Cytosine chemical structure.png|75px]]
The cell's machinery is capable of ''melting'' or disassociating a DNA double helix, and using each  DNA strand as a template for synthesizing a new strand which is nearly identical to the previous strand.  Errors that occur in the synthesis are known as [[mutations]].  The process known as [[Polymerase chain reaction|PCR]] (polymerase chain reaction) mimics this process [[in vitro]] in a nonliving system.
+
|[[Image:5-methylcytosine.png|95px]]
 
+
|[[Image:Thymine chemical structure.png|97px]]
Because pairing causes the nucleotide bases to face the helical axis, the sugar and phosphate groups of the nucleotides run along the outside; the two chains they form are sometimes called the "'''backbones'''" of the helix. In fact, it is chemical bonds between the phosphates and the sugars that link one nucleotide to the next in the DNA strand.
+
|-
 
+
|align=center|[[cytosine]]
{{multi-video start}}
+
|align=center|[[5-Methylcytosine|5-methylcytosine]]
{{multi-video item |
+
|align=center|[[thymine]]
  filename      = ADN animation.gif |
+
|}
  title        = Rotating DNA stick model |
+
<div style="border: none; width:300px;font-size: 85%;"><div class="thumbcaption">Structure of cytosine with and without the 5-methyl group. After deamination the 5-methylcytosine has the same structure as thymine</div></div></div>
  description  = Animation of a section of DNA rotating. (1.00 [[Megabyte|MB]], [[animated GIF]] format). |
+
===Base modifications===
  format        = [[animated GIF]]
+
{{further|[[DNA methylation]]}}
}}
+
The expression of genes is influenced by the [[chromatin]] structure of a chromosome and regions of [[heterochromatin]] (low or no gene expression) correlate with the [[methylation]] of [[cytosine]]. For example, cytosine methylation, to produce [[5-Methylcytosine|5-methylcytosine]], is important for [[X-inactivation|X-chromosome inactivation]].<ref>{{cite journal | author = Klose R, Bird A | title = Genomic DNA methylation: the mark and its mediators | journal = Trends Biochem Sci | volume = 31 | issue = 2 | pages = 89 – 97 | year = 2006 | id = PMID 16403636}}</ref> The average level of methylation varies between organisms, with ''[[Caenorhabditis elegans]]'' lacking cytosine methylation, while [[vertebrate]]s show higher levels, with up to 1% of their DNA containing 5-methylcytosine.<ref>{{cite journal | author = Bird A | title = DNA methylation patterns and epigenetic memory | journal = Genes Dev | volume = 16 | issue = 1 | pages = 6 – 21 | year = 2002 | id = PMID 11782440}}</ref> Despite the biological role of 5-methylcytosine it is susceptible to spontaneous [[deamination]] to leave the thymine base, and methylated cytosines are therefore [[mutation]] hotspots.<ref>{{cite journal | author = Walsh C, Xu G | title = Cytosine methylation and DNA repair | journal = Curr Top Microbiol Immunol | volume = 301 | issue = | pages = 283 – 315 | year = | id = PMID 16570853}}</ref> Other base modifications include adenine methylation in bacteria and the [[glycosylation]] of uracil to produce the "J-base" in [[kinetoplastid]]s.<ref>{{cite journal | author = Ratel D, Ravanat J, Berger F, Wion D | title = N6-methyladenine: the other methylated base of DNA | journal = Bioessays | volume = 28 | issue = 3 | pages = 309 – 15 | year = 2006 | id = PMID 16479578}}</ref><ref>{{cite journal | author = Gommers-Ampt J, Van Leeuwen F, de Beer A, Vliegenthart J, Dizdaroglu M, Kowalak J, Crain P, Borst P | title = beta-D-glucosyl-hydroxymethyluracil: a novel modified base present in the DNA of the parasitic protozoan T. brucei | journal = Cell | volume = 75 | issue = 6 | pages = 1129 – 36 | year = 1993 | id = PMID 8261512}}</ref>
{{multi-video end}}
 
 
 
==Sequence role==
 
Within a gene, the sequence of [[nucleotides]] along a DNA strand defines a messenger RNA sequence which then defines a [[protein]], that an [[organism]] is liable to manufacture or "[[gene expression|express]]" at one or several points in its life using the information of the sequence. The relationship between the nucleotide sequence and the [[amino acid|amino-acid]] sequence of the protein is determined by simple cellular rules of [[Translation (genetics)|translation]], known collectively as the [[genetic code]]. The genetic code consists of three-letter 'words' (termed a codon) formed from a sequence of three nucleotides (e.g. ACT, CAG, TTT). These codons can then be translated with [[messenger RNA]] and then [[transfer RNA]], with a codon corresponding to a particular amino acid. There are 64 possible codons (4 bases in 3 places <math>4^3</math>) that encode 20 amino acids. Most amino acids, therefore, have more than one possible codon. There are also three 'stop' or 'nonsense' codons signifying the end of the coding region, namely the UAA, UGA and UAG codons.
 
 
 
In many [[species]], only a small fraction of the total sequence of the [[genome]] appears to encode protein. For example, only about 1.5% of the [[human genome]] consists of protein-coding [[exons]]. The function of the rest is a matter of speculation. It is known that certain nucleotide sequences specify affinity for [[DNA binding protein]]s, which play a wide variety of vital roles, in particular through control of replication and transcription. These sequences are frequently called [[regulatory sequence]]s, and researchers assume that so far they have identified only a tiny fraction of the total that exist. "[[Junk DNA]]" represents sequences that do not yet appear to contain genes or to have a function. The reasons for the presence of so much [[non-coding DNA]] in [[eukaryotic]] genomes and the extraordinary differences in [[genome size]] ("[[C-value]]") among species represent a long-standing puzzle in DNA research known as the "[[C-value enigma]]".
 
  
Some DNA sequences play structural roles in chromosomes. [[Telomere]]s and [[centromere]]s typically contain few (if any) protein-coding genes, but are important for the function and stability of chromosomes. Some genes code for "RNA genes" (see [[tRNA]] and [[rRNA]]). Some RNA genes code for transcripts that function as regulatory RNAs (see [[RNA interference|siRNA]]) that influence the function of other RNA molecules. The intron-exon structure of some genes (such as immunoglobin and protocadeherin genes) is important for allowing alternative splicing of pre-mRNA which allows several different proteins to be made from the same gene. Some non-coding DNA represents [[pseudogene]]s that can be used as raw material for the creation of new genes with new functions. Some non-coding DNA provided hot-spots for duplication of short DNA regions; such sequence duplication has been the major form of genetic change in the human lineage (see evidence from the [[Chimpanzee Genome Project]]). Exons interspersed with introns allows for "exon shuffling" and the creation of modified genes that might have new adaptive functions. Large amounts of non-coding DNA is probably adaptive in that it provides chromosomal regions where [[Genetic recombination|recombination]] between homologous portions of chromosomes can take place without disrupting the function of genes. Some biologists such as [[Stuart Kauffman]] have speculated that non-coding DNA may modify the rate of evolution of a species.{{fact}}
+
===DNA damage===
 +
{{further|[[Mutation]]}}
  
Sequence also determines a DNA segment's susceptibility to cleavage by [[restriction enzyme]]s, the quintessential tools of [[genetic engineering]]. The position of cleavage sites throughout an individual's genome determines one kind of an individual's "[[DNA fingerprinting|DNA fingerprint]]".
+
[[Image:Benzopyrene DNA adduct 1JDG.png|thumb|right|250px|[[Benzopyrene]], the major mutagen in [[tobacco smoking|tobacco smoke]], in an adduct to DNA.<ref>Created from [http://www.rcsb.org/pdb/cgi/explore.cgi?pdbId=1JDG PDB 1JDG]</ref>]]
 +
DNA can be damaged by many different sorts of [[mutagen]]s. These include [[oxidizing agent]]s, [[alkylating agent]]s and also high-energy [[electromagnetic radiation]] such as [[ultraviolet]] light and [[x-ray]]s. The type of DNA damage produced depends on the type of mutagen. For example, UV light mostly damages DNA by producing [[thymine dimer]]s, which are cross-links between adjacent pyrimidine bases in a DNA strand.<ref>{{cite journal | author = Douki T, Reynaud-Angelin A, Cadet J, Sage E | title = Bipyrimidine photoproducts rather than oxidative lesions are the main type of DNA damage involved in the genotoxic effect of solar UVA radiation | journal = Biochemistry | volume = 42 | issue = 30 | pages = 9221 – 6 | year = 2003 | id = PMID 12885257}},</ref> On the other hand, oxidants such as [[free radical]]s or [[hydrogen peroxide]] produce multiple forms of damage, including base modifications, particularly of guanosine, as well as double-strand breaks.<ref>{{cite journal | author = Cadet J, Delatour T, Douki T, Gasparutto D, Pouget J, Ravanat J, Sauvaigo S | title = Hydroxyl radicals and DNA base damage | journal = Mutat Res | volume = 424 | issue = 1 – 2 | pages = 9 – 21 | year = 1999 | id = PMID 10064846}}</ref> It has been estimated that in each human cell, about 500 bases suffer oxidative damage per day.<ref>{{cite journal | author = Shigenaga M, Gimeno C, Ames B | title = Urinary 8-hydroxy-2′-deoxyguanosine as a biological marker of ''in vivo'' oxidative DNA damage | url=http://www.pnas.org/cgi/reprint/86/24/9697 | journal = Proc Natl Acad Sci U S A | volume = 86 | issue = 24 | pages = 9697 – 701 | year = 1989 | id = PMID 2602371}}</ref><ref>{{cite journal | author = Cathcart R, Schwiers E, Saul R, Ames B | title = Thymine glycol and thymidine glycol in human and rat urine: a possible assay for oxidative DNA damage | url=http://www.pnas.org/cgi/reprint/81/18/5633.pdf | journal = Proc Natl Acad Sci U S A | volume = 81 | issue = 18 | pages = 5633 – 7 | year = 1984 | id = PMID 6592579}}</ref> Of these oxidative lesions, the most dangerous are double-strand breaks, as these lesions are difficult to repair and can produce [[point mutation]]s, [[Insertion (genetics)|insertions]] and [[Genetic deletion|deletions]] from the DNA sequence, as well as [[chromosomal translocation]]s.<ref>{{cite journal | author = Valerie K, Povirk L | title = Regulation and mechanisms of mammalian double-strand break repair | journal = Oncogene | volume = 22 | issue = 37 | pages = 5792 – 812 | year = 2003 | id = PMID 12947387}}</ref>
  
==Replication==
+
Many mutagens [[intercalation (chemistry)|intercalate]] into the space between two adjacent base pairs. Intercalators are mostly [[aromaticity|aromatic]] and planar molecules, and include [[ethidium]], [[daunomycin]], [[doxorubicin]] and [[thalidomide]]. In order for an intercalator to fit between base pairs, the bases must separate, distorting the DNA strands by unwinding of the double helix. These structural changes inhibit both transcription and DNA replication, causing toxicity and mutations. As a result, DNA intercalators are often [[carcinogen]]s, with [[benzopyrene|benzopyrene diol epoxide]], [[acridine]]s, [[aflatoxin]] and [[ethidium bromide]] being well-known examples.<ref>{{cite journal | author = Ferguson L, Denny W | title = The genetic toxicology of acridines | journal = Mutat Res | volume = 258 | issue = 2 | pages = 123 – 60 | year = 1991 | id = PMID 1881402}}</ref><ref>{{cite journal | author = Jeffrey A | title = DNA modification by chemical carcinogens | journal = Pharmacol Ther | volume = 28 | issue = 2 | pages = 237 – 72 | year = 1985 | id = PMID 3936066}}</ref><ref>{{cite journal |author=Stephens T, Bunde C, Fillmore B |title=Mechanism of action in thalidomide teratogenesis |journal=Biochem Pharmacol |volume=59 |issue=12 |pages=1489 – 99 |year=2000 |id=PMID 10799645}}</ref> Nevertheless, due to their properties of inhibiting DNA transcription and replication, they are also used in [[chemotherapy]] to inhibit rapidly-growing [[cancer]] cells.<ref>{{cite journal | author = Braña M, Cacho M, Gradillas A, de Pascual-Teresa B, Ramos A | title = Intercalators as anticancer drugs | journal = Curr Pharm Des | volume = 7 | issue = 17 | pages = 1745 – 80 | year = 2001 | id = PMID 11562309}}</ref>
''Main article:'' [[DNA replication]]
 
[[image:dna-split.png|frame|DNA replication]]
 
<!-- summary has been added, below, also include any extra context relevant for this article as well
 
  
..[[origin of replication]]...chromosome...plasmid...DNA polymerase...[[mutation]]...[a paragraph including these ideas would be useful and go well here]
+
==Overview of biological functions==
-->
+
DNA usually occurs as linear [[chromosome]]s in eukaryotes, and circular chromosomes in prokaryotes. The set of chromosomes in a cell makes up its [[genome]]; the [[human genome]] has approximately 3 billion base pairs of DNA arranged into 46 chromosomes.<ref>{{cite journal | author = Venter J, ''et al.'' | title = The sequence of the human genome | journal = Science | volume = 291 | issue = 5507 | pages = 1304 – 51 | year = 2001 | id = PMID 11181995}}</ref> The information carried by DNA is held in the [[DNA sequence|sequence]] of pieces of DNA called [[gene]]s. Transmission of genetic information in genes is achieved via complementary base pairing. For example, in transcription, when a cell uses the information in a gene, the DNA sequence is copied into a complementary RNA sequence through the attraction between the DNA and the correct RNA nucleotides. Usually, this RNA copy is then used to make a matching protein sequence in a process called [[Translation (biology)|translation]] which depends on the same interaction between RNA nucleotides. Alternatively, a cell may simply copy its genetic information in a process called DNA replication. The details of these functions are covered in other articles; here we focus on the interactions between DNA and other molecules that mediate the function of the genome.
DNA replication or DNA synthesis is the process of copying the double-stranded DNA prior to [[cell division]]. The two resulting double strands are generally almost perfectly identical, but occasionally errors in replication or exposure to chemicals, or radiation can result in a less than perfect copy (see [[mutation]]), and each of them consists of one original and one newly synthesized strand. This is called ''[[semiconservative replication]]''.  The process of replication consists of three steps: ''initiation'', ''elongation'' and ''termination''.
+
===Genome structure===
 +
{{further|[[Cell nucleus]], [[Chromatin]], [[Chromosome]], [[Gene]], [[Non-coding DNA]]}}
 +
Genomic DNA is located in the [[cell nucleus]] of eukaryotes, as well as small amounts in [[mitochondrion|mitochondria]] and [[chloroplast]]s. In prokaryotes, the DNA is held within an irregularly shaped body in the cytoplasm called the [[nucleoid]].<ref>{{cite journal | author = Thanbichler M, Wang S, Shapiro L | title = The bacterial nucleoid: a highly organized and dynamic structure | journal = J Cell Biochem | volume = 96 | issue = 3 | pages = 506 – 21 | year = 2005 | id = PMID 15988757}}</ref> The genetic information in a genome is held within genes. A gene is a unit of [[heredity]] and is a region of DNA that influences a particular characteristic in an organism. Genes contain an [[open reading frame]] that can be transcribed, as well as [[regulatory sequence]]s such as [[promoter]]s and [[enhancer (genetics)|enhancers]], which control the expression of the open reading frame.  
  
==Mechanical biological properties==
+
In many [[species]], only a small fraction of the total sequence of the [[genome]] encodes protein. For example, only about 1.5% of the human genome consists of protein-coding [[exon]]s, with over 50% of human DNA consisting of non-coding [[repeated sequence (DNA)|repetitive sequences]].<ref>{{cite journal | author = Wolfsberg T, McEntyre J, Schuler G | title = Guide to the draft human genome | journal = Nature | volume = 409 | issue = 6822 | pages = 824 – 6 | year = 2001 | id = PMID 11236998}}</ref> The reasons for the presence of so much [[noncoding DNA|non-coding DNA]] in eukaryotic genomes and the extraordinary differences in [[genome size]], or ''[[C-value]]'', among species represent a long-standing puzzle known as the "[[C-value enigma]]."<ref>{{cite journal | author = Gregory T | title = The C-value enigma in plants and animals: a review of parallels and an appeal for partnership | url=http://aob.oxfordjournals.org/cgi/content/full/95/1/133 | journal = Ann Bot (Lond) | volume = 95 | issue = 1 | pages = 133 – 46 | year = 2005 | id = PMID 15596463}}</ref>
''Main article:'' [[Mechanical properties of DNA]].
+
[[Image:RNA pol.jpg|thumb|left|300px|[[T7 RNA polymerase]] producing a mRNA (green) from a DNA template (red and blue). The enzyme is shown as a purple ribbon.<ref>Created from [http://www.rcsb.org/pdb/explore/explore.do?structureId=1MSW PDB 1MSW]</ref>]]
 +
Some non-coding DNA sequences play structural roles in chromosomes. [[Telomere]]s and [[centromere]]s typically contain few genes, but are important for the function and stability of chromosomes.<ref name=Nugent/><ref>{{cite journal | author = Pidoux A, Allshire R | title = The role of heterochromatin in centromere function | url=http://www.journals.royalsoc.ac.uk/media/804t6y8vmh5utlb6ua5y/contributions/p/x/7/a/px7ahm740dq5ueuk.pdf | journal = Philos Trans R Soc Lond B Biol Sci | volume = 360 | issue = 1455 | pages = 569 – 79 | year = 2005 | id = PMID 15905142}}</ref> An abundant form of non-coding DNA in humans are [[pseudogene]]s, which are copies of genes that have been disabled by mutation.<ref>{{cite journal | author = Harrison P, Hegyi H, Balasubramanian S, Luscombe N, Bertone P, Echols N, Johnson T, Gerstein M | title = Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22 | url=http://www.genome.org/cgi/content/full/12/2/272 | journal = Genome Res | volume = 12 | issue = 2 | pages = 272 – 80 | year = 2002 | id = PMID 11827946}}</ref> These sequences are usually just molecular [[fossil]]s, although they can occasionally serve as raw genetic material for the creation of new genes through the process of [[gene duplication]] and [[divergent evolution|divergence]].<ref>{{cite journal | author = Harrison P, Gerstein M | title = Studying genomes through the aeons: protein families, pseudogenes and proteome evolution | journal = J Mol Biol | volume = 318 | issue = 5 | pages = 1155 – 74 | year = 2002 | id = PMID 12083509}}</ref>
  
===Strands association and dissociation===
+
===Transcription and translation===
The hydrogen bonds between the strands of the double helix are weak enough that they can be easily separated by [[enzyme]]s. Enzymes known as [[helicase]]s unwind the strands to facilitate the advance of sequence-reading enzymes such as [[DNA polymerase]]. The unwinding requires that helicases chemically cleave the phosphate backbone of one of the strands so that it can swivel around the other. The strands can also be separated by gentle heating, as used in [[PCR]], provided they have fewer than about 10,000 '''base pairs''' (10 kilobase pairs, or 10 kbp). The intertwining of the DNA strands makes long segments difficult to separate.
+
{{further|[[Genetic code]], [[Transcription (genetics)]], [[Protein biosynthesis]]}}
 +
A gene is a sequence of DNA that contains genetic information and can influence the [[phenotype]] of an organism. Within a gene, the sequence of bases along a DNA strand defines a [[messenger RNA]] sequence, which then defines a protein sequence. The relationship between the nucleotide sequences of genes and the [[amino acid|amino-acid]] sequences of proteins is determined by the rules of [[translation (genetics)|translation]], known collectively as the [[genetic code]]. The genetic code consists of three-letter 'words' called ''codons'' formed from a sequence of three nucleotides (e.g. ACT, CAG, TTT).
  
===Circular DNA===
+
In transcription, the codons of a gene are copied into messenger RNA by [[RNA polymerase]]. This RNA copy is then decoded by a [[ribosome]] that reads the RNA sequence by base-pairing the messenger RNA to [[transfer RNA]], which carries amino acids. Since there are 4 bases in 3-letter combinations, there are 64 possible codons (<math>4^3</math> combinations). These encode the twenty [[list of standard amino acids|standard amino acids]], giving most amino acids more than one possible codon. There are also three 'stop' or 'nonsense' codons signifying the end of the coding region; these are the TAA, TGA and TAG codons.
When the ends of a piece of double-helical DNA are joined so that it forms a circle, as in [[plasmid]] DNA, the strands are [[knot theory|topologically]] knotted. This means they cannot be separated by gentle heating or by any process that does not involve breaking a strand. The task of unknotting topologically linked strands of DNA falls to enzymes known as [[topoisomerase]]s. Some of these enzymes unknot circular DNA by cleaving two strands so that another double-stranded segment can pass through. Unknotting is required for the replication of circular DNA as well as for various types of [[recombination]] in linear DNA.
 
  
===Great length versus tiny breadth===
+
[[Image:DNA replication.svg|thumb|450px|right|DNA replication. The double helix is unwound by a [[helicase]] and [[topoisomerase]]. Next, one [[DNA polymerase]] produces the [[leading strand]] copy. Another DNA polymerase binds to the [[lagging strand]]. This enzyme makes discontinuous segments (called [[Okazaki fragment]]s) before [[DNA ligase]] joins them together.]]
The narrow breadth of the double helix makes it impossible to detect by conventional [[transmission electron microscope|electron microscopy]], except by heavy staining. At the same time, the DNA found in many cells can be macroscopic in length — approximately 2 [[meter]]s long for strands in a human chromosome.<ref>{{cite web| url=http://hypertextbook.com/facts/1998/StevenChen.shtml| title=Length of a Human DNA Molecule| accessdate=2006-03-04}}</ref> Consequently, cells must compact or "package" DNA to carry it within them. This is one of the functions of the chromosomes, which contain spool-like [[protein]]s known as [[histone]]s, around which DNA winds.
 
  
===Entropic stretching behavior===
+
===Replication===
When DNA is in solution, it undergoes conformational fluctuations due to the energy available in the [[thermal bath]]. For [[Entropy|entropic]] reasons, floppy states are more thermally accessible than stretched out states; for this reason, a single molecule of DNA stretches similarly to a rubber band. Using [[optical tweezers]], the entropic stretching behavior of DNA has been studied and analyzed from a [[polymer physics]] perspective, and it has been found that DNA behaves like the ''Kratky-Porod'' [[worm-like chain]] model with a persistence length of about 53 nm.
+
{{further|[[DNA replication]]}}
  
Furthermore, DNA undergoes a stretching [[phase transition]] at a force of 65 [[Newtons|pN]]; above this force, DNA is thought to take the form that [[Linus Pauling]] originally hypothesized, with the phosphates in the middle and bases splayed outward. This proposed structure for overstretched DNA has been called "P-form DNA," in honor of Pauling.
+
[[Cell division]] is essential for an organism to grow, but when a cell divides it must replicate the DNA in its genome so that the two daughter cells have the same genetic information as their parent. The double-stranded structure of DNA provides a simple mechanism for [[DNA replication]]. Here, the two strands are separated and then each strand's complementary DNA sequence is recreated by an enzyme called [[DNA polymerase]]. This enzyme makes the complementary strand by finding the correct base through complementary base pairing, and bonding it onto the original strand. As DNA polymerases can only extend a DNA strand in a 5′ to 3′ direction, different mechanisms are used to copy the antiparallel strands of the double helix.<ref>{{cite journal | author = Albà M | title = Replicative DNA polymerases | url=http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=11178285 | journal = Genome Biol | volume = 2 | issue = 1 | pages = REVIEWS3002 | year = 2001 | id = PMID 11178285}}</ref> In this way, the base on the old strand dictates which base appears on the new strand, and the cell ends up with a perfect copy of its DNA.
  
===Different helix geometries===
+
==Interactions with proteins==
The DNA helix can assume one of three slightly different geometries, of which the "B" form described by [[James D. Watson]] and [[Francis Crick]] is believed to predominate in cells. It is 2 [[nanometre]]s wide and extends 3.4 nanometres per 10 [[Base pair|bp]] of sequence. This is also the approximate length of sequence in which the double helix makes one complete turn about its axis. This frequency of twist (known as the helical ''pitch'') depends largely on stacking forces that each base exerts on its neighbors in the chain.
+
All the functions of DNA depend on interactions with proteins. These protein interactions can be non-specific, or the protein can bind specifically to a single DNA sequence. Enzymes can also bind to DNA and of these, the polymerases that copy the DNA base sequence in transcription and DNA replication are particularly important.
  
====Supercoiled DNA====
+
===DNA-binding proteins===
{{main|Supercoil}}
+
<div class="thumb tleft" style="background-color: #f9f9f9; border: 1px solid #CCCCCC; margin:0.5em;">
The B form of the DNA helix twists 360° per 10 bp in the absence of strain. But many molecular biological processes can induce strain. A DNA segment with excess or insufficient helical twisting is referred to, respectively, as positively or negatively "supercoiled". DNA ''in vivo'' is typically negatively supercoiled, which facilitates the unwinding of the double-helix required for [[transcription (genetics)|RNA transcription]].
+
{|border="0" width=260px border="0" cellpadding="0" cellspacing="0" style="font-size: 85%; border: 1px solid #CCCCCC; margin: 0.3em;"
 
+
|[[Image:Nucleosome 2.jpg|260px]]
====Sugar pucker====
 
There are four conformations that the [[ribofuranose]] rings in nucleotides can acquire:
 
# C-2' endo
 
# C-2' exo
 
# C-3' endo
 
# C-3' exo
 
Ribose is usually in C-3'endo, while deoxyribose is usually in the C-2' endo sugar pucker conformation.
 
The A and B forms differ mainly in their ''sugar pucker''.  In the A form, the C3' configuration is above the sugar ring, whilst the C2' configuration is below it.  Thus, the A form is described as "C3'-endo."  Likewise, in the B form, the C2' configuration is above the sugar ring, whilst C3' is below; this is called "C2'-endo."  Altered sugar puckering in A-DNA results in shortening the distance between adjacent phosphates by around one angstrom.  This gives 11 to 12 base pairs to each helix in the DNA strand, instead of 10.5 in B-DNA.  Sugar pucker gives uniform ribbon shape to DNA, a cylindrical open core, and also a deep major groove more narrow and pronounced that grooves found in B-DNA.
 
 
 
====A and Z helices formation====
 
The two other known double-helical forms of DNA, called A and [[Z-DNA|Z]], differ modestly in their geometry and dimensions. The A form appears likely to occur only in dehydrated samples of DNA, such as those used in [[crystallography|crystallographic]] experiments, and possibly in hybrid pairings of DNA and [[RNA]] strands. Segments of DNA that cells have [[methylation|methylated]] for regulatory purposes may adopt the Z geometry, in which the strands turn about the helical axis like a mirror image of the B form.
 
 
 
====Properties of different helical forms====
 
{| border="0" align="center" style="border: 1px solid #999; background-color:#FFFFFF"
 
|-align="center" bgcolor="#CCCCCC"
 
!Geometry attribute
 
!A-form
 
!B-form
 
!Z-form
 
 
|-
 
|-
|Helix sense ||align="center"| right-handed ||align="center"| right-handed ||align="center"| left-handed
+
|[[Image:Nucleosome_(opposites_attracts).JPG|260px]]
|—bgcolor="#EFEFEF"
 
|Repeating unit ||align="right"| 1 bp ||align="right"| 1 bp ||align="right"| 2 bp
 
|-----
 
|Rotation/bp ||align="right"| 33.6° ||align="right"| 35.9° ||align="right"| 60°/2
 
|—bgcolor="#EFEFEF"
 
|Mean bp/turn ||align="right"| 10.7 ||align="right"| 10.4 ||align="right"| 12
 
|-----
 
|Inclination of bp to axis ||align="right"| +19° ||align="right"| -1.2° ||align="right"| -9°
 
|—bgcolor="#EFEFEF"
 
|Rise/bp along axis ||align="right"| 0.23 nm ||align="right"| 0.332 nm ||align="right"| 0.38 nm
 
|-----
 
|Pitch/turn of helix ||align="right"| 2.46 nm ||align="right"| 3.32 nm ||align="right"| 4.56 nm
 
|—bgcolor="#EFEFEF"
 
|Mean propeller twist ||align="right"| +18° ||align="right"| +16° ||align="right"| 0°
 
|-----
 
|Glycosyl angle ||align="center"| anti ||align="center"| anti ||align="center"| C: anti,<br> G: syn
 
|—bgcolor="#EFEFEF"
 
|Sugar pucker ||align="center"| C3'-endo ||align="center"| C2'-endo ||align="center"| C: C2'-endo,<br>G: C2'-exo
 
|-----
 
|Diameter ||align="right"| 2.6 nm ||align="right"| 2.0 nm ||align="right"| 1.8 nm
 
|—bgcolor="#EFEFEF"
 
 
|}
 
|}
 +
<div style="border: none; width:260px;"><div class="thumbcaption">Interaction of DNA with [[histone]]s (shown in white, top). These proteins' basic amino acids (below left, blue) bind to the acidic phosphate groups on DNA (below right, red).</div></div></div>
  
===Non-helical forms===
+
Structural proteins that bind DNA are well-understood examples of non-specific DNA-protein interactions. Within chromosomes, DNA is held in complexes with structural proteins. These proteins organize the DNA into a compact structure called [[chromatin]]. In eukaryotes this structure involves DNA binding to a complex of small basic proteins called [[histone]]s, while in prokaryotes multiple types of proteins are involved.<ref>{{cite journal | author = Sandman K, Pereira S, Reeve J | title = Diversity of prokaryotic chromosomal proteins and the origin of the nucleosome | journal = Cell Mol Life Sci | volume = 54 | issue = 12 | pages = 1350 – 64 | year = 1998 | id = PMID 9893710}}</ref><ref>{{cite journal |author=Dame RT |title=The role of nucleoid-associated proteins in the organization and compaction of bacterial chromatin |journal=Mol. Microbiol. |volume=56 |issue=4 |pages=858-70 |year=2005 |pmid=15853876}}</ref> The histones form a disk-shaped complex called a [[nucleosome]], which contains two complete turns of double-stranded DNA wrapped around its surface. These non-specific interactions are formed through basic residues in the histones making [[ionic bond]]s to the acidic sugar-phosphate backbone of the DNA, and are therefore largely independent of the base sequence.<ref>{{cite journal | author = Luger K, Mäder A, Richmond R, Sargent D, Richmond T | title = Crystal structure of the nucleosome core particle at 2.8 A resolution | journal = Nature | volume = 389 | issue = 6648 | pages = 251 – 60 | year = 1997 | id = PMID 9305837}}</ref> Chemical modifications of these basic amino acid residues include [[methylation]], [[phosphorylation]] and [[acetylation]].<ref>{{cite journal | author = Jenuwein T, Allis C | title = Translating the histone code | journal = Science | volume = 293 | issue = 5532 | pages = 1074 – 80 | year = 2001 | id = PMID 11498575}}</ref> These chemical changes alter the strength of the interaction between the DNA and the histones, making the DNA more or less accessible to [[transcription factor]]s and changing the rate of transcription.<ref>{{cite journal | author = Ito T | title = Nucleosome assembly and remodelling | journal = Curr Top Microbiol Immunol | volume = 274 | issue = | pages = 1 – 22 | year = | id = PMID 12596902}}</ref> Other non-specific DNA-binding proteins found in chromatin include the high-mobility group proteins, which bind preferentially to bent or distorted DNA.<ref>{{cite journal | author = Thomas J | title = HMG1 and 2: architectural DNA-binding proteins | journal = Biochem Soc Trans | volume = 29 | issue = Pt 4 | pages = 395 – 401 | year = 2001 | id = PMID 11497996}}</ref> These proteins are important in bending arrays of nucleosomes and arranging them into more complex chromatin structures.<ref>{{cite journal | author = Grosschedl R, Giese K, Pagel J | title = HMG domain proteins: architectural elements in the assembly of nucleoprotein structures | journal = Trends Genet | volume = 10 | issue = 3 | pages = 94–100 | year = 1994 | id = PMID 8178371}}</ref>
There is an argument to be made that the native, intracellular form of DNA is not the B-form double helix, as commonly supposed. Rather, this argument proposes, the strands of DNA remain almost entirely separate in their normal states.
 
Information on this alternative theory is available from this online book, presented in PDF format:
 
 
 
http://www.notahelix.com/delmonte/new_struct_mol_biol.pdf
 
  
and a recent research paper summarises some key experimental data which are better explained by SBS models than by the double helix:
+
A distinct group of DNA-binding proteins are the single-stranded-DNA-binding proteins that specifically bind single-stranded DNA. In humans, replication protein A is the best-characterised member of this family and is essential for most processes where the double helix is separated, including DNA replication, recombination and DNA repair.<ref>{{cite journal | author = Iftode C, Daniely Y, Borowiec J | title = Replication protein A (RPA): the eukaryotic SSB | journal = Crit Rev Biochem Mol Biol | volume = 34 | issue = 3 | pages = 141 – 80 | year = 1999 | id = PMID 10473346}}</ref> These binding proteins seem to stabilize single-stranded DNA and protect it from forming [[stem loop]]s or being degraded by [[nuclease]]s.
  
http://www.ias.ac.in/currsci/dec102003/1564.pdf
+
[[Image:Lambda repressor 1LMB.png|thumb|right|185px|The lambda repressor [[helix-turn-helix]] transcription factor bound to its DNA target<ref>Created from [http://www.rcsb.org/pdb/explore/explore.do?structureId=1LMB PDB 1LMB]</ref>]]
 +
In contrast, other proteins have evolved to specifically bind particular DNA sequences. The most intensively studied of these are the various classes of [[transcription factor]]s, which are proteins that regulate transcription. Each one of these proteins bind to one particular set of DNA sequences and thereby activates or inhibits the transcription of genes with these sequences close to their [[promoter]]s. The transcription factors do this in two ways. Firstly, they can bind the RNA polymerase responsible for transcription, either directly or through other mediator proteins; this locates the polymerase at the promoter and allows it to begin transcription.<ref>{{cite journal | author = Myers L, Kornberg R | title = Mediator of transcriptional regulation | journal = Annu Rev Biochem | volume = 69 | issue = | pages = 729 – 49 | year = | id = PMID 10966474}}</ref> Alternatively, transcription factors can bind [[enzyme]]s that modify the histones at the promoter; this will change the accessibility of the DNA template to the polymerase.<ref>{{cite journal | author = Spiegelman B, Heinrich R | title = Biological control through regulated transcriptional coactivators | journal = Cell | volume = 119 | issue = 2 | pages = 157-67 | year = 2004 | id = PMID 15479634}}</ref>
  
with subsequent correspondence:
+
As these DNA targets can occur throughout an organism's genome, changes in the activity of one type of transcription factor can affect thousands of genes.<ref>{{cite journal | author = Li Z, Van Calcar S, Qu C, Cavenee W, Zhang M, Ren B | title = A global transcriptional regulatory role for c-Myc in Burkitt's lymphoma cells | url=http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=12808131 | journal = Proc Natl Acad Sci U S A | volume = 100 | issue = 14 | pages = 8164 – 9 | year = 2003 | id = PMID 12808131}}</ref> Consequently, these proteins are often the targets of the [[signal transduction]] processes that mediate responses to environmental changes or cellular differentiation and development. The specificity of these transcription factors' interactions with DNA come from the proteins making multiple contacts to the edges of the DNA bases, allowing them to "read" the DNA sequence. Most of these base-interactions are made in the major groove, where the bases are most accessible.<ref>{{cite journal | author = Pabo C, Sauer R | title = Protein-DNA recognition | journal = Annu Rev Biochem | volume = 53 | issue = | pages = 293 – 321 | year = | id = PMID 6236744}}</ref>
  
http://www.ias.ac.in/currsci/may252004/1352.pdf
+
[[Image:EcoRV 1RVA.png|thumb|left|250px|The [[restriction enzyme]] [[EcoRV]] (green) in a complex with its substrate DNA<ref>Created from [http://www.rcsb.org/pdb/explore/explore.do?structureId=1RVA PDB 1RVA]</ref>]]
  
However, these theories have problems of their own, such as explaining the near-perfect symmetry of DNA in cells and the activity of DNA repair in the absence of a base-paired strand for comparison. Additionally, the activity of [[topoisomerase|topoisomerases]] would be entirely redundant, and not nearly as important to cellular function as it patently is, if not for the fact that base-paired double-strands are at least the primary form of cellular DNA.
+
===DNA-modifying enzymes===
 +
====Nucleases and ligases====
 +
Nucleases are [[enzyme]]s that cut DNA strands by catalyzing the [[hydrolysis]] of the [[phosphodiester bond]]s. Nucleases that hydrolyse nucleotides from the ends of DNA strands are called [[exonuclease]]s, while [[endonuclease]]s cut within strands. The most frequently-used nucleases in [[molecular biology]] are the [[restriction enzyme|restriction endonucleases]], which cut DNA at specific sequences. For instance, the EcoRV enzyme shown to the left recognizes the 6-base sequence 5′-GAT|ATC-3′ and makes a cut at the vertical line. In nature, these enzymes protect [[bacteria]] against [[phage]] infection by digesting the phage DNA when it enters the bacterial cell, acting as part of the [[restriction modification system]].<ref>{{cite journal | author = Bickle T, Krüger D | title = Biology of DNA restriction | url=http://www.pubmedcentral.nih.gov/picrender.fcgi?artid=372918&blobtype=pdf | journal = Microbiol Rev | volume = 57 | issue = 2 | pages = 434 – 50 | year = 1993 | id = PMID 8336674}}</ref> In technology, these sequence-specific nucleases are used in [[clone (genetics)|molecular cloning]] and [[DNA fingerprinting]].
  
==Strand direction==
+
Enzymes called [[DNA ligase]]s can rejoin cut or broken DNA strands, using the energy from either [[adenosine triphosphate]] or [[nicotinamide adenine dinucleotide]].<ref name=Doherty>{{cite journal | author = Doherty A, Suh S | title = Structural and mechanistic conservation in DNA ligases. | url=http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=11058099 | journal = Nucleic Acids Res | volume = 28 | issue = 21 | pages = 4051 – 8 | year = 2000 | id = PMID 11058099}}</ref> Ligases are particularly important in [[lagging strand]] DNA replication, as they join together the short segments of DNA produced at the [[replication fork]] into a complete copy of the DNA template. They are also used in [[DNA repair]] and [[genetic recombination]].<ref name=Doherty/>
The asymmetric shape and linkage of nucleotides means that a DNA strand always has a discernible orientation or directionality. Because of this directionality, close inspection of a double helix reveals that nucleotides are heading one way along one strand (the "''ascending strand''"), and the other way along the other strand (the "''descending strand''"). This arrangement of the strands is called '''antiparallel'''.
 
  
===Chemical nomenclature ([[5' end|5']] and [[3' end|3']])===
+
====Topoisomerases and helicases====
For reasons of chemical nomenclature, people who work with DNA refer to the asymmetric ends of ("five prime" and "three prime"). Within a cell, the enzymes that perform [[DNA replication|replication]] and [[DNA transcription|transcription]] read DNA in the "'''[[3' end|3']] to [[5' end|5']] direction'''", while the enzymes that perform translation read in the opposite directions (on [[RNA|RNA]]). However, because chemically produced DNA is synthesized and manipulated in the opposite or in non-directional manners, the orientation should not be assumed. In a vertically oriented double helix, the [[3' end|3']] strand is said to be ascending while the [[5' end|5']] strand is said to be descending.
+
[[Topoisomerase]]s are enzymes with both nuclease and ligase activity. These proteins change the amount of [[DNA supercoil|supercoiling]] in DNA. Some of these enzyme work by cutting the DNA helix and allowing one section to rotate, thereby reducing its level of supercoiling; the enzyme then seals the DNA break.<ref name=Champoux/> Other types of these enzymes are capable of cutting one DNA helix and then passing a second strand of DNA through this break, before rejoining the helix.<ref>{{cite journal | author = Schoeffler A, Berger J | title = Recent advances in understanding structure-function relationships in the type II topoisomerase mechanism | journal = Biochem Soc Trans | volume = 33 | issue = Pt 6 | pages = 1465 – 70 | year = 2005 | id = PMID 16246147}}</ref> Topoisomerases are required for many processes involving DNA, such as DNA replication and transcription.<ref name=Wang/>
  
===Sense and antisense===
+
[[Helicase]]s are proteins that are a type of [[molecular motor]]. They use the chemical energy in [[nucleoside triphosphate]]s, predominantly [[Adenosine triphosphate|ATP]], to break hydrogen bonds between bases and unwind the DNA double helix into single strands.<ref>{{cite journal | author = Tuteja N, Tuteja R | title = Unraveling DNA helicases. Motif, structure, mechanism and function | url=http://www.blackwell-synergy.com/links/doi/10.1111%2Fj.1432-1033.2004.04094.x | journal = Eur J Biochem | volume = 271 | issue = 10 | pages = 1849–63 | year = 2004 | id = PMID 15128295}}</ref> These enzymes are essential for most processes where enzymes need to access the DNA bases.
As a result of their antiparallel arrangement and the sequence-reading preferences of enzymes, even if both strands carried identical instead of complementary sequences, cells could properly translate only one of them. The other strand a cell can only read backwards. [[molecular biology|Molecular biologists]] call a sequence "'''sense'''" if it is translated or translatable, and they call its complement  "'''antisense'''". It follows then, somewhat paradoxically, that the template for transcription is the ''antisense'' strand. The resulting transcript is an RNA replica of the ''sense'' strand and is itself ''sense.''
 
  
===Distinction between sense and antisense strands===
+
====Polymerases====
A small proportion of genes in [[prokaryotes]], and more in [[plasmids]] and [[viruses]], blur the distinction made above between sense and antisense strands. Certain sequences of their [[genome|genomes]] do double duty, encoding one protein when read 5' to 3' along one strand, and a second protein when read in the opposite direction (still 5' to 3') along the other strand. As a result, the genomes of these viruses are unusually compact for the number of genes they contain, which biologists view as an [[adaptation (biology)|adaptation]]. This merely confirms that there is no biological distinction between the two strands of the double helix. Typically each strand of a DNA double helix will act as sense and antisense in different regions.
+
Polymerases are enzymes that synthesise polynucleotide chains from [[nucleoside triphosphate]]s. They function by adding nucleotides onto the 3′ [[hydroxyl|hydroxyl group]] of the previous nucleotide in the DNA strand. As a consequence, all polymerases work in a 5′ to 3′ direction.<ref name=Joyce>{{cite journal | author = Joyce C, Steitz T | title = Polymerase structures and function: variations on a theme? | url=http://www.pubmedcentral.nih.gov/picrender.fcgi?artid=177480&blobtype=pdf | journal = J Bacteriol | volume = 177 | issue = 22 | pages = 6321 – 9 | year = 1995 | id = PMID 7592405}}</ref> In the [[active site]] of these enzymes, the nucleoside triphosphate substrate base-pairs to a single-stranded polynucleotide template: this allows polymerases to accurately synthesise the complementary strand of this template. Polymerases are classified according to the type of template that they use.
  
===As viewed by topologists===
+
In [[DNA replication]], a DNA-dependent [[DNA polymerase]] makes a DNA copy of a DNA sequence. Accuracy is vital in this process, so many of these polymerases have a [[Proofreading#Proofreading in biology|proofreading]] activity. Here, the polymerase recognizes the occasional mistakes in the synthesis reaction by the lack of base pairing between the mismatched nucleotides. If a mismatch is detected, a 3′ to 5′ [[exonuclease]] activity is activated and the incorrect base removed.<ref>{{cite journal | author = Hubscher U, Maga G, Spadari S | title = Eukaryotic DNA polymerases | journal = Annu Rev Biochem | volume = 71 | issue = | pages = 133 – 63 | year = | id = PMID 12045093}}</ref> In most organisms DNA polymerases function in a large complex called the [[replisome]] that contains multiple accessory subunits, such as the [[DNA clamp]] or [[helicase]]s.<ref>{{cite journal | author = Johnson A, O'Donnell M | title = Cellular DNA replicases: components and dynamics at the replication fork | journal = Annu Rev Biochem | volume = 74 | issue = | pages = 283 – 315 | year = | id = PMID 15952889}}</ref>
Topologists like to note that the juxtaposition of the [[3′ end]] of one DNA strand beside the [[5′ end]] of the other at both ends of a double-helical segment makes the arrangement a "[[crab canon]]".
 
  
==Single-stranded DNA (ssDNA) and repair of mutations==
+
RNA-dependent DNA polymerases are a specialised class of polymerases that copy the sequence of an RNA strand into DNA. They include [[reverse transcriptase]], which is a [[virus|viral]] enzyme involved in the infection of cells by [[retrovirus]]es, and [[telomerase]], which is required for the replication of [[telomere]]s.<ref>{{cite journal | author = Tarrago-Litvak L, Andréola M, Nevinsky G, Sarih-Cottin L, Litvak S | title = The reverse transcriptase of HIV-1: from enzymology to therapeutic intervention | url=http://www.fasebj.org/cgi/reprint/8/8/497 | journal = FASEB J | volume = 8 | issue = 8 | pages = 497–503 | year = 1994 | id = PMID 7514143}}</ref><ref name=Greider/> Telomerase is an unusual polymerase because it contains its own RNA template as part of its structure.<ref name=Nugent/>
In some [[virus]]es DNA appears in a non-helical, single-stranded form. Because many of the [[DNA repair]] mechanisms of cells work only on paired bases, viruses that carry single-stranded DNA [[genome]]s [[mutation|mutate]] more frequently than they would otherwise. As a result, such species may adapt more rapidly to avoid extinction. The result would not be so favorable in more complicated and more slowly replicating organisms, however, which may explain why only viruses carry single-stranded DNA. These viruses presumably also benefit from the lower cost of replicating one strand versus two.
 
  
==History of DNA research==
+
Transcription is carried out by a DNA-dependent [[RNA polymerase]] that copies the sequence of a DNA strand into RNA. To begin transcribing a gene, the RNA polymerase binds to a sequence of DNA called a [[promoter]] and separates the DNA strands. It then copies the gene sequence into a [[messenger RNA]] transcript until it reaches a region of DNA called the [[terminator (genetics)|terminator]], where it halts and detaches from the DNA. As with human DNA-dependent DNA polymerases, RNA polymerase II, the enzyme that transcribes most of the genes in the human genome, operates as part of a large protein complex with multiple regulatory and accessory subunits.<ref>{{cite journal | author = Martinez E | title = Multi-protein complexes in eukaryotic gene transcription | journal = Plant Mol Biol | volume = 50 | issue = 6 | pages = 925 – 47 | year = 2002 | id = PMID 12516863}}</ref>
[[Image:JamesWatson.jpg|thumb|200px|[[James D. Watson|James Watson]] in the [[Cavendish Laboratory]] at the [[University of Cambridge]]]]
 
The discovery that DNA was the carrier of genetic information was a process that required many earlier discoveries. The existence of DNA was discovered in the mid 19th century. However, it was only in the early 20th century that researchers began suggesting that it might store genetic information. This was only accepted after the structure of DNA was elucidated by [[James D. Watson]] and [[Francis Crick]] in their 1953 [[Nature (journal)|''Nature'']] publication. Watson and Crick proposed the [[central dogma]] of molecular biology in 1957, describing the process whereby proteins are produced from [[cell nucleus|nucleic]] DNA. In 1962 Watson, Crick, and [[Maurice Wilkins]] jointly received the Nobel Prize for their determination of the structure of DNA. The Nobel Prize would not have been given to them if it hadn't been for [[Rosalind Franklin]] and her famous radiograph, Photo Fifty-One. Franklin, however, did not get much attention until recently, because before the Nobel Prize was given to Watson, Crick, and Wilkins, Franklin died of ovarian cancer. The most probable reason Franklin contracted cancer was her exposure to X-ray radiation.
 
  
===First isolation of DNA===
+
==Genetic recombination==
Working in the 19th century, biochemists initially isolated DNA and RNA (mixed together) from cell nuclei. They were relatively quick to appreciate the polymeric nature of their "nucleic acid" isolates, but realized only later that nucleotides were of two types—one containing [[ribose]] and the other [[deoxyribose]]. It was this subsequent discovery that led to the identification and naming of DNA as a substance distinct from RNA.
+
<div class="thumb tright" style="background-color: #f9f9f9; border: 1px solid #CCCCCC; margin:0.5em;">
 +
{|border="0" width=250px border="0" cellpadding="0" cellspacing="0" style="font-size: 85%; border: 1px solid #CCCCCC; margin: 0.3em;"
 +
|[[Image:Holliday Junction cropped.png|250px]]
 +
|-
 +
|[[Image:Holliday junction coloured.png|250px]]
 +
|}
 +
<div style="border: none; width:250px;"><div class="thumbcaption">Structure of the [[Holliday junction]] intermediate in [[genetic recombination]]. The four separate DNA strands are coloured red, blue, green and yellow.<ref>Created from [http://www.rcsb.org/pdb/explore/explore.do?structureId=1M6G PDB 1M6G]</ref></div></div></div>
 +
{{further|[[Genetic recombination]]}}
 +
[[Image:Chromosomal Recombination.svg|thumb|250px|left|Recombination involves the breakage and rejoining of two chromosomes (M and F) to produce two re-arranged chromosomes (C1 and C2).]]
  
[[Friedrich Miescher]] (1844-1895) discovered a substance he called "nuclein" in 1869. Somewhat later, he isolated a pure sample of the material now known as DNA from the sperm of salmon, and in 1889 his pupil, [[Richard Altmann]], named it  "nucleic acid".  This substance was found to exist only in the chromosomes.
+
A DNA helix does not usually interact with other segments of DNA, and in human cells the different chromosomes even occupy separate areas in the nucleus called "chromosome territories".<ref>{{cite journal | author = Cremer T, Cremer C | title = Chromosome territories, nuclear architecture and gene regulation in mammalian cells | journal = Nat Rev Genet | volume = 2 | issue = 4 | pages = 292–301 | year = 2001 | id = PMID 11283701}}</ref> This physical separation of different chromosomes is important for the ability of DNA to function as a stable repository for information, as one of the few times chromosomes interact is during [[chromosomal crossover]] when they [[genetic recombination|recombine]]. Chromosomal crossover is when two DNA helices break, swap a section and then rejoin.
  
In 1929 [[Phoebus Levene]] at the [[Rockefeller Institute]] identified the components (the four bases, the sugar and the phosphate chain) and he showed that the components of DNA were linked in the order phosphate-sugar-base.  He called each of these units a [[nucleotide]] and suggested the DNA molecule consisted of a string of nucleotide units linked together through the phosphate groups, which are the 'backbone' of the molecule.  However Levene thought the chain was short and that the bases repeated in the same fixed order.  [[Torbjorn Oskar Caspersson|Torbjorn Caspersson]] and [[Einar Hammersten]] showed that DNA was a polymer.
+
Recombination allows chromosomes to exchange genetic information and produces new combinations of genes, which increases the efficiency of [[natural selection]] and can be important in the rapid evolution of new proteins.<ref>{{cite journal | author = Pál C, Papp B, Lercher M | title = An integrated view of protein evolution | journal = Nat Rev Genet | volume = 7 | issue = 5 | pages = 337 – 48 | year = 2006 | id = PMID 16619049}}</ref> Genetic recombination can also be involved in DNA repair, particularly in the cell's response to double-strand breaks.<ref>{{cite journal | author = O'Driscoll M, Jeggo P | title = The role of double-strand break repair - insights from human genetics | journal = Nat Rev Genet | volume = 7 | issue = 1 | pages = 45 – 54 | year = 2006 | id = PMID 16369571}}</ref>
  
===Chromosomes and inherited traits===
+
The most common form of chromosomal crossover is [[homologous recombination]], where the two chromosomes involved share very similar sequences. Non-homologous recombination can be damaging to cells, as it can produce [[chromosomal translocation]]s and genetic abnormalities. The recombination reaction is catalyzed by enzymes known as ''recombinases'', such as [[RAD51]].<ref>{{cite journal |author=Vispé S, Defais M |title=Mammalian Rad51 protein: a RecA homologue with pleiotropic functions |journal=Biochimie |volume=79 |issue=9-10 |pages=587-92 |year=1997 |pmid=9466696}}</ref>  The first step in recombination is a double-stranded break either caused by an [[endonuclease]] or damage to the DNA.<ref>{{cite journal |author=Neale MJ, Keeney S |title=Clarifying the mechanics of DNA strand exchange in meiotic recombination |journal=Nature |volume=442 |issue=7099 |pages=153-8 |year=2006 |pmid=16838012}}</ref>  A series of steps catalyzed in part by the recombinase then leads to joining of the two helices by at least one [[Holliday junction]], in which a segment of a single strand in each helix is annealed to the complementary strand in the other helix. The Holliday junction is a tetrahedral junction structure that can be moved along the pair of chromosomes, swapping one strand for another. The recombination reaction is then halted by cleavage of the junction and re-ligation of the released DNA.<ref>{{cite journal | author = Dickman M, Ingleston S, Sedelnikova S, Rafferty J, Lloyd R, Grasby J, Hornby D | title = The RuvABC resolvasome | journal = Eur J Biochem | volume = 269 | issue = 22 | pages = 5492 – 501 | year = 2002 | id = PMID 12423347}}</ref>
[[Max Delbrück]], [[Nikolai V. Timofeeff-Ressovsky]], and [[Karl G. Zimmer]] published results in 1935 suggesting that chromosomes are very large molecules the structure of which can be changed by treatment with [[X-ray]]s, and that by so changing their structure it was possible to change the heritable characteristics governed by those chromosomes.  In 1937 [[William Astbury]] produced the first [[X-ray diffraction]] patterns from DNA.  He was not able to propose the correct structure but the patterns showed that DNA had a regular structure and therefore it might be possible to deduce what this structure was.
 
  
In 1943, [[Oswald Theodore Avery]] and a team of scientists discovered that traits proper to the "smooth" form of the ''Pneumococcus'' could be transferred to the "rough" form of the same bacteria merely by making the killed "smooth" (S) form available to the live "rough" (R) form. Quite unexpectedly, the living R ''Pneumococcus'' bacteria were transformed into a new strain of the S form, and the transferred S characteristics turned out to be heritable. Avery called the medium of transfer of traits the [[transforming principle]]; he identified DNA as the transforming principle, and not [[protein]] as previously thought. He essentially redid [[Fredrick Griffith]]'s experiment. In 1953, [[Alfred Hershey]] and [[Martha Chase]] did an experiment ([[Hershey-Chase experiment]]) that showed, in [[T2 phage]], that DNA is the [[genetic material]] (Hershey shared the Nobel prize with Luria).
+
==Evolution of DNA-based metabolism==
 +
DNA contains the genetic information that allows all modern living things to function, grow and reproduce. However, it is unclear how long in the 4-billion-year [[Timeline of evolution|history of life]] DNA has performed this function, as it has been proposed that the earliest forms of life may have used RNA as their genetic material.<ref name=Joyce>{{cite journal |author=Joyce G |title=The antiquity of RNA-based evolution |journal=Nature |volume=418 |issue=6894 |pages=214 – 21 |year=2002 |id=PMID 12110897}}</ref><ref>{{cite journal |author=Orgel L |title=Prebiotic chemistry and the origin of the RNA world | url=http://www.crbmb.com/cgi/reprint/39/2/99.pdf |journal=Crit Rev Biochem Mol Biol |volume=39 |issue=2 |pages=99 – 123 |year= |id=PMID 15217990}}</ref> RNA may have acted as the central part of early cell metabolism as it can both transmit genetic information and carry out [[catalysis]] as part of [[ribozyme]]s.<ref>{{cite journal |author=Davenport R |title=Ribozymes. Making copies in the RNA world |journal=Science |volume=292 |issue=5520 |pages=1278 |year=2001 |pmid=11360970}}</ref> This ancient [[RNA world hypothesis|RNA world]] where nucleic acid would have been used for both catalysis and genetics may have influenced the evolution of the current genetic code based on four nucleotide bases. This would occur since the number of unique bases in such an organism is a trade-off between a small number of bases increasing replication accuracy and a large number of bases increasing the catalytic efficiency of ribozymes.<ref>{{cite journal |author=Szathmáry E |title=What is the optimum size for the genetic alphabet? |url=http://www.pnas.org/cgi/reprint/89/7/2614.pdf |journal=Proc Natl Acad Sci U S A |volume=89 |issue=7 |pages=2614 – 8 |year=1992 |pmid=1372984}}</ref>
  
[[Image:FirstSketchOfDNADoubleHelix.jpg|thumb|200px|[[Francis Crick]]'s first sketch of the [[deoxyribonucleic acid]] double-helix pattern]]
+
Unfortunately, there is no direct evidence of ancient genetic systems, as recovery of DNA from most fossils is impossible. This is because DNA will survive in the environment for less than one million years and slowly degrades into short fragments in solution.<ref>{{cite journal |author=Lindahl T |title=Instability and decay of the primary structure of DNA |journal=Nature |volume=362 |issue=6422 |pages=709 – 15 |year=1993 |id=PMID 8469282}}</ref> Although claims for older DNA have been made, most notably a report of the isolation of a viable bacterium from a salt crystal 250-million years old,<ref>{{cite journal |author=Vreeland R, Rosenzweig W, Powers D |title=Isolation of a 250 million-year-old halotolerant bacterium from a primary salt crystal |journal=Nature |volume=407 |issue=6806 |pages=897 – 900 |year=2000 |id=PMID 11057666}}</ref> these claims are controversial and have been disputed.<ref>{{cite journal |author=Hebsgaard M, Phillips M, Willerslev E |title=Geologically ancient DNA: fact or artefact? |journal=Trends Microbiol |volume=13 |issue=5 |pages=212 – 20 |year=2005 |id=PMID 15866038}}</ref><ref>{{cite journal |author=Nickle D, Learn G, Rain M, Mullins J, Mittler J |title=Curiously modern DNA for a "250 million-year-old" bacterium |journal=J Mol Evol |volume=54 |issue=1 |pages=134 – 7 |year=2002 |id=PMID 11734907}}</ref>
In 1944, the renowned physicist, [[Erwin Schrödinger]], published a brief book entitled ''[[What is Life? (Schrödinger)| What is Life?]]'', where he maintained that chromosomes contained what he called the "hereditary code-script" of life.  He added: "But the term code-script is, of course, too narrow. The chromosome structures are at the same time instrumental in bringing about the development they foreshadow. They are law-code and executive power — or, to use another simile, they are architect's plan and builder's craft in one." He conceived of these dual functional elements as being woven into the molecular structure of chromosomes.  By understanding the exact molecular structure of the chromosomes one could hope to understand both the "architect's plan" and also how that plan was carried out through the "builder's craft."  Three groups took up Schrödinger's challenge to work out the structure of the chromosomes and the question of how the segments of the chromosomes that were conceived to relate to specific traits could
 
possibly do their jobs.
 
  
Just how the presence of specific features in the molecular structure of chromosomes could produce traits and behaviors in living organisms was unimaginable at the time. Because chemical dissection of DNA samples always yielded the same four nucleotides, the chemical composition of DNA appeared simple, perhaps even uniform. Organisms, on the other hand, are fantastically complex individually and widely diverse collectively. Geneticists did not speak of genes as conveyors of "information" in such words, but if they had, they would not have hesitated to quantify the amount of information that genes need to convey as vast. The idea that information might reside in a chemical in the same way that it exists in text—as a finite alphabet of letters arranged in a sequence of unlimited length—had not yet been conceived. It would emerge upon the discovery of DNA's structure, but few researchers imagined that DNA's structure had much to say about genetics.
+
==Uses in technology==
 +
===Genetic engineering===
 +
{{further|[[Molecular biology]] and [[genetic engineering]]}}
  
===Discovery of the structure of DNA===
+
Modern [[biology]] and [[biochemistry]] make intensive use of recombinant DNA technology. [[Recombinant DNA]] is a man-made DNA sequence that has been assembled from other DNA sequences. They can be [[transformation (genetics)|transformed]] into organisms in the form of [[plasmids]] or in the appropriate format, by using a [[viral vector]].<ref>{{cite journal |author=Goff SP, Berg P |title=Construction of hybrid viruses containing SV40 and lambda phage DNA segments and their propagation in cultured monkey cells |journal=Cell |volume=9 |issue=4 PT 2 |pages=695–705 |year=1976 |pmid=189942}}</ref> The [[genetic engineering|genetically modified]] organisms produced can be used to produce products such as recombinant [[protein]]s, used in medical research,<ref>{{cite journal |author=Houdebine L |title=Transgenic animal models in biomedical research |journal=Methods Mol Biol |volume=360 |issue= |pages=163 – 202 |year= |pmid=17172731}}</ref> or be grown in [[agriculture]].<ref>{{cite journal |author=Daniell H, Dhingra A |title=Multigene engineering: dawn of an exciting new era in biotechnology |journal=Curr Opin Biotechnol |volume=13 |issue=2 |pages=136 – 41 |year=2002 |pmid=11950565}}</ref><ref>{{cite journal |author=Job D |title=Plant biotechnology in agriculture |journal=Biochimie |volume=84 |issue=11 |pages=1105 – 10 |year=2002 |pmid=12595138}}</ref>
In the 1950s, three groups made it their goal to determine the structure of DNA. The first group to start was at [[King's College London]] and was led by [[Maurice Wilkins]] and was later joined by [[Rosalind Franklin]]. Another group consisting of [[Francis Crick]] and [[James D. Watson]] was at [[University of Cambridge|Cambridge]]. A third group was at [[Caltech]] and was led by [[Linus Pauling]].  Crick and Watson built physical models using metal rods and balls, in which they incorporated the known chemical structures of the nucleotides, as well as the known position of the linkages joining one nucleotide to the next along the polymer. At King's College Maurice Wilkins and Rosalind Franklin examined [[crystallography|X-ray diffraction]] patterns of DNA fibers. Of the three groups, only the London group was able to produce good quality diffraction patterns and thus produce sufficient quantitative data about the structure.
 
  
[[Image:DNA-labels.png|thumb|200px|The chemical structure of DNA]]
+
===Forensics ===
 +
{{further|[[Genetic fingerprinting]]}}
  
====Helix structure====
+
[[Forensic science|Forensic scientists]] can use DNA in [[blood]], [[semen]], [[skin]], [[saliva]] or [[hair]] at a crime scene to identify a perpetrator. This process is called [[genetic fingerprinting]], or more accurately, DNA profiling. In DNA profiling, the lengths of variable sections of repetitive DNA, such as [[short tandem repeat]]s and [[minisatellite]]s, are compared between people. This method is usually an extremely reliable technique for identifying a criminal.<ref>{{cite journal | author = Collins A, Morton N | title = Likelihood ratios for DNA identification | url=http://www.pnas.org/cgi/reprint/91/13/6007.pdf | journal = Proc Natl Acad Sci U S A | volume = 91 | issue = 13 | pages = 6007 – 11 | year = 1994 | id = PMID 8016106}}</ref> However, identification can be complicated if the scene is contaminated with DNA from several people.<ref>{{cite journal | author = Weir B, Triggs C, Starling L, Stowell L, Walsh K, Buckleton J | title = Interpreting DNA mixtures | journal = J Forensic Sci | volume = 42 | issue = 2 | pages = 213 – 22 | year = 1997 | id = PMID 9068179}}</ref> DNA profiling was developed in 1984 by British geneticist Sir [[Alec Jeffreys]],<ref>{{cite journal | author = Jeffreys A, Wilson V, Thein S | title = Individual-specific 'fingerprints' of human DNA. | journal = Nature | volume = 316 | issue = 6023 | pages = 76 – 9 | year = | id = PMID 2989708}}</ref> and first used in forensic science to convict Colin Pitchfork in the 1988 [[Enderby murders]] case.<ref>[http://www.forensic.gov.uk/forensic_t/inside/news/list_casefiles.php?case=1 Colin Pitchfork — first murder conviction on DNA evidence also clears the prime suspect] Forensic Science Service Accessed 23 Dec 2006</ref> People convicted of certain types of crimes may be required to provide a sample of DNA for a database. This has helped investigators solve old cases where only a DNA sample was obtained from the scene. DNA profiling can also be used to identify victims of mass casualty incidents.<ref>{{cite web |url=http://massfatality.dna.gov/Introduction/ |title=DNA Identification in Mass Fatality Incidents |date=September 2006 |publisher=National Institute of Justice}}</ref>
In 1948 Pauling discovered that many proteins included helical (see [[alpha helix]]) shapes. Pauling had deduced this structure from X-ray patterns. (Pauling was also later to suggest an incorrect three chain helical structure based on Astbury's data.)  Even in the initial diffraction data from DNA by Maurice Wilkins, it was evident that the structure involved helices. But this insight was only a beginning. There remained the questions of how many strands came together, whether this number was the same for every helix, whether the bases pointed toward the helical axis or away, and ultimately what were the explicit angles and coordinates of all the bonds and atoms. Such questions motivated the modeling efforts of Watson and Crick.
 
  
====Complementary nucleotides====
+
===Bioinformatics===
In their modeling, Watson and Crick restricted themselves to what they saw as chemically and biologically reasonable. Still, the breadth of possibilities was very wide. A breakthrough occurred in 1952, when [[Erwin Chargaff]] visited Cambridge and inspired Crick with a description of experiments Chargaff had published in 1947. Chargaff had observed that the proportions of the four nucleotides vary between one DNA sample and the next, but that for particular pairs of nucleotides -- adenine and thymine, guanine and cytosine — the two nucleotides are always present in equal proportions.
+
{{further|[[Bioinformatics]]}}
 +
[[Bioinformatics]] involves the manipulation, searching, and [[data mining]] of DNA sequence data. The development of techniques to store and search DNA sequences have led to widely-applied advances in [[computer science]], especially [[string searching algorithm]]s, [[machine learning]] and [[database theory]].<ref>Baldi, Pierre. Brunak, Soren. ''Bioinformatics: The Machine Learning Approach'' MIT Press (2001) ISBN 978-0-262-02506-5</ref> String searching or matching algorithms, which find an occurrence of a sequence of letters inside a larger sequence of letters, were developed to search for specific sequences of nucleotides.<ref>Gusfield, Dan. ''Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology''. Cambridge University Press, 15 January [[1997]]. ISBN 978-0-521-58519-4.</ref> In other applications such as [[text editor]]s, even simple algorithms for this problem usually suffice, but DNA sequences cause these algorithms to exhibit near-worst-case behaviour due to their small number of distinct characters. The related problem of [[sequence alignment]] aims to identify [[homology (biology)|homologous]] sequences and locate the specific [[mutation]]s that make them distinct. These techniques, especially [[multiple sequence alignment]], are used in studying [[phylogenetics|phylogenetic]] relationships and protein function.<ref>{{cite journal | author = Sjölander K | title = Phylogenomic inference of protein molecular function: advances and challenges | url=http://bioinformatics.oxfordjournals.org/cgi/reprint/20/2/170 | journal = Bioinformatics | volume = 20 | issue = 2 | pages = 170-9 | year = 2004 | id = PMID 14734307}}</ref> Data sets representing entire genomes' worth of DNA sequences, such as those produced by the [[Human Genome Project]], are difficult to use without annotations, which label the locations of genes and regulatory elements on each chromosome. Regions of DNA sequence that have the characteristic patterns associated with protein- or RNA-coding genes can be identified by [[gene finding]] algorithms, which allow researchers to predict the presence of particular [[gene product]]s in an organism even before they have been isolated experimentally.<ref name="Mount">{{cite book|author = Mount DM | title = Bioinformatics: Sequence and Genome Analysis | edition = 2 | publisher = Cold Spring Harbor Laboratory Press | location | Cold Spring Harbor, NY | date = 2004 | isbn = 0879697121}}</ref>
  
====Watson and Crick's model====
+
===DNA and computation ===
[[Image:DNA Model Crick-Watson.jpg|thumb|200px|right|Crick and Watson DNA model built in 1953, currently on display at the [[National Science Museum]] in London.]]
+
{{further|[[DNA computing]]}}
 +
DNA was first used in computing to solve a small version of the directed [[Hamiltonian path problem]], an [[NP-complete]] problem.<ref>{{cite journal | author = Adleman L | title = Molecular computation of solutions to combinatorial problems | journal = Science | volume = 266 | issue = 5187 | pages = 1021 – 4 | year = 1994 | id = PMID 7973651}}</ref> [[DNA computing]] is advantageous over electronic computers in power use, space use, and efficiency, due to its ability to compute in a highly parallel fashion (see [[parallel computing]]). A number of other problems, including simulation of various [[abstract machine]]s, the [[boolean satisfiability problem]], and the bounded version of the [[travelling salesman problem]], have since been analysed using DNA computing.<ref>{{cite journal | author = Parker J | title = Computing with DNA. | url=http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=12524509 | journal = EMBO Rep | volume = 4 | issue = 1 | pages = 7 – 10 | year = 2003 | id = PMID 12524509}}</ref> Due to its compactness, DNA also has a theoretical role in [[cryptography]], where in particular it allows unbreakable [[one-time pad]]s to be efficiently constructed and used.<ref>Ashish Gehani, Thomas LaBean and John Reif. [http://citeseer.ist.psu.edu/gehani99dnabased.html DNA-Based Cryptography].
 +
Proceedings of the 5th DIMACS Workshop on DNA Based Computers, Cambridge, MA, USA, 14 – 15 June 1999.</ref>
  
[[James D. Watson|Watson]] and [[Francis Crick|Crick]] had begun to contemplate double helical arrangements, but they lacked information about the amount of twist (pitch) and the distance between the two strands. [[Rosalind Franklin]] had to disclose some of her findings for the [[Medical Research Council]] and Crick saw this material through [[Max Perutz|Max Perutz's]] links to the MRC. Franklin's work confirmed a double helix that was on the outside of the molecule and also gave an insight into its symmetry, in particular that the two helical strands ran in opposite directions.
+
===History and anthropology===
 +
{{further|[[Phylogenetics]] and [[Genetic genealogy]]}}
 +
Because DNA collects mutations over time, which are then inherited, it contains historical information and by comparing DNA sequences, geneticists can infer the evolutionary history of organisms, their [[phylogeny]].<ref>{{cite journal | author = Wray G | title = Dating branches on the tree of life using DNA | url=http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=11806830 | journal = Genome Biol | volume = 3 | issue = 1 | pages = REVIEWS0001 | year = 2002 | id = PMID 11806830}}</ref> This field of [[phylogenetics]] is a powerful tool in [[evolutionary biology]]. If DNA sequences within a species are compared, [[population genetics|population geneticists]] can learn the history of particular populations. This can be used in studies ranging from [[ecological genetics]] to [[anthropology]]; for example, DNA evidence is being used to try to identify the [[Ten Lost Tribes of Israel]].<ref>''Lost Tribes of Israel'', [[NOVA (TV series)|NOVA]], PBS airdate: 22 February 2000. Transcript available from [http://www.pbs.org/wgbh/nova/transcripts/2706israel.html PBS.org,] (last accessed on 4 March 2006)</ref><ref>Kleiman, Yaakov. [http://www.aish.com/societywork/sciencenature/the_cohanim_-_dna_connection.asp "The Cohanim/DNA Connection: The fascinating story of how DNA studies confirm an ancient biblical tradition".] ''aish.com'' (January 13, 2000). Accessed 4 March 2006.</ref>
  
Watson and Crick were again greatly assisted by more of Franklin's data. This is controversial because Franklin's critical X-ray pattern was shown to Watson and Crick without Franklin's knowledge or permission. Wilkins showed the famous Photo 51 to Watson at his lab immediately after Watson had been unsuccessful in asking Franklin to collaborate to beat Pauling in finding the structure.
+
DNA has also been used to look at modern family relationships, such as establishing family relationships between the descendants of [[Sally Hemings]] and [[Thomas Jefferson]]. This usage is closely related to the use of DNA in criminal investigations detailed above. Indeed, some criminal investigations have been solved when DNA from crime scenes has matched relatives of the guilty individual.<ref>Bhattacharya, Shaoni. [http://www.newscientist.com/article.ns?id=dn4908 "Killer convicted thanks to relative's DNA".] ''newscientist.com'' (20 April 2004). Accessed 22 Dec 06</ref>
  
From the data in photograph 51 Watson and Crick were able to discern that not only was the distance between the two strands constant, but also to measure its exact value of 2 nanometres. The same photograph also gave them the 3.4 nanometre-per-10 bp "pitch" of the helix.
+
==History==
 +
[[Image:Francis Crick.png|thumb|125px|right|[[Francis Crick]]]]
 +
[[Image:JamesDWatson.jpg|thumb|125px|right|[[James D. Watson|James Watson]]]]
  
The final insight came when Crick and Watson saw that a complementary pairing of the bases could provide an explanation for Chargaff's puzzling finding. However the structure of the bases had been incorrectly guessed in the textbooks as the [[enol]] [[tautomer]] when they were more likely to be in the [[keto]] form. When [[Jerry Donohue]] pointed this fallacy out to Watson, Watson quickly realised that the pairs of adenine and thymine, and guanine and cytosine were almost identical in shape and so would provide equally sized 'rungs' between the two strands. With the base-pairing, the Watson and Crick quickly converged upon a model, which they announced before Franklin herself had published any of her work.
+
{{further|[[History of molecular biology]]}}
 +
DNA was first isolated by the [[Switzerland|Swiss]] physician [[Friedrich Miescher]] who, in 1869, discovered a microscopic substance in the [[pus]] of discarded surgical bandages. As it resided in the nuclei of cells, he called it "nuclein".<ref>{{cite journal | author = Dahm R | title = Friedrich Miescher and the discovery of DNA | journal = Dev Biol | volume = 278 | issue = 2 | pages = 274 – 88 | year = 2005 | id = PMID 15680349}}</ref> In 1929 this discovery was followed by [[Phoebus Levene]]'s identification of the base, sugar and phosphate nucleotide unit.<ref>{{cite journal | author = Levene P, | title = The structure of yeast nucleic acid | url=http://www.jbc.org/cgi/reprint/40/2/415 | journal = J Biol Chem | volume = 40 | issue = 2 | pages = 415 – 24 | year = 1919}}</ref> Levene suggested that DNA consisted of a string of nucleotide units linked together through the phosphate groups. However, Levene thought the chain was short and the bases repeated in a fixed order. In 1937 [[William Astbury]] produced the first [[X-ray diffraction]] patterns that showed that DNA had a regular structure.<ref>{{cite journal | author =Astbury W, | title = Nucleic acid | journal = Symp. SOC. Exp. Bbl | volume = 1 | issue = 66 | year = 1947}}</ref>
  
Franklin was two steps away from the solution.  She had not guessed the base-pairing and had not appreciated the implications of the symmetry that she had described. However she had been working almost alone and did not have regular contact with a partner like Crick and Watson, and with other experts such as Jerry Donohue. Her notebooks show that she was aware both of Jerry Donohue's work concerning tautomeric forms of bases (she had used the keto forms for three of the bases) and of Chargaff's work.
+
In 1943, [[Oswald Theodore Avery]] discovered that [[trait (biology)|traits]] of the "smooth" form of the ''Pneumococcus'' could be transferred to the "rough" form of the same bacteria by mixing killed "smooth" bacteria with the live "rough" form. Avery identified DNA as this [[transforming principle]].<ref>{{cite journal | author = Avery O, MacLeod C, McCarty M | title = Studies on the chemical nature of the substance inducing transformation of pneumococcal types. Inductions of transformation by a desoxyribonucleic acid fraction isolated from pneumococcus type III | url=http://www.jem.org/cgi/reprint/149/2/297 | journal = J Exp Med | volume = 79 | issue = 2 | pages = 137 – 158 | year = 1944 }}</ref> DNA's role in [[heredity]] was confirmed in 1953, when [[Alfred Hershey]] and [[Martha Chase]] in the [[Hershey-Chase experiment]] showed that DNA is the [[genetic material]] of the [[T2 phage]].<ref>{{cite journal | author = Hershey A, Chase M | title = Independent functions of viral protein and nucleic acid in growth of bacteriophage | url=http://www.jgp.org/cgi/reprint/36/1/39.pdf | journal = J Gen Physiol | volume = 36 | issue = 1 | pages = 39 – 56 | year = 1952 | id = PMID 12981234}}</ref>
  
The disclosure of Franklin's data to Watson has angered some people who believe Franklin did not receive due credit at the time and that she might have discovered the structure on her own before Crick and Watson. In Crick and Watson's famous paper in Nature in 1953, they said that their work had been stimulated by the work of Wilkins and Franklin, whereas it had been the basis of their work. However they had agreed with Wilkins and Franklin that they all should publish papers in the same issue of Nature in support of the proposed structure.
+
In 1953, based on [[Photo 51|X-ray diffraction images]]<ref name=FWPUB>Watson J.D. and Crick F.H.C. [http://www.nature.com/nature/dna50/watsoncrick.pdf "A Structure for Deoxyribose Nucleic Acid".] (PDF) ''Nature'' 171, 737 – 738 (1953). Accessed 13 Feb 2007.</ref> taken by [[Rosalind Franklin]] and the information that the bases were paired, [[James D. Watson]] and [[Francis Crick]] suggested<ref name=FWPUB/> what is now accepted as the first accurate model of [[Molecular structure of Nucleic Acids|DNA structure]] in the journal [[Nature (journal)|''Nature'']].<ref name=Watson/> Experimental evidence for Watson and Crick's model were published in a series of five articles in the same issue of ''Nature''.<ref name=NatureDNA50>Nature Archives [http://www.nature.com/nature/dna50/archive.html Double Helix of DNA: 50 Years]</ref> Of these, [[Rosalind Franklin|Franklin]] and [[Raymond Gosling]]'s paper<ref name=NatFranGos>Molecular Configuration in Sodium Thymonucleate. Franklin R. and Gosling R.G.Nature 171, 740 – 741 (1953)[http://www.nature.com/nature/dna50/franklingosling.pdf Nature Archives Full Text (PDF)]</ref> saw the publication of the X-ray diffraction image<ref>[http://osulibrary.oregonstate.edu/specialcollections/coll/pauling/dna/pictures/franklin-typeBphoto.html Original X-ray diffraction image]</ref>, which was key in Watson and Crick interpretation, as well as another article, co-authored by [[Maurice Wilkins]] and his colleagues.<ref name=NatWilk>Molecular Structure of Deoxypentose Nucleic Acids. Wilkins M.H.F., A.R. Stokes A.R. & Wilson, H.R. Nature 171, 738 – 740 (1953)[http://www.nature.com/nature/dna50/wilkins.pdf Nature Archives (PDF)]</ref> Franklin and Gosling's subsequent paper identified the distinctions between the A and B structures of the double helix in DNA.<ref name=NatFrankGos2>Evidence for 2-Chain Helix in Crystalline Structure of Sodium Deoxyribonucleate. Franklin R. and Gosling R.G. Nature 172, 156 – 157 (1953)[http://www.nature.com/nature/dna50/franklingosling2.pdf Nature Archives, full text (PDF)]</ref> In 1962 Watson, Crick, and [[Maurice Wilkins]] jointly received the [[Nobel Prize]] in [[Nobel Prize in Physiology or Medicine|Physiology or Medicine]] (Franklin didn't share the prize with them since she had died earlier).<ref>[http://nobelprize.org/nobel_prizes/medicine/laureates/1962/ The Nobel Prize in Physiology or Medicine 1962] Nobelprize .org Accessed 22 Dec 06</ref>
  
===="Central Dogma"====
+
In an influential presentation in 1957, Crick laid out the [[central dogma of molecular biology|"Central Dogma" of molecular biology]], which foretold the relationship between DNA, RNA, and proteins, and articulated the "adaptor hypothesis".<ref>Crick, F.H.C. [http://genome.wellcome.ac.uk/assets/wtx030893.pdf On degenerate templates and the adaptor hypothesis (PDF).] genome.wellcome.ac.uk (Lecture, 1955). Accessed 22 Dec 2006</ref> Final confirmation of the replication mechanism that was implied by the double-helical structure followed in 1958 through the [[Meselson-Stahl experiment]].<ref>{{cite journal | author = Meselson M, Stahl F | title = The replication of DNA in ''Escherichia coli'' | journal = Proc Natl Acad Sci U S A | volume = 44 | issue = 7 | pages = 671 – 82 | year = 1958 | id = PMID 16590258}}</ref> Further work by Crick and coworkers showed that the genetic code was based on non-overlapping triplets of bases, called codons, allowing [[Har Gobind Khorana]], [[Robert W. Holley]] and [[Marshall Warren Nirenberg]] to decipher the [[genetic code]].<ref>[http://nobelprize.org/nobel_prizes/medicine/laureates/1968/ The Nobel Prize in Physiology or Medicine 1968] Nobelprize.org Accessed 22 Dec 06</ref> These findings represent the birth of [[molecular biology]].
Watson and Crick's model attracted great interest immediately upon its presentation. Arriving at their conclusion on [[February 21]] [[1953]], Watson and Crick made their first announcement on [[February 28]]. Their paper ''A Structure for Deoxyribose Nucleic Acid''<ref>Watson and Crick, 1953</ref> was published on April 25. In an influential presentation in 1957, Crick laid out the "[[Central Dogma]]", which foretold the relationship between DNA, RNA, and proteins, and articulated the "sequence hypothesis." A critical confirmation of the replication mechanism that was implied by the double-helical structure followed in 1958 in the form of the [[Meselson-Stahl experiment]]. Work by Crick and coworkers showed that the genetic code was based on non-overlapping triplets of bases, called codons, and [[Har Gobind Khorana]] and others deciphered the [[genetic code]] not long afterward. These findings represent the birth of [[molecular biology]].
 
  
[[James D. Watson|Watson]], [[Francis Crick|Crick]], and [[Maurice Wilkins|Wilkins]] were awarded the 1962 [[Nobel Prize for Physiology or Medicine]] for discovering the molecular structure of DNA, by which time [[Rosalind Franklin|Franklin]] had died from cancer at 37. Nobel prizes are not awarded posthumously; had she lived, the difficult decision over whom to jointly award the prize would have been complicated as the prize can only be shared between a maximum of three; but because their work could be considered to be chemistry, it is conceivable that [[Maurice Wilkins|Wilkins]] and [[Rosalind Franklin|Franklin]] could have been awarded the [[Nobel Prize for Chemistry]] instead; see Graeme Hunter's biography of Sir Lawrence Bragg for more information on how scientists were nominated for Nobel Prizes.
+
==See also==
 +
* [[Genetic disorder]]
 +
* [[Plasmid]]
 +
* [[DNA sequencing]]
 +
* [[Southern blot]]
 +
* [[DNA microarray]]
 +
* [[Polymerase chain reaction]]
 +
* [[Protein-DNA interaction site predictor]]
 +
* [[Phosphoramidite]]
 +
* [[Quantification of nucleic acids]]
 +
* [[Guanidium thiocyanate-phenol-chlorofrom extraction]]
  
 
==References==
 
==References==
===Citations===
+
<div class="references-small" style="-moz-column-count:2; column-count:2;">
 
<references/>
 
<references/>
 +
</div>
  
===General references===
+
==Further reading==
 +
* Clayton, Julie. (Ed.). ''50 Years of DNA'', Palgrave MacMillan Press, 2003. ISBN 978-1-40-391479-8
 +
* Judson, Horace Freeland. ''The Eighth Day of Creation: Makers of the Revolution in Biology'', Cold Spring Harbor Laboratory Press, 1996. ISBN 978-0-87-969478-4
 +
* [[Robert Olby|Olby, Robert]]. ''The Path to The Double Helix: Discovery of DNA'', first published in October 1974 by MacMillan, with foreword by Francis Crick; ISBN 978-0-48-668117-7; the definitive DNA textbook, revised in 1994, with a 9 page postscript.
 +
* [[Matt Ridley|Ridley, Matt]]. ''Francis Crick: Discoverer of the Genetic Code (Eminent Lives)'' HarperCollins Publishers; 192 pp, ISBN 978-0-06-082333-7 [[2006]]
 +
* Rose, Steven. ''The Chemistry of Life'', Penguin, ISBN 978-0-14-027273-4.
 +
* Watson, James D. and Francis H.C. Crick. [http://www.nature.com/nature/dna50/watsoncrick.pdf A structure for Deoxyribose Nucleic Acid] (PDF). ''[[Nature (journal)|Nature]]'' 171, 737 – 738, [[25 April]] [[1953]].
 +
* Watson, James D. ''DNA: The Secret of Life'' ISBN 978-0-375-41546-3.
 +
* Watson, James D. ''[[The Double Helix|The Double Helix: A Personal Account of the Discovery of the Structure of DNA (Norton Critical Editions)]]''. ISBN 978-0-393-95075-5
 +
* Watson, James D. "Avoid boring people and other lessons from a life in science" New York: Random House. ISBN 978-0-375-421844 (0-375-41284-0)366pp [[2007]]
 +
* Calladine, Chris R.; Drew, Horace R.; Luisi, Ben F. and Travers, Andrew A. ''Understanding DNA'', Elsevier Academic Press, 2003. ISBN 978-0-12155089-9
  
* Watson, James D. and Francis H.C. Crick. [http://www.nature.com/nature/dna50/watsoncrick.pdf A structure for Deoxyribose Nucleic Acid] (PDF). ''[[Nature (journal)|Nature]]'' 171, 737&ndash;738, [[25 April]] [[1953]].
+
==DVD==
* Watson, James D. ''DNA: The Secret of Life'' ISBN 0375415467.
+
* ''[http://www.windfallfilms.com/html/productions/DNA.htm DNA — The Story of the Pioneers who Changed the World,]'' Windfall Films Production for [http://www.channel4.com/science/microsites/D/dna_thestoryoflife/ Channel Four Television] & [http://www.pbs.org/wnet/dna/ PBS Thirteen-WNET] — 2003, PAL [http://www.ncbe.reading.ac.uk/DNA50/documentaries.html], NTSC [http://www.shoppbs.org/searchHandler/index.jsp?searchId=20744726972&keywords=dna&view=all PBS Shop]
* Watson, James D. [[The Double Helix|The Double Helix: A Personal Account of the Discovery of the Structure of DNA (Norton Critical Editions)]]. ISBN 0393950751
+
* ''[http://www.dnai.org/feature/dnai_dvd.html DNA interactive]'' PAL [http://www.ncbe.reading.ac.uk/DNA50/interactivepal.html], NTSC [http://www.ncbe.reading.ac.uk/DNA50/interactiventsc.html], [http://www.scienceinschool.org/2006/issue1/dnainteractive/]
* Chomet, S. (Ed.), DNA Genesis of a Discovery, ''Newman-Hemisphere Press, London, 1994.
+
* ''[http://www.carolina.com/biotech/DNA_secret.asp DNA: The Secret of Life]'' Carolina Biological
* Delmonte, C.S. and Mann, L.R.B. [http://www.ias.ac.in/currsci/dec102003/1564.pdf Variety in DNA secondary structure]. Current Science, 85 (11), 1564&ndash;1570, 10 December 2003.
+
* ''[http://shop.wgbh.org/webapp/wcs/stores/servlet/ProductDisplay?productId=51808&storeId=11051&catalogId=10051&langId=-1 DNA — Secret of Photo 51]'' Rosalind Franklin — NOVA documentary (NTSC — Region 1?)
*Miller, Kenneth R., and Levin, Joseph. ''Biology''. Upper Saddle River, New Jersey: Prentice Hall, 2002.
+
* ''[http://shop.wgbh.org/webapp/wcs/stores/servlet/ProductDisplay?productId=18308&storeId=11051&catalogId=10051&langId=-1 Cracking the Code of Life]'' NOVA documentary (NTSC — All Regions)
  
<!Not sure if we need this long document as a reference ? If yes, please provide a reference; the PDF looks unpublished.
+
==External links==
* Delmonte, C. S., http://www.notahelix.com/delmonte/new_struct_mol_biol.pdf
+
{{portalpar|Molecular and Cellular Biology|Portal.svg}}
—>
+
{{Spoken Wikipedia|dna.ogg|2007-02-12}}
 +
{{commonscat|DNA}}
 +
* [http://orpheus.ucsd.edu/speccoll/testing/html/mss0660a.html#abstract] Crick's personal papers at Mandeville Special Collections Library, Geisel Library, University of California, San Diego
 +
* [http://www.dnai.org/ DNA Interactive] (requires [[Adobe Flash]])
 +
* [http://www.dnaftb.org/dnaftb/ DNA from the beginning]
 +
* [http://www.ncbe.reading.ac.uk/DNA50/ Double Helix 1953 – 2003] National Centre for Biotechnology Education
 +
* [http://www.nature.com/nature/dna50/archive.html Double helix: 50 years of DNA], ''[[Nature (journal)|Nature]]''
 +
* [http://mason.gmu.edu/~emoody/rfranklin.html Rosalind Franklin's contributions to the study of DNA]
 +
* [http://www.genome.gov/10506367 U.S. National DNA Day] — watch videos and participate in real-time chat with top scientists
 +
* [http://www.genome.gov/10506718 Genetic Education Modules for Teachers] — ''DNA from the Beginning'' Study Guide
 +
* [http://www.bbc.co.uk/bbcfour/audiointerviews/profilepages/crickwatson1.shtml Listen to Francis Crick and James Watson talking on the BBC in 1962, 1972, and 1974]
 +
* {{PDB Molecule of the Month|pdb23_1}}
 +
* [http://www.fidelitysystems.com/Unlinked_DNA.html DNA under electron microscope]
 +
* {{dmoz|Science/Biology/Biochemistry_and_Molecular_Biology/Biomolecules/Nucleic_Acids/DNA/|DNA}}
 +
* [http://dnawiz.com/ DNA Articles] — articles and information collected from various sources
 +
* {{McGrawHillAnimation|genetics|Dna%20Replication}}
 +
* [http://biostudio.com/c_%20education%20mac.htm DNA coiling to form chromosomes]
 +
* [http://pipe.scs.fsu.edu/displar.html DISPLAR: DNA binding site prediction on protein]
 +
* [http://www.dnalc.org/ Dolan DNA Learning Center]
 +
* [[Robert Olby|Olby, R.]] (2003) [http://chem-faculty.ucsd.edu/joseph/CHEM13/DNA1.pdf "Quiet debut for the double helix"] ''Nature'' '''421''' (January 23): 402 – 405.
 +
*[http://www.blackwellpublishing.com/trun/artwork/Animations/cloningexp/cloningexp.html Basic animated guide to DNA cloning]
 +
* [http://nobelprize.org/educational_games/medicine/dna_double_helix/ DNA the Double Helix Game] From the official Nobel Prize web site
  
==External links==
+
{{Nucleic acids}}
*[http://www.dnahack.com/index.html DNA hack: The website for Amateur Genetic Engineering]
 
*[http://www.packer34.freeserve.co.uk/selectedTATAwebsites.htm First press stories on DNA]
 
*[http://en.wikipedia.org/wiki/Image:Rosalindfranklinsjokecard.jpg 'Death' of DNA Helix (Crystaline) joke funeral card].
 
*[http://www.nature.com/nature/dna50/archive.html Double helix: 50 years of DNA], [[Nature (journal)|Nature]].
 
*[http://www.genome.gov/10506367 U.S. National DNA Day] Watch videos and participate in real-time chat with top scientists
 
*[http://www.genome.gov/10506718 Genetic Education Modules for Teachers] ''DNA from the Beginning'' Study Guide
 
*[http://www.genome.gov/glossary.cfm Talking Glossary of Genetic Terms] In Spanish, too
 
*[http://osulibrary.oregonstate.edu/specialcollections/coll/pauling/dna/index.html Linus Pauling and the Race for DNA]
 
*Listen to Francis Crick and James Watson talking on the BBC in 1962, 1972, and 1974:http://www.bbc.co.uk/bbcfour/audiointerviews/profilepages/crickwatson1.shtml
 
*[http://news.bbc.co.uk/1/hi/sci/tech/2949629.stm 17 April, 2003, BBC News: Most ancient DNA ever?]
 
*[http://www.whatsnextnetwork.com/health/index.php?cat=61 Latest Advances In Gene Research]
 
*[http://www.dna-research.org DNA Research News]
 
*[http://www.dnai.org DNA Interactive] (requires [[Macromedia Flash]])
 
*[http://3dscience.com/3d_dna_models.asp Free 3d DNA model Images]
 
*[http://nist.rcsb.org/pdb/molecules/pdb23_1.html DNA: PDB molecule of the month]
 
*[http://www.fidelitysystems.com/Unlinked_DNA.html DNA under electron microscope]
 
*[http://www.ccrnp.ncifcrf.gov/~toms/LeftHanded.DNA.html Left-handed DNA Hall of Fame]
 
*[http://www.myfirstbookaboutdna.com My First Book About DNA] Designed for children to learn more about DNA.
 
*{{dmoz|Science/Biology/Biochemistry_and_Molecular_Biology/Biomolecules/Nucleic_Acids/|Nucleic Acids}}
 
*[http://www.zytologie-online.net/dna.php DNA Replication and Translation / Cell Biology]
 
  
{{credit|51121850}}
+
{{credit|133247152}}
 
[[Category:Life sciences]]
 
[[Category:Life sciences]]

Revision as of 20:49, 24 May 2007

The structure of part of a DNA double helix

Deoxyribonucleic acid, or DNA is a nucleic acid molecule that contains the genetic instructions used in the development and functioning of all living organisms. The main role of DNA is the long-term storage of information and it is often compared to a set of blueprints, since DNA contains the instructions needed to construct other components of cells, such as proteins and RNA molecules. The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in regulating the use of this genetic information.

Chemically, DNA is a long polymer of simple units called nucleotides, with a backbone made of sugars and phosphate atoms joined by ester bonds. Attached to each sugar is one of four types of molecules called bases. It is the sequence of these four bases along the backbone that encodes information. This information is read using the genetic code, which specifies the sequence of the amino acids within proteins. The code is read by copying stretches of DNA into the related nucleic acid RNA, in a process called transcription. Most of these RNA molecules are used to synthesize proteins, but others are used directly in structures such as ribosomes and spliceosomes.

Within cells, DNA is organized into structures called chromosomes and the set of chromosomes within a cell make up a genome. These chromosomes are duplicated before cells divide, in a process called DNA replication. Eukaryotic organisms such as animals, plants, and fungi store their DNA inside the cell nucleus, while in prokaryotes such as bacteria it is found in the cell's cytoplasm. Within the chromosomes, chromatin proteins such as histones compact and organize DNA, which helps control its interactions with other proteins and thereby control which genes are transcribed.

Physical and chemical properties

The chemical structure of DNA.

DNA is a long polymer made from repeating units called nucleotides.[1][2] The DNA chain is 22 to 24 Ångströms wide (2.2 to 2.4 nanometres), and one nucleotide unit is 3.3 Ångstroms (0.33 nanometres) long.[3] Although each individual repeating unit is very small, DNA polymers can be enormous molecules containing millions of nucleotides. For instance, the largest human chromosome, chromosome number 1, is 220 million base pairs long.[4]

In living organisms, DNA does not usually exist as a single molecule, but instead as a tightly-associated pair of molecules.[5][6] These two long strands entwine like vines, in the shape of a double helix. The nucleotide repeats contain both the segment of the backbone of the molecule, which holds the chain together, and a base, which interacts with the other DNA strand in the helix. In general, a base linked to a sugar is called a nucleoside and a base linked to a sugar and one or more phosphate groups is called a nucleotide. If multiple nucleotides are linked together, as in DNA, this polymer is referred to as a polynucleotide.[7]

The backbone of the DNA strand is made from alternating phosphate and sugar residues.[8] The sugar in DNA is 2-deoxyribose, which is a pentose (five carbon) sugar. The sugars are joined together by phosphate groups that form phosphodiester bonds between the third and fifth carbon atoms of adjacent sugar rings. These asymmetric bonds mean a strand of DNA has a direction. In a double helix the direction of the nucleotides in one strand is opposite to their direction in the other strand. This arrangement of DNA strands is called antiparallel. The asymmetric ends of a strand of DNA bases are referred to as the 5′ (five prime) and 3′ (three prime) ends. One of the major differences between DNA and RNA is the sugar, with 2-deoxyribose being replaced by the alternative pentose sugar ribose in RNA.[6]

The DNA double helix is stabilized by hydrogen bonds between the bases attached to the two strands. The four bases found in DNA are adenine (abbreviated A), cytosine (C), guanine (G) and thymine (T). These four bases are shown below and are attached to the sugar/phosphate to form the complete nucleotide, as shown for adenosine monophosphate.

These bases are classified into two types; adenine and guanine are fused five- and six-membered heterocyclic compounds called purines, while cytosine and thymine are six-membered rings called pyrimidines.[7] A fifth pyrimidine base, called uracil (U), usually takes the place of thymine in RNA and differs from thymine by lacking a methyl group on its ring. Uracil is normally only found in DNA as a breakdown product of cytosine, but a very rare exception to this rule is a bacterial virus called PBS1 that contains uracil in its DNA.[9] In contrast, following synthesis of certain RNA molecules, a significant number of the uracils are converted to thymines by the enzymatic addition of the missing methyl group. This occurs mostly on structural and enzymatic RNAs like transfer RNAs and ribosomal RNA.[10]

Animation of the structure of a section of DNA. The bases lie horizontally between the two spiraling strands. Large version[11]

The double helix is a right-handed spiral. As the DNA strands wind around each other, they leave gaps between each set of phosphate backbones, revealing the sides of the bases inside (see animation). There are two of these grooves twisting around the surface of the double helix: one groove, the major groove, is 22 Å wide and the other, the minor groove, is 12 Å wide.[12] The narrowness of the minor groove means that the edges of the bases are more accessible in the major groove. As a result, proteins like transcription factors that can bind to specific sequences in double-stranded DNA usually make contacts to the sides of the bases exposed in the major groove.[13]

GC DNA base pair.svg
AT DNA base pair.svg
At top, a GC base pair with three hydrogen bonds. At the bottom, AT base pair with two hydrogen bonds. Hydrogen bonds are shown as dashed lines.

Base pairing

Further information: Base pair

Each type of base on one strand forms a bond with just one type of base on the other strand. This is called complementary base pairing. Here, purines form hydrogen bonds to pyrimidines, with A bonding only to T, and C bonding only to G. This arrangement of two nucleotides binding together across the double helix is called a base pair. In a double helix, the two strands are also held together via forces generated by the hydrophobic effect and pi stacking, which are not influenced by the sequence of the DNA.[14] As hydrogen bonds are not covalent, they can be broken and rejoined relatively easily. The two strands of DNA in a double helix can therefore be pulled apart like a zipper, either by a mechanical force or high temperature.[15] As a result of this complementarity, all the information in the double-stranded sequence of a DNA helix is duplicated on each strand, which is vital in DNA replication. Indeed, this reversible and specific interaction between complementary base pairs is critical for all the functions of DNA in living organisms.[1]

The two types of base pairs form different numbers of hydrogen bonds, AT forming two hydrogen bonds, and GC forming three hydrogen bonds (see figures, left). The GC base pair is therefore stronger than the AT base pair. As a result, it is both the percentage of GC base pairs and the overall length of a DNA double helix that determine the strength of the association between the two strands of DNA. Long DNA helices with a high GC content have stronger-interacting strands, while short helices with high AT content have weaker-interacting strands.[16] Parts of the DNA double helix that need to separate easily, such as the TATAAT Pribnow box in bacterial promoters, tend to have sequences with a high AT content, making the strands easier to pull apart.[17] In the laboratory, the strength of this interaction can be measured by finding the temperature required to break the hydrogen bonds, their melting temperature (also called Tm value). When all the base pairs in a DNA double helix melt, the strands separate and exist in solution as two entirely independent molecules. These single-stranded DNA molecules have no single common shape, but some conformations are more stable than others.[18]

Sense and antisense

Further information: Sense (molecular biology)

A DNA sequence is called "sense" if its sequence is the same as that of a messenger RNA (mRNA) copy that is translated into protein. The sequence on the opposite strand is complementary to the sense sequence and is therefore called the "antisense" sequence. Since RNA polymerases work by making a complementary copy of their templates, it is this antisense strand that is the template for producing the sense mRNA. Both sense and antisense sequences can exist on different parts of the same strand of DNA (i.e. both strands contain both sense and antisense sequences). In both prokaryotes and eukaryotes, antisense RNA sequences are produced, but the functions of these RNAs are not entirely clear.[19] One proposal is that antisense RNAs are involved in regulating gene expression through RNA-RNA base pairing.[20]

A few DNA sequences in prokaryotes and eukaryotes, and more in plasmids and viruses, blur the distinction made above between sense and antisense strands by having overlapping genes.[21] In these cases, some DNA sequences do double duty, encoding one protein when read 5′ to 3′ along one strand, and a second protein when read in the opposite direction (still 5′ to 3′) along the other strand. In bacteria, this overlap may be involved in the regulation of gene transcription,[22] while in viruses, overlapping genes increase the amount of information that can be encoded within the small viral genome.[23] Another way of reducing genome size is seen in some viruses that contain linear or circular single-stranded DNA as their genetic material.[24][25]

Supercoiling

Further information: DNA supercoil

DNA can be twisted like a rope in a process called DNA supercoiling. With DNA in its "relaxed" state, a strand usually circles the axis of the double helix once every 10.4 base pairs, but if the DNA is twisted the strands become more tightly or more loosely wound.[26] If the DNA is twisted in the direction of the helix, this is positive supercoiling, and the bases are held more tightly together. If they are twisted in the opposite direction, this is negative supercoiling, and the bases come apart more easily. In nature, most DNA has slight negative supercoiling that is introduced by enzymes called topoisomerases.[27] These enzymes are also needed to relieve the twisting stresses introduced into DNA strands during processes such as transcription and DNA replication.[28]

From left to right, the structures of A, B and Z DNA

Alternative double-helical structures

Further information: Mechanical properties of DNA

DNA exists in several possible conformations. The conformations so far identified are: A-DNA, B-DNA, C-DNA, D-DNA,[29] E-DNA,[30] H-DNA,[31] L-DNA,[29] P-DNA,[32] and Z-DNA.[8][33] However, only A-DNA, B-DNA, and Z-DNA have been observed in naturally occurring biological systems. Which conformation DNA adopts depends on the sequence of the DNA, the amount and direction of supercoiling, chemical modifications of the bases and also solution conditions, such as the concentration of metal ions and polyamines.[34] Of these three conformations, the "B" form described above is most common under the conditions found in cells.[35] The two alternative double-helical forms of DNA differ in their geometry and dimensions.

The A form is a wider right-handed spiral, with a shallow and wide minor groove and a narrower and deeper major groove. The A form occurs under non-physiological conditions in dehydrated samples of DNA, while in the cell it may be produced in hybrid pairings of DNA and RNA strands, as well as in enzyme-DNA complexes.[36][37] Segments of DNA where the bases have been chemically-modified by methylation may undergo a larger change in conformation and adopt the Z form. Here, the strands turn about the helical axis in a left-handed spiral, the opposite of the more common B form.[38] These unusual structures can be recognised by specific Z-DNA binding proteins and may be involved in the regulation of transcription.[39]

File:Telomere quadruplex.jpg
Structure of a DNA quadruplex formed by telomere repeats.[40]

Quadruplex structures

At the ends of the linear chromosomes are specialized regions of DNA called telomeres. The main function of these regions is to allow the cell to replicate chromosome ends using the enzyme telomerase, as the enzymes that normally replicate DNA cannot copy the extreme 3′ ends of chromosomes.[41] As a result, if a chromosome lacked telomeres it would become shorter each time it was replicated. These specialized chromosome caps also help protect the DNA ends from exonucleases and stop the DNA repair systems in the cell from treating them as damage to be corrected.[42] In human cells, telomeres are usually lengths of single-stranded DNA containing several thousand repeats of a simple TTAGGG sequence.[43]

These guanine-rich sequences may stabilize chromosome ends by forming very unusual structures of stacked sets of four-base units, rather than the usual base pairs found in other DNA molecules. Here, four guanine bases form a flat plate and these flat four-base units then stack on top of each other, to form a stable quadruplex structure.[44] These structures are stabilized by hydrogen bonding between the edges of the bases and chelation of a metal ion in the centre of each four-base unit. The structure shown to the left is a top view of the quadruplex formed by a DNA sequence found in human telomere repeats. The single DNA strand forms a loop, with the sets of four bases stacking in a central quadruplex three plates deep. In the space at the centre of the stacked bases are three chelated potassium ions.[45] Other structures can also be formed, with the central set of four bases coming from either a single strand folded around the bases, or several different parallel strands, each contributing one base to the central structure.

In addition to these stacked structures, telomeres also form large loop structures called telomere loops, or T-loops. Here, the single-stranded DNA curls around in a long circle stabilized by telomere-binding proteins.[46] At the very end of the T-loop, the single-stranded telomere DNA is held onto a region of double-stranded DNA by the telomere strand disrupting the double-helical DNA and base pairing to one of the two strands. This triple-stranded structure is called a displacement loop or D-loop.[44]

Chemical modifications

Cytosine chemical structure.png 95px Thymine chemical structure.png
cytosine 5-methylcytosine thymine
Structure of cytosine with and without the 5-methyl group. After deamination the 5-methylcytosine has the same structure as thymine

Base modifications

Further information: DNA methylation

The expression of genes is influenced by the chromatin structure of a chromosome and regions of heterochromatin (low or no gene expression) correlate with the methylation of cytosine. For example, cytosine methylation, to produce 5-methylcytosine, is important for X-chromosome inactivation.[47] The average level of methylation varies between organisms, with Caenorhabditis elegans lacking cytosine methylation, while vertebrates show higher levels, with up to 1% of their DNA containing 5-methylcytosine.[48] Despite the biological role of 5-methylcytosine it is susceptible to spontaneous deamination to leave the thymine base, and methylated cytosines are therefore mutation hotspots.[49] Other base modifications include adenine methylation in bacteria and the glycosylation of uracil to produce the "J-base" in kinetoplastids.[50][51]

DNA damage

Further information: Mutation
Benzopyrene, the major mutagen in tobacco smoke, in an adduct to DNA.[52]

DNA can be damaged by many different sorts of mutagens. These include oxidizing agents, alkylating agents and also high-energy electromagnetic radiation such as ultraviolet light and x-rays. The type of DNA damage produced depends on the type of mutagen. For example, UV light mostly damages DNA by producing thymine dimers, which are cross-links between adjacent pyrimidine bases in a DNA strand.[53] On the other hand, oxidants such as free radicals or hydrogen peroxide produce multiple forms of damage, including base modifications, particularly of guanosine, as well as double-strand breaks.[54] It has been estimated that in each human cell, about 500 bases suffer oxidative damage per day.[55][56] Of these oxidative lesions, the most dangerous are double-strand breaks, as these lesions are difficult to repair and can produce point mutations, insertions and deletions from the DNA sequence, as well as chromosomal translocations.[57]

Many mutagens intercalate into the space between two adjacent base pairs. Intercalators are mostly aromatic and planar molecules, and include ethidium, daunomycin, doxorubicin and thalidomide. In order for an intercalator to fit between base pairs, the bases must separate, distorting the DNA strands by unwinding of the double helix. These structural changes inhibit both transcription and DNA replication, causing toxicity and mutations. As a result, DNA intercalators are often carcinogens, with benzopyrene diol epoxide, acridines, aflatoxin and ethidium bromide being well-known examples.[58][59][60] Nevertheless, due to their properties of inhibiting DNA transcription and replication, they are also used in chemotherapy to inhibit rapidly-growing cancer cells.[61]

Overview of biological functions

DNA usually occurs as linear chromosomes in eukaryotes, and circular chromosomes in prokaryotes. The set of chromosomes in a cell makes up its genome; the human genome has approximately 3 billion base pairs of DNA arranged into 46 chromosomes.[62] The information carried by DNA is held in the sequence of pieces of DNA called genes. Transmission of genetic information in genes is achieved via complementary base pairing. For example, in transcription, when a cell uses the information in a gene, the DNA sequence is copied into a complementary RNA sequence through the attraction between the DNA and the correct RNA nucleotides. Usually, this RNA copy is then used to make a matching protein sequence in a process called translation which depends on the same interaction between RNA nucleotides. Alternatively, a cell may simply copy its genetic information in a process called DNA replication. The details of these functions are covered in other articles; here we focus on the interactions between DNA and other molecules that mediate the function of the genome.

Genome structure

Further information: Cell nucleus, Chromatin, Chromosome, Gene, Non-coding DNA

Genomic DNA is located in the cell nucleus of eukaryotes, as well as small amounts in mitochondria and chloroplasts. In prokaryotes, the DNA is held within an irregularly shaped body in the cytoplasm called the nucleoid.[63] The genetic information in a genome is held within genes. A gene is a unit of heredity and is a region of DNA that influences a particular characteristic in an organism. Genes contain an open reading frame that can be transcribed, as well as regulatory sequences such as promoters and enhancers, which control the expression of the open reading frame.

In many species, only a small fraction of the total sequence of the genome encodes protein. For example, only about 1.5% of the human genome consists of protein-coding exons, with over 50% of human DNA consisting of non-coding repetitive sequences.[64] The reasons for the presence of so much non-coding DNA in eukaryotic genomes and the extraordinary differences in genome size, or C-value, among species represent a long-standing puzzle known as the "C-value enigma."[65]

File:RNA pol.jpg
T7 RNA polymerase producing a mRNA (green) from a DNA template (red and blue). The enzyme is shown as a purple ribbon.[66]

Some non-coding DNA sequences play structural roles in chromosomes. Telomeres and centromeres typically contain few genes, but are important for the function and stability of chromosomes.[42][67] An abundant form of non-coding DNA in humans are pseudogenes, which are copies of genes that have been disabled by mutation.[68] These sequences are usually just molecular fossils, although they can occasionally serve as raw genetic material for the creation of new genes through the process of gene duplication and divergence.[69]

Transcription and translation

Further information: Genetic code, Transcription (genetics), Protein biosynthesis

A gene is a sequence of DNA that contains genetic information and can influence the phenotype of an organism. Within a gene, the sequence of bases along a DNA strand defines a messenger RNA sequence, which then defines a protein sequence. The relationship between the nucleotide sequences of genes and the amino-acid sequences of proteins is determined by the rules of translation, known collectively as the genetic code. The genetic code consists of three-letter 'words' called codons formed from a sequence of three nucleotides (e.g. ACT, CAG, TTT).

In transcription, the codons of a gene are copied into messenger RNA by RNA polymerase. This RNA copy is then decoded by a ribosome that reads the RNA sequence by base-pairing the messenger RNA to transfer RNA, which carries amino acids. Since there are 4 bases in 3-letter combinations, there are 64 possible codons ( combinations). These encode the twenty standard amino acids, giving most amino acids more than one possible codon. There are also three 'stop' or 'nonsense' codons signifying the end of the coding region; these are the TAA, TGA and TAG codons.

DNA replication. The double helix is unwound by a helicase and topoisomerase. Next, one DNA polymerase produces the leading strand copy. Another DNA polymerase binds to the lagging strand. This enzyme makes discontinuous segments (called Okazaki fragments) before DNA ligase joins them together.

Replication

Further information: DNA replication

Cell division is essential for an organism to grow, but when a cell divides it must replicate the DNA in its genome so that the two daughter cells have the same genetic information as their parent. The double-stranded structure of DNA provides a simple mechanism for DNA replication. Here, the two strands are separated and then each strand's complementary DNA sequence is recreated by an enzyme called DNA polymerase. This enzyme makes the complementary strand by finding the correct base through complementary base pairing, and bonding it onto the original strand. As DNA polymerases can only extend a DNA strand in a 5′ to 3′ direction, different mechanisms are used to copy the antiparallel strands of the double helix.[70] In this way, the base on the old strand dictates which base appears on the new strand, and the cell ends up with a perfect copy of its DNA.

Interactions with proteins

All the functions of DNA depend on interactions with proteins. These protein interactions can be non-specific, or the protein can bind specifically to a single DNA sequence. Enzymes can also bind to DNA and of these, the polymerases that copy the DNA base sequence in transcription and DNA replication are particularly important.

DNA-binding proteins

Nucleosome 2.jpg
260px
Interaction of DNA with histones (shown in white, top). These proteins' basic amino acids (below left, blue) bind to the acidic phosphate groups on DNA (below right, red).

Structural proteins that bind DNA are well-understood examples of non-specific DNA-protein interactions. Within chromosomes, DNA is held in complexes with structural proteins. These proteins organize the DNA into a compact structure called chromatin. In eukaryotes this structure involves DNA binding to a complex of small basic proteins called histones, while in prokaryotes multiple types of proteins are involved.[71][72] The histones form a disk-shaped complex called a nucleosome, which contains two complete turns of double-stranded DNA wrapped around its surface. These non-specific interactions are formed through basic residues in the histones making ionic bonds to the acidic sugar-phosphate backbone of the DNA, and are therefore largely independent of the base sequence.[73] Chemical modifications of these basic amino acid residues include methylation, phosphorylation and acetylation.[74] These chemical changes alter the strength of the interaction between the DNA and the histones, making the DNA more or less accessible to transcription factors and changing the rate of transcription.[75] Other non-specific DNA-binding proteins found in chromatin include the high-mobility group proteins, which bind preferentially to bent or distorted DNA.[76] These proteins are important in bending arrays of nucleosomes and arranging them into more complex chromatin structures.[77]

A distinct group of DNA-binding proteins are the single-stranded-DNA-binding proteins that specifically bind single-stranded DNA. In humans, replication protein A is the best-characterised member of this family and is essential for most processes where the double helix is separated, including DNA replication, recombination and DNA repair.[78] These binding proteins seem to stabilize single-stranded DNA and protect it from forming stem loops or being degraded by nucleases.

The lambda repressor helix-turn-helix transcription factor bound to its DNA target[79]

In contrast, other proteins have evolved to specifically bind particular DNA sequences. The most intensively studied of these are the various classes of transcription factors, which are proteins that regulate transcription. Each one of these proteins bind to one particular set of DNA sequences and thereby activates or inhibits the transcription of genes with these sequences close to their promoters. The transcription factors do this in two ways. Firstly, they can bind the RNA polymerase responsible for transcription, either directly or through other mediator proteins; this locates the polymerase at the promoter and allows it to begin transcription.[80] Alternatively, transcription factors can bind enzymes that modify the histones at the promoter; this will change the accessibility of the DNA template to the polymerase.[81]

As these DNA targets can occur throughout an organism's genome, changes in the activity of one type of transcription factor can affect thousands of genes.[82] Consequently, these proteins are often the targets of the signal transduction processes that mediate responses to environmental changes or cellular differentiation and development. The specificity of these transcription factors' interactions with DNA come from the proteins making multiple contacts to the edges of the DNA bases, allowing them to "read" the DNA sequence. Most of these base-interactions are made in the major groove, where the bases are most accessible.[83]

The restriction enzyme EcoRV (green) in a complex with its substrate DNA[84]

DNA-modifying enzymes

Nucleases and ligases

Nucleases are enzymes that cut DNA strands by catalyzing the hydrolysis of the phosphodiester bonds. Nucleases that hydrolyse nucleotides from the ends of DNA strands are called exonucleases, while endonucleases cut within strands. The most frequently-used nucleases in molecular biology are the restriction endonucleases, which cut DNA at specific sequences. For instance, the EcoRV enzyme shown to the left recognizes the 6-base sequence 5′-GAT|ATC-3′ and makes a cut at the vertical line. In nature, these enzymes protect bacteria against phage infection by digesting the phage DNA when it enters the bacterial cell, acting as part of the restriction modification system.[85] In technology, these sequence-specific nucleases are used in molecular cloning and DNA fingerprinting.

Enzymes called DNA ligases can rejoin cut or broken DNA strands, using the energy from either adenosine triphosphate or nicotinamide adenine dinucleotide.[86] Ligases are particularly important in lagging strand DNA replication, as they join together the short segments of DNA produced at the replication fork into a complete copy of the DNA template. They are also used in DNA repair and genetic recombination.[86]

Topoisomerases and helicases

Topoisomerases are enzymes with both nuclease and ligase activity. These proteins change the amount of supercoiling in DNA. Some of these enzyme work by cutting the DNA helix and allowing one section to rotate, thereby reducing its level of supercoiling; the enzyme then seals the DNA break.[27] Other types of these enzymes are capable of cutting one DNA helix and then passing a second strand of DNA through this break, before rejoining the helix.[87] Topoisomerases are required for many processes involving DNA, such as DNA replication and transcription.[28]

Helicases are proteins that are a type of molecular motor. They use the chemical energy in nucleoside triphosphates, predominantly ATP, to break hydrogen bonds between bases and unwind the DNA double helix into single strands.[88] These enzymes are essential for most processes where enzymes need to access the DNA bases.

Polymerases

Polymerases are enzymes that synthesise polynucleotide chains from nucleoside triphosphates. They function by adding nucleotides onto the 3′ hydroxyl group of the previous nucleotide in the DNA strand. As a consequence, all polymerases work in a 5′ to 3′ direction.[89] In the active site of these enzymes, the nucleoside triphosphate substrate base-pairs to a single-stranded polynucleotide template: this allows polymerases to accurately synthesise the complementary strand of this template. Polymerases are classified according to the type of template that they use.

In DNA replication, a DNA-dependent DNA polymerase makes a DNA copy of a DNA sequence. Accuracy is vital in this process, so many of these polymerases have a proofreading activity. Here, the polymerase recognizes the occasional mistakes in the synthesis reaction by the lack of base pairing between the mismatched nucleotides. If a mismatch is detected, a 3′ to 5′ exonuclease activity is activated and the incorrect base removed.[90] In most organisms DNA polymerases function in a large complex called the replisome that contains multiple accessory subunits, such as the DNA clamp or helicases.[91]

RNA-dependent DNA polymerases are a specialised class of polymerases that copy the sequence of an RNA strand into DNA. They include reverse transcriptase, which is a viral enzyme involved in the infection of cells by retroviruses, and telomerase, which is required for the replication of telomeres.[92][41] Telomerase is an unusual polymerase because it contains its own RNA template as part of its structure.[42]

Transcription is carried out by a DNA-dependent RNA polymerase that copies the sequence of a DNA strand into RNA. To begin transcribing a gene, the RNA polymerase binds to a sequence of DNA called a promoter and separates the DNA strands. It then copies the gene sequence into a messenger RNA transcript until it reaches a region of DNA called the terminator, where it halts and detaches from the DNA. As with human DNA-dependent DNA polymerases, RNA polymerase II, the enzyme that transcribes most of the genes in the human genome, operates as part of a large protein complex with multiple regulatory and accessory subunits.[93]

Genetic recombination

Holliday Junction cropped.png
Holliday junction coloured.png
Structure of the Holliday junction intermediate in genetic recombination. The four separate DNA strands are coloured red, blue, green and yellow.[94]
Further information: Genetic recombination
Recombination involves the breakage and rejoining of two chromosomes (M and F) to produce two re-arranged chromosomes (C1 and C2).

A DNA helix does not usually interact with other segments of DNA, and in human cells the different chromosomes even occupy separate areas in the nucleus called "chromosome territories".[95] This physical separation of different chromosomes is important for the ability of DNA to function as a stable repository for information, as one of the few times chromosomes interact is during chromosomal crossover when they recombine. Chromosomal crossover is when two DNA helices break, swap a section and then rejoin.

Recombination allows chromosomes to exchange genetic information and produces new combinations of genes, which increases the efficiency of natural selection and can be important in the rapid evolution of new proteins.[96] Genetic recombination can also be involved in DNA repair, particularly in the cell's response to double-strand breaks.[97]

The most common form of chromosomal crossover is homologous recombination, where the two chromosomes involved share very similar sequences. Non-homologous recombination can be damaging to cells, as it can produce chromosomal translocations and genetic abnormalities. The recombination reaction is catalyzed by enzymes known as recombinases, such as RAD51.[98] The first step in recombination is a double-stranded break either caused by an endonuclease or damage to the DNA.[99] A series of steps catalyzed in part by the recombinase then leads to joining of the two helices by at least one Holliday junction, in which a segment of a single strand in each helix is annealed to the complementary strand in the other helix. The Holliday junction is a tetrahedral junction structure that can be moved along the pair of chromosomes, swapping one strand for another. The recombination reaction is then halted by cleavage of the junction and re-ligation of the released DNA.[100]

Evolution of DNA-based metabolism

DNA contains the genetic information that allows all modern living things to function, grow and reproduce. However, it is unclear how long in the 4-billion-year history of life DNA has performed this function, as it has been proposed that the earliest forms of life may have used RNA as their genetic material.[89][101] RNA may have acted as the central part of early cell metabolism as it can both transmit genetic information and carry out catalysis as part of ribozymes.[102] This ancient RNA world where nucleic acid would have been used for both catalysis and genetics may have influenced the evolution of the current genetic code based on four nucleotide bases. This would occur since the number of unique bases in such an organism is a trade-off between a small number of bases increasing replication accuracy and a large number of bases increasing the catalytic efficiency of ribozymes.[103]

Unfortunately, there is no direct evidence of ancient genetic systems, as recovery of DNA from most fossils is impossible. This is because DNA will survive in the environment for less than one million years and slowly degrades into short fragments in solution.[104] Although claims for older DNA have been made, most notably a report of the isolation of a viable bacterium from a salt crystal 250-million years old,[105] these claims are controversial and have been disputed.[106][107]

Uses in technology

Genetic engineering

Further information: Molecular biology and genetic engineering

Modern biology and biochemistry make intensive use of recombinant DNA technology. Recombinant DNA is a man-made DNA sequence that has been assembled from other DNA sequences. They can be transformed into organisms in the form of plasmids or in the appropriate format, by using a viral vector.[108] The genetically modified organisms produced can be used to produce products such as recombinant proteins, used in medical research,[109] or be grown in agriculture.[110][111]

Forensics

Further information: Genetic fingerprinting

Forensic scientists can use DNA in blood, semen, skin, saliva or hair at a crime scene to identify a perpetrator. This process is called genetic fingerprinting, or more accurately, DNA profiling. In DNA profiling, the lengths of variable sections of repetitive DNA, such as short tandem repeats and minisatellites, are compared between people. This method is usually an extremely reliable technique for identifying a criminal.[112] However, identification can be complicated if the scene is contaminated with DNA from several people.[113] DNA profiling was developed in 1984 by British geneticist Sir Alec Jeffreys,[114] and first used in forensic science to convict Colin Pitchfork in the 1988 Enderby murders case.[115] People convicted of certain types of crimes may be required to provide a sample of DNA for a database. This has helped investigators solve old cases where only a DNA sample was obtained from the scene. DNA profiling can also be used to identify victims of mass casualty incidents.[116]

Bioinformatics

Further information: Bioinformatics

Bioinformatics involves the manipulation, searching, and data mining of DNA sequence data. The development of techniques to store and search DNA sequences have led to widely-applied advances in computer science, especially string searching algorithms, machine learning and database theory.[117] String searching or matching algorithms, which find an occurrence of a sequence of letters inside a larger sequence of letters, were developed to search for specific sequences of nucleotides.[118] In other applications such as text editors, even simple algorithms for this problem usually suffice, but DNA sequences cause these algorithms to exhibit near-worst-case behaviour due to their small number of distinct characters. The related problem of sequence alignment aims to identify homologous sequences and locate the specific mutations that make them distinct. These techniques, especially multiple sequence alignment, are used in studying phylogenetic relationships and protein function.[119] Data sets representing entire genomes' worth of DNA sequences, such as those produced by the Human Genome Project, are difficult to use without annotations, which label the locations of genes and regulatory elements on each chromosome. Regions of DNA sequence that have the characteristic patterns associated with protein- or RNA-coding genes can be identified by gene finding algorithms, which allow researchers to predict the presence of particular gene products in an organism even before they have been isolated experimentally.[120]

DNA and computation

Further information: DNA computing

DNA was first used in computing to solve a small version of the directed Hamiltonian path problem, an NP-complete problem.[121] DNA computing is advantageous over electronic computers in power use, space use, and efficiency, due to its ability to compute in a highly parallel fashion (see parallel computing). A number of other problems, including simulation of various abstract machines, the boolean satisfiability problem, and the bounded version of the travelling salesman problem, have since been analysed using DNA computing.[122] Due to its compactness, DNA also has a theoretical role in cryptography, where in particular it allows unbreakable one-time pads to be efficiently constructed and used.[123]

History and anthropology

Further information: Phylogenetics and Genetic genealogy

Because DNA collects mutations over time, which are then inherited, it contains historical information and by comparing DNA sequences, geneticists can infer the evolutionary history of organisms, their phylogeny.[124] This field of phylogenetics is a powerful tool in evolutionary biology. If DNA sequences within a species are compared, population geneticists can learn the history of particular populations. This can be used in studies ranging from ecological genetics to anthropology; for example, DNA evidence is being used to try to identify the Ten Lost Tribes of Israel.[125][126]

DNA has also been used to look at modern family relationships, such as establishing family relationships between the descendants of Sally Hemings and Thomas Jefferson. This usage is closely related to the use of DNA in criminal investigations detailed above. Indeed, some criminal investigations have been solved when DNA from crime scenes has matched relatives of the guilty individual.[127]

History

Francis Crick
Further information: History of molecular biology

DNA was first isolated by the Swiss physician Friedrich Miescher who, in 1869, discovered a microscopic substance in the pus of discarded surgical bandages. As it resided in the nuclei of cells, he called it "nuclein".[128] In 1929 this discovery was followed by Phoebus Levene's identification of the base, sugar and phosphate nucleotide unit.[129] Levene suggested that DNA consisted of a string of nucleotide units linked together through the phosphate groups. However, Levene thought the chain was short and the bases repeated in a fixed order. In 1937 William Astbury produced the first X-ray diffraction patterns that showed that DNA had a regular structure.[130]

In 1943, Oswald Theodore Avery discovered that traits of the "smooth" form of the Pneumococcus could be transferred to the "rough" form of the same bacteria by mixing killed "smooth" bacteria with the live "rough" form. Avery identified DNA as this transforming principle.[131] DNA's role in heredity was confirmed in 1953, when Alfred Hershey and Martha Chase in the Hershey-Chase experiment showed that DNA is the genetic material of the T2 phage.[132]

In 1953, based on X-ray diffraction images[133] taken by Rosalind Franklin and the information that the bases were paired, James D. Watson and Francis Crick suggested[133] what is now accepted as the first accurate model of DNA structure in the journal Nature.[5] Experimental evidence for Watson and Crick's model were published in a series of five articles in the same issue of Nature.[134] Of these, Franklin and Raymond Gosling's paper[135] saw the publication of the X-ray diffraction image[136], which was key in Watson and Crick interpretation, as well as another article, co-authored by Maurice Wilkins and his colleagues.[137] Franklin and Gosling's subsequent paper identified the distinctions between the A and B structures of the double helix in DNA.[138] In 1962 Watson, Crick, and Maurice Wilkins jointly received the Nobel Prize in Physiology or Medicine (Franklin didn't share the prize with them since she had died earlier).[139]

In an influential presentation in 1957, Crick laid out the "Central Dogma" of molecular biology, which foretold the relationship between DNA, RNA, and proteins, and articulated the "adaptor hypothesis".[140] Final confirmation of the replication mechanism that was implied by the double-helical structure followed in 1958 through the Meselson-Stahl experiment.[141] Further work by Crick and coworkers showed that the genetic code was based on non-overlapping triplets of bases, called codons, allowing Har Gobind Khorana, Robert W. Holley and Marshall Warren Nirenberg to decipher the genetic code.[142] These findings represent the birth of molecular biology.

See also

  • Genetic disorder
  • Plasmid
  • DNA sequencing
  • Southern blot
  • DNA microarray
  • Polymerase chain reaction
  • Protein-DNA interaction site predictor
  • Phosphoramidite
  • Quantification of nucleic acids
  • Guanidium thiocyanate-phenol-chlorofrom extraction

References
ISBN links support NWE through referral fees

  1. 1.0 1.1 Alberts, Bruce and Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts, and Peter Walters (2002). Molecular Biology of the Cell; Fourth Edition. New York and London: Garland Science. ISBN 0-8153-3218-1. 
  2. Butler, John M. (2001) Forensic DNA Typing "Elsevier". pp. 14 – 15. ISBN 978-0-12-147951-0.
  3. Mandelkern M, Elias J, Eden D, Crothers D (1981). The dimensions of DNA in solution. J Mol Biol 152 (1): 153 – 61. PMID 7338906.
  4. Gregory S, et al. (2006). The DNA sequence and biological annotation of human chromosome 1. Nature 441 (7091): 315 – 21. PMID 16710414.
  5. 5.0 5.1 Watson J, Crick F (1953). Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature 171 (4356): 737 – 8. PMID 13054692.
  6. 6.0 6.1 Berg J., Tymoczko J. and Stryer L. (2002) Biochemistry. W. H. Freeman and Company ISBN 0-7167-4955-6
  7. 7.0 7.1 Abbreviations and Symbols for Nucleic Acids, Polynucleotides and their Constituents IUPAC-IUB Commission on Biochemical Nomenclature (CBN) Accessed 03 Jan 2006
  8. 8.0 8.1 Ghosh A, Bansal M (2003). A glossary of DNA structures from A to Z. Acta Crystallogr D Biol Crystallogr 59 (Pt 4): 620 – 6. PMID 12657780.
  9. Takahashi I, Marmur J. (1963). Replacement of thymidylic acid by deoxyuridylic acid in the deoxyribonucleic acid of a transducing phage for Bacillus subtilis. Nature 197: 794 – 5. PMID 13980287.
  10. Agris P (2004). Decoding the genome: a modified view. Nucleic Acids Res 32 (1): 223 – 38.
  11. Created from PDB 1D65
  12. Wing R, Drew H, Takano T, Broka C, Tanaka S, Itakura K, Dickerson R (1980). Crystal structure analysis of a complete turn of B-DNA. Nature 287 (5784): 755 – 8. PMID 7432492.
  13. Pabo C, Sauer R. Protein-DNA recognition. Annu Rev Biochem 53: 293 – 321. PMID 6236744.
  14. Ponnuswamy P, Gromiha M (1994). On the conformational stability of oligonucleotide duplexes and tRNA molecules. J Theor Biol 169 (4): 419 – 32. PMID 7526075.
  15. Clausen-Schaumann H, Rief M, Tolksdorf C, Gaub H (2000). Mechanical stability of single DNA molecules. Biophys J 78 (4): 1997 – 2007. PMID 10733978.
  16. Chalikian T, Völker J, Plum G, Breslauer K (1999). A more unified picture for the thermodynamics of nucleic acid duplex melting: a characterization by calorimetric and volumetric techniques. Proc Natl Acad Sci U S A 96 (14): 7853 – 8. PMID 10393911.
  17. deHaseth P, Helmann J (1995). Open complex formation by Escherichia coli RNA polymerase: the mechanism of polymerase-induced strand separation of double helical DNA. Mol Microbiol 16 (5): 817 – 24. PMID 7476180.
  18. Isaksson J, Acharya S, Barman J, Cheruku P, Chattopadhyaya J (2004). Single-stranded adenine-rich DNA and RNA retain structural characteristics of their respective double-stranded conformations and show directional differences in stacking pattern. Biochemistry 43 (51): 15996 – 6010. PMID 15609994.
  19. Hüttenhofer A, Schattner P, Polacek N (2005). Non-coding RNAs: hope or hype?. Trends Genet 21 (5): 289 – 97. PMID 15851066.
  20. Munroe S (2004). Diversity of antisense regulation in eukaryotes: multiple mechanisms, emerging patterns. J Cell Biochem 93 (4): 664 – 71. PMID 15389973.
  21. Makalowska I, Lin C, Makalowski W (2005). Overlapping genes in vertebrate genomes. Comput Biol Chem 29 (1): 1 – 12. PMID 15680581.
  22. Johnson Z, Chisholm S (2004). Properties of overlapping genes are conserved across microbial genomes. Genome Res 14 (11): 2268 – 72. PMID 15520290.
  23. Lamb R, Horvath C (1991). Diversity of coding strategies in influenza viruses. Trends Genet 7 (8): 261 – 6. PMID 1771674.
  24. Davies J, Stanley J (1989). Geminivirus genes and vectors. Trends Genet 5 (3): 77 – 81. PMID 2660364.
  25. Berns K (1990). Parvovirus replication. Microbiol Rev 54 (3): 316 – 29. PMID 2215424.
  26. Benham C, Mielke S. DNA mechanics. Annu Rev Biomed Eng 7: 21 – 53. PMID 16004565.
  27. 27.0 27.1 Champoux J. DNA topoisomerases: structure, function, and mechanism. Annu Rev Biochem 70: 369 – 413. PMID 11395412.
  28. 28.0 28.1 Wang J (2002). Cellular roles of DNA topoisomerases: a molecular perspective. Nat Rev Mol Cell Biol 3 (6): 430 – 40. PMID 12042765.
  29. 29.0 29.1 Hayashi G, Hagihara M, Nakatani K (2005). Application of L-DNA as a molecular tag. Nucleic Acids Symp Ser (Oxf) 49: 261 – 262. PMID 17150733.
  30. Vargason JM, Eichman BF, Ho PS (2000). The extended and eccentric E-DNA structure induced by cytosine methylation or bromination. Nature Structural Biology 7: 758 – 761. PMID 10966645.
  31. Wang G, Vasquez KM (2006). Non-B DNA structure-induced genetic instability. Mutat Res 598 (1 – 2): 103 – 119. PMID 16516932.
  32. Allemand, et al (1998). Stretched and overwound DNA forms a Pauling-like structure with exposed bases. PNAS 24: 14152-14157. PMID 9826669.
  33. Palecek E (1991). Local supercoil-stabilized DNA structures. Critical Reviews in Biochemistry and Molecular Biology 26 (2): 151 – 226. PMID 1914495.
  34. Basu H, Feuerstein B, Zarling D, Shafer R, Marton L (1988). Recognition of Z-RNA and Z-DNA determinants by polyamines in solution: experimental and theoretical studies. J Biomol Struct Dyn 6 (2): 299 – 309. PMID 2482766.
  35. Leslie AG, Arnott S, Chandrasekaran R, Ratliff RL (1980). Polymorphism of DNA double helices. J. Mol. Biol. 143 (1): 49–72.
  36. Wahl M, Sundaralingam M (1997). Crystal structures of A-DNA duplexes. Biopolymers 44 (1): 45 – 63. PMID 9097733.
  37. Lu XJ, Shakked Z, Olson WK (2000). A-form conformational motifs in ligand-bound DNA structures. J. Mol. Biol. 300 (4): 819-40.
  38. Rothenburg S, Koch-Nolte F, Haag F. DNA methylation and Z-DNA formation as mediators of quantitative differences in the expression of alleles. Immunol Rev 184: 286 – 98. PMID 12086319.
  39. Oh D, Kim Y, Rich A (2002). Z-DNA-binding proteins can act as potent effectors of gene expression in vivo. Proc. Natl. Acad. Sci. U.S.A. 99 (26): 16666-71.
  40. Created from NDB UD0017
  41. 41.0 41.1 Greider C, Blackburn E (1985). Identification of a specific telomere terminal transferase activity in Tetrahymena extracts. Cell 43 (2 Pt 1): 405 – 13. PMID 3907856.
  42. 42.0 42.1 42.2 Nugent C, Lundblad V (1998). The telomerase reverse transcriptase: components and regulation. Genes Dev 12 (8): 1073 – 85. PMID 9553037.
  43. Wright W, Tesmer V, Huffman K, Levene S, Shay J (1997). Normal human chromosomes have long G-rich telomeric overhangs at one end. Genes Dev 11 (21): 2801 – 9. PMID 9353250.
  44. 44.0 44.1 Burge S, Parkinson G, Hazel P, Todd A, Neidle S (2006). Quadruplex DNA: sequence, topology and structure. Nucleic Acids Res 34 (19): 5402 – 15. PMID 17012276.
  45. Parkinson G, Lee M, Neidle S (2002). Crystal structure of parallel quadruplexes from human telomeric DNA. Nature 417 (6891): 876 – 80. PMID 12050675.
  46. Griffith J, Comeau L, Rosenfield S, Stansel R, Bianchi A, Moss H, de Lange T (1999). Mammalian telomeres end in a large duplex loop. Cell 97 (4): 503 – 14. PMID 10338214.
  47. Klose R, Bird A (2006). Genomic DNA methylation: the mark and its mediators. Trends Biochem Sci 31 (2): 89 – 97. PMID 16403636.
  48. Bird A (2002). DNA methylation patterns and epigenetic memory. Genes Dev 16 (1): 6 – 21. PMID 11782440.
  49. Walsh C, Xu G. Cytosine methylation and DNA repair. Curr Top Microbiol Immunol 301: 283 – 315. PMID 16570853.
  50. Ratel D, Ravanat J, Berger F, Wion D (2006). N6-methyladenine: the other methylated base of DNA. Bioessays 28 (3): 309 – 15. PMID 16479578.
  51. Gommers-Ampt J, Van Leeuwen F, de Beer A, Vliegenthart J, Dizdaroglu M, Kowalak J, Crain P, Borst P (1993). beta-D-glucosyl-hydroxymethyluracil: a novel modified base present in the DNA of the parasitic protozoan T. brucei. Cell 75 (6): 1129 – 36. PMID 8261512.
  52. Created from PDB 1JDG
  53. Douki T, Reynaud-Angelin A, Cadet J, Sage E (2003). Bipyrimidine photoproducts rather than oxidative lesions are the main type of DNA damage involved in the genotoxic effect of solar UVA radiation. Biochemistry 42 (30): 9221 – 6. PMID 12885257.,
  54. Cadet J, Delatour T, Douki T, Gasparutto D, Pouget J, Ravanat J, Sauvaigo S (1999). Hydroxyl radicals and DNA base damage. Mutat Res 424 (1 – 2): 9 – 21. PMID 10064846.
  55. Shigenaga M, Gimeno C, Ames B (1989). Urinary 8-hydroxy-2′-deoxyguanosine as a biological marker of in vivo oxidative DNA damage. Proc Natl Acad Sci U S A 86 (24): 9697 – 701. PMID 2602371.
  56. Cathcart R, Schwiers E, Saul R, Ames B (1984). Thymine glycol and thymidine glycol in human and rat urine: a possible assay for oxidative DNA damage. Proc Natl Acad Sci U S A 81 (18): 5633 – 7. PMID 6592579.
  57. Valerie K, Povirk L (2003). Regulation and mechanisms of mammalian double-strand break repair. Oncogene 22 (37): 5792 – 812. PMID 12947387.
  58. Ferguson L, Denny W (1991). The genetic toxicology of acridines. Mutat Res 258 (2): 123 – 60. PMID 1881402.
  59. Jeffrey A (1985). DNA modification by chemical carcinogens. Pharmacol Ther 28 (2): 237 – 72. PMID 3936066.
  60. Stephens T, Bunde C, Fillmore B (2000). Mechanism of action in thalidomide teratogenesis. Biochem Pharmacol 59 (12): 1489 – 99. PMID 10799645.
  61. Braña M, Cacho M, Gradillas A, de Pascual-Teresa B, Ramos A (2001). Intercalators as anticancer drugs. Curr Pharm Des 7 (17): 1745 – 80. PMID 11562309.
  62. Venter J, et al. (2001). The sequence of the human genome. Science 291 (5507): 1304 – 51. PMID 11181995.
  63. Thanbichler M, Wang S, Shapiro L (2005). The bacterial nucleoid: a highly organized and dynamic structure. J Cell Biochem 96 (3): 506 – 21. PMID 15988757.
  64. Wolfsberg T, McEntyre J, Schuler G (2001). Guide to the draft human genome. Nature 409 (6822): 824 – 6. PMID 11236998.
  65. Gregory T (2005). The C-value enigma in plants and animals: a review of parallels and an appeal for partnership. Ann Bot (Lond) 95 (1): 133 – 46. PMID 15596463.
  66. Created from PDB 1MSW
  67. Pidoux A, Allshire R (2005). The role of heterochromatin in centromere function. Philos Trans R Soc Lond B Biol Sci 360 (1455): 569 – 79. PMID 15905142.
  68. Harrison P, Hegyi H, Balasubramanian S, Luscombe N, Bertone P, Echols N, Johnson T, Gerstein M (2002). Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22. Genome Res 12 (2): 272 – 80. PMID 11827946.
  69. Harrison P, Gerstein M (2002). Studying genomes through the aeons: protein families, pseudogenes and proteome evolution. J Mol Biol 318 (5): 1155 – 74. PMID 12083509.
  70. Albà M (2001). Replicative DNA polymerases. Genome Biol 2 (1): REVIEWS3002. PMID 11178285.
  71. Sandman K, Pereira S, Reeve J (1998). Diversity of prokaryotic chromosomal proteins and the origin of the nucleosome. Cell Mol Life Sci 54 (12): 1350 – 64. PMID 9893710.
  72. Dame RT (2005). The role of nucleoid-associated proteins in the organization and compaction of bacterial chromatin. Mol. Microbiol. 56 (4): 858-70.
  73. Luger K, Mäder A, Richmond R, Sargent D, Richmond T (1997). Crystal structure of the nucleosome core particle at 2.8 A resolution. Nature 389 (6648): 251 – 60. PMID 9305837.
  74. Jenuwein T, Allis C (2001). Translating the histone code. Science 293 (5532): 1074 – 80. PMID 11498575.
  75. Ito T. Nucleosome assembly and remodelling. Curr Top Microbiol Immunol 274: 1 – 22. PMID 12596902.
  76. Thomas J (2001). HMG1 and 2: architectural DNA-binding proteins. Biochem Soc Trans 29 (Pt 4): 395 – 401. PMID 11497996.
  77. Grosschedl R, Giese K, Pagel J (1994). HMG domain proteins: architectural elements in the assembly of nucleoprotein structures. Trends Genet 10 (3): 94–100. PMID 8178371.
  78. Iftode C, Daniely Y, Borowiec J (1999). Replication protein A (RPA): the eukaryotic SSB. Crit Rev Biochem Mol Biol 34 (3): 141 – 80. PMID 10473346.
  79. Created from PDB 1LMB
  80. Myers L, Kornberg R. Mediator of transcriptional regulation. Annu Rev Biochem 69: 729 – 49. PMID 10966474.
  81. Spiegelman B, Heinrich R (2004). Biological control through regulated transcriptional coactivators. Cell 119 (2): 157-67. PMID 15479634.
  82. Li Z, Van Calcar S, Qu C, Cavenee W, Zhang M, Ren B (2003). A global transcriptional regulatory role for c-Myc in Burkitt's lymphoma cells. Proc Natl Acad Sci U S A 100 (14): 8164 – 9. PMID 12808131.
  83. Pabo C, Sauer R. Protein-DNA recognition. Annu Rev Biochem 53: 293 – 321. PMID 6236744.
  84. Created from PDB 1RVA
  85. Bickle T, Krüger D (1993). Biology of DNA restriction. Microbiol Rev 57 (2): 434 – 50. PMID 8336674.
  86. 86.0 86.1 Doherty A, Suh S (2000). Structural and mechanistic conservation in DNA ligases.. Nucleic Acids Res 28 (21): 4051 – 8. PMID 11058099.
  87. Schoeffler A, Berger J (2005). Recent advances in understanding structure-function relationships in the type II topoisomerase mechanism. Biochem Soc Trans 33 (Pt 6): 1465 – 70. PMID 16246147.
  88. Tuteja N, Tuteja R (2004). Unraveling DNA helicases. Motif, structure, mechanism and function. Eur J Biochem 271 (10): 1849–63. PMID 15128295.
  89. 89.0 89.1 Joyce C, Steitz T (1995). Polymerase structures and function: variations on a theme?. J Bacteriol 177 (22): 6321 – 9. PMID 7592405. Cite error: Invalid <ref> tag; name "Joyce" defined multiple times with different content
  90. Hubscher U, Maga G, Spadari S. Eukaryotic DNA polymerases. Annu Rev Biochem 71: 133 – 63. PMID 12045093.
  91. Johnson A, O'Donnell M. Cellular DNA replicases: components and dynamics at the replication fork. Annu Rev Biochem 74: 283 – 315. PMID 15952889.
  92. Tarrago-Litvak L, Andréola M, Nevinsky G, Sarih-Cottin L, Litvak S (1994). The reverse transcriptase of HIV-1: from enzymology to therapeutic intervention. FASEB J 8 (8): 497–503. PMID 7514143.
  93. Martinez E (2002). Multi-protein complexes in eukaryotic gene transcription. Plant Mol Biol 50 (6): 925 – 47. PMID 12516863.
  94. Created from PDB 1M6G
  95. Cremer T, Cremer C (2001). Chromosome territories, nuclear architecture and gene regulation in mammalian cells. Nat Rev Genet 2 (4): 292–301. PMID 11283701.
  96. Pál C, Papp B, Lercher M (2006). An integrated view of protein evolution. Nat Rev Genet 7 (5): 337 – 48. PMID 16619049.
  97. O'Driscoll M, Jeggo P (2006). The role of double-strand break repair - insights from human genetics. Nat Rev Genet 7 (1): 45 – 54. PMID 16369571.
  98. Vispé S, Defais M (1997). Mammalian Rad51 protein: a RecA homologue with pleiotropic functions. Biochimie 79 (9-10): 587-92.
  99. Neale MJ, Keeney S (2006). Clarifying the mechanics of DNA strand exchange in meiotic recombination. Nature 442 (7099): 153-8.
  100. Dickman M, Ingleston S, Sedelnikova S, Rafferty J, Lloyd R, Grasby J, Hornby D (2002). The RuvABC resolvasome. Eur J Biochem 269 (22): 5492 – 501. PMID 12423347.
  101. Orgel L. Prebiotic chemistry and the origin of the RNA world. Crit Rev Biochem Mol Biol 39 (2): 99 – 123. PMID 15217990.
  102. Davenport R (2001). Ribozymes. Making copies in the RNA world. Science 292 (5520): 1278.
  103. Szathmáry E (1992). What is the optimum size for the genetic alphabet?. Proc Natl Acad Sci U S A 89 (7): 2614 – 8.
  104. Lindahl T (1993). Instability and decay of the primary structure of DNA. Nature 362 (6422): 709 – 15. PMID 8469282.
  105. Vreeland R, Rosenzweig W, Powers D (2000). Isolation of a 250 million-year-old halotolerant bacterium from a primary salt crystal. Nature 407 (6806): 897 – 900. PMID 11057666.
  106. Hebsgaard M, Phillips M, Willerslev E (2005). Geologically ancient DNA: fact or artefact?. Trends Microbiol 13 (5): 212 – 20. PMID 15866038.
  107. Nickle D, Learn G, Rain M, Mullins J, Mittler J (2002). Curiously modern DNA for a "250 million-year-old" bacterium. J Mol Evol 54 (1): 134 – 7. PMID 11734907.
  108. Goff SP, Berg P (1976). Construction of hybrid viruses containing SV40 and lambda phage DNA segments and their propagation in cultured monkey cells. Cell 9 (4 PT 2): 695–705.
  109. Houdebine L. Transgenic animal models in biomedical research. Methods Mol Biol 360: 163 – 202.
  110. Daniell H, Dhingra A (2002). Multigene engineering: dawn of an exciting new era in biotechnology. Curr Opin Biotechnol 13 (2): 136 – 41.
  111. Job D (2002). Plant biotechnology in agriculture. Biochimie 84 (11): 1105 – 10.
  112. Collins A, Morton N (1994). Likelihood ratios for DNA identification. Proc Natl Acad Sci U S A 91 (13): 6007 – 11. PMID 8016106.
  113. Weir B, Triggs C, Starling L, Stowell L, Walsh K, Buckleton J (1997). Interpreting DNA mixtures. J Forensic Sci 42 (2): 213 – 22. PMID 9068179.
  114. Jeffreys A, Wilson V, Thein S. Individual-specific 'fingerprints' of human DNA.. Nature 316 (6023): 76 – 9. PMID 2989708.
  115. Colin Pitchfork — first murder conviction on DNA evidence also clears the prime suspect Forensic Science Service Accessed 23 Dec 2006
  116. DNA Identification in Mass Fatality Incidents. National Institute of Justice (September 2006).
  117. Baldi, Pierre. Brunak, Soren. Bioinformatics: The Machine Learning Approach MIT Press (2001) ISBN 978-0-262-02506-5
  118. Gusfield, Dan. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, 15 January 1997. ISBN 978-0-521-58519-4.
  119. Sjölander K (2004). Phylogenomic inference of protein molecular function: advances and challenges. Bioinformatics 20 (2): 170-9. PMID 14734307.
  120. Mount DM (2004). Bioinformatics: Sequence and Genome Analysis, 2, Cold Spring Harbor Laboratory Press. ISBN 0879697121. 
  121. Adleman L (1994). Molecular computation of solutions to combinatorial problems. Science 266 (5187): 1021 – 4. PMID 7973651.
  122. Parker J (2003). Computing with DNA.. EMBO Rep 4 (1): 7 – 10. PMID 12524509.
  123. Ashish Gehani, Thomas LaBean and John Reif. DNA-Based Cryptography. Proceedings of the 5th DIMACS Workshop on DNA Based Computers, Cambridge, MA, USA, 14 – 15 June 1999.
  124. Wray G (2002). Dating branches on the tree of life using DNA. Genome Biol 3 (1): REVIEWS0001. PMID 11806830.
  125. Lost Tribes of Israel, NOVA, PBS airdate: 22 February 2000. Transcript available from PBS.org, (last accessed on 4 March 2006)
  126. Kleiman, Yaakov. "The Cohanim/DNA Connection: The fascinating story of how DNA studies confirm an ancient biblical tradition". aish.com (January 13, 2000). Accessed 4 March 2006.
  127. Bhattacharya, Shaoni. "Killer convicted thanks to relative's DNA". newscientist.com (20 April 2004). Accessed 22 Dec 06
  128. Dahm R (2005). Friedrich Miescher and the discovery of DNA. Dev Biol 278 (2): 274 – 88. PMID 15680349.
  129. Levene P, (1919). The structure of yeast nucleic acid. J Biol Chem 40 (2): 415 – 24.
  130. Astbury W, (1947). Nucleic acid. Symp. SOC. Exp. Bbl 1 (66).
  131. Avery O, MacLeod C, McCarty M (1944). Studies on the chemical nature of the substance inducing transformation of pneumococcal types. Inductions of transformation by a desoxyribonucleic acid fraction isolated from pneumococcus type III. J Exp Med 79 (2): 137 – 158.
  132. Hershey A, Chase M (1952). Independent functions of viral protein and nucleic acid in growth of bacteriophage. J Gen Physiol 36 (1): 39 – 56. PMID 12981234.
  133. 133.0 133.1 Watson J.D. and Crick F.H.C. "A Structure for Deoxyribose Nucleic Acid". (PDF) Nature 171, 737 – 738 (1953). Accessed 13 Feb 2007.
  134. Nature Archives Double Helix of DNA: 50 Years
  135. Molecular Configuration in Sodium Thymonucleate. Franklin R. and Gosling R.G.Nature 171, 740 – 741 (1953)Nature Archives Full Text (PDF)
  136. Original X-ray diffraction image
  137. Molecular Structure of Deoxypentose Nucleic Acids. Wilkins M.H.F., A.R. Stokes A.R. & Wilson, H.R. Nature 171, 738 – 740 (1953)Nature Archives (PDF)
  138. Evidence for 2-Chain Helix in Crystalline Structure of Sodium Deoxyribonucleate. Franklin R. and Gosling R.G. Nature 172, 156 – 157 (1953)Nature Archives, full text (PDF)
  139. The Nobel Prize in Physiology or Medicine 1962 Nobelprize .org Accessed 22 Dec 06
  140. Crick, F.H.C. On degenerate templates and the adaptor hypothesis (PDF). genome.wellcome.ac.uk (Lecture, 1955). Accessed 22 Dec 2006
  141. Meselson M, Stahl F (1958). The replication of DNA in Escherichia coli. Proc Natl Acad Sci U S A 44 (7): 671 – 82. PMID 16590258.
  142. The Nobel Prize in Physiology or Medicine 1968 Nobelprize.org Accessed 22 Dec 06

Further reading

  • Clayton, Julie. (Ed.). 50 Years of DNA, Palgrave MacMillan Press, 2003. ISBN 978-1-40-391479-8
  • Judson, Horace Freeland. The Eighth Day of Creation: Makers of the Revolution in Biology, Cold Spring Harbor Laboratory Press, 1996. ISBN 978-0-87-969478-4
  • Olby, Robert. The Path to The Double Helix: Discovery of DNA, first published in October 1974 by MacMillan, with foreword by Francis Crick; ISBN 978-0-48-668117-7; the definitive DNA textbook, revised in 1994, with a 9 page postscript.
  • Ridley, Matt. Francis Crick: Discoverer of the Genetic Code (Eminent Lives) HarperCollins Publishers; 192 pp, ISBN 978-0-06-082333-7 2006
  • Rose, Steven. The Chemistry of Life, Penguin, ISBN 978-0-14-027273-4.
  • Watson, James D. and Francis H.C. Crick. A structure for Deoxyribose Nucleic Acid (PDF). Nature 171, 737 – 738, 25 April 1953.
  • Watson, James D. DNA: The Secret of Life ISBN 978-0-375-41546-3.
  • Watson, James D. The Double Helix: A Personal Account of the Discovery of the Structure of DNA (Norton Critical Editions). ISBN 978-0-393-95075-5
  • Watson, James D. "Avoid boring people and other lessons from a life in science" New York: Random House. ISBN 978-0-375-421844 (0-375-41284-0)366pp 2007
  • Calladine, Chris R.; Drew, Horace R.; Luisi, Ben F. and Travers, Andrew A. Understanding DNA, Elsevier Academic Press, 2003. ISBN 978-0-12155089-9

DVD

External links

Portal:Molecular and Cellular Biology
Molecular and Cellular Biology Portal
Commons-logo.svg
Wikimedia Commons has media related to:
Nucleic acids edit
Nucleobases: Adenine - Thymine - Uracil - Guanine - Cytosine - Purine - Pyrimidine
Nucleosides: Adenosine - Uridine - Guanosine - Cytidine - Deoxyadenosine - Thymidine - Deoxyguanosine - Deoxycytidine
Nucleotides: AMP - UMP - GMP - CMP - ADP - UDP - GDP - CDP - ATP - UTP - GTP - CTP - cAMP - cGMP
Deoxynucleotides: dAMP - dTMP - dUMP - dGMP - dCMP - dADP - dTDP - dUDP - dGDP - dCDP - dATP - dTTP - dUTP - dGTP - dCTP
Nucleic acids: DNA - RNA - LNA - PNA - mRNA - ncRNA - miRNA - rRNA - siRNA - tRNA - mtDNA - Oligonucleotide

Credits

New World Encyclopedia writers and editors rewrote and completed the Wikipedia article in accordance with New World Encyclopedia standards. This article abides by terms of the Creative Commons CC-by-sa 3.0 License (CC-by-sa), which may be used and disseminated with proper attribution. Credit is due under the terms of this license that can reference both the New World Encyclopedia contributors and the selfless volunteer contributors of the Wikimedia Foundation. To cite this article click here for a list of acceptable citing formats.The history of earlier contributions by wikipedians is accessible to researchers here:

The history of this article since it was imported to New World Encyclopedia:

Note: Some restrictions may apply to use of individual images which are separately licensed.