Transcription
Lesson Content
Inquire: mRNA Synthesis
Overview
In prokaryotes, mRNA synthesis is initiated at a promoter sequence on the DNA template. Elongation synthesizes new mRNA. Termination liberates the mRNA and occurs by mechanisms that stall the RNA polymerase and cause it to fall off the DNA template. Newly transcribed eukaryotic mRNAs are modified with a cap and a poly-A tail. These structures protect the mature mRNA from degradation and help export it from the nucleus. Eukaryotic mRNAs also undergo splicing, in which introns are removed and exons are reconnected with single-nucleotide accuracy. Only complete mRNAs are exported from the nucleus to the cytoplasm.
Big Question
What are the steps in eukaryotic transcription?
Watch: Molecular Biology
Read: Transcription
Overview
In both prokaryotes and eukaryotes, the second function of DNA (the first being replication) is to provide the information needed to construct the proteins necessary so that the cell can perform all of its functions. To do this, the DNA is “read” or transcribed into an mRNA molecule. The mRNA then provides the code to form a protein by a process called translation. Through the processes of transcription and translation, a protein is built with a specific sequence of amino acids that was originally encoded in the DNA. This module discusses the details of transcription.
The Central Dogma: DNA Encodes RNA; RNA Encodes Protein
The flow of genetic information in cells from DNA to mRNA to protein is described by the central dogma which states that genes specify the sequences of mRNAs, which in turn specify the sequences of proteins.
The copying of DNA to mRNA is relatively straightforward, with one nucleotide being added to the mRNA strand for every complementary nucleotide read in the DNA strand. The translation to protein is more complex because groups of three mRNA nucleotides correspond to one amino acid of the protein sequence. However, as we shall see in the next module, the translation to protein is still systematic: nucleotides 1 to 3 correspond to amino acid 1, nucleotides 4 to 6 correspond to amino acid 2, and so on.
Transcription: from DNA to mRNA
Both prokaryotes and eukaryotes perform fundamentally the same process of transcription, with the important difference of the membrane-bound nucleus in eukaryotes. With the genes bound in the nucleus, transcription occurs in the nucleus of the cell and the mRNA transcript must be transported to the cytoplasm. The prokaryotes, which include bacteria and archaea, lack membrane-bound nuclei and other organelles, and transcription occurs in the cytoplasm of the cell. In both prokaryotes and eukaryotes, transcription occurs in three main stages: initiation, elongation, and termination.
Initiation
Transcription requires the DNA double helix to partially unwind in the region of mRNA synthesis. The region of unwinding is called a transcription bubble. The DNA sequence onto which the proteins and enzymes involved in transcription bind to initiate the process is called a promoter. In most cases, promoters exist upstream of the genes they regulate. The specific sequence of a promoter is very important because it determines whether the corresponding gene is transcribed all of the time, some of the time, or hardly ever.
Elongation
Transcription always proceeds from one of the two DNA strands, which is called the template strand. The mRNA product is complementary to the template strand and is almost identical to the other DNA strand, called the nontemplate strand, with the exception that RNA contains a uracil (U) in place of the thymine (T) found in DNA. During elongation, an enzyme called RNA polymerase proceeds along the DNA template adding nucleotides by base pairing with the DNA template in a manner similar to DNA replication, with the difference that an RNA strand is being synthesized that does not remain bound to the DNA template. As elongation proceeds, the DNA is continuously unwound ahead of the core enzyme and rewound behind it.
Termination
Once a gene is transcribed, the prokaryotic polymerase needs to be instructed to dissociate from the DNA template and liberate the newly made mRNA. Depending on the gene being transcribed, there are two kinds of termination signals, but both involve repeated nucleotide sequences in the DNA template that result in RNA polymerase stalling, leaving the DNA template, and freeing the mRNA transcript.
On termination, the process of transcription is complete. In a prokaryotic cell, by the time termination occurs, the transcript has already been used to partially synthesize numerous copies of the encoded protein because these processes can occur concurrently using multiple ribosomes (polyribosomes). In contrast, the presence of a nucleus in eukaryotic cells precludes simultaneous transcription and translation.
Eukaryotic RNA Processing
Newly transcribed eukaryotic mRNAs must undergo several processing steps before they can be transferred from the nucleus to the cytoplasm and translated into a protein. The additional steps involved in eukaryotic mRNA maturation create a molecule that is much more stable than a prokaryotic mRNA: eukaryotic mRNAs last for several hours, whereas the typical prokaryotic mRNA lasts no more than five seconds.
The mRNA transcript is first coated in RNA-stabilizing proteins to prevent it from degrading while it is processed and exported out of the nucleus. This occurs while the pre-mRNA is still being synthesized by adding a special nucleotide “cap” to the 5′ end of the growing transcript. In addition to preventing degradation, factors involved in protein synthesis recognize the cap to help initiate translation by ribosomes.
Once elongation is complete, an enzyme adds a string of approximately 200 adenine residues to the 3′ end, called the poly-A tail. This modification further protects the pre-mRNA from degradation and signals to cellular factors that the transcript needs to be exported to the cytoplasm. Eukaryotic genes are made up of of protein-coding sequences called exons (ex-on signifies that they are expressed) and intervening sequences called introns (int-ron denotes their intervening role). Introns are removed and degraded from the pre-mRNA during processing while the pre-mRNA is still in the nucleus. Intron sequences in mRNA do not encode functional proteins. It is essential that all of a pre-mRNA’s introns be completely and precisely removed before protein synthesis so that the exons join together to code for the correct amino acids. If the process errs by even a single nucleotide, the sequence of the rejoined exons would be shifted, and the resulting protein would be nonfunctional. The process of removing introns and reconnecting exons is called splicing. Introns are removed and degraded while the pre-mRNA is still in the nucleus.
Reflect: Genetics and You
Poll
Expand: The Central Dogma
Investigate
The Central Dogma describes the normal flow of genetic information from DNA to mRNA to protein; DNA in genes specify sequences of mRNA which, in turn, specify amino acid sequences in proteins. The process requires two steps: transcription and translation. During transcription, genes are used to make messenger RNA (mRNA). In turn, the mRNA is used to direct the synthesis of proteins during the process of translation. Translation also requires two other types of RNA: transfer RNA (tRNA) and ribosomal RNA (rRNA). The genetic code is a triplet code, with each RNA codon consisting of three consecutive nucleotides that specify one amino acid or the release of the newly formed polypeptide chain; for example, the mRNA codon CAU specifies the amino acid histidine. The code is degenerate; that is, some amino acids are specified by more than one codon, like synonyms you study in your English class (different word, same meaning). For example, CCU, CCC, CCA, and CCG are all codons for proline. It is important to remember the same genetic code is universal to almost all organisms on Earth. Small variations in codon assignment exist in mitochondria and some microorganisms.
The cellular process of transcription generates messenger RNA (mRNA), a mobile molecular copy of one or more genes with an alphabet of A, C, G, and uracil (U). Translation of the mRNA template on ribosomes converts nucleotide-based genetic information into a protein product. That is the central dogma of DNA-protein synthesis. Protein sequences consist of 20 commonly occurring amino acids; therefore, it can be said that the protein alphabet consists of 20 “letters.” Different amino acids have different chemistries (such as acidic versus basic, or polar and nonpolar) and different structural constraints. Variation in amino acid sequence is responsible for the enormous variation in protein structure and function.
The Central Dogma: DNA Encodes RNA; RNA Encodes Protein
The flow of genetic information in cells from DNA to mRNA to protein is described by the central dogma, which states that genes specify the sequence of mRNAs, which in turn specify the sequence of amino acids making up all proteins. The decoding of one molecule to another is performed by specific proteins and RNAs. As the information stored in DNA is so central to cellular function, it makes sense that the cell makes mRNA copies of this information for protein synthesis, while keeping the DNA itself intact and protected. The copying of DNA to RNA is relatively straightforward, with one nucleotide being added to the mRNA strand for every nucleotide read in the DNA strand. The translation to protein is a bit more complex because three mRNA nucleotides correspond to one amino acid in the polypeptide sequence. However, the translation to protein is still systematic and colinear, such that nucleotides 1 to 3 correspond to amino acid 1, nucleotides 4 to 6 correspond to amino acid 2, and so on.
Check Your Knowledge
Use the quiz below to check your understanding of this lesson’s content. You can take this quiz as many times as you like. Once you are finished taking the quiz, click on the “View questions” button to review the correct answers.
Lesson Resources
Lesson Toolbox
Additional Resources and Readings
DNA, Hot Pockets, & The Longest Word Ever: Crash Course Biology #11
A Crash Course Biology video covering DNA transcription and translation
An interactive activity allowing you to transcribe a gene and translate it to protein using complementary pairing and the genetic code
DNA Central Dogma Part 1: Transcription
A video of the first part of transcription and the base sequence repetition of the TATA box
An animation of the process of prokaryotic transcription
Lesson Glossary
Terms

- central dogmastates that genes specify the sequence of mRNAs, which in turn specify the sequence of proteins
- colinearin terms of RNA and protein, three “units” of RNA (nucleotides) specify one “unit” of protein (amino acid) in a consecutive fashion
- exonsa sequence present in protein-coding mRNA after completion of pre-mRNA splicing
- intronsnon–protein-coding intervening sequences that are spliced from mRNA during processing
- mRNAmessenger RNA; a form of RNA that carries the nucleotide sequence code for a protein sequence that is translated into a polypeptide sequence
- nontemplate strandthe strand of DNA that is not used to transcribe mRNA; this strand is identical to the mRNA except that T nucleotides in the DNA are replaced by U nucleotides in the mRNA
- promotera sequence of DNA to which RNA polymerase and associated factors bind and initiate transcription
- RNA polymerasean enzyme that synthesizes an RNA strand from a DNA template strand
- splicingthe process of removing introns and reconnecting exons in a pre-mRNA
- template strandthe strand of DNA that specifies the complementary mRNA molecule
- transcription bubblethe region of locally unwound DNA that allows for transcription of mRNA
License and Citations
Content License
Lesson Content:
Authored and curated by Jill Carson for The TEL Library. CC BY NC SA 4.0
Adapted Content:
Title: Biology – 15.1 The Genetic Code – The Central Dogma: DNA Encodes RNA: Rice University, OpenStax CNX. License: CC BY 4.0
Title: Biology – 9.3 Transcription – The Central Dogma: DNA Encodes RNA: Rice University, OpenStax CNX. License: CC BY 4.0
Media Sources
Link | Author | Publisher | License | |
---|---|---|---|---|
![]() | Genetics Chromosomes Rna | OpenClipart-Vectors | Pixabay | CC 0 |
![]() | Figure 2. RNA | OpenStax | OpenStax | CC BY 4.0 |
![]() | Figure 1. 20 amino acids | OpenStax | OpenStax | CC BY 4.0 |
![]() | Figure 5. Eukaryotic mRNA | OpenStax | OpenStax | CC BY 4.0 |
![]() | Figure 3. RNA polymerase | OpenStax | OpenStax | CC BY 4.0 |
![]() | Figure 4. Multiple polymerases | OpenStax | OpenStax | CC BY 4.0 |
![]() | Figure 2. Transcription | OpenStax | OpenStax | CC BY 4.0 |