Molecular Homology

The genetic code is universal in all organisms. It uses the same 4 bases to code for the same set of 20 amino acids to produce all proteins across all organisms. This suggests that all organisms are derived from the one initial common ancestor.

The image shows a visual representation of the process from DNA to protein. At the top, there is a double-stranded DNA molecule with labeled nucleotide bases. Below it, an arrow labeled "Transcription" points to a single-stranded mRNA molecule, also labeled with nucleotide bases. Further down, another arrow labeled "Translation" points to a chain of amino acids in a protein.

Molecular homology is about comparing similarities in the sequences of molecules to determine the evolutionary relationship between organisms. It provides strong evidence that species with more similar molecular features are more closely related, and share a more recent common ancestor. For example, the following diagram shows the sequence alignment between great apes for a segment of DNA. From this alignment, we can see that humans and chimpanzees share the most similar DNA sequence, while gorillas have some differences to these, and orangutans are more different again. This corresponds to how recently each species diverged from a common ancestor.

An image of a phylogenetic tree showing the evolutionary relationships among humans, chimpanzees, gorillas, and orangutans. To the right of each species label is a horizontal bar representing a genetic sequence alignment, with blue sections indicating aligned sequences and gray gaps representing missing or unmatched regions in the alignment. Orangutans shared a common ancestor with humans longest ago, and have the least gaps and most aligned sequence. Gorillas, with the next oldest common ancestor with humans, have some further gaps in their sequence compared to orangutans. Chimpanzees and humans, who share the closest common ancestor, have similar sequences, with the most gaps compared to orangutans
Source: Your Genome (2019) Illustration showing a comparison of the genomes of four great apes and their evolutionary relatedness. LabXchange.

Molecular homology compares DNA sequences and amino acid sequences between species to determine relatedness. This information can be represented as a phylogenetic tree that shows relatedness amongst species, like the one above.


Use this page to revise the following concepts within molecular homology:


DNA sequencing

The order of bases along a DNA strand is its sequence. If two sequences are similar, then the two organisms are more related and will share a recent common ancestor.

DNA sequencing is used to understand the conservation of genes and to determine genome phylogeny. The differences found in the comparison of DNA sequences between species can be plotted against time in a phylogenetic tree .

Closely related species will show more similarities in the sequence of their common genes and have a more recent common ancestor.

Less closely related species will show more differences in the sequence of their common genes and have a more distant common ancestor.

Gene sequences may be strongly conserved over time if they are essential genes and therefore will show very similar base sequences.

Amino acid sequencing

All living things contain proteins which are built from specific sequences of amino acids .

While amino acid sequences are encoded in DNA sequences, it still is phylogenetically useful to also compare amino acid sequences. Due to redundancy in the genetic code, numerous silent mutations may accumulate over time without impacting an amino acid sequence. Therefore, between related species , amino acid sequences are often more similar than the corresponding DNA sequence.

The degree of difference between proteins is determined by calculating the number of amino acids that have changed since the two groups diverged.

The following table illustrates the high degree of similarity in the cytochrome c amino acid sequence between humans and pigs. In contrast, there is less similarity in the amino acid sequence between humans and fruit flies.

An table showing CLUSTAL alignment of amino acid sequences for cytochrome c from four organisms: humans, pigs, chickens, and fruit flies. The sequences are aligned horizontally, showing amino acids such as glycine (Gly), valine (Val), and lysine (Lys). The human and pig sequences are nearly identical, with only minor differences. The chicken sequence shows a few more differences compared to humans, and the fruit fly sequence has the most differences.

The protein cytochrome c is commonly used to compare organisms. It is a vital protein in the electron transport chain in aerobic respiration. The table below highlights the relatedness between humans and four other species based on the number of amino acid differences in the sequence of cytochrome c. Those with more amino acids different to humans are less closely related.

Organisms being compared for its sequence of cytochrome c

Number of amino acids different from humans
Rhesus monkey 1 amino acid
Rabbit 9 amino acids
Penguin 11 amino acids
Moth 24 amino acids

The differences found in the comparison of these amino acid sequences for a specific protein can be plotted against time in a phylogenetic tree .

Closely related species will show more similarities in the sequence of certain proteins and have a more recent common ancestor.

Less closely related species will show more differences in the sequence of certain proteins and have a more distant common ancestor.