ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Learning from primary structure
Sequence alignment
Sequence alignment
• measure their similarity
• determine the residue-residue
correspondences
• observe patterns of conservation and
variability
• infer evolutionary relatonships
Measure of similarity
alignment: identification of residue-residue correspondences
Correspondences must preserve the order of residues
Gaps may be introduced
Example:
First string= a b c d e second string= a c d e f
A reasonable alignment: a b c d e –
a – c d e f
Measure of similarity
We must define criteria so that an algorithm can choose the best alignment
Example:
gctgaacg ctataatc
Alignments:
- - - - - - - g c t g a a c g
c t a t a a t c - - - - - - -
g c t g a a c g
c t a t a a t c
g c t g a - a - - c g
- - c t - a t a a t c
g c t g - a a - c g
- c t a t a a t c -
Measure of similarity
We need a way to examine all possible alignments
systematically. Then we need to compute a score
reflecting the quality of each possible alignment, and
to identify the alignment with the optimal score
Several different alignments may give the same best
score
Even minor variations in the scoring scheme may
change the ranking of alignments, causing a different
one to emerge as the best
Dotplot
• give an overview of the similarities between two sequences
• have a close relationship with the alignment between two sequences
Da: Lesk, Introduction to Bioinformatics
Dotplot showing
identities between short
name
(DOROTHYHODGKIN)
and full name
(DOROTHYCROWFOOTH
ODGKIN)
Dotplot
Da: Lesk, Introduction to Bioinformatics
Dotplot showing identities
between a repetitive
sequence
(ABRACADABRACADABRA)
and itself. The repeats appear
on several subsidiary
diagonals parallel to the main
diagonal.
Dotplot
Da: Lesk, Introduction to Bioinformatics
Dotplot showing identities
between the palindromic
sequence MAX I STAY AWAY
AT SIX AM and itself. The
palindrome reveals itself as a
stretch of matches
perpendicular to the main
diagonal
Remember that: Restriction
enzymes and transcriptional
regulatory factors may
recognize palindrome
sequences
EcoRI: GAATTC
CTTAAG
Dotplot
Da: Lesk, Introduction to Bioinformatics
Dotplot relating the
mitochondrial ATPase-6 genes
from a lamprey and dogfish
shark. Similarity of the
sequences is weakest near
the beginning.
The dotplot is a weak
approach to compare related
but distant sequences
Dotplot
Proteins dotplot: a dotplot
relating PAX-6 protein of
mouse and the eyeless
protein of Drosophila
melanogaster.
The mouse sequence shows
an insertion that is missing in
Drosophila
Rielaborato da: Lesk, Introduction to Bioinformatics
Dotplot and
sequence alignment
The dotplot capture the
overall similarity of two
sequences and also the
complete set and relative
quality of different possible
alignments.
Diagonal movement indicates
that the residues align;
horizontal movement
indicates that a gap must be
introduced in the sequence
shown in the lines; if it is
vertical, the gap is introduced
in the column sequence
Da: Lesk, Introduction to Bioinformatics
DOROTHY--------HODGKIN
DOROTHYCROWFOOTHODGKIN
Measures of sequence similarity
Given two character strings, two measures of the distance between them are:
• The Hamming distance, defined between two strings of equal length, is the
number of positions with mismatching characters.
• The Levenshtein, or edit distance, between two strigs of not necessarily equal
length, is the minimal number of ’edit operations’ required to change one string
into the other, where an edit operation is a deletion, insertion or alteration of a
single chracter in either sequence.
For example:
agtc Hamming distance = 2
cgta
ag-tcc Levenshtein distance = 3
cgctca
Da: Lesk, Introduction to Bioinformatics

More Related Content

Similar to Sequence Alignment - Data Bioinformatics Introduction (20)

An Efficient Biological Sequence Compression Technique Using LUT and Repeat ...
An Efficient Biological Sequence Compression Technique Using  LUT and Repeat ...An Efficient Biological Sequence Compression Technique Using  LUT and Repeat ...
An Efficient Biological Sequence Compression Technique Using LUT and Repeat ...
IOSR Journals
Ìý
DNA Compression (Encoded using Huffman Encoding Method)
DNA Compression (Encoded using Huffman Encoding Method)DNA Compression (Encoded using Huffman Encoding Method)
DNA Compression (Encoded using Huffman Encoding Method)
Marwa Al-Rikaby
Ìý
lecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadflecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadf
alizain9604
Ìý
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
Abhishek Vatsa
Ìý
protein structure prediction in bioinformatics.ppt
protein structure prediction in bioinformatics.pptprotein structure prediction in bioinformatics.ppt
protein structure prediction in bioinformatics.ppt
DrSudha2
Ìý
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
Dr. Harisingh Gour Vishwavidyalaya (A Central Universuty), Sagar, MP
Ìý
2015 bioinformatics alignments_wim_vancriekinge
2015 bioinformatics alignments_wim_vancriekinge2015 bioinformatics alignments_wim_vancriekinge
2015 bioinformatics alignments_wim_vancriekinge
Prof. Wim Van Criekinge
Ìý
Alignments
AlignmentsAlignments
Alignments
James McInerney
Ìý
2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekinge2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekinge
Prof. Wim Van Criekinge
Ìý
Bioinformatica t3-scoring matrices
Bioinformatica t3-scoring matricesBioinformatica t3-scoring matrices
Bioinformatica t3-scoring matrices
Prof. Wim Van Criekinge
Ìý
Bioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matricesBioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matrices
Prof. Wim Van Criekinge
Ìý
Laboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsLaboratory 1 sequence_alignments
Laboratory 1 sequence_alignments
seham15
Ìý
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
naveed ul mushtaq
Ìý
Part 4 of RNA-seq for DE analysis: Extracting count table and QC
Part 4 of RNA-seq for DE analysis: Extracting count table and QCPart 4 of RNA-seq for DE analysis: Extracting count table and QC
Part 4 of RNA-seq for DE analysis: Extracting count table and QC
Joachim Jacob
Ìý
Increasingly Accurate Representation of Biochemistry (v2)
Increasingly Accurate Representation of Biochemistry (v2)Increasingly Accurate Representation of Biochemistry (v2)
Increasingly Accurate Representation of Biochemistry (v2)
Michel Dumontier
Ìý
Protein Threading
Protein ThreadingProtein Threading
Protein Threading
SANJANA PANDEY
Ìý
NeedlemanWunsch.pdf
NeedlemanWunsch.pdfNeedlemanWunsch.pdf
NeedlemanWunsch.pdf
Yogeshwari54
Ìý
Randomizing genome-scale metabolic networks
Randomizing genome-scale metabolic networksRandomizing genome-scale metabolic networks
Randomizing genome-scale metabolic networks
Areejit Samal
Ìý
Msa & rooted/unrooted tree
Msa & rooted/unrooted treeMsa & rooted/unrooted tree
Msa & rooted/unrooted tree
Samiul Ehsan
Ìý
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
Ravi Gandham
Ìý
An Efficient Biological Sequence Compression Technique Using LUT and Repeat ...
An Efficient Biological Sequence Compression Technique Using  LUT and Repeat ...An Efficient Biological Sequence Compression Technique Using  LUT and Repeat ...
An Efficient Biological Sequence Compression Technique Using LUT and Repeat ...
IOSR Journals
Ìý
DNA Compression (Encoded using Huffman Encoding Method)
DNA Compression (Encoded using Huffman Encoding Method)DNA Compression (Encoded using Huffman Encoding Method)
DNA Compression (Encoded using Huffman Encoding Method)
Marwa Al-Rikaby
Ìý
lecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadflecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadf
alizain9604
Ìý
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
Abhishek Vatsa
Ìý
protein structure prediction in bioinformatics.ppt
protein structure prediction in bioinformatics.pptprotein structure prediction in bioinformatics.ppt
protein structure prediction in bioinformatics.ppt
DrSudha2
Ìý
2015 bioinformatics alignments_wim_vancriekinge
2015 bioinformatics alignments_wim_vancriekinge2015 bioinformatics alignments_wim_vancriekinge
2015 bioinformatics alignments_wim_vancriekinge
Prof. Wim Van Criekinge
Ìý
2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekinge2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekinge
Prof. Wim Van Criekinge
Ìý
Bioinformatica t3-scoring matrices
Bioinformatica t3-scoring matricesBioinformatica t3-scoring matrices
Bioinformatica t3-scoring matrices
Prof. Wim Van Criekinge
Ìý
Bioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matricesBioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matrices
Prof. Wim Van Criekinge
Ìý
Laboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsLaboratory 1 sequence_alignments
Laboratory 1 sequence_alignments
seham15
Ìý
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
naveed ul mushtaq
Ìý
Part 4 of RNA-seq for DE analysis: Extracting count table and QC
Part 4 of RNA-seq for DE analysis: Extracting count table and QCPart 4 of RNA-seq for DE analysis: Extracting count table and QC
Part 4 of RNA-seq for DE analysis: Extracting count table and QC
Joachim Jacob
Ìý
Increasingly Accurate Representation of Biochemistry (v2)
Increasingly Accurate Representation of Biochemistry (v2)Increasingly Accurate Representation of Biochemistry (v2)
Increasingly Accurate Representation of Biochemistry (v2)
Michel Dumontier
Ìý
Protein Threading
Protein ThreadingProtein Threading
Protein Threading
SANJANA PANDEY
Ìý
NeedlemanWunsch.pdf
NeedlemanWunsch.pdfNeedlemanWunsch.pdf
NeedlemanWunsch.pdf
Yogeshwari54
Ìý
Randomizing genome-scale metabolic networks
Randomizing genome-scale metabolic networksRandomizing genome-scale metabolic networks
Randomizing genome-scale metabolic networks
Areejit Samal
Ìý
Msa & rooted/unrooted tree
Msa & rooted/unrooted treeMsa & rooted/unrooted tree
Msa & rooted/unrooted tree
Samiul Ehsan
Ìý
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
Ravi Gandham
Ìý

Recently uploaded (20)

Yale VMOC Special Report - Measles Outbreak Southwest US 3-30-2025 FINAL v2...
Yale VMOC Special Report - Measles Outbreak  Southwest US 3-30-2025  FINAL v2...Yale VMOC Special Report - Measles Outbreak  Southwest US 3-30-2025  FINAL v2...
Yale VMOC Special Report - Measles Outbreak Southwest US 3-30-2025 FINAL v2...
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
Ìý
Role of Teacher in the era of Generative AI
Role of Teacher in the era of Generative AIRole of Teacher in the era of Generative AI
Role of Teacher in the era of Generative AI
Prof. Neeta Awasthy
Ìý
Viceroys of India & Their Tenure – Key Events During British Rule
Viceroys of India & Their Tenure – Key Events During British RuleViceroys of India & Their Tenure – Key Events During British Rule
Viceroys of India & Their Tenure – Key Events During British Rule
DeeptiKumari61
Ìý
How to Configure Outgoing and Incoming mail servers in Odoo 18
How to Configure Outgoing and Incoming mail servers in Odoo 18How to Configure Outgoing and Incoming mail servers in Odoo 18
How to Configure Outgoing and Incoming mail servers in Odoo 18
Celine George
Ìý
Key Frameworks in Systematic Reviews - Dr Reginald Quansah
Key Frameworks in Systematic Reviews - Dr Reginald QuansahKey Frameworks in Systematic Reviews - Dr Reginald Quansah
Key Frameworks in Systematic Reviews - Dr Reginald Quansah
Systematic Reviews Network (SRN)
Ìý
The basics of sentences session 9pptx.pptx
The basics of sentences session 9pptx.pptxThe basics of sentences session 9pptx.pptx
The basics of sentences session 9pptx.pptx
heathfieldcps1
Ìý
MIPLM subject matter expert Daniel Holzner
MIPLM subject matter expert Daniel HolznerMIPLM subject matter expert Daniel Holzner
MIPLM subject matter expert Daniel Holzner
MIPLM
Ìý
Anti-Viral Agents.pptx Medicinal Chemistry III, B Pharm SEM VI
Anti-Viral Agents.pptx Medicinal Chemistry III, B Pharm SEM VIAnti-Viral Agents.pptx Medicinal Chemistry III, B Pharm SEM VI
Anti-Viral Agents.pptx Medicinal Chemistry III, B Pharm SEM VI
Samruddhi Khonde
Ìý
How to Manage Purchase Order Approval in Odoo 18
How to Manage Purchase Order Approval in Odoo 18How to Manage Purchase Order Approval in Odoo 18
How to Manage Purchase Order Approval in Odoo 18
Celine George
Ìý
MICROECONOMICS: RENT AND THEORIES OF RENT
MICROECONOMICS: RENT AND THEORIES OF RENTMICROECONOMICS: RENT AND THEORIES OF RENT
MICROECONOMICS: RENT AND THEORIES OF RENT
DrSundariD
Ìý
DUODENUM ANATOMY & Clinical Anatomy.pptx
DUODENUM ANATOMY & Clinical Anatomy.pptxDUODENUM ANATOMY & Clinical Anatomy.pptx
DUODENUM ANATOMY & Clinical Anatomy.pptx
Sid Roy
Ìý
How to Setup Company Data in Odoo 17 Accounting App
How to Setup Company Data in Odoo 17 Accounting AppHow to Setup Company Data in Odoo 17 Accounting App
How to Setup Company Data in Odoo 17 Accounting App
Celine George
Ìý
Knownsense 2025 prelims- U-25 General Quiz.pdf
Knownsense 2025 prelims- U-25 General Quiz.pdfKnownsense 2025 prelims- U-25 General Quiz.pdf
Knownsense 2025 prelims- U-25 General Quiz.pdf
Pragya - UEM Kolkata Quiz Club
Ìý
Anti-Fungal Agents.pptx Medicinal Chemistry III B. Pharm Sem VI
Anti-Fungal Agents.pptx Medicinal Chemistry III B. Pharm Sem VIAnti-Fungal Agents.pptx Medicinal Chemistry III B. Pharm Sem VI
Anti-Fungal Agents.pptx Medicinal Chemistry III B. Pharm Sem VI
Samruddhi Khonde
Ìý
Antifungal agents by Mrs. Manjushri Dabhade
Antifungal agents by Mrs. Manjushri DabhadeAntifungal agents by Mrs. Manjushri Dabhade
Antifungal agents by Mrs. Manjushri Dabhade
Dabhade madam Dabhade
Ìý
UTI Quinolones by Mrs. Manjushri Dabhade
UTI Quinolones by Mrs. Manjushri DabhadeUTI Quinolones by Mrs. Manjushri Dabhade
UTI Quinolones by Mrs. Manjushri Dabhade
Dabhade madam Dabhade
Ìý
3. AI Trust Layer, Governance – Explainability, Security & Compliance.pdf
3. AI Trust Layer, Governance – Explainability, Security & Compliance.pdf3. AI Trust Layer, Governance – Explainability, Security & Compliance.pdf
3. AI Trust Layer, Governance – Explainability, Security & Compliance.pdf
Mukesh Kala
Ìý
Unit1 Inroduction to Internal Combustion Engines
Unit1  Inroduction to Internal Combustion EnginesUnit1  Inroduction to Internal Combustion Engines
Unit1 Inroduction to Internal Combustion Engines
NileshKumbhar21
Ìý
MIPLM subject matter expert Nicos Raftis
MIPLM subject matter expert Nicos RaftisMIPLM subject matter expert Nicos Raftis
MIPLM subject matter expert Nicos Raftis
MIPLM
Ìý
MIPLM subject matter expert Dr Robert Klinski
MIPLM subject matter expert Dr Robert KlinskiMIPLM subject matter expert Dr Robert Klinski
MIPLM subject matter expert Dr Robert Klinski
MIPLM
Ìý
Role of Teacher in the era of Generative AI
Role of Teacher in the era of Generative AIRole of Teacher in the era of Generative AI
Role of Teacher in the era of Generative AI
Prof. Neeta Awasthy
Ìý
Viceroys of India & Their Tenure – Key Events During British Rule
Viceroys of India & Their Tenure – Key Events During British RuleViceroys of India & Their Tenure – Key Events During British Rule
Viceroys of India & Their Tenure – Key Events During British Rule
DeeptiKumari61
Ìý
How to Configure Outgoing and Incoming mail servers in Odoo 18
How to Configure Outgoing and Incoming mail servers in Odoo 18How to Configure Outgoing and Incoming mail servers in Odoo 18
How to Configure Outgoing and Incoming mail servers in Odoo 18
Celine George
Ìý
Key Frameworks in Systematic Reviews - Dr Reginald Quansah
Key Frameworks in Systematic Reviews - Dr Reginald QuansahKey Frameworks in Systematic Reviews - Dr Reginald Quansah
Key Frameworks in Systematic Reviews - Dr Reginald Quansah
Systematic Reviews Network (SRN)
Ìý
The basics of sentences session 9pptx.pptx
The basics of sentences session 9pptx.pptxThe basics of sentences session 9pptx.pptx
The basics of sentences session 9pptx.pptx
heathfieldcps1
Ìý
MIPLM subject matter expert Daniel Holzner
MIPLM subject matter expert Daniel HolznerMIPLM subject matter expert Daniel Holzner
MIPLM subject matter expert Daniel Holzner
MIPLM
Ìý
Anti-Viral Agents.pptx Medicinal Chemistry III, B Pharm SEM VI
Anti-Viral Agents.pptx Medicinal Chemistry III, B Pharm SEM VIAnti-Viral Agents.pptx Medicinal Chemistry III, B Pharm SEM VI
Anti-Viral Agents.pptx Medicinal Chemistry III, B Pharm SEM VI
Samruddhi Khonde
Ìý
How to Manage Purchase Order Approval in Odoo 18
How to Manage Purchase Order Approval in Odoo 18How to Manage Purchase Order Approval in Odoo 18
How to Manage Purchase Order Approval in Odoo 18
Celine George
Ìý
MICROECONOMICS: RENT AND THEORIES OF RENT
MICROECONOMICS: RENT AND THEORIES OF RENTMICROECONOMICS: RENT AND THEORIES OF RENT
MICROECONOMICS: RENT AND THEORIES OF RENT
DrSundariD
Ìý
DUODENUM ANATOMY & Clinical Anatomy.pptx
DUODENUM ANATOMY & Clinical Anatomy.pptxDUODENUM ANATOMY & Clinical Anatomy.pptx
DUODENUM ANATOMY & Clinical Anatomy.pptx
Sid Roy
Ìý
How to Setup Company Data in Odoo 17 Accounting App
How to Setup Company Data in Odoo 17 Accounting AppHow to Setup Company Data in Odoo 17 Accounting App
How to Setup Company Data in Odoo 17 Accounting App
Celine George
Ìý
Anti-Fungal Agents.pptx Medicinal Chemistry III B. Pharm Sem VI
Anti-Fungal Agents.pptx Medicinal Chemistry III B. Pharm Sem VIAnti-Fungal Agents.pptx Medicinal Chemistry III B. Pharm Sem VI
Anti-Fungal Agents.pptx Medicinal Chemistry III B. Pharm Sem VI
Samruddhi Khonde
Ìý
Antifungal agents by Mrs. Manjushri Dabhade
Antifungal agents by Mrs. Manjushri DabhadeAntifungal agents by Mrs. Manjushri Dabhade
Antifungal agents by Mrs. Manjushri Dabhade
Dabhade madam Dabhade
Ìý
UTI Quinolones by Mrs. Manjushri Dabhade
UTI Quinolones by Mrs. Manjushri DabhadeUTI Quinolones by Mrs. Manjushri Dabhade
UTI Quinolones by Mrs. Manjushri Dabhade
Dabhade madam Dabhade
Ìý
3. AI Trust Layer, Governance – Explainability, Security & Compliance.pdf
3. AI Trust Layer, Governance – Explainability, Security & Compliance.pdf3. AI Trust Layer, Governance – Explainability, Security & Compliance.pdf
3. AI Trust Layer, Governance – Explainability, Security & Compliance.pdf
Mukesh Kala
Ìý
Unit1 Inroduction to Internal Combustion Engines
Unit1  Inroduction to Internal Combustion EnginesUnit1  Inroduction to Internal Combustion Engines
Unit1 Inroduction to Internal Combustion Engines
NileshKumbhar21
Ìý
MIPLM subject matter expert Nicos Raftis
MIPLM subject matter expert Nicos RaftisMIPLM subject matter expert Nicos Raftis
MIPLM subject matter expert Nicos Raftis
MIPLM
Ìý
MIPLM subject matter expert Dr Robert Klinski
MIPLM subject matter expert Dr Robert KlinskiMIPLM subject matter expert Dr Robert Klinski
MIPLM subject matter expert Dr Robert Klinski
MIPLM
Ìý

Sequence Alignment - Data Bioinformatics Introduction

  • 1. Learning from primary structure Sequence alignment
  • 2. Sequence alignment • measure their similarity • determine the residue-residue correspondences • observe patterns of conservation and variability • infer evolutionary relatonships
  • 3. Measure of similarity alignment: identification of residue-residue correspondences Correspondences must preserve the order of residues Gaps may be introduced Example: First string= a b c d e second string= a c d e f A reasonable alignment: a b c d e – a – c d e f
  • 4. Measure of similarity We must define criteria so that an algorithm can choose the best alignment Example: gctgaacg ctataatc Alignments: - - - - - - - g c t g a a c g c t a t a a t c - - - - - - - g c t g a a c g c t a t a a t c g c t g a - a - - c g - - c t - a t a a t c g c t g - a a - c g - c t a t a a t c -
  • 5. Measure of similarity We need a way to examine all possible alignments systematically. Then we need to compute a score reflecting the quality of each possible alignment, and to identify the alignment with the optimal score Several different alignments may give the same best score Even minor variations in the scoring scheme may change the ranking of alignments, causing a different one to emerge as the best
  • 6. Dotplot • give an overview of the similarities between two sequences • have a close relationship with the alignment between two sequences Da: Lesk, Introduction to Bioinformatics Dotplot showing identities between short name (DOROTHYHODGKIN) and full name (DOROTHYCROWFOOTH ODGKIN)
  • 7. Dotplot Da: Lesk, Introduction to Bioinformatics Dotplot showing identities between a repetitive sequence (ABRACADABRACADABRA) and itself. The repeats appear on several subsidiary diagonals parallel to the main diagonal.
  • 8. Dotplot Da: Lesk, Introduction to Bioinformatics Dotplot showing identities between the palindromic sequence MAX I STAY AWAY AT SIX AM and itself. The palindrome reveals itself as a stretch of matches perpendicular to the main diagonal Remember that: Restriction enzymes and transcriptional regulatory factors may recognize palindrome sequences EcoRI: GAATTC CTTAAG
  • 9. Dotplot Da: Lesk, Introduction to Bioinformatics Dotplot relating the mitochondrial ATPase-6 genes from a lamprey and dogfish shark. Similarity of the sequences is weakest near the beginning. The dotplot is a weak approach to compare related but distant sequences
  • 10. Dotplot Proteins dotplot: a dotplot relating PAX-6 protein of mouse and the eyeless protein of Drosophila melanogaster. The mouse sequence shows an insertion that is missing in Drosophila Rielaborato da: Lesk, Introduction to Bioinformatics
  • 11. Dotplot and sequence alignment The dotplot capture the overall similarity of two sequences and also the complete set and relative quality of different possible alignments. Diagonal movement indicates that the residues align; horizontal movement indicates that a gap must be introduced in the sequence shown in the lines; if it is vertical, the gap is introduced in the column sequence Da: Lesk, Introduction to Bioinformatics DOROTHY--------HODGKIN DOROTHYCROWFOOTHODGKIN
  • 12. Measures of sequence similarity Given two character strings, two measures of the distance between them are: • The Hamming distance, defined between two strings of equal length, is the number of positions with mismatching characters. • The Levenshtein, or edit distance, between two strigs of not necessarily equal length, is the minimal number of ’edit operations’ required to change one string into the other, where an edit operation is a deletion, insertion or alteration of a single chracter in either sequence. For example: agtc Hamming distance = 2 cgta ag-tcc Levenshtein distance = 3 cgctca Da: Lesk, Introduction to Bioinformatics