際際滷

際際滷Share a Scribd company logo
Presented By:-
Ashika
Pokiya(12TI083)
Guide by:-
Nehal Patel
STRING MATCHING
ALGORITHMS
WHAT IS STRING MATCHING
 In computer science, string searching
algorithms, sometimes called string
matching algorithms, that try to find a
place where one or several string (also
called pattern) are found within a larger
string or text.
EXAMPLE
STRING MATCHING PROBLEM
A B C A B A A C A B
A B A A
TEXT
PATTER
N
SHIFT=3
STRING MATCHING
ALGORITHMS
There are many types of String Matching
Algorithms like:-
1) The Naive string-matching algorithm
2) The Rabin-Krap algorithm
3) String matching with finite automata
4) The Knuth-Morris-Pratt algorithm
But we discuss about 2 types of string matching
algorithms.
1) The Naive string-matching algorithm
2) The Rabin-Krap algorithm
THE NAIVE ALGORITHM
The naive algorithm finds all valid shifts using a loop
that checks
the condition P[1.m]=T[s+1. s+m] for each of the n-
m+1
possible values of s.(P=pattern , T=text/string , s=shift)
NAIVE-STRING-MATCHER(T,P)
1) n = T.length
2) m = P.length
3) for s=0 to n-m
4) if P[1m]==T[s+1.s+m]
5) printf Pattern occurs with
shift  s
EXAMPLE
 SUPPOSE,
T=1011101110
P=111
FIND ALL VALID SHIFT
1 0 1 1 1 0 1 1 1 0T=Tex
t
1 1 1P=Patter
n
S=
0
1 0 1 1 1 0 1 1 1 0
1 1 1
S=
1
1 0 1 1 1 0 1 1 1 0
1 1 1
S=2
So, S=2 is a valid shift
1 0 1 1 1 0 1 1 1 0
1 1 1
S=3
1 0 1 1 1 0 1 1 1 0
1 1 1
S=4
1 0 1 1 1 0 1 1 1 0
1 1 1
S=5
1 0 1 1 1 0 1 1 1 0
1 1 1
S=6
So, S=6 is a valid shift
1 0 1 1 1 0 1 1 1 0
1 1 1
S=7
THE RABIN-KARP ALGORITHM
 Rabin and Karp proposed a string
matching algorithm that performs well in
practice and that also generalizes to
other algorithms for related problems,
such as two-dimentional pattern
matching.
ALGORITHM
RABIN-KARP-MATCHER(T,P,d,q)
1) n = T.length
2) m = P.length
3) h = d^(m-1) mod q
4) p = 0
5) t = 0
6) for i =1 to m //pre-processing
7) p = (dp + P[i]) mod q
8) t = (d t + T[i]) mod q
9) for s = 0 to n  m //matching
10) if p == t
11) if P[1m] == T[s+1. s+m]
12) printf  Pattern occurs with shift  s
13) if s< n-m
14) t+1 = (d(t- T[s+1]h)+ T[s+m+1]) mod q
EXAMPLE
Pattern P=26, how many spurious hits does the
Rabin
Karp matcher in the text T=3 1 4 1 5 9 2 6 5 3
5
 T = 3 1 4 1 5 9 2 6 5 3 5
P = 2 6
Here T.length=11 so Q=11
and P mod Q = 26 mod 11
= 4
Now find the exact match of P mod Q
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
3 1 mod 1 1 = 9 not equal to 4
1 4 mod 1 1 = 3 not equal to 4
4 1 mod 1 1 = 8 not equal to 4
3 1 4 1 5 9 2 6 5 3 5
S=1
S=0
S=2
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
1 5 mod 1 1 = 4 equal to 4 SPURIOUS HIT
5 9 mod 1 1 = 4 equal to 4 SPURIOUS HIT
9 2 mod 1 1 = 4 equal to 4 SPURIOUS HIT
S=3
S=4
S=5
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
2 6 mod 1 1 = 4 EXACT MATCH
6 5 mod 1 1 = 10 not equal to 4
5 3 mod 1 1 = 9 not equal to 4
S=7
S=6
S=8
3 1 4 1 5 9 2 6 5 3 5
3 5 mod 1 1 = 2 not equal to 4
S=9
Pattern occurs with shift 6
COMPARISSION
 The Naive String Matching algorithm slides
the pattern one by one. After each slide, it one
by one checks characters at the current shift
and if all characters match then prints the
match.
 Like the Naive Algorithm, Rabin-Karp algorithm
also slides the pattern one by one. But unlike the
Naive algorithm, Rabin Karp algorithm matches
the hash value of the pattern with the hash value
of current substring of text, and if the hash values
match then only it starts matching individual
characters.
THANK YOU

More Related Content

String matching algorithms

  • 2. WHAT IS STRING MATCHING In computer science, string searching algorithms, sometimes called string matching algorithms, that try to find a place where one or several string (also called pattern) are found within a larger string or text.
  • 3. EXAMPLE STRING MATCHING PROBLEM A B C A B A A C A B A B A A TEXT PATTER N SHIFT=3
  • 4. STRING MATCHING ALGORITHMS There are many types of String Matching Algorithms like:- 1) The Naive string-matching algorithm 2) The Rabin-Krap algorithm 3) String matching with finite automata 4) The Knuth-Morris-Pratt algorithm But we discuss about 2 types of string matching algorithms. 1) The Naive string-matching algorithm 2) The Rabin-Krap algorithm
  • 5. THE NAIVE ALGORITHM The naive algorithm finds all valid shifts using a loop that checks the condition P[1.m]=T[s+1. s+m] for each of the n- m+1 possible values of s.(P=pattern , T=text/string , s=shift) NAIVE-STRING-MATCHER(T,P) 1) n = T.length 2) m = P.length 3) for s=0 to n-m 4) if P[1m]==T[s+1.s+m] 5) printf Pattern occurs with shift s
  • 6. EXAMPLE SUPPOSE, T=1011101110 P=111 FIND ALL VALID SHIFT 1 0 1 1 1 0 1 1 1 0T=Tex t 1 1 1P=Patter n S= 0
  • 7. 1 0 1 1 1 0 1 1 1 0 1 1 1 S= 1
  • 8. 1 0 1 1 1 0 1 1 1 0 1 1 1 S=2 So, S=2 is a valid shift
  • 9. 1 0 1 1 1 0 1 1 1 0 1 1 1 S=3
  • 10. 1 0 1 1 1 0 1 1 1 0 1 1 1 S=4
  • 11. 1 0 1 1 1 0 1 1 1 0 1 1 1 S=5
  • 12. 1 0 1 1 1 0 1 1 1 0 1 1 1 S=6 So, S=6 is a valid shift
  • 13. 1 0 1 1 1 0 1 1 1 0 1 1 1 S=7
  • 14. THE RABIN-KARP ALGORITHM Rabin and Karp proposed a string matching algorithm that performs well in practice and that also generalizes to other algorithms for related problems, such as two-dimentional pattern matching.
  • 15. ALGORITHM RABIN-KARP-MATCHER(T,P,d,q) 1) n = T.length 2) m = P.length 3) h = d^(m-1) mod q 4) p = 0 5) t = 0 6) for i =1 to m //pre-processing 7) p = (dp + P[i]) mod q 8) t = (d t + T[i]) mod q 9) for s = 0 to n m //matching 10) if p == t 11) if P[1m] == T[s+1. s+m] 12) printf Pattern occurs with shift s 13) if s< n-m 14) t+1 = (d(t- T[s+1]h)+ T[s+m+1]) mod q
  • 16. EXAMPLE Pattern P=26, how many spurious hits does the Rabin Karp matcher in the text T=3 1 4 1 5 9 2 6 5 3 5 T = 3 1 4 1 5 9 2 6 5 3 5 P = 2 6 Here T.length=11 so Q=11 and P mod Q = 26 mod 11 = 4 Now find the exact match of P mod Q
  • 17. 3 1 4 1 5 9 2 6 5 3 5 3 1 4 1 5 9 2 6 5 3 5 3 1 mod 1 1 = 9 not equal to 4 1 4 mod 1 1 = 3 not equal to 4 4 1 mod 1 1 = 8 not equal to 4 3 1 4 1 5 9 2 6 5 3 5 S=1 S=0 S=2
  • 18. 3 1 4 1 5 9 2 6 5 3 5 3 1 4 1 5 9 2 6 5 3 5 3 1 4 1 5 9 2 6 5 3 5 1 5 mod 1 1 = 4 equal to 4 SPURIOUS HIT 5 9 mod 1 1 = 4 equal to 4 SPURIOUS HIT 9 2 mod 1 1 = 4 equal to 4 SPURIOUS HIT S=3 S=4 S=5
  • 19. 3 1 4 1 5 9 2 6 5 3 5 3 1 4 1 5 9 2 6 5 3 5 3 1 4 1 5 9 2 6 5 3 5 2 6 mod 1 1 = 4 EXACT MATCH 6 5 mod 1 1 = 10 not equal to 4 5 3 mod 1 1 = 9 not equal to 4 S=7 S=6 S=8
  • 20. 3 1 4 1 5 9 2 6 5 3 5 3 5 mod 1 1 = 2 not equal to 4 S=9 Pattern occurs with shift 6
  • 21. COMPARISSION The Naive String Matching algorithm slides the pattern one by one. After each slide, it one by one checks characters at the current shift and if all characters match then prints the match. Like the Naive Algorithm, Rabin-Karp algorithm also slides the pattern one by one. But unlike the Naive algorithm, Rabin Karp algorithm matches the hash value of the pattern with the hash value of current substring of text, and if the hash values match then only it starts matching individual characters.

Editor's Notes

  • #19: Spurious hit is when we have a match but it isnt an actual match to the pattern. When this happen, further testing is done.