This document discusses string matching algorithms. It defines string matching as finding a pattern within a larger text or string. It then summarizes two common string matching algorithms: the naive algorithm and Rabin-Karp algorithm. The naive algorithm loops through all possible shifts of the pattern and directly compares characters. Rabin-Karp also shifts the pattern but compares hash values of substrings first before checking individual characters to reduce comparisons. The document provides examples of how each algorithm works on sample strings.
2. WHAT IS STRING MATCHING
In computer science, string searching
algorithms, sometimes called string
matching algorithms, that try to find a
place where one or several string (also
called pattern) are found within a larger
string or text.
4. STRING MATCHING
ALGORITHMS
There are many types of String Matching
Algorithms like:-
1) The Naive string-matching algorithm
2) The Rabin-Krap algorithm
3) String matching with finite automata
4) The Knuth-Morris-Pratt algorithm
But we discuss about 2 types of string matching
algorithms.
1) The Naive string-matching algorithm
2) The Rabin-Krap algorithm
5. THE NAIVE ALGORITHM
The naive algorithm finds all valid shifts using a loop
that checks
the condition P[1.m]=T[s+1. s+m] for each of the n-
m+1
possible values of s.(P=pattern , T=text/string , s=shift)
NAIVE-STRING-MATCHER(T,P)
1) n = T.length
2) m = P.length
3) for s=0 to n-m
4) if P[1m]==T[s+1.s+m]
5) printf Pattern occurs with
shift s
14. THE RABIN-KARP ALGORITHM
Rabin and Karp proposed a string
matching algorithm that performs well in
practice and that also generalizes to
other algorithms for related problems,
such as two-dimentional pattern
matching.
15. ALGORITHM
RABIN-KARP-MATCHER(T,P,d,q)
1) n = T.length
2) m = P.length
3) h = d^(m-1) mod q
4) p = 0
5) t = 0
6) for i =1 to m //pre-processing
7) p = (dp + P[i]) mod q
8) t = (d t + T[i]) mod q
9) for s = 0 to n m //matching
10) if p == t
11) if P[1m] == T[s+1. s+m]
12) printf Pattern occurs with shift s
13) if s< n-m
14) t+1 = (d(t- T[s+1]h)+ T[s+m+1]) mod q
16. EXAMPLE
Pattern P=26, how many spurious hits does the
Rabin
Karp matcher in the text T=3 1 4 1 5 9 2 6 5 3
5
T = 3 1 4 1 5 9 2 6 5 3 5
P = 2 6
Here T.length=11 so Q=11
and P mod Q = 26 mod 11
= 4
Now find the exact match of P mod Q
17. 3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
3 1 mod 1 1 = 9 not equal to 4
1 4 mod 1 1 = 3 not equal to 4
4 1 mod 1 1 = 8 not equal to 4
3 1 4 1 5 9 2 6 5 3 5
S=1
S=0
S=2
18. 3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
1 5 mod 1 1 = 4 equal to 4 SPURIOUS HIT
5 9 mod 1 1 = 4 equal to 4 SPURIOUS HIT
9 2 mod 1 1 = 4 equal to 4 SPURIOUS HIT
S=3
S=4
S=5
19. 3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
2 6 mod 1 1 = 4 EXACT MATCH
6 5 mod 1 1 = 10 not equal to 4
5 3 mod 1 1 = 9 not equal to 4
S=7
S=6
S=8
20. 3 1 4 1 5 9 2 6 5 3 5
3 5 mod 1 1 = 2 not equal to 4
S=9
Pattern occurs with shift 6
21. COMPARISSION
The Naive String Matching algorithm slides
the pattern one by one. After each slide, it one
by one checks characters at the current shift
and if all characters match then prints the
match.
Like the Naive Algorithm, Rabin-Karp algorithm
also slides the pattern one by one. But unlike the
Naive algorithm, Rabin Karp algorithm matches
the hash value of the pattern with the hash value
of current substring of text, and if the hash values
match then only it starts matching individual
characters.