Boyer Moore String Search Algorithm

Boyer Moore String Search
Algorithm
  • Boyer-Moore string search algorithm was developed by Robert S. Boyer and J Strother Moore in 1977.
  • Concept of Boyer-Moore Algorithm
    1. Scan From right to left
    2. Bad character rule
    3. Good Suffix shift rule
  • There is a preprocess in this algorithm.
    • It's create a table with last occurrence of  pattern characters
    • Example : 
                Pattern : a b a c a b
                              0 1 2 3 4 5
                                    Table
  • Note: If character not in pattern,
                                               T[i] = -1
  • Look at the table and find the last occurrence "a" (T[i]) 
  • "j" is index of character in pattern compare with text.
  •  if T[i] < j  is true
    shift by  j - T[i]
  • else if T[i] < j is false
    Then shift by one.
  • else if T[i] = -1 
    Then shift by while whole pattern pass that character.
  Pattern : a b a c a b
  Text : a b a c a a b a d a c a b a c a b a a b b

                there is miss match 
                ↙ 
a b a c a a b a d a c a b a c a b a a b b
a b a c a
               ↖ j
 if T[i] < j  is true
shift by  j - T[i]

T[a]= 4
j = 5
T[i] < j is true 
shift by 5-4 = 1
               i
            ↙ 
a b a c a a b a d a c a b a c a b a a b b
   a b a c a b
             ↖
                j
T[a] = 4
j = 3
T[i] < j is false
Then shift by one.
                           i
                        ↙ 
 a b a c a a b a d a c a b a c a b a a b b
       a b a c a b
                      
                         j
T[a] = 4
j = 5
T[i] < j is Ture
Then shift by 5-4 = 1
                             i
                          ↙ 
a b a c a a b a d a c a b a c a b a a b b
         a b a c a b
                          
                             j
T[d] = -1
j = 5
if T[i] = -1 
Then shift by while whole pattern pass "d" .

  • Like this shift pattern through out the text. If find string match then return position of string matched.
a b a c a a b a d a c a b a c a b a a b b
                           a b a c a b
                                 a b a c a b
                          
  • Boyer-Moore Algorithm
 k = n;
While k <= m{
   i=n; h =k;
     While i>0 and P(i) = T(j){
          i=i-1; 
          h= h-1;}
     if i = 0 {
             report occurrence of P in T at position k.
             k = k + n -1(2); }
     else {
            Shift P(increase k) by the max amount indicated by the extended bad character rule and the                   good suffix rule.
    }
}

  • Boyer-Moore Algorithm code in c
#include<stdio.h>
#include<string.h>
#define MAX 256
int max(int a,int b ){return a>b?a:b;}
void tablebad(char *str, int m, int table[MAX]){
    int i;
    for(i=0;i<MAX;i++){
        table[i]=-1;
    }
    for(i=0;i<m;i++){
        table[(int)str[i]]=i;
    }
}
void match(char *T,char *P){
    int m=strlen(P);
    int n=strlen(T);

    int table[MAX];
    tablebad(P,m,table);

    int s=0;
    while(s<=(n-m)){
        int j=m-1;
        while(j>-1&&P[j]==T[s+j]){
                j--;
                }
        if(j==-1){
            printf("\nPattern match =%d\n",s+1);
            s+=1;
        }
        else{
            s+=max(1,j-table[T[s+j]]);
        }
    }
}
void main(){
    char T[]="AAabCDabAB";
    char P[]="ab";
    match(T,P);
}

result :   Pattern match =  3
              Pattern match = 7


Comments

Post a Comment

visitors

Popular posts from this blog

String Matching with Finite Automata

Types Of Computer Networking