-------------------------------------------------------------------------------
                Boyer-Moore Fast String Search Algorithm
                            Win32 C/C++ SDK
                              version 1.0

                         (code name is BOYER32)

                          by Patrick KO Shu Pui

                Copyright (c) 1991-1996 All Rights Reserved.
-------------------------------------------------------------------------------

1. INTRODUCTION
===============================================================================

BOYER32 is an implmentation of the Boyer-Moore fast string search
algorithm.  BOYER32 provides programmers a set of functions that
enable them to search text in an efficient way.

BOYER32 provides two interfaces for different development communities.
The first interface is in the form of C++ class (class API).  The second
interface is in form of DLL function calls (C API), which are callable to
not only C, but also Visual Basic, Centura, etc.

These set of functions support text search in forward and backward
direction, case-sensitive or case-insensitive.  Since it is implemented
for Win32 environment, the maximum length of string or search space can
be as large as 4G bytes.

This documentation will introduce the set of API available to you, for
both interfaces, and also address some technical issues in using this
SDK.  In the following documentation, we will use the following reference
model for the text search problem:

Problem solved by Boyer-Moore:

        Let p, s be strings, and length of p = m, length of s = n.

        For a pattern p, find the position i in search space s where
        p[0] = s[i], p[1] = s[i+1], ... p[m-1] = s[i+m-1].

Time Complexity of Boyer-Moore Algorithm:

        On "average" case, it takes n/m steps to find p in s.

2. BOYER32 Class API
===============================================================================

Class:          BOYER

Constructors:   BOYER(BYTE * p, UINT plen)
                BOYER(BYTE * p)
                ~BOYER()

Public Methods: Find()
                FindIC()
                FindBackward()
                FindBackwardIC()

2.1 BOYER32 Class API - BOYER() constructors
-------------------------------------------------------------------------------
Constructors :  BOYER(BYTE * p, UINT plen)
                BOYER(BYTE * p)

Synopsis :      Create a BOYER object

Parameter:      p = a pointer to a pattern string to search for
                plen = the length of the pattern string, if plen is not
                specified, p is assumed to be a ASCIIZ string (null terminated)

Example:
                BOYER * boyer = new BOYER((BYTE *)"find me");
                ...
                delete boyer;
-------------------------------------------------------------------------------

2.2 BOYER32 Class API - Find(), FindIC()
-------------------------------------------------------------------------------
Function :      BYTE * Find(BYTE * s, UINT slen)
                BYTE * FindIC(BYTE * s, UINT slen)

Synopsis :      Find a pattern inside a string from s[0] to s[n-1]
                Find() is case-sensitive, FindIC() is case-insensitive

Parameter:      s = pointer to the start of search space
                slen = length of the search space s (i.e. n)

Return Value:   NULL = pattern not found in string s
                non NULL = pointer within s where pattern is matched
                           at first position (i.e. p[0])

Example:        To find a pattern "tiger" in a text in RAM starting at
                pointer s with a length of 1,000,000 characters,
                program like this:

                BYTE * txtp = /* search space */;
                BOYER * boyer = BOYER((BYTE *)"tiger");

                if ((matchp = boyer->Find((BYTE *)txtp, 1000000) != NULL)
                {
                    // found
                }
                else
                {
                    // not found
                }
                ...
                delete boyer;

See Also:       FindBackward(), FindBackwardIC()
-------------------------------------------------------------------------------

2.3 BOYER32 Class API - FindBackward(), FindBackwardIC()
-------------------------------------------------------------------------------
Function :      BYTE * FindBackward(BYTE * s, UINT slen)
                BYTE * FindBackwardIC(BYTE * s, UINT slen)

Synopsis :      Find a pattern inside a string from s[n-1] to s[0]
                Find() is case-sensitive, FindIC() is case-insensitive

Parameter:      s = pointer to the end of search space
                slen = length of the search space s (i.e. n)

Return Value:   NULL = pattern not found in string s
                non NULL = pointer within s where pattern is matched
                           at first position (i.e. p[0])

Example:        To find a pattern "tiger" in a text in RAM starting at
                the end of s with a length of 1,000,000 characters,
                program like this:

                BYTE * txtp = /* search space */;
                BOYER * boyer = BOYER((BYTE *)"tiger");

                if ((matchp =
                    boyer->Find((BYTE *)txtp + 1000000 - 1, 1000000) != NULL)
                {
                    // found
                }
                else
                {
                    // not found
                }
                ...
                delete boyer;

See Also:       Find(), FindIC()
-------------------------------------------------------------------------------


3. BOYER32 C API
===============================================================================

3.1 BOYER32 C API - CreateFindHandle()
-------------------------------------------------------------------------------
Function :      HFIND CreateFindHandle(BYTE * p, UINT plen)
Synopsis :      Create a find handle and specify the pattern to be matched,
                this function must be called prior to any find functions.

Parameter:      p = pattern string to be matched
                plen = length of p

Return Value:   the find handle
Example:
                HFIND hfind;
                char * pattern = "find me";

                hfind = CreateFindHandle( (BYTE *)pattern );
                ...
                FreeFindHandle(hfind);

See Also:       FreeFindHandle()
-------------------------------------------------------------------------------

3.2 BOYER32 C API - FreeFindHandle()
-------------------------------------------------------------------------------
Function :      void FreeFindHandle(HFIND hFind)
Synopsis :      Free a find handle and any memory associated.
                FreeFindHandle must be call when a find handle is no longer
                needed.

Parameter:      hFind = handle to be freed
Return Value:   nothing
Example:        see CreateFindHandle()
See Also:       CreateFindHandle()
-------------------------------------------------------------------------------

3.3 BOYER32 C API - Find(), Find2(), Find3(), FindIC(), FindIC2(), FindIC3()
-------------------------------------------------------------------------------
Function :      Find(HFIND hFind, BYTE * s, UINT slen)
                Find2(BYTE * p, UINT plen, BYTE * s, UINT slen)
                Find3(BYTE * p, BYTE * s, UINT slen)
                FindIC(HFIND hFind, BYTE * s, UINT slen)
                FindIC2(BYTE * p, UINT plen, BYTE * s, UINT slen)
                FindIC3(BYTE * p, BYTE * s, UINT slen)

Synopsis :      Find a pattern inside a string from s[0] to s[n-1]
                Find(), Find2() and Find3() perform the same functions
                but with various parameter format.

                Find() expects a find handle and the same find handle
                can be used to call Find() consecutively in order to
                search for multiple occurences of the same pattern.

                Find2() and Find3() are best used to search for the first
                occurence of the pattern, where for Find3() the p parameter
                must refer to a ASCIIZ (null-terminated) string.

                FindIC(), FindIC2() and FindIC3() are similar to Find(),
                Find2() and Find3() respectively except they are
                case-insensitive.

Parameter:      HFIND hfind = the find handle obtained from CreateFindHandle()
                s = pointer to the start of search space
                slen = length of the search space s (i.e. n)
                p = pointer to the pattern string
                plen = length of p

Return Value:   NULL = pattern not found in string s
                non NULL = pointer within s where pattern is matched
                           at first position (i.e. p[0])

Example:        To find a pattern "tiger" in a text in RAM starting at
                pointer s with a length of 1,000,000 characters,
                program like this:

                Example 1:
                HFIND hfind;
                BYTE * s = ... search space ...
                BYTE * matchp;

                if ((hfind = CreateFindHandle((BYTE *)"tiger" )) != NULL)
                {
                        matchp = Find( hfind, s, 1000000 );
                        if (matchp != NULL)
                        /* found */
                        else
                        /* not found */
                        ...
                }

                FreeFindHandle(hfind);

                Example 2:
                matchp = Find2( (BYTE *)"tiger", 5, s, 1000000 );
                if (matchp != NULL)
                /* found */
                else
                /* not found */

                Example 3:
                matchp = Find3( (BYTE *)"tiger", s, 1000000 );
                if (matchp != NULL)
                /* found */
                else
                /* not found */

See Also:       FindBackward()
-------------------------------------------------------------------------------

3.4 BOYER32 C API - FindBackward(), FindBackward2(), FindBackward3(),
    FindBackwardIC(), FindBackwardIC2(), FindBackwardIC3()
-------------------------------------------------------------------------------
Function :      FindBackward(HFIND hFind, BYTE * s, UINT slen)
                FindBackward2(BYTE * p, UINT plen, BYTE * s, UINT slen)
                FindBackward3(BYTE * p, BYTE * s, UINT slen)
                FindBackwardIC(HFIND hFind, BYTE * s, UINT slen)
                FindBackwardIC2(BYTE * p, UINT plen, BYTE * s, UINT slen)
                FindBackwardIC3(BYTE * p, BYTE * s, UINT slen)

Synopsis :      Find a pattern inside a string from s[n-1] to s[0]
                FindBackward(), FindBackward2() and FindBackward3() perform
                the same functions but with various parameter format.

                FindBackward() expects a find handle and the same find handle
                can be used to call FindBackward() consecutively in order to
                search for multiple occurences of the same pattern.

                FindBackward2() and FindBackward3() are best used to search
                for the first occurence of the pattern, where for
                FindBackward3() the p parameter must refer to a ASCIIZ
                (null-terminated) string.

                FindBackwardIC(), FindBackwardIC2() and FindBackwardIC3() are
                similar to FindBackward(), FindBackward2() and FindBackward3()
                respectively except they are case-insensitive.

Parameter:      HFIND hfind = the find handle obtained from CreateFindHandle()
                s = pointer to the end of search space
                slen = length of the search space s (i.e. n)
                p = pointer to the pattern string
                plen = length of p

Return Value:   NULL = pattern not found in string s
                non NULL = pointer within s where pattern is matched
                           at first position (i.e. p[0])

Example:        To find a pattern "tiger" in a text in RAM starting at
                pointer s + 1000000 - 1 backward 1,000,000 characters,
                program like this:

                Example 1:
                HFIND hfind;
                BYTE * s = ... search space ...
                BYTE * matchp;

                if ((hfind = CreateFindHandle((BYTE *)"tiger" )) != NULL)
                {
                        matchp = FindBackward( hfind, s + 1000000 - 1, 1000000 );
                        if (matchp != NULL)
                        /* found */
                        else
                        /* not found */
                        ...
                }

                FreeFindHandle(hfind);

                Example 2:
                matchp =
                FindBackward2( (BYTE *)"tiger", 5, s + 1000000 - 1, 1000000 );

                if (matchp != NULL)
                /* found */
                else
                /* not found */

                Example 3:
                matchp = FindBackward3( (BYTE *)"tiger", s, 1000000 );
                if (matchp != NULL)
                /* found */
                else
                /* not found */

See Also:       Find()
-------------------------------------------------------------------------------


4. FAQ
===============================================================================

Q:  How can I use Boyer-Moore functions in my Windows 95 and Windows NT
    program?
A:  Use BOYER32 API and link your program with boyer32.lib.

Q:  Can I use BOYER32 in other visual tools such as Visual Basic, Centura, etc. ?
A:  BOYER32 is also available in DLL form, callable by all above.

Q:  What is the limit of the memory space I can search?
A:  4GB.

Q:  Is BOYER32 thread-safe?
A:  Yes.


5. REFERENCE
===============================================================================

"Algorithms". Robert Sedgewick. Addison Wesley Publishing Company.
1988. 2nd addition. p286. QA76.6.S435 1983

-------------------------------------------------------------------------------
