(This is the program documentation in plain text. If you have any Web
browser, you may prefer to use it to read GREP.HTM, which has all the
same text but is much better formatted.)



                 GREP -- Find Regular Expressions in Files

         program and documentation by Stan Brown, Oak Road Systems
                         revised February 20, 1999
        Copyright  1986-1999 by Oak Road Systems, +1 216 371-0043
  ------------------------------------------------------------------------

GREP is a filter that searches input files, or the standard input, for
lines that contain matches for one or more patterns called regular
expressions and displays those matching lines.

         Why GREP?
         License and warranty
         System requirements
         Installation
         User instructions
         Options

               General options
               Output options
         Regular expressions
               Normal and special characters
               How to construct a regular expression
               Special rules for the command line
         Environment variable
         Return values
         Bugs
         What's new?


  ------------------------------------------------------------------------

                                 Why GREP?

  ------------------------------------------------------------------------
The DOS filter FIND is useful for finding a given string in one or more
files. But what if you want to find the word the in caps or lower case,
without also finding other, then, and so on? You don't really want to
search for a specific string. Rather, what you're looking for is a regular
expression, namely the preceded and followed by something other than a
letter. GREP to the rescue!

GREP combines most features of UNIX grep and fgrep. GREP has many other
advantages over FIND besides using regular expressions:

   * You can search for multiple strings or regular expressions in a single
     pass through the input file(s).
   * You can show any number of context lines before and after the matching
     lines.
   * You have a wide range of choices for output format: whether to show
     the file names or not; whether to show the matching lines, count them,
     or just show the names of files that contain matches; and whether to
     use DOS FIND-style or UNIX grep-style output format.
   * You can store often-used options in an environment variable instead of
     typing them on the command line every time.
   * GREP returns status values that can be useful in batch files, and you
     can control which condition returns which value.

  ------------------------------------------------------------------------

                            License and warranty

  ------------------------------------------------------------------------
GREP is shareware. If you use it past a 30-day evaluation period, you are
morally and legally bound to register and pay for it. Please see the file
LICENSE.TXT for full details, including support and warranty information.

  ------------------------------------------------------------------------

                            System requirements

  ------------------------------------------------------------------------
The 16-bit version runs under DOS 2.0 or higher, including a DOS box under
Windows. The 32-bit version requires a DOS box under Windows 98, Win95, or
Win NT 4.0.

The two versions operate the same and have the same features, except that
the 32-bit version supports long filenames.

  ------------------------------------------------------------------------

                                Installation

  ------------------------------------------------------------------------
There is no special installation procedure. Simply move GREP16.EXE,
GREP32.EXE, or both to any convenient directory in your path.

You may wish to rename the version you use more often to the simpler
GREP.EXE. All the following user instructions will assume you've done that.
Otherwise, just substitute GREP16 or GREP32 wherever you see GREP in the
examples.

  ------------------------------------------------------------------------

                             User instructions

  ------------------------------------------------------------------------
For a quick summary of operating instructions, type

        grep

The full command form is one of

        grep [options] ["regexp"] [<inputfile] [>outputfile]
        grep [options] ["regexp"] inputfiles [>outputfile]

In the first form, GREP is a filter, taking its input from the standard
input (most likely piped from some other command). In the second form, GREP
takes its input from any number of files, possibly specified with paths and
wildcards. Please be aware that the 16-bit and 32-bit GREP programs expand
wildcards slightly differently because the 32-bit version supports long
filenames. Thus the 32-bit version would expand abc* to include all files,
with any extension or none, whose names start with abc; with the 16-bit
version you need abc*.* to get the same result.

In both forms, the outputfile will receive the matching lines (or other
output, depending on the output options). For output to the screen, omit >
and outputfile.

"regexp" is a regular expression; see below for how to construct one. A
regular expression is normally required on the command line; however, if
you use the /F option, regular expressions will be taken from file instead
of the command line.

Example:

        grep -I "pic[t\s]" \proj\*.cob >prn

will examine every COBOL source file in the PROJ directory and print every
line that contains a picture clause ("pic" followed by either "t" or a
space) in caps or lower case (the /I option).

  ------------------------------------------------------------------------

                                  Options

  ------------------------------------------------------------------------
GREP's operation can be modified by several options, either on the command
line (before or after the regular expression) or in an environment variable
(see below).

You have a lot of freedom about how you enter options. You can use a
leading hyphen or slash mark; you can use upper- or lower-case letters; you
can leave spaces between options or combine them. For instance, the
following are just some of the different ways of turning on the P3 and B
options:

        /p3 -b    /b/P3    /p3B    -B/P3    -P3 -b

This document will always use capital letters for the options, to make it
easier to distinguish letter l and figure 1.

General options

Quite a number of GREP's options control the appearance of the output, and
those options are collected in a separate section below. This section
explains the other options.

/? Display a help message and exit with no further processing.

/0 or /1
     These options control the values that GREP returns in the DOS error
     level. /0 returns 0 if there are differences or 1 if there are no
     differences; /1 returns 1 for differences or 0 for no differences. For
     more details, see Return values below.

/Dfile
     Display debugging information. Debugging information includes whether
     you're running the 16-bit or 32-bit version, the value of the
     environment variable, the values of all options specified or implied,
     the raw and interpreted values of the regular expression(s), and
     details of every file scanned. This information is normally
     suppressed, but you may find it helpful if GREP seems to behave in a
     way you don't expect.

     file is optional. If you specify /D by itself (followed by a space),
     debugging information will be sent to the standard error stream, which
     is normally the screen.

     Since the debugging information can be voluminous, you may want to
     specify an output file: file must follow the D with no intervening
     space, and the filename ends at the next space. GREP will append to
     the file if it already exists.

/Ffile
     Read one or more regular expressions from file instead of taking a
     single regular expression from the command line. You must enter the
     regular expressions one per line in the file; don't surround them with
     quotes. (This is similar to the F option in UNIX grep, but unlike UNIX
     grep, you can have multiple regular expressions in the file.)

     file must follow the /F with no intervening space, and the filename
     ends at the next space.

     If you use a minus sign as the filename (/F- option), GREP will accept
     regular expressions from standard input. Don't do this if you are
     redirecting input from a file!

/I
     Ignore case, treating caps and lower case as matching each other.
     (This is the same as the I option in UNIX grep and DOS FIND.)

     Caution: the /I option does not apply to 8-bit characters (characters
     128-255). Because there are many different encoding schemes, GREP
     doesn't know which characters above 127 correspond to each other as
     upper and lower case on your computer. Therefore, if you want
     case-blind comparisons, you must explicitly code any 8-bit upper and
     lower case in your regular expression. For instance, to search for the
     French word "th" in upper or lower case, code it as th[]. The "th",
     being 7-bit ASCII characters, will be handled correctly by the /I
     option.

/Q
     Suppress the program logo and all warning messages. Error messages
     will still be displayed (as will debug output, if you set the /D
     option).
/V
     Show or count the lines that don't match instead of those that do.
     (This is the same as the V option in UNIX grep and DOS FIND.)

Output options

This section lists GREP's options that control the appearance of the
output. The other options are listed in the preceding section.

Before going through the options, let's take a moment to look at some of
the possible output formats. By default, GREP's output is similar to that
of DOS FIND:

        ---------- GREP.C
                op_showhead = ShowNoHeads;
                else if (op_showhead == ShowNoHeads)
                op_showhead = ShowNoHeads;

        ---------- GREP_MAT.C
                op_showhead == ShowNoHeads)

However, the /U option (see below) produces UNIX grep-style output like
this:

        GREP.C:        op_showhead = ShowNoHeads;
        GREP.C:        else if (op_showhead == ShowNoHeads)
        GREP.C:        op_showhead = ShowNoHeads;
        GREP_MAT.C:        op_showhead == ShowNoHeads)

As you can see, the main difference is that DOS-style output has the
filename as a header above the group of matching lines from that file, and
UNIX-style output has the name of the file on every matching line.

Now, here are the options that control what GREP outputs and how it is
formatted:

/B
     Display a header for every file examined, even if the file contains no
     matches. (This option is meaningful only with DOS-style output.)

/C
     Display only a count of the matching lines in each file, instead of
     the matching lines themselves. (This is the same as the C option in
     UNIX grep and DOS FIND.)

/H
     Don't display any filenames as headers. This is useful when you're
     using GREP as a filter to extract lines from a file for processing by
     another program, like this:

         grep /H "Directory" <inputfile | other program

/L
     Display only a bare list of the names of files that contain matches,
     not the actual lines that match. With the /V option, display the names
     of files that contain no matches. (This is the same as the L option in
     UNIX grep.)

/N
     Show the line number before each matching line. (This is the same as
     the N option in UNIX grep and DOS FIND.) DOS-style output looks like
     this:

         ---------- GREP.C
         [ 144]        op_showhead = ShowNoHeads;
         [ 178]        else if (op_showhead == ShowNoHeads)
         [ 366]        op_showhead = ShowNoHeads;

         ---------- GREP_MAT.C
         [  98]        op_showhead == ShowNoHeads)

     With both the /N and /U options, the UNIX-style output looks like
     this:

         GREP.C:144:        op_showhead = ShowNoHeads;
         GREP.C:178:        else if (op_showhead == ShowNoHeads)
         GREP.C:366:        op_showhead = ShowNoHeads;
         GREP_MAT.C:98:        op_showhead == ShowNoHeads)

     UNIX-style output is suitable for use with the excellent freeware
     editor Vim.

/Pbefore,after
     Show context lines before and after each match. If you omit after,
     GREP will show the same number of lines after each match as before. If
     you omit both numbers, GREP will show two lines before and two lines
     after.

     Either number can be 0. For instance, use /P0,4 if you want to show
     every match and the four lines that follow it.

     If you use the /P option, you probably want to use the /N option as
     well, to display line numbers. In that case, the punctuation of the
     line numbers will distinguish which lines are actual matches and which
     are displayed for context. Here is some DOS-style output from a run
     with the options /P1,1N set:

         ---------- GREP.C
           143     if (opcount >= argc)
         [ 144]        op_showhead = ShowNoHeads;
           145
           177             PRTDBG "with each matching line");
         [ 178]        else if (op_showhead == ShowNoHeads)
           179             PRTDBG "NO");
           365     if (myToggle('L') || myToggle('U') || myToggle('H'))
         [ 366]        op_showhead = ShowNoHeads;
           367     else if (myToggle('B'))

         ---------- GREP_MAT.C
            97         op_showwhat == ShowMatchCount ||
         [  98]        op_showhead == ShowNoHeads)
            99         headered = TRUE;

     As you can see, the actual matches have square brackets around the
     line numbers, and the context lines do not.

/U
     Show the filename with each matching line, instead of just once in a
     separate header. This UNIX-style output is useful with editors like
     Vim that can automatically jump to the file that contains a match.
     Some examples of UNIX-style output have been given earlier in this
     section.

     There's one small difference from UNIX grep output: UNIX grep
     suppresses the filename when there is only one input file, but GREP
     assumes that if you didn't want the filename you wouldn't have
     specified the /U option. Neither GREP and UNIX grep displays a
     filename if input comes from a file via < redirection.

Some combinations of output options are logically incompatible. For
instance, /H/L makes no sense. In such cases, GREP will turn off one of the
incompatible options and tell you what it did.

The following list of incompatibilities is given for completeness only:
       /B    overrides /H; ignored with /L or /U
       /C    overrides /H, /L, /N, /P
       /H    ignored with /B, /C, /L, /U
       /L    overrides /B, /H, /N, /P, /U; ignored with /C
       /N    ignored with /C or /L
       /P    ignored with /C or /L
       /U    overrides /B and /H; ignored with /L

  ------------------------------------------------------------------------

                            Regular expressions

  ------------------------------------------------------------------------
A regular expression is essentially a string with a bunch of operators
thrown in to express possibilities like "any of these characters" and
"repeated".

Normal and special characters

To understand regular expressions, you need to know the difference between
special characters and normal characters. (The meanings of the special
characters will be explained in the next section.)

The following characters are special if they occur in the listed contexts:

   * the backslash (\), always
   * the period (.), asterisk (*), plus sign (+), and left square bracket
     ([), anywhere except within square brackets
   * the caret (^), only at the beginning of the regular expression or
     immediately after a left square bracket
   * the dollar sign ($), only at the end of the regular expression
   * the minus sign or hyphen (-), only between square brackets

Any other character, or one of the above characters not in the listed
context, is a normal character. Any of the above characters also becomes a
normal character if preceded by a backslash, as will be shown below.

How to construct a regular expression

Here are the rules for a regular expression:

single character
     Any normal character matches itself. To match a special character,
     precede it with a backslash (\). Example: to search for the string
     "^abc\def", you must put backslashes before the two special characters
     to make GREP treat them as normal characters and not give them special
     meanings, so that \^abc\\def is your regular expression.

     You can use any character from space through character 255. If using
     8-bit characters on the command line, see Special rules for the
     command line below.

character class
     To match any one of a group of characters, enclose them in square
     brackets ([ ]). Examples: [aA] will match an upper- or lower-case
     letter A; sno[wr]ing will match "snowing" or "snoring".

     You can indicate a character range with the minus sign (-). Examples:
     [0-9] will match any single digit, and [a-zA-Z] will match any English
     letter. To match any Western European letter (under most recent
     versions of Windows, in North America and Western Europe), use
     [a-zA-Z---].

     A character class can contain both ranges and single characters, and
     the order doesn't matter as long as each range is written low-high.

negative character class
     To match any character that is not in a class, use square brackets
     with a caret (^). Examples: [^0-9 ] matches any character except a
     digit or a space; the[^a-z] matches "the" followed by anything except
     a lower-case letter.

     Note: The negative character class matches any character not within
     the square brackets, but it does match a character. For instance,
     the[a-z] matches "the" followed by something other than a lower-case
     letter; it does not match "the" at the end of a line because then
     "the" is not followed by any characters. Please see the extended
     example at the end of these rules for further explanation.

repetition
     A plus sign (+) after a character or character class matches one or
     more occurrences; an asterisk (*) matches zero or more occurrences.
     Examples: snor+ing matches "snoring", "snorring", "snorrring", and so
     on, but not "snoing". snor*ing matches "snoing", "snoring", and so on.

     Used with a character class, the plus sign and asterisk match any
     multiple characters in the class, not only multiple occurrences of the
     same character. For instance, sno[rw]+ing matches "snowing",
     "snorwing", "snowrring", and so on.

     Obligatory example: [A-Za-z_]+[A-Za-z0-9_]* matches a C or C++
     identifier, which is at least one letter or underscore, followed by
     any number of letters, digits, and underscores.

start of line, end of line
     A caret (^) at the start of a regular expression means that the
     pattern starts at the beginning of a line in the file(s) being
     searched. A dollar sign ($, ASCII 36) at the end of a regular
     expression means that the pattern ends at the end of a line in the
     file(s) being searched.

     Example: ^[wW]hereas matches the word "Whereas" or "whereas" at the
     start of a line, but not in the middle of a line. Blanks are not
     ignored, so if you want to find that word whenever it's the first word
     of the line, you need to use a pattern like ^ *[wW]hereas to allow for
     indention.

     Examples: ^$ will find lines that contain no characters at all. ^ *$
     will match lines that contain no characters or contain only spaces.
     ^ +$ will match lines that contain only spaces, but not empty lines.

     Examples: ^[A-Za-z]+$ will find every line that contains nothing but
     English letters. ^ *[A-Za-z]+ *$ will find every line that contains
     exactly one English word, possibly preceded or followed by blanks.

Extended example: suppose you want to find the word "the" in a file,
whether in caps or lower case. You can use the /I option to make the search
case blind, and concentrate on constructing the regular expressions. At
first glance, [^a-z]the[^a-z] seems adequate: anything other than a letter,
followed by "the", followed by anything but a letter. That lets in "the"
and rules out "then" and "mother". But it also rules out "the" at the
beginning or end of a line. Remember that a negative character class does
insist on matching some character. So the solution is to have four regular
expressions, for "the" at the beginning, middle, or end of a line, or on a
line by itself:

        ^the[^a-z]
        [^a-z]the[^a-z]
        [^a-z]the$
        ^the$

So to search for just the occurrences of the word "the", you'd put those
four lines in a file and then use the /F option on GREP.

Special rules for the command line

When you enter regular expressions in a file or from the keyboard (using
GREP /F), the above rules are sufficient. But when you enter a regular
expression on the command line, you also have to contend with DOS command
parsing. Putting double quotes around the expression will help, but it
doesn't avoid all problems.

Suppose you want to search for a character like < or |. The DOS
command-line parser always gives these characters special meanings, so if
you put them in a regular expression GREP will never see them. Therefore,
GREP defines several escape sequences to let you make an end run around
DOS:

   * \" is the double quote (ASCII 34)
   * \c is the comma (,)
   * \e is the escape character (ASCII 27)
   * \g is the greater-than sign (>)
   * \i is the semicolon (;)
   * \l is the less-than sign (<)
   * \q is the equal sign (=)
   * \s is the space character (ASCII 32)
   * \t is the tab character (ASCII 9, Control-I)
   * \v is the vertical bar (|)

In addition, you can enter any character as a numeric sequence in decimal,
hex (leading 0x), or octal (leading 0). Example: capital A would be \65,
\0x41, or \0101.

Finally, if your regular expression begins with a minus (-) or slash (/),
GREP will try to interpret it as an option. Example: if you're searching
for the string "-in-law", GREP will think you're trying to turn on the
options /I, /N, and so on. To avoid this problem, use a leading backslash
(\-in-law).

Remember, the rules in this section are required only to get around parsing
problems on the command line. These escape sequences are not needed, and
don't work, in regular expressions in a file or when you use the /F- option
to enter regular expressions on separate lines from the keyboard.

  ------------------------------------------------------------------------

                            Environment variable

  ------------------------------------------------------------------------
If you use certain options frequently, you can put them in the ORS_GREP
environment variable. You have the same freedom as on the command line:
leading slashes or hyphens, space separation or options run together, caps
or lower case.

Only options can be put in the environment variable. If you want to "can" a
regular expression, put it in a file and put the /Ffile option in the
environment variable.

If you have some options in the environment variable but you don't want one
of them for a particular run of GREP, you don't have to edit the
environment variable. You can make most changes on the command line, like
this:

   * All of the single-letter options function as toggles. That means that
     if you set one of those options in the environment variable, you can
     remove it by specifying it on the command line. For instance, if you
     usually want to see the line numbers of matching lines, put /N in the
     environment variable. Then if you don't want line numbers in a
     particular run of GREP, just specify /N on the command line for that
     run to cancel the /N option set in the environment variable.

   * The numeric options /0 and /1, which set return values from GREP,
     override each other. The latest one specified on the command line will
     be effective.

   * The /D and /F options, if set in the environment variable, cannot be
     turned off on the command line. However, you can specify different
     files on the command line with either of those options.

   * /P in the environment variable can be overridden by a different /P
     setting on the command line. You can use /P0 to request no context
     lines.

If you're ever in doubt about the interaction of options between the
command line and the environment variable, simply type

        grep /d

and GREP will tell you all the option settings in effect.

  ------------------------------------------------------------------------

                               Return values

  ------------------------------------------------------------------------
By default, GREP will return one of the following values to DOS, and you
can test the return value with IF ERRORLEVEL in a batch file.



You might want to use GREP in a batch file or a makefile and take different
actions depending on whether matches were found or not. To do this, use the
/0 or /1 option. The /1 option returns an error level of 1 if matches were
found or 0 if there were no matches. /0 is the opposite: it returns 0 if
there were matches or 1 if there were none. In other words, the /0 or /1
option gives the value you want GREP to return if matches are found.

  ------------------------------------------------------------------------

                                    Bugs

  ------------------------------------------------------------------------

Regular expressions are limited to 127 input characters, and GREP will
behave strangely if you enter a longer expression.

GREP's regular expressions are slightly different from UNIX grep's.
Specifically, to accommodate DOS command-line parsing, GREP defines quite a
few more escape characters like \c and \s, as well as numeric escapes. On
the other hand, GREP does not (yet) implement ?, \<, \(, \{, and \| in
regular expressions.

  ------------------------------------------------------------------------

                                What's new?

  ------------------------------------------------------------------------
Here's what's new in version 4.2, the latest version. A complete revision
history is also available.

   * Allow 8-bit characters in regular expressions.
   * Allow multiple regular expressions to be typed in directly, with the
     /F- option.
   * Fix a bug: with /I, character classes entered in lower case were
     expanded incorrectly.
