"INSTANT INDEX" TEXT SEARCH ENGINE IS MANY TIMES FASTER THAN OTHER INDEXING
AND TEXT SEARCH TECHNOLOGIES AND COSTS LESS

MCLEAN, VA, March 22 -- HT ENTERPRISES INSTANT INDEX IS LESS EXPENSIVE THAN
OTHER TEXT INDEX AND RETRIEVAL PRODUCTS, AND OUTPERFORMS THEM ALL
RADICALLY. THE INSTANT INDEX PERSONAL EDITION SELLS FOR $99.95, THE
PROFESSIONAL VERSION FOR $395.00. INDEXING WITH INSTANT INDEX IS MANY
TIMES FASTER THAN WITH OTHER SEARCH ENGINES; INDEXING TIMES APPROACHING
100 MB/MINUTE HAVE BEEN REALIZED WITH FAST, VESA-BUS 486 MACHINES USING
CACHING DISK CONTROLLERS. INDEX SIZES CAN BE VARIED ACCORDING TO NEED, BUT
ARE USUALLY 7% THE SIZE OF THE ORIGINAL TEXT, IN SHARP CONTRAST TO OTHER
PRODUCTS, WHOSE INDICES ARE MUCH LARGER. UNLIKE OTHER TEXT SEARCH SYSTEMS,
INSTANT INDEX HAS NO FILE-SIZE LIMITS. ACCURACY PROBLEMS ASSOCIATED WITH
OTHER PRODUCTS ARE NOTABLY ABSENT IN INSTANT INDEX. INSTANT INDEX WILL
TURN UP AN OCCASIONAL DOUBLE HIT; SOME IF NOT MOST OTHER PACKAGES APPEAR
TO BE ERRING ON THE SIDE OF NOT FINDING ALL POSSIBLE HITS (SEE INFO WORLD,
5/24/93 PAGE 133). OPERATIONS REQUIRING KNOWLEDGE OF SYMBOLIC LOGIC WITH
OTHER PRODUCTS ARE PERFORMED EASILY WITH INSTANT INDEX AND REQUIRE NO SUCH
EXPERTISE.

Text Indexing developer Holden Technical Enterprises, Inc. (McLean, VA) is
now offering a complete professional copy of a high speed text index and
search software product for under $400. INSTANT INDEX will index any files
including word processing, database, and executable files using Windows
3.1, a 386-486 CPU and at least 4MB of RAM.

"Holden Technical Enterprises, Inc.'s quantum leap in text indexing
technology does not mean a quantum leap in price," said Jim Young, product
manager for INSTANT INDEX. "Users can now realize a many-fold increase in
indexing speed over standard text index/retrieval software with a
relatively small investment."

Instant Index is similar to certain kinds of AI routines in the manner in
which it uses computer memory, nonetheless it is a real-world-oriented
program with many features which will endear it to business and government
organizations. Cascade searching allows search for multiple criteria which
may occur in various parts of documents being searched. The STICKY NOTE
feature allows you to quickly re-access data turned up by previous
searches. FUZZY LOGIC features allow searching of data in imprefect
condition. Wild-card searching is via simple omission of trailing letter
groups.

In fact, INSTANT INDEX is so fast, feature rich and user friendly, there is
no need to look further. INSTANT INDEX is available to computer dealers
and resellers nationwide.

Holden Technical Enterprises, Inc.
PO Box 3423, McLean, VA 22103
(703) 719-6142,   FAX: (703) 971-8192
-----------------------------------------------------------------
HT Enterprises' INSTANT INDEX
Full Text Search Product
A White Paper

Full text search has a reputation as one of the "black-arts" of computer
science; that users are actually paying as much as $20,000 for a license
to watch something take 30 hours to happen on a 486-based microcomputer is
indeed seen as magical by many observers. A recent issue of PC Week
mentioned a leading text-search product taking more than 2.5 hours to
index a 27 MB file. That's not a solution to anything; that's another
problem.

It appears to go downhill from there. Index files (the keyword tables and
linked lists etc.) tend to be major fractions of the size of the original
data files, which can double a user's costs for disk drives. There are
major questions regarding the accuracy of most commercial text search
software, and there are limitations on file sizes.

HT Enterprises' Instant Index product is intended as a solution to these
and other problems involving text search. Instant Index runs on
microcomputers running MS-Windows 3.1, with OS/2, UNIX, and other versions
available in the near future.

This package does not use word tables or tables of linked lists at all, but
is based upon a distant cousin of the Fourrier transform. Like the
Fourrier transform, which converts a time-domain function into one in
frequency domain, the HT Enterprises software converts the ordinary
representation of text, which is content oriented, into a representation
which is location oriented.

Times of less than 40 seconds have been noted (on a typical AT-bus 33 MHZ
486 PC with a 17 MS disk) for indexing the King James Bible. Indexing a
directory with over 7 MB of text in over 300 files on the same machine
takes less than two minutes. Instant Index solves numerous problems, but
the big emphasis is on indexing speed, which has been the biggest problem
with text search software until now. In this regard, Instant Index is the
breakthrough users have sought

Index sizes can be varied according to needs with Instant Index, but most
indices are around seven percent the size of the text file(s), as compared
to the 25% - 35% figures for most packages. Files being searched can be
very large (130 MB files have been indexed and searched), and may be
either pure text or application files. Instant Index can find search hits
in application files and then view those files in the application(s) which
created them.

Searching speed with Instant Index ranges from average to phenomenal, and
increases radically with practice and familiarity. In contrast to normal
text search software, Instant Index has some of the same characteristics
as human memory. Normal text search software gets slow when given longer
search strings; Instant Index gets faster. The more specific a search
criteria you give it, i.e. the longer the search string, or the more words
you can give it to work with, the faster the process becomes. This is in
keeping with good usage. Rather than turning up 300 hits on "Johnny" and
hoping to find the two which also mention "Frankie" by hand, you normally
(with Instant Index) enter "Frankie Johnny" or "Johnny Frankie" and get
only the hits you really are interested in. No knowledge of symbolic logic
or Boolean operations is required of the user.

Words such as "the" or "and" add nothing to a typical search for obvious
reasons, and may be omitted. With Instant Index, you just leave them out.
Other packages require you to set up and maintain lists of official
non-words Searches which would require symbolic logic operations with
other packages are the normal, simple usage of Instant Index. Users have
noted that Instant Index is friendlier and easier to use than other
packages, and appears to be finding more information, and turning up more
hits on the same data set.

The design of Instant Index calls for erring on the side of turning up an
occasional double hit; other packages appear to be erring on the side of
not finding all possible hits (see Info World, 5/24/93 Page 133).

Wildcards are by omission; e.g. a search for "tradi" will turn up
"trading", "tradition", "traditional", and all other variations. Boolean
joins are by simply including more than one word, not necessarily in
order, in a search criteria. Fuzy-logic searching is included in Instant
Index and the notion of a wildcard serving for more than one character is
subsumed in this function. Fuzy logic searches are not 100%, and require
larger indices than the minimal seven percent.

Fuzy logic is a last resort, and is normally recommended only for scanned
text, in which a certain number of mis-spellings are expected and routine.
A better solution even then, given time, would be to run the data through
a commercial spell-checker, and then through text search.

Proximity for searching is normally taken to be two lines (i.e. finding the
words in a search phrase within two lines of text), but may be set to
sector proximity. Noise words must then be omitted from search criteria
for obvious reasons.

A count of hits is given and the first hit shown in context on the screen.
A tool dialog box allows a user to quickly view hits by file, and to
launch application files containing hits into their applications. Control
returns to Instant Index.

Motion control within documents containing hits, as shown on the screen, is
via arrow and page keys, vertical scroll bar, and FORWARD and BACK menu
keys which move 512 bytes at a clip.

Indexing and searching is done on a directory level, with an option for
recursive descent into subdirectories, i.e. a single index may suffice for
an entire directory system. Instant Index, in theory at least, could work
with files up to around 700 MB on one of todays 486 micros. Files of 130
MB have actually been indexed and searched.

Instant Index occasionally returns a double hit, i.e. returns the same hit
twice. This is a very minor nuisance which is unavoidable in the design of
such a package. It is a by-product of the system for insuring that search
strings which span two file sectors still get reported without losing
performance or increasing index file size Other than in the case of fuzzy
log searching, there is no known possibility of Instant Index ever missing
a hit for a given search criteria. In the case of application (binary
files, there is no guarantee that Instant Index will be able to display
all hits on the screen, since certain combinations of binary characters
will either defeat the Windows

API display routines or appear as large numbers of line-feeds and scroll
off the screen, but Instant Index will always be able to launch Windows
applications to display the documents they create

Aside from lines which get redlined by the Verify function (hits), you can
hold the left mouse key down and redline any lines which appear on the
screen. Clicking the right mouse key undoes any redlining. The Copy/Paste
key puts any redlined text into the MS Windows clipboard edit buffer, from
which it may be retrieved using the "Paste" feature of any full-function
MS-Windows word processor. This is the normal use of Instant Index.
Basically, you find something you want in a huge text file, then you paste
it into a wordprocessor and do your own thing with it. This allows maximum
possible flexibility, without the wasted effort of duplicating the
functionality of a wordprocessor within the text search engine.

Instant Index is not case sensitive. That would double times for everything
with very questionable benefit in our view, and Instant Index is optimized
for speed. Instant Index searches on characters given as the alphabet in
the PARAMETERS dialog box while indexing, along with a selectable group of
other characters, typically, just the numbers 0 - 9. Other characters
should be added for reason: for instance, for bible searching, you might
also include a colon (:) to allow you to search for such things as "Gen
1:7". For Shakespeare, include an apostrophe (since Shakespeare uses 'd in
place of "ed". Instant Index allows a total of 60 characters all told,
counting each upper/lower-case pair as one character.

Foreign languages are no problem for Instant Index, and require only an
MS-Windows font and text which uses that font. In particular, this does
not even depend upon using the Roman alphabet; HTE has demonstrated
searching Pushkin's works with Instant Index. Such capabilities should be
of great use in the intelligence field, as well as in academia.

 ============================================================
 From the  'New Product Information'  Electronic News Service
 ============================================================
 This information was processed from data provided by the
 above mentioned company. For additional details, contact 
 the company at the address or telephone number indicated.
 OmniPage Pro is now used for converting all printed input! 
 ============================================================
 All submissions for this service should be addressed to:
 BAKER ENTERPRISES,  20 Ferro Dr,  Sewell, NJ  08080  U.S.A.
 Email: RBakerPC (AOL/Delphi), rbakerpc@delphi.com (Internet)
 ============================================================
