
                        VEDA NeurOCR - Version 2.0
                             ( for Windows )


    I. ABOUT VEDA NeurOCR
    =====================

    VEDA NeurOCR is an Artificial Neural Networks (ANN) based OCR/ICR
    software  package.  It  was designed and developed  to  become  a
    useful tool for those who have a large amount of printed text  to
    feed their computers with, and want to do it in a real  automatic
    way.  VEDA NeurOCR is a professional product which is the  result
    of hundreds and thousands of theoretical researches,  programming
    and  testing hours. It was developed starting from some  original
    ANN  theoretical approaches in the neural models and  classifiers
    fields.

    a) INPUT:
    ~~~~~~~~~
    When  recognizing  with VEDA NeurOCR, we usually assume  that you
    previously scanned (in "line-art" mode), the original document(s)
    you have to "feed" the computer with, and have saved it (them) as
    PCX  or  TIFF - packbits compressed or  uncompressed  -  file(s),
    which are the input formats accepted by VEDA NeurOCR. Any scanner
    accompanying  acquisition software has at least one of the  above
    mentioned  image  file  formats  among  its  saving  options.   A
    scanning resolution of 300 dpi. (400 dpi. for texts written  with
    font size(s) less than 8) is recommended.

    A TWAIN acquisition interface to control devices (i.e.  scanners)
    designed and made in compliance with the TWAIN standard, is  also
    provided  by  VEDA NeurOCR V.2.0. Scanned images are  stored (and
    will be used) in line-art PCX format.

    The original image can be rotated by 90 degrees, if necessary.

    b) OUTPUT:
    ~~~~~~~~~~
    VEDA NeurOCR  will use its "knowledges" and other capabilities to
    "read"   texts  from  PCX  or   TIFF  (uncompressed  or  packbits
    compressed)   image  file(s), and will store them  for  you  into
    corresponding ASCII text file(s).

    VEDA  NeurOCR usually provides formatted ASCII text  output  (the
    recognition  result  follows the original document  layout).  The
    output  text  can  also  be  obtained  as  "decolumnized"   ASCII
    (eventual  columns  and/or  blocks  of  text  are  converted   to
    successive paragraphs). A general DTP/WP paragraph oriented  text
    format (without <CR><LF> at the end of each line, but only at the
    end of paragraphs), is also available as an export option for the
    recognized text. (DTP = Desktop Publishing, WP = Word Processor).

    c) MAIN FEATURES:
    ~~~~~~~~~~~~~~~~~
    VEDA  NeurOCR  comes  with  some  default  (already  built)  font
    oriented  knowledge  bases. These ones were  trained  on  printed
    samples  written with the main families of fonts: Courier,  Dutch
    (Times  Roman), Swiss (Arial, Helvetica), 9 pins  matrix  printer
    and mechanical typewriter.

    It  is  also  trainable. VEDA NeurOCR can  learn  any  other  new
    character and font (typeface, style, size) you want. The training
    is  interactive, fast and easy. It is always performed  off-line,
    such as the recognition process could be really full automatical.
    Even if in the same knowledge base can be learned without problem
    characters from more than only one font, it is still  recommended
    to  try  to keep each knowledge base oriented on  only  one  font
    typeface, and to use an appropriate name for it.

    VEDA  NeurOCR is able to "read" multi-font written documents  (if
    the knowledge bases for all these fonts are available).

    VEDA  NeurOCR  can even be considered an "omnifont"  OCR  system.
    This means that, if a multi-font configuration is setup, with the
    default  knowledge  bases active and the most of  the  customized
    font  related  knowledge  bases selected (but not  more  than  20
    once), VEDA NeurOCR can directly recognize almost any new  source
    document, no other settings and/or trainings being needed.

    It can be also considered "multi-lingual",  because it is able to
    "read"  text  from  documents  printed  in  a  lot  of  languages
    (mainly based on Latin-like alphabet). In fact, it can learn  any
    graphical  sign  for which  an  ASCII correspondent  exists.  The
    learning process is natural (based on examples), fast and easy.

    VEDA NeurOCR  can work in "batch processing" mode, using up to 50
    different input image files in each batch session.

    VEDA  NeurOCR automatically detects and skips  graphics  (tables,
    boxes, columns separators, pictures or other lines and  graphical
    elements)  in the image source and doesn't reproduce them in  the
    recognized text file.

    VEDA NeurOCR V.2.0 also allows an off-line interactive  selection
    of the zones (of the current source image), which will be further
    used for recognition or/and training, if desired.

    It can obtain good recognition ratios even on poor quality  input
    documents (obtained from 9 pins matrix printers or old mechanical
    typewriters).

    VEDA NeurOCR  solves  in an efficient  and  elegant manner  (also
    through  neural networks methods) the segmentation  of physically
    connected  ("in  touch")   characters,   mainly  encountered  for
    Dutch-  and Swiss-like fonts. It also splits, in almost  all  the
    cases, the vertically touched rows.

    It  contains an integrated Editor with which the recognized  text
    can  be analyzed, edited (corrected) and/or printed. It can  also
    be  exported  from  the Editor, in  a  general  DTP/WP  paragraph
    oriented text format.

    VEDA NeurOCR provides a brief "Command Help" associated with  the
    buttons  which  starts  its  main  functions.  It  also  has   an
    integrated  "Help Viewer" through which the "User's Manual"  text
    can be displayed at any time.

    d) PERFORMANCES:
    ~~~~~~~~~~~~~~~~
    The recognition ratio can reach up to  99.99 %  and even  100  %;
    its normal average value must be about 98.50 % - 99.50 %.

    The recognition speed is about 30 to over 300  characters/second.
    It  is strongly dependent on:
    - the computer's CPU type and frequency,
    - the  complexity of the source image (scanned document),
    - the number and the size of the used knowledge base(s).

    e) HARDWARE REQUIREMENTS:
    ~~~~~~~~~~~~~~~~~~~~~~~~~
    VEDA NeurOCR programs can run on any PC-AT over 386. However,  it
    is  recommended  to use a PC-AT configuration with at  least  486
    DX2/66MHz - DX4/100MHz CPU and 4 - 8 MB RAM on board. Minimum 6.0
    MB  free hard disk space must be available. The system must  also
    includes  at least one floppy drive (i.e. the 3.5" HD  one),  VGA
    graphic  adapter  and mouse.  VEDA  NeurOCR's  speed  performance
    directly  (and strongly) depends on the CPU type  and  frequency;
    i.e.  on an AT-Pentium/166MHz PC, substantial  (>250-300%)  speed
    improvements can be observed compared with an AT-486 DX4/100MHz.

    For documents acquisition, a 300-400 dpi., line-art, A4,  flatbed
    scanner (eventually TWAIN compliant) is recommended.


    II. COPYRIGHT
    =============

    VEDA NeurOCR is a proprietary (C) software and trademark (TM) of:

                     VEDA INTERNATIONAL Ltd. (R)

                     1 Padurea Craiului Str.
                     Bldg. B2, Entr. 3, Suite 122,
                     Bucharest 3, ROMANIA

             CONTACTS:    Dan ONTANU,   Mihnea VREJOIU

               phones: +40 1 644-3058,  +40 1 726-4366
                faxes: +40 1 410-2404,  +40 1 665-7058

           e-mails: dano@roearn.ici.ro,  mihnea@roearn.ici.ro

    We shall always appreciate comments, suggestions  or/and  reports
    about eventual bugs, errors, or other anomalies you may encounter
    when using VEDA NeurOCR V.2.0.

    Don't hesitate to contact us!

                                   * * *
