?

G L O S S A R Y

Glossary of Terms from LegalScans


ADF
Annotations
ASP
ASCII
Bar Code
Batch Processing
Bitmap/Bitmapped
BMP
Boolean Logic
Briefcase
Burn (CDs or DVDs)
Caching (of Images)
CD Publishing
CD-R
CD-ROM
Keyboard Glossary of Terms from LegalScans CD-ROM Drive
Client-Server Architecture vs. File-Sharing
COLD
COM
COM Object
Compression Ratio
CPU
De-shading
De-skewing
De-speckling
Dithering
Document Imaging
Drag-and-Drop
Duplex Scanners vs. Double-Sided Scanning
DVD
Electronic Document Management
Erasable Optical Drive
Flatbed Scanner
Folder Browser
Forms Processing
Full-text Indexing and Search
Fuzzy Logic
GIF
Gigabyte
Grayscale
Hierarchical Storage Management (HSM)
ICR
Image Enabling
Image Processing Card (IPC)
Index Fields
Internet Publishing
IPX/SPX
ISIS and TWAIN Scanner Drivers
ISO 9660 CD Format
JPEG
Jukebox
Key Field
Magneto-Optical Drive
MAPI
MFP
Near-Line
NetWare Loadable Module (NLM)
NT
n-tier architecture
OCR
Off-Line
On-Line
Optical Disks
Optical Jukebox
Phase Change
Pixel
Portable Volumes
RAID
Raster/Rasterized (Raster or Bitmap Drawing)
Redaction
Region (of an image)
Scale-to-Gray
Scalability
Scanner
SCSI
SCSI Scanner Interface
SQL
TCP/IP
Templates, Document
Thumbnails
TIFF
TIFF Group III (compression)
TIFF Group IV (compression)
Video Scanner Interface
Workflow, Ad Hoc
Workflow, Rule-Based
WORM Disks
ZIP
Zone OCR




ADF
Automatic Document Feeder. This is the means by which a scanner feeds the paper document.

Annotations
The changes or additions made to a document using sticky notes, a highlighter, or other electronic tools. Document images or text can be highlighted in different colors, redacted (blacked-out or whited-out), stamped (e.g. “FAXED” or “CONFIDENTIAL”), or have electronic sticky notes attached. Annotations should be overlaid and not change the original document.

ASCII
American Standard Computer Information Interchange. Used to define computer text that was built on a set of 255 alphanumeric and control characters. ASCII has been a standard, non-proprietary text format since 1963.

ASP
Active Server Pages. A technology that simplifies customization and integration of Web applications. ASPs reside on a Web server and contain a mixture of HTML code and server-side scripts. An example of ASP usage includes having a server accept a request from a client, perform a query on a database, and then return the results of the query in HTML format for viewing by a web browser.

Bar Code
A small pattern of vertical lines that is read by a laser or an optical scanner, and which corresponds to a record in a database. An add-on component to imaging software, this feature is designed to increase the speed with which documents can be archived.

Batch Processing
The name of the technique used to input a large amount of information in a single step, as opposed to individual processes.

Bitmap/Bitmapped
See Raster/Rasterized.

BMP
A native file format of Windows for storing images called “bitmaps.”

Boolean Logic
The use of the terms “AND,” “OR” and “NOT” in conducting searches. Used to widen or narrow the scope of a search.

Briefcase

A method to simplify the transport of a group of documents from one computer to another.

Burn (CDs or DVDs)

To record or write data on a CD or DVD.

Caching (of Images)
The temporary storage of image files on a hard disk for later migration to permanent storage, like an optical or CD jukebox.

CD Publishing
An alternative to photocopying large volumes of paper documents. This method involves coupling image and text documents with viewer software on CDs. Sometimes search software is included on the CDs to enhance search capabilities.

CD-R
Short for CD-Recordable. This is a CD which can be written (or recorded) only once. It can be copied to distribute a large amount of data. CD-Rs can be read on any CD-ROM drive whether on a standalone computer or network system. This makes interchange between systems easier.

CD-ROM
Compact Disc Read Only Memory. Written on a large scale and not on a standard computer CD burner (CD writer), they are an optical disk storage media popular for storing computer files as well as digitally-recorded music.

CD-ROM Drive
A computer drive that reads compact discs.

Client-Server Architecture vs. File-Sharing
Two common application software architectures found on computer networks. With file-sharing applications, all searches occur on the workstation, while the document database resides on the server. With client-server architecture, CPU intensive processes (such as searching and indexing) are completed on the server, while image viewing and OCR occur on the client. File-sharing applications are easier to develop, but they tend to generate tremendous network data traffic in document imaging applications. They also expose the database to corruption through workstation interruptions. Client-server applications are harder to develop, but dramatically reduce network data traffic and insulate the database from workstation interruptions.

COLD
Computer Output to Laser Disk. A computer programming process that outputs electronic records and printed reports to laser disk instead of a printer. Can be used to replace COM (Computer Output to Microfilm) or printed reports such as green-bar.

COM
Computer Output to Microfilm. A process that outputs electronic records and computer generated reports to microfilm.

COM Object
Component Object Model. COM refers to both a specification and implementation developed by Microsoft Corporation, which provides a framework for integrating components of a software application. COM allows developers to build software by assembling reusable components from different vendors.

Compression Ratio
The ratio of the file sizes of a compressed file to an uncompressed file, e.g., with a 20:1 compression ratio, an uncompressed file of 1 MB is compressed to 50 KB.

CPU

Central Processing Unit. The “brain” of the computer.

De-shading
Removing shaded areas to render images more easily recognizable by OCR. De-shading software typically searches for areas with a regular pattern of tiny dots.

De-skewing
The process of straightening skewed (off-center) images. De-skewing is one of the image enhancements that can improve OCR accuracy. Documents often become skewed when they are scanned or faxed.

De-speckling
Removing isolated speckles from an image file. Speckles often develop when a document is scanned or faxed.

Dithering
The process of converting grays to different densities of black dots, usually for the purposes of printing or storing color or grayscale images as black and white images.

Document Imaging
Software used to store, manage, retrieve and distribute documents quickly and easily on the computer.

Drag-and-Drop
The movement of on-screen objects by dragging them across the screen with the mouse.

Duplex Scanners vs. Double-Sided Scanning

Duplex scanners automatically scan both sides of a double-sided page, producing two images at once. Double-sided scanning uses a single-sided scanner to scan double-sided pages, scanning one collated stack of paper, then flipping it over and scanning the other side.

DVD
Digital Video Disc or Digital Versatile Disc. A plastic disc, like a CD, on which data can be written and read. DVDs are faster, can hold more information, and can support more data formats than CDs.

Electronic Document Management
Imaging software that helps manage electronic documents.

Erasable Optical Drive
A type of optical drive that uses erasable optical discs.

Flatbed Scanner
A flat-surface scanner that allows users to input books and other documents.

Folder Browser

A system of on-screen folders (usually hierarchical or “stacked”) used to organize documents. For example, the File Manager program in Microsoft Windows is a type of folder browser that displays the directories on your disk.

Forms Processing
A specialized imaging application designed for handling pre-printed forms. Forms processing systems often use high-end (or multiple) OCR engines and elaborate data validation routines to extract hand-written or poor quality print from forms that go into a database. This type of imaging application faces major challenges, since many of the documents scanned were never designed for imaging or OCR.

Full-text Indexing and Search
Enables the retrieval of documents by either their word or phrase content. Every word in the document is indexed into a master word list with pointers to the documents and pages where each occurrence of the word appears.

Fuzzy Logic
A full-text search procedure that looks for exact matches as well as similarities to the search criteria, in order to compensate for spelling or OCR errors.

GIF
Graphics Interchange Format. CompuServe’s native file format for storing images.

Gigabyte
One billion bytes. Also expressed as one thousand megabytes. In terms of image storage capacity, one gigabyte equals approximately 17,000 81/2" x 11" pages scanned at 300 dpi, stored as TIFF Group IV images.

Grayscale
See “Scale-to-Gray.”

Hierarchical Storage Management (HSM)
Software that automatically migrates files from on-line to near-line storage media, usually on the basis of the age or frequency of use of the files.

ICR
Intelligent Character Recognition. A software process that recognizes handwritten and printed text as alphanumeric characters.

Image Enabling
Allows for fast, straightforward manipulation of a client through third-party applications. In examples with Laserfiche, image enabling allows for launching the Laserfiche client, displaying search results in the client, and bringing up the scan dialogue box, all from within a third party application.

Image Processing Card (IPC)
A board mounted in either the computer, scanner or printer that facilitates the acquisition and display of images. The primary function of most IPCs is the rapid compression and decompression of image files.

Index Fields
Database fields used to categorize and organize documents. Often user-defined, these fields can be used for searches.

Internet Publishing

Specialized imaging software that allows large volumes of paper documents to be published on the Internet or intranet. These files can be made available to other departments, offsite colleagues or the public for searching, viewing and printing.

IPX/SPX
Communications protocol used by Novell networks.

ISIS and TWAIN Scanner Drivers
Specialized applications used for communication between scanners and computers.

ISO 9660 CD Format
The International Standards Organization format for creating CD-ROMs that can be read worldwide.

JPEG
Joint Photographic Experts Group (JPEG or JPG). An image compression format used for storing color photographs and images.

Jukebox
A mass storage device that holds optical disks and loads them into a drive.

Key Field
Database fields used for document searches and retrieval. Synonymous with “index field.”

Magneto-Optical Drive
A drive that combines laser and magnetic technology to create high-capacity erasable storage.

MAPI
Mail Application Program Interface. This Windows software standard has become a popular e-mail interface and is used by MS Exchange, GroupWise, and other e-mail packages.

MFP
Multifunction Printer or Multifunctional Peripheral. A device that performs any combination of scanning, printing, faxing, or copy.

Near-Line
Documents stored on optical disks or compact disks that are housed in the jukebox or CD changer and can be retrieved without human intervention.

NetWare Loadable Module (NLM)
An application that runs as part of the network operating system (NOS) of a Novell NetWare server.

NT
Network Technology. Refers to Microsoft Windows NT server and workstation software.

n-tier architecture
The term can apply to the physical or logical architecture of computing. The term refers to a method of distributed computing in which the processing of a specific application occurs over “n” number of machines across a network. Typical tiers include a data tier, business logic tier, and a presentation tier, wherein a given machine will perform the individualized tasks of a tier. Scalability is a primary advantage of n-tier architecture.

OCR
Optical Character Recognition. A software process that recognizes printed text as alphanumeric characters.

Off-Line
Archival documents stored on optical disks or compact disks that are not connected or installed in the computer, but instead require human intervention to be accessed.

On-Line
Documents stored on the hard drive or magnetic disk of a computer that are available immediately.

Optical Disks
Computer media similar to a compact disc that cannot be rewritten. An optical drive uses a laser to read the stored data.

Optical Jukebox
See “Jukebox.”

Phase Change
A method of storing information on rewritable optical disks.

Pixel
Picture Element. A single dot in an image. It can be black and white, grayscale or color.

Portable Volumes
A feature that facilitates the moving of large volumes of documents without requiring copying multiple files. Portable volumes enable individual CDs to be easily regrouped, detached and reattached to different databases for a broader information exchange.

RAID
Redundant Array of Independent Disks. A collection of hard disks that act as a single unit. Files on RAID drives can be duplicated (“mirrored”) to preserve data. RAID systems may vary in levels of redundancy, with no redundancy being a single, non-mirrored disk as level 0, two disks that mirror each other as level 1, on up to level 5, the most common.

Raster/Rasterized (Raster or Bitmap Drawing)
A method of representing an image with a grid (or “map”) of dots or pixels. Typical raster file formats are GIF, JPEG, TIFF, PCX, BMP, etc.

Redaction
A type of document annotation that provides word-level security by concealing from view specific portions of sensitive documents. Like all annotations in a document imaging system, redactions should be image overlays that protect information but do not alter original document images.

Region (of an image)
An area of an image file that is selected for specialized processing. Also called a “zone.”

Scale-to-Gray
An option to display a black and white image file in an enhanced mode, making it easier to view. A scale-to-gray display uses gray shading to fill in gaps or jumps (known as aliasing) that occur when displaying an image file on a computer screen. Also known as grayscale.

Scalability
The capacity of a system to expand without requiring major reconfiguration or re-entry of data. Multiple servers or additional storage can be easily added.

Scanner
An input device commonly used to convert paper documents into computer images. Scanner devices are also available to scan microfilm and microfiche.

SCSI
Small Computer Systems Interface. Pronounced “skuzzy.” A standard for attaching peripherals (notably mass storage devices and scanners) to computers. SCSI allows for up to 7 devices to be attached in a chain via cables. The current SCSI standard is “SCSI II,” also known as “Fast SCSI.”

SCSI Scanner Interface
The device used to connect a scanner with a computer.

SQL
Structured Query Language. The popular standard for running database searches (queries) and reports.

TCP/IP
Network communications protocol. This is the protocol used by the Internet.

Templates, Document
Sets of index fields for documents.

Thumbnails
Small versions of an image used for quick overviews or to get a general idea of what an image looks like.

TIFF
Tagged Image File Format. A non-proprietary raster image format, in wide use since 1981, which allows for several different types of compression. TIFFs may be either single or multi-page files. A single-page TIFF is a single image of one page of a document. A multi-page TIFF is a large single file consisting of multiple document pages. Document imaging systems that store documents as single-page TIFFs offer significant network performance benefits over multi-page TIFF systems.

TIFF Group III (compression)
A one-dimensional compression format for storing black and white images that is utilized by most fax machines.

TIFF Group IV (compression)
A two-dimensional compression format for storing black and white images. Typically compresses at a 20-to-1 ratio for standard business documents.

Video Scanner Interface
A type of device used to connect scanners with computers. Scanners with this interface require a scanner control board designed by Kofax, Xionics or Dunord.

Workflow, Ad Hoc

A simple manual process by which documents can be moved around a multi-user imaging system on an “as-needed” basis.

Workflow, Rule-Based
A programmed series of automated steps that route documents to various users on a multi-user imaging system.

WORM Disks
Write Once Read Many Disks. A popular archival storage media during the 1980s. Acknowledged as the first optical disks, they are primarily used to store archives of data that cannot be altered. WORM disks are created by standalone PCs and cannot be used on the network, unlike CD-Rs.

ZIP
A common file compression format that allows quick and easy storage for transport.

Zone OCR
An add-on feature of the imaging software that populates document templates by reading certain regions or zones of a document, and then placing the text into a document index field.




?