Date of Award

Spring 1998

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computing Sciences - (Ph.D.)

Department

Computer and Information Science

First Advisor

Peter A. Ng

Second Advisor

Murat Tanik

Third Advisor

Daochuan Hung

Fourth Advisor

Ronald S. Curtis

Fifth Advisor

Taiming Chu

Abstract

This dissertation presents a knowledge-based document filing system for TEXPROS. The requirements of a. personal document processing system are investigated. In order for the system to be used in various application domains, a flexible, dynamic modeling approach is employed by getting the user involved in document modeling. The office documents are described using a dual-model which consists of a document type hierarchy and a folder organization. The document type hierarchy is used to capture the layout, logical and conceptual structures of documents. The folder organization, which is defined by the user, emulates the real world structure for organizing and storing documents in an office environment.

The document filing and retrieval are predicate-driven. The user can specify filing criteria and queries in terms of predicates. The predicate specification and folder organization specification are described. It is shown that the new specifications can prevent false drops which happen in the previous approach.

The dual models are incorporated by a three-level storage architecture. This storage architecture supports efficient document and information retrieval by limiting the searches to those frame instances of a document type within those folders which appear to be the most similar to the corresponding queries, Specifically, a. three-level retrieval strategy is used in document and information retrieval. Firstly, a knowledge-based query preprocess is applied for efficiently reducing the search space to a small set of frame instances, using the information in the query formula. Secondly, the knowledge and content-based retrieval on the small set of frame instances is applied.

Finally, the third level storage provides a platform for adopting potential content-based multimedia document retrieval techniques.

A knowledge-based predicate evaluation engine is described for automating document filing. The dissertation presents a knowledge representation model. The knowledge base is dynamicly created by a learning agent, which demonstrates that the notion of flexible and dynamic modeling is applicable.

The folder organization is implemented using an agent-based architecture. Each folder is monitored by a filing agent. The basic operations for constructing and reorganizing a folder organization are defined. The dissertation also discusses the cooperation among the filing agents, which is needed for implementing the folder organization.

Share

COinS