Document Type

Dissertation

Date of Award

Spring 5-31-2001

Degree Name

Doctor of Philosophy in Computing Sciences - (Ph.D.)

Department

Computer and Information Science

First Advisor

Gary L. Thomas

Second Advisor

Peter A. Ng

Third Advisor

Daochuan Hung

Fourth Advisor

Ajaz A. Rana

Fifth Advisor

Ronald S. Curtis

Abstract

Document retrieval in an information system is most often accomplished through keyword search. The common technique behind keyword search is indexing. The major drawback of such a search technique is its lack of effectiveness and accuracy. It is very common in a typical keyword search over the Internet to identify hundreds or even thousands of records as the potentially desired records. However, often few of them are relevant to users' interests.

This dissertation presents knowledge-based document retrieval architecture with application to TEXPROS. The architecture is based on a dual document model that consists of a document type hierarchy and, a folder organization. Using the knowledge collected during document filing, the search space can be narrowed down significantly. Combining the classical text-based retrieval methods with the knowledge-based retrieval can improve tremendously both search efficiency and effectiveness.

With the proposed predicate-based query language, users can more precisely and accurately specify the search criteria and their knowledge about the documents to be retrieved. To assist users formulate a query, a guided search is presented as part of an intelligent user interface. Supported by an intelligent question generator, an inference engine, a question base, and a predicate-based query composer, the guided search collects the most important information known to the user to retrieve the documents that satisfy users' particular interests.

A knowledge-based query processing and search engine is presented as the core component in this architecture. Algorithms are developed for the search engine to effectively and efficiently retrieve the documents that match the query. Cache is introduced to speed up the process of query refinement. Theoretical proof and performance analysis are performed to prove the efficiency and effectiveness of this knowledge-based document retrieval approach.

Recommended Citation

Sheng, Fang, "Knowledge-based document retrieval with application to TEXPROS" (2001). Dissertations. 482.
https://digitalcommons.njit.edu/dissertations/482

Download

Included in

Databases and Information Systems Commons, Management Information Systems Commons

COinS

Dissertations

Knowledge-based document retrieval with application to TEXPROS

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Fifth Advisor

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Dissertations

Knowledge-based document retrieval with application to TEXPROS

Author

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Fifth Advisor

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links