Document Type
Dissertation
Date of Award
Spring 5-31-2001
Degree Name
Doctor of Philosophy in Computing Sciences - (Ph.D.)
Department
Computer and Information Science
First Advisor
Gary L. Thomas
Second Advisor
Peter A. Ng
Third Advisor
Daochuan Hung
Fourth Advisor
Ajaz A. Rana
Fifth Advisor
Ronald S. Curtis
Abstract
Document retrieval in an information system is most often accomplished through keyword search. The common technique behind keyword search is indexing. The major drawback of such a search technique is its lack of effectiveness and accuracy. It is very common in a typical keyword search over the Internet to identify hundreds or even thousands of records as the potentially desired records. However, often few of them are relevant to users' interests.
This dissertation presents knowledge-based document retrieval architecture with application to TEXPROS. The architecture is based on a dual document model that consists of a document type hierarchy and, a folder organization. Using the knowledge collected during document filing, the search space can be narrowed down significantly. Combining the classical text-based retrieval methods with the knowledge-based retrieval can improve tremendously both search efficiency and effectiveness.
With the proposed predicate-based query language, users can more precisely and accurately specify the search criteria and their knowledge about the documents to be retrieved. To assist users formulate a query, a guided search is presented as part of an intelligent user interface. Supported by an intelligent question generator, an inference engine, a question base, and a predicate-based query composer, the guided search collects the most important information known to the user to retrieve the documents that satisfy users' particular interests.
A knowledge-based query processing and search engine is presented as the core component in this architecture. Algorithms are developed for the search engine to effectively and efficiently retrieve the documents that match the query. Cache is introduced to speed up the process of query refinement. Theoretical proof and performance analysis are performed to prove the efficiency and effectiveness of this knowledge-based document retrieval approach.
Recommended Citation
Sheng, Fang, "Knowledge-based document retrieval with application to TEXPROS" (2001). Dissertations. 482.
https://digitalcommons.njit.edu/dissertations/482