Document Type
Thesis
Date of Award
8-31-1988
Degree Name
Master of Science in Electrical Engineering - (M.S.)
Department
Electrical Engineering
First Advisor
Yeheskel Bar-Ness
Second Advisor
Fidel Morales-Moreno
Third Advisor
Ali N. Akansu
Abstract
It is usually difficult to compress data from a statistically unknown source and yield a good compression ratio. Using the fact that words are the fundamental basis of the written language, a data compression scheme based on words should yield high compression ratios. Discussed here is a compression scheme that gathers words from the source stream and places them in a dictionary, or word list, to use later in the coding, thereby becoming adaptive to the words in a given file without prior knowledge of the source. When a word is in the word dictionary, it is encoded by its location in the dictionary. If a word is not in the dictionary, the raw characters of the word are encoded by arithmetic coding, and the word is then added to the dictionary. If the dictionary is initialized at the start of the compression with several common words found in the data stream, an even greater compression ratio will result. The schemes presented perform with a compression around that of LZW, and at times much greater.
Recommended Citation
Peckham, Christopher Dean, "A word based data compression using arithmetic encoding" (1988). Theses. 3169.
https://digitalcommons.njit.edu/theses/3169
