Date of Award

Summer 1974

Document Type

Dissertation

Degree Name

Doctor of Engineering Science in Chemical Engineering

Department

Chemical Engineering and Chemistry

First Advisor

Howard S. Kimmel

Second Advisor

Peter Anders

Third Advisor

Howard David Perlmutter

Fourth Advisor

Edward Charles Roche, Jr.

Fifth Advisor

William H. Snyder

Abstract

An automatic classification of chemical compounds by computer processing of digitized spectral data is presented. The classification system is based on a branch of artificial intelligence known as supervised learning, and used binary linear classifiers to identify compounds as alcohols, esters, ethers, ketones or compounds containing double bonds.

Each of the 1117 spectra in volume one of Sadtler's Standard Raman Spectra was coded using a scale from 0 to 9 in the range from 4000 to 200 cm-1. One hundred and twelve readings were taken on each spectrum. These data were then examined using pattern recognition techniques, and several methods of combining infrared and Raman data from the dame compound were tested.

The classification techniques were most successful when applied to concatenated infrared and Raman data. By taking the infrared data in the range from 500 to 1900 cm-1 and concatenating them with Raman data from the same range, a vector representing each of the L.00 compounds in the data set was obtained. .The parallel polarized Raman was used in preference to the perpendicularly polarized spectrum when both were available, otherwise the nonpolarized spectrum was used. Using an iterative pattern recognition technique a vector was then calculated which would recognize compounds as members of a class or not members based on the sign of the dot product of the calculated vector and the vector representing the compound.

When vectors were calculated using only half the data set, then tested for their ability to correctly classify the remaining compounds; it was found that they could correctly classify compounds more than 90% of the time.

Vectors which were trained using the entire data set were helpful in determining characteristic group frequencies and each class treated is discussed.

Several compounds for which observed frequencies have been assigned in the literature were tested to determine if the assignments could be supported by the trained vectors. In nearly every case supporting evidence for the assignment was available from the appropriate trained vector.

Share

COinS