Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published August 1992 | Published
Journal Article Open

An information theoretic approach to rule induction from databases

Abstract

The knowledge acquisition bottleneck in obtaining rules directly from an expert is well known. Hence, the problem of automated rule acquisition from data is a well-motivated one, particularly for domains where a database of sample data exists. In this paper we introduce a novel algorithm for the induction of rules from examples. The algorithm is novel in the sense that it not only learns rules for a given concept (classification), but it simultaneously learns rules relating multiple concepts. This type of learning, known as generalized rule induction is considerably more general than existing algorithms which tend to be classification oriented. Initially we focus on the problem of determining a quantitative, well-defined rule preference measure. In particular, we propose a quantity called the J-measure as an information theoretic alternative to existing approaches. The J-measure quantifies the information content of a rule or a hypothesis. We will outline the information theoretic origins of this measure and examine its plausibility as a hypothesis preference measure. We then define the ITRULE algorithm which uses the newly proposed measure to learn a set of optimal rules from a set of data samples, and we conclude the paper with an analysis of experimental results on real-world data.

Additional Information

© 1992 IEEE. Manuscript received October 25, 1989; revised April 16, 1990. This work was supported in part by Pacific Bell, in part by the U.S. Army Research Office under Contract DAAL03-89-K-0126 and by the California Institute of Technology's program in Advanced Technologies sponsored by Aerojet General, General Motors, and TRW. Part of this work was carried out by the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. The authors gratefully acknowledge the assistance of David Aha of the University of California-Irvine in providing the voting data set. and also Brain Gaines of the University of Calgary and Ross Quinlan of the University of Sydney for providing the chess data set.

Attached Files

Published - 00149926.pdf

Files

00149926.pdf
Files (1.6 MB)
Name Size Download all
md5:2c4270142c5e679347cb63f87f6a467e
1.6 MB Preview Download

Additional details

Created:
August 20, 2023
Modified:
October 20, 2023