Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published September 2019 | Submitted
Journal Article Open

Deriving Compact Laws Based on Algebraic Formulation of a Data Set

Abstract

In various subjects, there exist concise and consistent relationships between input and output parameters. Discovering the relationships, or more precisely compact laws, in a data set is of great interest in many fields, such as physics, chemistry, and finance. In recent years, the field of data discovery has made great progress towards discovering these laws in practice thanks to the success of machine learning. However, machine learning methods relate the input and output data by considering them separately instead of equally. In addition, the analytical approaches to finding the underlying theories governing the data are relatively slow. In this paper, we develop an innovative approach on discovering compact laws. A novel algebraic equation formulation is proposed such that constant determination and candidate equation verification can be explicitly solved with low computational time. This algebraic equation formulation does not distinguish between input and output variables, and converts the problem of deriving meaning from data into solving a linear algebra equation and searching for linear equations that fit the data. We also derive a more efficient search algorithm using finite fields. Rigorous proofs and computational results are presented in validating these methods. The algebraic formulation allows for the search of equation candidates in an explicit mathematical manner. For a certain type of compact theory, our approach assures convergence, with the discovery being computationally efficient and mathematically precise.

Additional Information

© 2019 Published by Elsevier. Received 23 July 2018, Revised 14 January 2019, Accepted 13 June 2019, Available online 15 June 2019. This research was funded by the Gordon and Betty Moore Foundation through Grant GBMF4915 to the Caltech Center for Data-Driven Discovery. This researched was conducted as the named Caltech SURF program of Dr. Jane Chen.

Attached Files

Submitted - 1706.05123.pdf

Files

1706.05123.pdf
Files (175.3 kB)
Name Size Download all
md5:80daed9c973ea4569026ed1745816833
175.3 kB Preview Download

Additional details

Created:
August 19, 2023
Modified:
October 20, 2023