Deriving Compact Laws Based on Algebraic Formulation of a Data Set
- Creators
- Xu, Wenqing (William)
- Stalzer, Mark
Abstract
In various subjects, there exist concise and consistent relationships between input and output parameters. Discovering the relationships, or more precisely compact laws, in a data set is of great interest in many fields, such as physics, chemistry, and finance. In recent years, the field of data discovery has made great progress towards discovering these laws in practice thanks to the success of machine learning. However, machine learning methods relate the input and output data by considering them separately instead of equally. In addition, the analytical approaches to finding the underlying theories governing the data are relatively slow. In this paper, we develop an innovative approach on discovering compact laws. A novel algebraic equation formulation is proposed such that constant determination and candidate equation verification can be explicitly solved with low computational time. This algebraic equation formulation does not distinguish between input and output variables, and converts the problem of deriving meaning from data into solving a linear algebra equation and searching for linear equations that fit the data. We also derive a more efficient search algorithm using finite fields. Rigorous proofs and computational results are presented in validating these methods. The algebraic formulation allows for the search of equation candidates in an explicit mathematical manner. For a certain type of compact theory, our approach assures convergence, with the discovery being computationally efficient and mathematically precise.
Additional Information
© 2019 Published by Elsevier. Received 23 July 2018, Revised 14 January 2019, Accepted 13 June 2019, Available online 15 June 2019. This research was funded by the Gordon and Betty Moore Foundation through Grant GBMF4915 to the Caltech Center for Data-Driven Discovery. This researched was conducted as the named Caltech SURF program of Dr. Jane Chen.Attached Files
Submitted - 1706.05123.pdf
Files
Name | Size | Download all |
---|---|---|
md5:80daed9c973ea4569026ed1745816833
|
175.3 kB | Preview Download |
Additional details
- Eprint ID
- 96463
- Resolver ID
- CaltechAUTHORS:20190617-104717869
- GBMF4915
- Gordon and Betty Moore Foundation
- Caltech Summer Undergraduate Research Fellowship (SURF)
- Created
-
2019-06-17Created from EPrint's datestamp field
- Updated
-
2021-11-16Created from EPrint's last_modified field