Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published December 2017 | public
Book Section - Chapter

Massively-parallel best subset selection for ordinary least-squares regression

Abstract

Selecting an optimal subset of k out of d features for linear regression models given n training instances is often considered intractable for feature spaces with hundreds or thousands of dimensions. We propose an efficient massively-parallel implementation for selecting such optimal feature subsets in a brute-force fashion for small k. By exploiting the enormous compute power provided by modern parallel devices such as graphics processing units, it can deal with thousands of input dimensions even using standard commodity hardware only. We evaluate the practical runtime using artificial datasets and sketch the applicability of our framework in the context of astronomy.

Additional Information

© 2017 IEEE. Fabian Gieseke acknowledges support from the Danish Industry Foundation through the Industrial Data Analysis Service (IDAS) and Christian Igel acknowledges support from the Innovation Fund Denmark through the Danish Center for Big Data Analytics Driven Innovation (DABAI).

Additional details

Created:
August 19, 2023
Modified:
October 18, 2023