Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published June 2020 | Submitted
Journal Article Open

A prototype knockoff filter for group selection with FDR control

Abstract

In many applications, we need to study a linear regression model that consists of a response variable and a large number of potential explanatory variables, and determine which variables are truly associated with the response. In Foygel Barber & Candès (2015, Ann. Statist., 43, 2055–2085), the authors introduced a new variable selection procedure called the knockoff filter to control the false discovery rate (FDR) and proved that this method achieves exact FDR control. In this paper, we propose a prototype knockoff filter for group selection by extending the Reid–Tibshirani (2016, Biostatistics, 17, 364–376) prototype method. Our prototype knockoff filter improves the computational efficiency and statistical power of the Reid–Tibshirani prototype method when it is applied for group selection. In some cases when the group features are spanned by one or a few hidden factors, we demonstrate that the Principal Component Analysis (PCA) prototype knockoff filter outperforms the Dai–Foygel Barber (2016, 33rd International Conference on Machine Learning (ICML 2016)) group knockoff filter. We present several numerical experiments to compare our prototype knockoff filter with the Reid–Tibshirani prototype method and the group knockoff filter. We have also conducted some analysis of the knockoff filter. Our analysis reveals that some knockoff path method statistics, including the Lasso path statistic, may lead to loss of power for certain design matrices and a specially designed response even if their signal strengths are still relatively strong.

Additional Information

© 2019 The Author(s). Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. Received: 11 June 2017; Revision received: 22 April 2018; Accepted: 24 April 2019; Published: 11 July 2019. The first author's research was conducted during his visit to Applied and Computational Mathematics (ACM) at California Institute of Technology. We are very thankful for Prof. Emmanuel Candés' valuable comments and suggestions to our work. We also thank Prof. Rina Foygel Barber for communicating with us regarding her group knockoff filter and Dr. Lucas Janson for his insightful comments on our PCA prototype filter. We are grateful to the anonymous referees for their valuable comments and suggestions and for pointing out a potential problem in a numerical example in our earlier manuscript using the glmnet package in solving the Lasso problem. Funding: National Science Foundation (DMS 1318377 and DMS 1613861).

Attached Files

Submitted - 1706.03400.pdf

Files

1706.03400.pdf
Files (250.3 kB)
Name Size Download all
md5:3891d2423f30eab581218b0411105831
250.3 kB Preview Download

Additional details

Created:
August 22, 2023
Modified:
October 19, 2023