Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published March 2008 | public
Journal Article

Mixed-Effects Statistical Model for Comparative LC−MS Proteomics Studies

Abstract

Comparing a protein's concentrations across two or more treatments is the focus of many proteomics studies. A frequent source of measurements for these comparisons is a mass spectrometry (MS) analysis of a protein's peptide ions separated by liquid chromatography (LC) following its enzymatic digestion. Alas, LC−MS identification and quantification of equimolar peptides can vary significantly due to their unequal digestion, separation, and ionization. This unequal measurability of peptides, the largest source of LC−MS nuisance variation, stymies confident comparison of a protein's concentration across treatments. Our objective is to introduce a mixed-effects statistical model for comparative LC−MS proteomics studies. We describe LC−MS peptide abundance with a linear model featuring pivotal terms that account for unequal peptide LC−MS measurability. We advance fitting this model to an often incomplete LC−MS data set with REstricted Maximum Likelihood (REML) estimation, producing estimates of model goodness-of-fit, treatment effects, standard errors, confidence intervals, and protein relative concentrations. We illustrate the model with an experiment featuring a known dilution series of a filamentous ascomycete fungus Trichoderma reesei protein mixture. For 781 of the 1546 T. reesei proteins with sufficient data coverage, the fitted mixed-effects models capably described the LC−MS measurements. The LC−MS measurability terms effectively accounted for this major source of uncertainty. Ninety percent of the relative concentration estimates were within 0.5-fold of the true relative concentrations. Akin to the common ratio method, this model also produced biased estimates, albeit less biased. Bias decreased significantly, both absolutely and relative to the ratio method, as the number of observed peptides per protein increased. Mixed-effects statistical modeling offers a flexible, well-established methodology for comparative proteomics studies integrating common experimental designs with LC−MS sample processing plans. It favorably accounts for the unequal LC−MS measurability of peptides and produces informative quantitative comparisons of a protein's concentration across treatments with objective measures of uncertainties.

Additional Information

© 2008 American Chemical Society. Received July 16, 2007. Publication Date (Web): February 6, 2008. The research described in this paper was conducted as part of the Environmental Biomarkers Initiative under the Laboratory Directed Research and Development Program, and the Genomes to Life program at Pacific Northwest National Laboratory, a multiprogram national laboratory operated by Battelle for the U.S. Department of Energy under Contract DE-AC05-76RL01830. The featured proteomics data were processed and archived by the Instrument Development Laboratory at the Environmental Molecular Sciences Laboratory, a national scientific user facility sponsored by the Department of Energy's Office of Biological and Environmental Research.

Additional details

Created:
August 19, 2023
Modified:
October 25, 2023