Year of Award
Master of Science (MS)
Department or School/College
Douglas Brinkerhoff, Eric Chesebro
Protein inference, bioinformatics, statistics, machine learning, proteomics, protein standard
University of Montana
Applied Statistics | Bioinformatics | Biostatistics | Computational Biology | Data Science | Numerical Analysis and Scientific Computing | Other Computer Sciences
The Protein inference problem is becoming an increasingly important tool that aids in the characterization of complex proteomes and analysis of complex protein samples. In bottom-up shotgun proteomics experiments the metrics for evaluation (like AUC and calibration error) are based on an often imperfect target-decoy database. These metrics make the inherent assumption that all of the proteins in the target set are present in the sample being analyzed. In general, this is not the case, they are typically a mix of present and absent proteins. To objectively evaluate inference methods, protein standard datasets are used. These datasets are special in that they have been carefully prepared to contain only the proteins specified in the target set. Though this helps, it is still unclear which metrics most adequately capture all the important aspects of a good protein inference method. In this manuscript, a novel protein standard dataset, an ensemble protein inference engine that utilizes several metrics and protein standard datasets to evaluate the performance of inference methods, and several novel protein inference methods are presented.
Lucke, Kyle Lee, "ENSEMBLE PROTEIN INFERENCE EVALUATION" (2021). Graduate Student Theses, Dissertations, & Professional Papers. 11845.
Applied Statistics Commons, Bioinformatics Commons, Biostatistics Commons, Computational Biology Commons, Data Science Commons, Numerical Analysis and Scientific Computing Commons, Other Computer Sciences Commons
© Copyright 2021 Kyle Lee Lucke