![]() University of California San Diego, UNITED STATES Thus, we propose a new methodology that starts by imputing missing values at the peptide level and estimating the uncertainty associated with this imputation and naturally extends by incorporating this uncertainty into the current moderated variance estimation techniques.Ĭitation: Chion M, Carapito C, Bertrand F (2022) Accounting for multiple imputation-induced variability for differential analysis in mass spectrometry-based label-free quantitative proteomics. These analyses generally conclude with a study of the differences in protein abundances between the different conditions, either using Student’s or Welch’s test for the most rudimentary approaches or using the t-tempered testing techniques based on empirical Bayesian approaches. Indeed, even if these statistical tools are relevant in this context, the data sets once imputed are considered as having always been complete in the subsequent analyses: the uncertainty caused by the imputation is not taken into account. ![]() The statistical treatment is not entirely satisfactory when imputation methods are used, notably multiple imputation techniques. Some state-of-the-art statistical proteomics data processing software proposes to impute these missing values, while others simply remove proteins with too many missing peptides. However, they do not satisfactorily consider peptides or proteins whose intensities are missing under certain conditions, even though they are particularly interesting from a biological or medical point of view, since they may explain a difference between the groups being compared. They allow the deduction of protein abundances provided that sufficient peptides per protein are available. Statistical inference methods commonly used in quantitative proteomics are based on the measurement of peptide intensities. We observed a trade-off between sensitivity and specificity, while the overall performance of mi4p outperforms DAPAR in terms of F-Score. Our methodology, named mi4p, was compared to the state-of-the-art limma workflow implemented in the DAPAR R package, both on simulated and real datasets. Indeed, an aggregation step is included for protein-level results based on peptide-level quantification data. This workflow can be used both at peptide and protein-level in quantification datasets. This estimator is finally included in moderated t-test statistics to provide differential analyses results. The imputation-based peptide’s intensities’ variance estimator is then moderated using Bayesian hierarchical models. ![]() We provide a rigorous multiple imputation strategy, leading to a less biased estimation of the parameters’ variability thanks to Rubin’s rules. Hence, the uncertainty due to the imputation is not adequately taken into account. However, the imputation itself may not be optimally considered downstream of the imputation process, as imputed datasets are often considered as if they had always been complete. Imputation aims at replacing a missing value with a user-defined one. Imputing missing values is common practice in label-free quantitative proteomics.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |