Unbiased data analytic strategies to improve biomarker discovery in precision medicine.
Omics technologies promised improved biomarker discovery for precision medicine. The foremost problem of discovered biomarkers is irreproducibility between patient cohorts. From a data analytics perspective, the main reason for these failures is bias in statistical approaches and overfitting resulting from batch effects and confounding factors. The keys to reproducible biomarker discovery are: proper study design, unbiased data preprocessing and quality control analyses, and a knowledgeable application of statistics and machine learning algorithms. In this review, we discuss study design and analysis considerations and suggest standards from an expert point-of-view to promote unbiased decision-making in biomarker discovery in precision medicine.