MSFT-transformer: a multistage fusion tabular transformer for disease prediction using metagenomic data.
More and more recent studies highlight the crucial role of the human microbiome in maintaining health, while modern advancements in metagenomic sequencing technologies have been accumulating data that are associated with human diseases. Although metagenomic data offer rich, multifaceted information, including taxonomic and functional abundance profiles, their full potential remains underutilized, as most approaches rely only on one type of information to discover and understand their related correlations with respect to disease occurrences. To address this limitation, we propose a multistage fusion tabular transformer architecture (MSFT-Transformer), aiming to effectively integrate various types of high-dimensional tabular information extracted from metagenomic data. Its multistage fusion strategy consists of three modules: a fusion-aware feature extraction module in the early stage to improve the extracted information from inputs, an alignment-enhanced fusion module in the mid stage to enforce the retainment of desired information in cross-modal learning, and an integrated feature decision layer in the late stage to incorporate desired cross-modal information. We conduct extensive experiments to evaluate the performance of MSFT-Transformer over state-of-the-art models on five standard datasets. Our results indicate that MSFT-Transformer provides stable performance gains with reduced computational costs. An ablation study illustrates the contributions of all three models compared with a reference multistage fusion transformer without these novel strategies. The result analysis implies the significant potential of the proposed model in future disease prediction with metagenomic data.