Multi-Label Generalized Zero Shot Chest X-Ray Classification by Combining Image-Text Information With Feature Disentanglement.
In fully supervised learning-based medical image classification, the robustness of a trained model depends on its exposure to various disease classes. Generalized Zero Shot Learning (GZSL) aims to predict both seen and novel unseen classes. While most GZSL approaches focus on single-label cases, chest X-rays often have multiple disease labels. We propose a novel multi-modal multi-label GZSL approach that leverages feature disentanglement and multi-modal information to synthesize features of unseen classes. Disease labels are processed through a pre-trained BioBert model to obtain text embeddings, which create a dictionary encoding similarity among labels. We use disentangled features and graph aggregation to learn a second dictionary of inter-label similarities, followed by clustering to identify representative vectors for each class. These dictionaries and representative vectors guide the feature synthesis step, generating realistic multi-label disease samples of seen and unseen classes. Our method outperforms competing methods in experiments on the NIH and CheXpert chest X-ray datasets.