DKEC: Domain Knowledge Enhanced Multi-Label Classification for Diagnosis Prediction
Abstract
Multi-label text classification (MLTC) tasks in the medical domain often face the long-tail label distribution problem. Prior works have explored hierarchical label structures to find relevant information for few-shot classes, but mostly neglected to incorporate external knowledge from medical guidelines. This paper presents DKEC, Domain Knowledge Enhanced Classification for diagnosis prediction with two innovations: (1) automated construction of heterogeneous knowledge graphs from external sources to capture semantic relations among diverse medical entities, (2) incorporating the heterogeneous knowledge graphs in few-shot classification using a label-wise attention mechanism. We construct DKEC using three online medical knowledge sources and evaluate it on a real-world Emergency Medical Services (EMS) dataset and a public electronic health record (EHR) dataset. Results show that DKEC outperforms the state-of-the-art label-wise attention networks and transformer models of different sizes, particularly for the few-shot classes. More importantly, it helps the smaller language models achieve comparable performance to large language models.
Approach
Figure 2. We develop a method for automated construction of heterogeneous knowledge graphs from online sources (e.g., Wikipedia, MayoClinic, ODEMSA) that accurately captures semantic relations among diverse medical entities (e.g., symptoms and diseases, diseases and treatments), by medical entity extraction using chain-of-thought prompting with GPT-4 and UMLS medical concept normalization. Knowledge Graph Construction: (1) retrieve → (2) step-by-step extraction (token classification, span detection, relation extraction) → (3) UMLS normalization.
Figure 3. We design a heterogeneous label-wise attention mechanism based on graph transformers that captures the diagnosis co-occurrence relations based on relevant medical entities in the knowledge graph and is combined with different encoders (e.g., Multi-filter CNN, BERT) to improve multi-label classification. DKEC pipeline: text branch → graph branch → heterogeneous label-wise attention → label-attentive document embeddings.
Results
How accurate is the Knowledge Graph? one-shot CoT GPT-4 outperforms other baselines in medical entity extraction.
Can DKEC improve MLTC performance for class-imbalanced datasets? DKEC helps more in tail labels.
How does DKEC perform when applied to language models with varying sizes? (1) Performance of DKEC-based models increase less as model size grows. (2) DKEC enables smaller language models to achieve comparable performance to LLMs.
How does DKEC perform with scaling label sizes? With the increase in the number of labels, the MLTC performance generally drops, but DKEC helps maintain performance, particularly when external knowledge is available for all the labels.
Video Presentation
Poster
BibTeX
@article{YourPaperKey2024,
title = {DKEC: Domain Knowledge Enhanced Multi-Label Classification for Diagnosis Prediction},
author = {Ge, Xueren and Satpathy, Abhishek and Williams, Ronald Dean and Stankovic, John and Alemzadeh, Homa},
booktitle = {Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing},
month = nov,
year = {2024},
address = {Miami, Florida, USA},
publisher = {Association for Computational Linguistics},
url = {https://aclanthology.org/2024.emnlp-main.712/},
doi = {10.18653/v1/2024.emnlp-main.712},
pages = {12798--12813}
}