Machine Learning
Healthcare AI
Signal Processing
Project Overview
Developing an AI system to detect tuberculosis from cough sounds, making TB screening more accessible in resource-limited settings and helping healthcare workers identify cases earlier to save lives.
Detailed Introduction
Motivation. Tuberculosis (TB) remains one of the world's deadliest infectious diseases, with over 10 million new cases and 1.5 million deaths annually. Early detection is crucial for effective treatment and preventing transmission. However, traditional diagnostic methods like sputum microscopy and chest X-rays require specialized equipment and trained personnel, making them inaccessible in many resource-limited settings.
Key project objectives
- Develop a machine learning model that can accurately classify TB from cough sound recordings.
- Create a mobile-friendly screening tool for use in resource-limited settings.
- Validate the system's performance across diverse populations and acoustic environments.
- Ensure the solution is culturally sensitive and accessible to underserved communities.
- Integrate with existing healthcare workflows to support clinical decision-making.
Background: Global TB Challenge
Tuberculosis disproportionately affects low- and middle-income countries, where access to diagnostic tools is limited. The World Health Organization estimates that 3 million TB cases go undiagnosed each year, contributing to continued transmission and poor outcomes. Our project aims to address this gap by leveraging the ubiquity of mobile devices and the distinct acoustic patterns of TB-related coughs.
Technical approach
Our system uses deep learning techniques to analyze cough sound characteristics, including:
- Audio preprocessing: Noise reduction, normalization, and feature extraction from raw audio recordings.
- Feature engineering: Mel-frequency cepstral coefficients (MFCCs), spectral features, and temporal patterns.
- Model architecture: Convolutional neural networks (CNNs) and recurrent neural networks (RNNs) for audio classification.
- Data augmentation: Synthetic data generation to improve model robustness across different recording conditions.
Data collection & privacy
- Ethical approval: IRB approval for audio data collection from TB patients and healthy controls.
- Data acquisition: High-quality audio recordings in controlled clinical environments.
- Privacy protection: De-identification of audio samples and secure data storage protocols.
- Quality control: Expert validation of TB diagnosis and audio quality assessment.
Model development strategy
We follow a systematic approach to model development:
- Baseline models: Start with traditional machine learning approaches (SVM, Random Forest) for comparison.
- Deep learning exploration: Implement CNN and RNN architectures optimized for audio classification.
- Ensemble methods: Combine multiple models to improve overall performance and robustness.
- Cross-validation: Rigorous evaluation using stratified k-fold cross-validation and holdout test sets.
Evaluation & validation
Our evaluation framework includes:
- Performance metrics: Accuracy, sensitivity, specificity, precision, and F1-score.
- Clinical validation: Comparison with standard diagnostic methods (sputum microscopy, chest X-ray).
- Robustness testing: Performance across different recording devices, environments, and patient populations.
- Bias assessment: Evaluation across age, gender, and demographic groups to ensure equitable performance.
Team & roles
Core investigators
- MS June Lee — Project Lead, Machine Learning & Signal Processing
- PhD Alex Ching — Audio Processing & Model Architecture
- PhD Cynthia Dong — Clinical Validation & Data Analysis
- Prof. Shwetak N. Patel — Faculty Advisor, Ubiquitous Computing
- MD David Horne — Clinical Partner, Harborview Medical Center
Collaborators
- Clinical informatics & data governance
- Audio engineering & signal processing
- Mobile app development
- Global health & implementation research
Implementation & deployment
- Mobile application: Cross-platform app for audio recording and real-time analysis.
- Cloud infrastructure: Secure backend for model inference and data processing.
- Integration: API for integration with existing healthcare information systems.
- User interface: Intuitive design for healthcare workers with varying technical expertise.
Timeline & milestones
Months 0–3: Data collection setup, IRB approval, initial audio preprocessing pipeline.
Months 4–9: Model development, baseline performance evaluation, feature engineering optimization.
Months 10–15: Clinical validation, mobile app development, pilot testing in clinical settings.
Months 16+: Field deployment, performance monitoring, iterative improvements based on user feedback.
Expected outcomes
- Validated AI model for TB detection from cough sounds with >90% accuracy.
- Mobile screening tool deployable in resource-limited settings.
- Clinical evidence supporting the use of audio-based TB screening.
- Open-source framework for similar audio-based diagnostic applications.
Challenges & mitigation
- Audio quality variation: Robust preprocessing and data augmentation techniques.
- Cultural sensitivity: Community engagement and culturally appropriate implementation.
- Regulatory approval: Early engagement with health authorities and regulatory bodies.
- Scalability: Cloud-based infrastructure and efficient model deployment strategies.
Global impact potential
This project has the potential to revolutionize TB screening in resource-limited settings by providing an accessible, cost-effective, and non-invasive screening tool. By leveraging the ubiquity of mobile devices, we can bring advanced diagnostic capabilities to underserved populations and contribute to global TB elimination efforts.
Contact & collaboration
If you're interested in collaborating, contributing data, or learning more about this project, please contact:
- MS June Lee — june0604@uw.edu
- Prof. Shwetak N. Patel — shwetak@cs.washington.edu
- MD David Horne — dhorne@uw.edu