Brain Trauma Injury Foundation Model

Foundation Model Multimodal Large Scale

Project Overview

This project aims to address a pressing challenge in trauma care — the vast amount of underutilized data stored across hospital systems. At Harborview Medical Center, a Level I trauma center, large volumes of imaging, clinical notes, and outcomes data are collected but remain fragmented and difficult to integrate for research or clinical use. Our objective is to explore how these heterogeneous data sources can be transformed into structured, privacy-preserving knowledge that supports clinicians in understanding injuries, tracking recovery, and making equitable treatment decisions. Rather than simply building another foundation model, the project investigates how large multimodal AI systems can be responsibly adapted to serve as domain specialists in traumatic brain and spinal injury.

Detailed Introduction

Motivation. Traumatic brain injury (TBI) and spinal cord injury (SCI) continue to impose major clinical and societal burdens, yet the datasets that could improve care remain largely unexploited due to privacy, accessibility, and scalability barriers. Through close collaboration with Harborview Medical Center, we discovered that many of these data are securely stored but not analyzed or connected across departments. This project seeks to tackle that gap by developing and evaluating technical strategies for adapting and fine-tuning large-scale foundation models to this domain—assessing how much data, what model configurations, and what governance processes are needed for a general-purpose foundation model (such as AWS Nova) to become a reliable, transparent, and clinically useful specialist model for TBI and SCI.

Our approach is research-driven and interdisciplinary: we work directly with clinicians from neurosurgery, orthopedics, and rehabilitation medicine to define meaningful downstream tasks while experimenting with scalable, privacy-preserving multimodal learning frameworks. The long-term goal is to establish a reproducible pathway for converting complex, unstructured clinical data into equitable and interpretable AI systems that augment, rather than replace, medical decision-making.

Evaluation, safety & fairness

Evaluation will include standard technical metrics (AUC, sensitivity, specificity, dice for segmentation), but must also include:

Fairness audits across age, sex, race/ethnicity, payer status, and geography.
Privacy leakage testing (can identifiers be recovered?) and red-team adversarial probes.
Clinical utility studies with clinicians to measure interpretability, workflow fit, and downstream impact on decision-making.

Team & roles

Core investigators

MS June Lee — Project Lead, Data engineering & model operations
MS Benjamin Han — Data Scientist, Data engineering & model operations
MS Mary Nguyen — Fullstack, Data Engineering
DO Asouri Souri — Neurology, Data Analysis
PhD Rupak Rajachar — Project Advisor
MD Diana Wiseman — Neurosurgery lead, clinical labeling & validation
MD Jamie Ott, MD Heather Barnett — Orthopedics clinical partners
MD Christopher Lewis — Rehabilitation medicine, outcomes & SDOH integration
PhD Vikash Gupta — Sr.Solution Architect, AWS Healthcare
PhD Ujjwal Ratan — Machine Learning Team Leader, AWS Healthcare

Advisors & collaborators

Clinical informatics & IRB governance
Imaging informatics (Radiology IT, PACS/Visage integration)
Privacy & legal (data use, HIPAA compliance)
External auditing partners for fairness/privacy evaluation

Infrastructure & tooling

Secure cloud environment (AWS recommended for Bedrock integration) with encryption at rest/in transit and VPC isolation.
Data lake for SGT-protected artifacts + metadata catalog.
Model training platform supporting multimodal transformers, distributed training, and experiment tracking (MLflow, Weights & Biases, or similar).
API gateway for clinical integration with Epic and Visage (FHIR mappings, HL7/RIS connectors) — outputs should be pluggable clinical decision support artifacts rather than black-box predictions.

Timeline & milestones (example)

Months 0–3: Governance, IRB, data pipeline prototypes, SGT proof-of-concept, pilot fine-tuning on AWS Bedrock (text-only).

Months 4–9: Add imaging modality (CT/CT-angio), structured labeling for key injury classes, clinician-in-the-loop evaluation, initial external audit.

Months 10–18: Scale data, continual pretraining for domain encoder, multimodal fusion, prospective clinical pilot for a targeted use-case (e.g., injury categorization & discharge planning assistance).

Months 18+: Production integration, monitoring, regulatory preparations, multi-center validation and model updates.

Deliverables

SGT-protected trauma dataset & metadata catalog (for internal, governed use).
Multimodal trauma foundation encoder and fine‑tuned downstream models for categorization, segmentation, and prognosis.
Clinician dashboard / FHIR API for integration with Epic and radiology systems.
Evaluation reports (technical, fairness, privacy) and documentation for audits and regulators.

Risks & mitigation

Data privacy risk: Mitigate via SGT, strict governance, limited access controls, and regular privacy audits.
Model bias & fairness: Continuous audits, balanced data sampling, subgroup metrics, and clinician oversight.
Clinical adoption: Early clinician involvement, human-in-the-loop design, interpretable outputs and clear limitations.
Cost & compute: Use staged approach (Bedrock pilot → expand) to manage spend; use spot instances and efficient training recipes.

Connection to recent radiology scaling research

Recent work on radiology scaling laws shows large performance gains when performing continual in-domain pretraining on institution-specific imaging corpora (even modestly sized datasets can provide outsized improvements). This supports our strategy to begin with Bedrock fine-tuning for speed and safety, and then move to larger in-domain continual pretraining if required.

Contact & next steps

If you're interested in collaborating, contributing labeled data, or reviewing the governance plan, please contact the project leads:

MS June Lee — june0604@uw.edu
Core team: Benjamin Han, Mary Nguyen, Asouri Souri, Rupak Rajachar — (internal UW contacts)

🏆 Brain Trauma Injury Foundation Model