Multimodal AI for Healthcare: Automating Multi-Specialist Medical Diagnosis

4 min readFeb 9, 2025

Medical diagnosis often requires input from multiple specialists to get a comprehensive understanding of a patient’s health condition. In this project, I built an AI-powered system using LangChain and OpenAI’s GPT model to analyze medical reports from various specialist perspectives, including cardiology, psychology, pulmonology, neurology, endocrinology, and immunology. The system then aggregates insights from these specialists into a final multidisciplinary diagnosis.

This blog will walk you through the motivation, architecture, implementation, and key takeaways of this project.

Motivation

Traditional medical diagnosis can be time-consuming and requires collaboration between different specialists. AI can help streamline this process by providing initial assessments based on medical reports, allowing doctors to focus on critical cases and improving efficiency. This project aims to:

Automate the analysis of medical reports.
Provide specialist insights for different medical domains.
Generate a comprehensive multidisciplinary diagnosis.

Tech Stack

LangChain: For prompt engineering and structured response generation.
OpenAI’s GPT-3.5 Turbo: For natural language processing.
PyMuPDF (fitz): For extracting text from PDFs.
Python (ThreadPoolExecutor): For parallel processing of AI agents.
JSON & File Handling: For saving and structuring results.

System Architecture

The system consists of:

Medical Report Extraction: Reads and extracts text from a PDF report.
Specialist AI Agents: Each agent analyzes the report from a different medical perspective (e.g., cardiologist, psychologist, etc.).
Multidisciplinary Team Agent: Aggregates responses from all specialists to provide a final comprehensive diagnosis.
Output Storage: Saves the final diagnosis as a text file.

Implementation Details

1. Medical Report Extraction

We use PyMuPDF to extract text from a PDF file:

import fitz  # PyMuPDF

def read_pdf(file_path):
    doc = fitz.open(file_path)
    text = ""
    for page_num in range(doc.page_count):
        page = doc.load_page(page_num)
        text += page.get_text("text")
    return text

This function loads a PDF and extracts its text content for further analysis.

2. Specialist AI Agents

Each medical specialist agent follows a structured prompt template designed for their domain:

from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI

class Agent:
    def __init__(self, medical_report=None, role=None, extra_info=None):
        self.medical_report = medical_report
        self.role = role
        self.extra_info = extra_info or {}
        self.prompt_template = self.create_prompt_template()
        self.model = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0125")
    def create_prompt_template(self):
        templates = {
            "Cardiologist": """
                Act like a cardiologist. Review the patient's cardiac workup, including ECG, blood tests, and echocardiogram.
                Provide insights on possible cardiac conditions and next steps.
                Medical Report: {medical_report}
            """,
            "Psychologist": """
                Act like a psychologist. Analyze the report for mental health concerns such as anxiety or depression.
                Medical Report: {medical_report}
            """
        }
        return PromptTemplate.from_template(templates[self.role])
    
    def run(self):
        prompt = self.prompt_template.format(medical_report=self.medical_report)
        response = self.model.invoke(prompt)
        return response.content

Each agent is initialized with a role and a specific prompt tailored to that role.

3. Running Agents in Parallel

To speed up processing, we run all specialist agents concurrently using Python’s ThreadPoolExecutor:

from concurrent.futures import ThreadPoolExecutor, as_completed

agents = {
    "Cardiologist": Cardiologist(medical_report),
    "Psychologist": Psychologist(medical_report)
}
def get_response(agent_name, agent):
    return agent_name, agent.run()
responses = {}
with ThreadPoolExecutor() as executor:
    futures = {executor.submit(get_response, name, agent): name for name, agent in agents.items()}
    for future in as_completed(futures):
        agent_name, response = future.result()
        responses[agent_name] = response

4. Multidisciplinary Analysis

After gathering responses from all specialists, we create a final agent to synthesize the information:

class MultidisciplinaryTeam(Agent):
    def __init__(self, responses):
        super().__init__(role="MultidisciplinaryTeam", extra_info=responses)

def run_multidisciplinary_analysis(responses):
    team_agent = MultidisciplinaryTeam(responses)
    return team_agent.run()
final_diagnosis = run_multidisciplinary_analysis(responses)

This agent combines insights to produce a final diagnosis.

5. Saving the Results

Finally, we save the results to a text file:

from pathlib import Path

output_path = Path("results/final_diagnosis.txt")
output_path.parent.mkdir(parents=True, exist_ok=True)
with output_path.open("w") as txt_file:
    txt_file.write(final_diagnosis)

This ensures the final report is stored for further review.

Results & Key Takeaways

Results

Each specialist agent provides domain-specific insights.
The multidisciplinary agent synthesizes all inputs into a comprehensive diagnosis.
The system successfully automates the first-level analysis of medical reports.

Key Takeaways

AI can assist but not replace doctors: This tool provides preliminary insights but should always be reviewed by medical professionals.
Parallel processing speeds up execution: Running agents concurrently significantly reduces response time.
Structured prompts improve reliability: Designing precise prompts ensures more accurate and relevant AI-generated responses.

Future Enhancements

More medical specialties: Expanding to dermatology, nephrology, etc.
Integration with EHR systems: To fetch reports directly from electronic health records.
Fine-tuned AI models: Using custom-trained models for better medical analysis.
UI for doctors: Creating a web interface for easier interaction.

Conclusion

This project demonstrates how AI can enhance medical analysis by simulating input from multiple specialists and generating a multidisciplinary diagnosis. While AI cannot replace human doctors, it can provide valuable insights and assist healthcare professionals in making better-informed decisions.

Want to explore the code? Check out my GitHub repository!

Let’s connect! Feel free to reach out to me on LinkedIn if you’re interested in discussing AI-driven healthcare solutions further!