This project presents a Multi-Modal Framework for glaucoma classification, combining Transfer Learning, Explainable AI (XAI), and advanced large language models (LLMs). By integrating LLAMA 3.2 90B Vision Model for visual analysis and LLAMA 3.1 70B LLM for reasoning and textual explanation, this system enhances both prediction accuracy and interpretability.
- Overview
- Framework Architecture
- Technologies
- Project Structure
- Installation
- Usage
- Results and Visualizations
- Contributing
- Snapshots
The system detects glaucoma from fundus images using Transfer Learning and integrates LIME to provide visual explanations for predictions. To further enhance interpretability, it employs:
- LLAMA 3.2 90B Vision Model: Analyzes LIME-based visual explanations to refine interpretability.
- LLAMA 3.1 70B LLM: Generates detailed textual reasoning, combining model outputs and vision insights.
System Components:
- Frontend: React.js-based interface for image upload and result visualization.
- Backend: Express.js and FastAPI for handling predictions and hosting LLMs.
- Image Analysis: Combines CNN-based classification with LIME and LLAMA models for interpretability.
- Database: MongoDB stores user feedback to refine predictions.
The workflow includes the following steps:
- Image Upload: Users upload fundus images for evaluation.
- Classification: A pre-trained CNN model classifies the images as Normal or Glaucoma.
- LIME Explanation: Highlights critical regions influencing the model’s decision.
- Vision Model Analysis:
- The LLAMA 3.2 90B Vision Model refines the explanation by analyzing LIME heatmaps and extracting superpixel importance.
- Reasoning and Explanation:
- The LLAMA 3.1 70B LLM generates a detailed textual explanation, synthesizing classification results and vision insights.
- Feedback: Users can submit feedback for continuous improvement.
-
Frontend:
- React.js for an interactive user interface.
-
Backend:
- Express.js for RESTful APIs.
- FastAPI for serving machine learning models.
-
AI Models:
- Transfer Learning: Pre-trained CNN (e.g., VGG16) for glaucoma classification.
- LIME: For visualizing feature importance.
- LLAMA 3.2 90B Vision Model: Processes LIME-based visual explanations to refine interpretability.
- LLAMA 3.1 70B LLM: Produces natural language explanations based on model predictions and visual insights.
-
Database:
- MongoDB stores user feedback and metadata for continuous system improvement.
Here’s how the project directories are organized:
Project/
│
├── Backend_Model/ # Backend for the machine learning model and API
│ ├── model/ # Model-related files
│ ├── Notebooks/ # Model Training Notebooks
│ ├── predictenv/ # Environment setup for model predictions
│ ├── server/ # Fast API backend for model deployment
│
│
├── Glaucoma-Detection-using-Transfer-Learning/ # Main application
│ ├── node_modules/ # Node.js dependencies
│ ├── public/ # Static files for frontend
│ ├── server/ # Node.js and Express.js backend configuration
│ ├── server.js # Express.js API server entry point
│ ├── src/ # React.js source code (frontend)
│ ├── .env # Environment variables for configuration
│ ├── package.json # NPM package configuration
│ ├── vite.config.js # Vite configuration for frontend
│
├── Test Images/ # Test images for model validation
│ ├── Glaucoma/ # Images of glaucoma-affected eyes
│ └── Normal/ # Images of normal eyes
│
├── .gitignore # Files to be ignored by git
└── README.md # Project documentation
git clone https://github.com/Nehal04052/Explainable-AI-based-Glaucoma-Detection-using-Transfer-Learning-and-LIME.git
cd Explainable-AI-based-Glaucoma-Detection
Navigate to the directory with the Python model environment:
cd Backend_Model/predictenv
pip install -r requirements.txt
Navigate to the Glaucoma-Detection-using-Transfer-Learning
directory and install the frontend dependencies:
cd ../../Glaucoma-Detection-using-Transfer-Learning
npm install
- Install and start MongoDB locally, or configure a remote MongoDB instance.
Create a .env
file in both Backend_Model/server
and Glaucoma-Detection-using-Transfer-Learning
directories. Here’s an example .env
file for the Express.js backend:
MONGO_URI=mongodb://localhost:27017/glaucomaDetection
PORT=5000
Ensure MongoDB is running on your machine.
cd ../../Glaucoma-Detection-using-Transfer-Learning
node server.js
cd Backend_Model/server/
uvicorn main:app --reload
Start the React.js frontend:
cd ../../Glaucoma-Detection-using-Transfer-Learning
npm run dev
Once running, you can access the application in your browser at http://localhost:3000
.
- Classification Result: Identifies the input image as Normal or Glaucoma.
- LIME Heatmaps: Highlights important regions in the fundus image.
- LLAMA-Enhanced Explanation:
- Visual Analysis (LLAMA 3.2 90B): Refines LIME outputs by identifying key superpixels and their relevance.
- Reasoning (LLAMA 3.1 70B): Provides detailed natural language explanations, offering clinical insights.
- Upload a test image (e.g., from the
Test Images
folder). - Receive the classification result and LIME-based visual heatmap.
- View detailed reasoning generated by the LLAMA models.
- Optionally submit feedback to improve the system.
We welcome contributions! Follow these steps to contribute:
- Fork the repository.
- Create a new branch (
git checkout -b feature/YourFeature
). - Commit your changes (
git commit -m 'Add your feature'
). - Push to the branch (
git push origin feature/YourFeature
). - Open a Pull Request.
Here are some snapshots of the system in action: