Dataset

Dataset Name: RSNA Intracranial Hemorrhage Dataset
Source: Kaggle (Radiological Society of North America)
Sample Size: More than 25,000 head CT scans
Format: DICOM (.dcm) files
Labels: Multi-label annotations for 5 ICH subtypes + Any ICH

The RSNA Intracranial Hemorrhage dataset is a large-scale, expertly annotated collection of brain CT scans specifically curated for developing and evaluating automated ICH detection algorithms. Each scan has been reviewed and labeled by board-certified radiologists, ensuring high-quality ground truth annotations.

Image Preprocessing Pipeline

Medical CT scans require specialized preprocessing to enhance hemorrhage visibility and prepare data for deep learning models.

1

DICOM Input

System accepts DICOM (.dcm) files, the standard format for medical imaging. DICOM contains both image data and metadata (patient information, scan parameters).

2

Hounsfield Units Conversion

Raw pixel values are converted to Hounsfield Units (HU), a standardized scale for measuring radiodensity. This normalization ensures consistency across different scanners and imaging protocols.

HU = pixel_value × RescaleSlope + RescaleIntercept
3

Three-Channel Windowing

Three different HU windowing settings are applied to emphasize different tissue types, creating a pseudo-RGB representation:

Brain Window

WL: 40 HU | WW: 80 HU

Optimizes visualization of brain parenchyma and gray-white matter differentiation

Blood Window

WL: 75 HU | WW: 215 HU

Enhances detection of acute hemorrhage and blood products

Bone Window

WL: 600 HU | WW: 2800 HU

Shows skull structures and helps identify fractures

4

RGB Tensor Formation

The three windowed images are stacked to create a 3-channel RGB-like tensor:

  • Red Channel: Blood window values
  • Green Channel: Brain window values
  • Blue Channel: Bone window values

This multi-window approach provides the neural network with comprehensive information about different tissue densities simultaneously, improving hemorrhage detection across all subtypes.

5

Normalization and Resizing

Images are normalized to [0, 1] range and resized to 256×256 pixels for efficient neural network processing while maintaining diagnostic features.

🎯 Purpose of Multi-Window Preprocessing

Different ICH subtypes have varying radiodensities and locations. By using three complementary windows, the system can detect hemorrhages regardless of their density or proximity to bone, significantly improving sensitivity and specificity compared to single-window approaches.

Model Architecture

Cascade EfficientNet-V2-ConvNeXt

Hybrid Deep Learning Architecture for Multi-Label ICH Classification

Working Principle

The system employs a novel cascade architecture that combines the strengths of two state-of-the-art convolutional neural networks in a sequential refinement process.

Front-End Network: EfficientNet-V2

Local Features

Role: Initial feature extraction and local pattern recognition

Key Characteristics:
  • Focuses on fine-grained and local features
  • Efficient compound scaling of network depth, width, and resolution
  • Optimized training speed with Fused-MBConv blocks
  • Excels at detecting small hemorrhages and subtle density changes
  • Progressive feature resolution for multi-scale detection
Why EfficientNet-V2?

EfficientNet-V2 provides superior parameter efficiency while maintaining high accuracy. Its compound scaling approach ensures optimal resource utilization, critical for medical imaging applications requiring detailed feature extraction.

Cascade Feature Fusion

Back-End Network: ConvNeXt

Global Context

Role: Global spatial context understanding and feature refinement

Key Characteristics:
  • Enhances global spatial relationships between features
  • Modern convolution design with large kernel sizes (7×7)
  • Layer normalization and GELU activation for stable training
  • Captures anatomical relationships between brain regions
  • Integrates information across the entire image field
Why ConvNeXt?

ConvNeXt modernizes standard CNNs with Vision Transformer-inspired improvements while maintaining the efficiency and interpretability of convolutional architectures. Its ability to model long-range dependencies is crucial for understanding ICH spatial distribution patterns.

Cascade Mechanism

The cascade architecture enables progressive feature refinement through a two-stage process:

  1. Stage 1 (EfficientNet-V2): Extracts hierarchical local features at multiple scales, identifying candidate hemorrhage regions and low-level patterns
  2. Feature Fusion: Selected feature maps from EfficientNet-V2 are forwarded to ConvNeXt through skip connections, preserving fine-grained details
  3. Stage 2 (ConvNeXt): Refines features with global context awareness, distinguishing true hemorrhages from artifacts and integrating spatial relationships
  4. Classification Head: Multi-label classifier produces confidence scores for each of the six ICH categories

📐 Cascade Architecture Diagram

Visualization showing EfficientNet-V2 → Feature Fusion → ConvNeXt → Classification

Input (256×256×3)
EfficientNet-V2
Feature Fusion
ConvNeXt
6 ICH Classes

Advantages of Hybrid Cascade Architecture

🎯

Improved Accuracy

Combines local and global features for comprehensive hemorrhage detection across all sizes and locations

⚖️

Balanced Sensitivity-Specificity

Two-stage refinement reduces false positives while maintaining high sensitivity for small hemorrhages

🔄

Feature Reusability

Cascade connections enable efficient feature sharing and gradient flow, improving training stability

📊

Multi-Label Capability

Architecture naturally handles multiple concurrent hemorrhage types, reflecting clinical reality

Multi-Model Validation Approach

The system employs three independently trained models to provide robust predictions and reduce uncertainty.

Model A - Primary

Trained with standard augmentation and balanced sampling

Model B - Secondary

Trained with different initialization and augmentation strategies

Model C - Validation

Trained with emphasis on challenging cases and edge scenarios

Benefits of Multi-Model Approach:

  • Reduces model-specific biases and overfitting
  • Provides confidence intervals through prediction variance
  • Enables cross-validation without requiring separate test set
  • Improves reliability in clinical decision support scenarios

Technical Specifications

Input Resolution

256 × 256 × 3 (RGB)

Output Classes

6 (Multi-label Binary)

Training Framework

TensorFlow / Keras

Loss Function

Binary Cross-Entropy

Optimizer

Adam with Learning Rate Scheduling

Batch Size

32 (Training)