You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Model Card for Digepath

Digepath is a self-supervised foundation model for intelligent gastrointestinal pathology images analysis. Arxiv preprint paper: [https://arxiv.org/abs/2505.21928]

The model is a Vision Transformer Large/16 with DINO-V2 [1] self-supervised pre-training on 353 million multi-scale images from 210,043 H&E-stained gastrointestinal related slides.

Introduction of Digepath

Gastrointestinal (GI) diseases represent a clinically significant burden, necessitating precise diagnostic approaches to optimize patient outcomes. Conventional histopathological diagnosis suffers from limited reproducibility and diagnostic variability. To overcome these limitations, we develop Digepath, a specialized foundation model for GI pathology. Our framework introduces a dual-phase iterative optimization strategy combining pretraining with fine-screening, specifically designed to address the detection of sparsely distributed lesion areas in whole-slide images. Digepath was initially pretrained on a large-scale dataset comprising over 353 million multi-scale images derived from 210,043 H&E-stained slides of GI diseases. It was subsequently fine-tuned on 471,443 carefully selected regions of interest (ROIs) in the second stage. It attains state-of-the-art performance on 32 out of 33 tasks related to GI pathology, including pathological diagnosis, protein expression status prediction, gene mutation prediction, and prognosis evaluation. Digepath demonstrates broad applicability across diverse clinical tasks, highlighting its potential for reliable deployment in real-world pathology workflows.

Using Digepath to extract features from gastrointestinal pathology image

import timm
import torch
import torchvision.transforms as transforms

model = timm.create_model('hf_hub:xtxx/Digepath', pretrained=True, init_values=1e-5, dynamic_img_size=True)

preprocess = transforms.Compose([
            transforms.Resize(224),
            transforms.ToTensor(),
            transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),])

model = model.to('cuda')
model.eval()

input = torch.randn([1, 3, 224, 224]).cuda()

with torch.no_grad():
    output = model(input) # [1, 1024]

Training Pipeline

Self Supervised Learning: https://github.com/facebookresearch/dinov2

Evaluation Pipeline

WSI Classification: https://github.com/lingxitong/MIL_BASELINE
ROI Classification: https://github.com/lingxitong/HistoROIBench
ROI Segmentation: https://github.com/lingxitong/PFM_Segmentation

Citation

If Digepath is helpful to you, please cite our work.

@article{zhu2025subspecialty,
  title={Subspecialty-specific foundation model for intelligent gastrointestinal pathology},
  author={Zhu, Lianghui and Ling, Xitong and Ouyang, Minxi and Liu, Xiaoping and Guan, Tian and Fu, Mingxi and Cheng, Zhiqiang and Fu, Fanglei and Zeng, Maomao and Liu, Liming and others},
  journal={arXiv preprint arXiv:2505.21928},
  year={2025}
}

References

[1] Oquab, Maxime, et al. "Dinov2: Learning robust visual features without supervision." arXiv preprint arXiv:2304.07193 (2023).

Downloads last month: 7

Inference Providers NEW

Image Feature Extraction

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support