Chaoyi Wu

I am a PhD student working on medical image analysis and machine learning at Shanghai Jiao Tong University.

My current research interest is in multimodal learning for medical image analysis.

Email  /  Scholar  /  Github

profile photo

Research
Towards Building Multilingual Language Model for Medicine
Pengcheng Qiu*, Chaoyi Wu*, Xiaoman Zhang, Weixiong Lin, Haicheng Wang, Ya Zhang, Yanfeng Wang, Weidi Xie,
Technical Report, 2024.

In this paper, we aim to develop a multilingual language corpus (MMedC), benchmark (MMedBench) and an open-source, multilingual language model (MMedLM) for medicine, that benefits a wider, linguistically diverse audience from different regions.

One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts
Ziheng Zhao, Yao Zhang, Chaoyi Wu, Xiaoman Zhang, Ya Zhang, Yanfeng Wang, Weidi Xie,
Technical Report, 2024.

In this paper, we build up a universal medical segmentation model, driven by text prompts (SAT).

Large-scale Long-tailed Disease Diagnosis on Radiology Images
Qiaoyu Zheng*, Weike Zhao*, Chaoyi Wu*, Xiaoman Zhang*, Ya Zhang, Yanfeng Wang, Weidi Xie,
Technical Report, 2024.

In this paper, we collect a large-scale multi-modal, multi-scan, long-tailed muti-lable diagnosis (classification) dataset. We further propose a vision-encoder together with a fusion module, enabling arbitary scan input per case. On evaluation, our methods achive better experiment results on our benchmark and can also serve as an pre-train mdoel for external datasets.

Can GPT-4V(ision) Serve Medical Applications? Case Studies on GPT-4V for Multimodal Medical Diagnosis
Chaoyi Wu*, Jiayu Lei*, Qiaoyu Zheng*, Weike Zhao*, Weixiong Lin*, Xiaoman Zhang*, Xiao Zhou*, Ziheng Zhao*, Yanfeng Wang, Ya Zhang, Weidi Xie,
Technical Report, 2023.

We evaluate the GPT-4V on 92 radiographic cases, 20 pathoglogy cases and 16 location cases across 17 medical systems covering 8 imaging modalities. In general, as the cases shown, GPT-4V is still far from clinical usage.

UniBrain: Universal Brain MRI Diagnosis with Hierarchical Knowledge-enhanced Pre-training
Jiayu Lei, Lisong Dai, Haoyun Jiang, Chaoyi Wu, Xiaoman Zhang, Yao Zhang, Jiangchao Yao, Weidi Xie, Yanyong Zhang, Yuehua Li, Ya Zhang, Yanfeng Wang ,
Technical Report, 2023.

We release a new knowledge-enhanced Brain MRI pre-train foundation model leveraging image-report pairs which can realize zero-shot diagnosis of unseen brain diseases.

Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data
Chaoyi Wu*, Xiaoman Zhang*, Yanfeng Wang , Ya Zhang, Weidi Xie,
Technical Report, 2023.

In this study, we aim to initiate the development of Radiology Foundation Model, termed as RadFM. we construct a large-scale Medical Multi-modal Dataset, MedMD, consisting of 16M 2D and 3D medical scans.

PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering
Xiaoman Zhang*, Chaoyi Wu*, Weixiong Lin, Ziheng Zhao, Yanfeng Wang , Ya Zhang, Weidi Xie,
Technical Report, 2023.

In this paper, we focus on the problem of Medical Visual Question Answering (MedVQA). We propose a generative medical VQA model, MedVInT, together with a large scale MedVQA Dataset, PMC-VQA.

PMC-LLaMA: Towards Building Open-source Language Models for Medicine
Chaoyi Wu, Xiaoman Zhang, Yanfeng Wang , Ya Zhang, Weidi Xie,
Technical Report, 2023.

In this report, we introduce PMC-LLaMA, an open-source language model that is acquired leveraging large medical corpus, surpassing chatGPT on medicalQA benchmarks.

PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents
Weixiong Lin, Ziheng Zhao, Xiaoman Zhang, Chaoyi Wu, Yanfeng Wang , Ya Zhang, Weidi Xie,
International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2023.

We collect a biomedical dataset, PMC-OA with 1.6M image-caption pairs collected from PubMedCentral's OpenAccess subset.

Knowledge-enhanced Pre-training for Auto-diagnosis of Chest Radiology Images
Xiaoman Zhang, Chaoyi Wu, Yanfeng Wang , Ya Zhang, Weidi Xie,
Nature Communications, 2023.

Here, we propose a knowledge-enhanced vision-language pre-training approach for auto-diagnosis on chest X-ray images. First trains a knowledge encoder based on an existing medical knowledge graph, then leverages the pre-trained knowledge encoder to guide the visual representation learning.

K-Diag: Knowledge-enhanced Disease Diagnosis in Radiographic Imaging
Chaoyi Wu*, Xiaoman Zhang*, Yanfeng Wang , Ya Zhang, Weidi Xie,
MICCAI-BTSD (workshop), 2023, Oral.

In this paper, we consider the problem of disease diagnosis. Unlike the conventional learning paradigm that treats labels independently, we propose a knowledge-enhanced framework, that enables training visual representation with the guidance of medical domain knowledge.

MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training
Chaoyi Wu, Xiaoman Zhang, Ya Zhang, Yanfeng Wang , Weidi Xie,
International Conference on Computer Vision (ICCV), 2023.

We propose to leverage medical specific knowledge enhancing language-image pre-training method, significantly advancing the ability of pre-trained models to handle unseen diseases on zero-shot classification and grounding tasks.

Boundary-Enhanced Self-supervised Learning for Brain Structure Segmentation
Feng Chang, Chaoyi Wu, Yanfeng Wang , Ya Zhang, Xin Chen, Qi Tian,
International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2022.

We propose Boundary-Enhanced Self-SupervisedLearning (BE-SSL), leveraging supervoxel segmentation and registrationas two related proxy tasks, enhancing brain structure segmentation.

Integrating features from lymph node stations for metastatic lymph node detection
Chaoyi Wu, Feng Chang, Xiao Su, Zhihan Wu, Yanfeng Wang , Ling Zhu, Ya Zhang,
Computerized Medical Imaging and Graphics (CMIG), 2022, 101: 102108.

We first leverage the information of LN stations for metastatic LN detection. Metastatic LN station classification is proposed as proxy task for metastatic LN detection. A GCN-based structure is adopted to model the mutual influence among LN stations.


Based on a template by Jon Barron.