ISSN# 1545-4428 | Published date: 19 April, 2024
You must be logged in to view entire program, abstracts, and syllabi
At-A-Glance Session Detail
   
AI/ML: Vision Transformers in MRI
Digital Poster
AI & Machine Learning
Monday, 06 May 2024
Exhibition Hall (Hall 403)
14:45 -  15:45
Session Number: D-160
No CME/CE Credit

Computer #
1949.
81Categorizing liver stiffness in children and adults through deep learning and multiparametric MRI with segmented liver and spleen data
Jonathan R. Dillman1, Redha Ali1, Hailong Li1, Huixian Zhang1, Wen Pan2, Scott B. Reeder3, David T. Harris4, William Masch5, Anum Alsam5, Krishna Shanbhogue6, Anas Bernieh7, Sarangarajan Ranganathan7, Nehal A. Parikh7, and Lili He1
1Department of Radiology, Cincinnati children's hospital medical center, Cincinnati, OH, United States, 2Department of Radiology, Cincinnati children's hospital medical center, 45429, OH, United States, 3Department of Radiology, University of Wisconsin-Madison, Madison, WI, United States, 4University of Wisconsin-Madison, Madison, WI, United States, 5Michigan Medicine, University of Michigan, Ann Arbor, MI, United States, 6New York University Langone Health, New York, NY, United States, 7Cincinnati children's hospital medical center, Cincinnati, OH, United States

Keywords: Diagnosis/Prediction, Machine Learning/Artificial Intelligence, Liver, Elastography

Motivation: To address the limited accessibility of Magnetic Resonance Elastography (MRE) for liver stiffness assessment.

Goal(s): To develop AI-based pipeline for categorizing subjects into no/mild (<3 kPa) and moderate/severe (≥3 kPa) liver stiffening using multiparametric MRI images.

Approach: Our model contains two main components: segmentation and classification. We employed a Swin-UNETR model to segment liver and spleen tissues from multiparametric MRI images. Then, we developed a Swin Transformer-based model for liver stiffness stratification. We used multi-site ten-fold cross-validation to evaluate our models’ performance.

Results: Our best model achieved an Area Under the Receiver Operating Characteristic (AUROC) curve of 0.84 for liver stiffness categorization.

Impact: Offering an accessible and accurate method for liver stiffness categorization, our research may enhance patient care, reduce healthcare costs, and expand the availability of this vital diagnostic tool, benefiting clinicians, researchers, and, ultimately, patients with liver disease, worldwide.

1950.
82Self-Supervised Pre-Training Based Hybrid Network for Deep Gray Matter Nuclei Segmentation
Lijun Bao1 and Yang Deng2
1Department of Electronic Science, Xiamen University, Xiamen, China, 2Institute of Artificial Intelligence,Xiamen University, Xiamen, China

Keywords: Analysis/Processing, Segmentation

Motivation: Vision Transformers (ViTs) have the potentiality to outperform convolutional neural networks (CNNs) in the task of deep gray matter nuclei segmentation. However, Transformer-based models require large labeled datasets for training.

Goal(s): Our goal is to design a Transformer-based model and alleviate the model's dependence on labeled data.

Approach: We present a CNN-Transformer hybrid Network (CTNet) for deep gray matter nuclei segmentation. Moreover, a novel self-supervised pre-training approach that combines rotation prediction and masked feature reconstruction is proposed to pre-train the CTNet. 

Results: Our method achieves better performance than other comparison models on human brain MRI datasets.

Impact: This is the first study that utilizes self-supervised learning methods for deep gray matter nuclei segmentation. Our method can achieve outstanding segmentation performance and effectively assist clinical doctors in the diagnosis and treatment of neurodegenerative diseases.

1951.
83Longitudinal oncology lesion tracking using self-supervised vision transformers.
Deepa Anand1, Gurunath Reddy M1, Dattesh D Shanbhag1, Sudhanya Chatterjee1, Aanchal Mongia1, Uday Patil1, and Rakesh Mullick1
1GE HealthCare, Bengaluru, India

Keywords: Diagnosis/Prediction, Machine Learning/Artificial Intelligence, Foundation Model, Lesion Tracking, Longitudinal data

Motivation: Automate lesion(s) delineation across longitudinal time-points to improve throughput and accuracy, reduce fatigue and determine disease velocity

Goal(s): A method where user identifies an imaging lesion and wants to automatically label phenotypically similar imaging lesion on other scans .

Approach: Vision Foundation model (DINO V2) features to localize and segment similar region between template mask region and new test data to obtain segmentation of similar lesion(s).

Results: Reasonably well lesion segmentation capabilities on serial MRI scans in oncology patients with various MRI protocols, orientations and contrast. For Ferret diameter metric, a mean difference (95% CI) = -3.5 mm (-7.6 to 0.7 mm).

Impact: Ability to automatically delineate phenotypically similar lesions on serial imaging data with user interaction on first time point only. Methodology is generalizable irrespective of imaging orientation, contrast and without need for extensive data labelling or geometric synchronization on serial scans.

1952.
84Breast tumor segmentation network based on local attention
Binze Han1,2, Long Yang2, Heng Zhang2,3, Zhou Liu4, Meng Wang4, Ya Ren4, Qian Yang4, Wei Cui5, Ye Li2,6,7, Dong Liang2,6,7, Xin Liu2,6,7, Hairong Zheng2,6,7, and Na Zhang2,6,7
1Southern University of Science and Technology (SUSTech), Shenzhen, China, 2Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China, 3Faculty of Robot Science and Engineering, Northeastern University, Shenyang, China, 4National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital & Shenzhen Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Shenzhen, China, 5MR Research, GE Healthcare, Beijing, China, 6Key Laboratory of Biomedical Imaging Science and System, Chinese Academy of Sciences, Shenzhen, China, 7United Imaging Research Institute of Innovative Medical Equipment, Shenzhen, China

Keywords: Analysis/Processing, Cancer, Breast

Motivation: Accurate segmentation of the lesion region is the first step toward early diagnosis. The transformer, on the other hand, has very competitive performance but also extremely high computational complexity.

Goal(s): Finding an efficient and computationally inexpensive method is currently a great challenge for the application of transformers in medical image segmentation.

Approach: We adopt the shift local self-attention method to extract features, which reduces the computational complexity while obtaining very high segmentation accuracy.

Results: Experimental results on a dataset comprising 130 breast tumor cases demonstrate that the proposed network accurately segments breast tumors, surpassing the accuracy of many other convolution-based or transformer-based networks.

Impact: This study may inspire scientists to create simpler, efficient components for reduced self-attention computational cost while preserving long-range modeling. The achievement in high-precision segmentation can ease clinicians' workload by reducing image annotation.

1953.
85Intracranial Large Vessel Severe Stenosis and Occlusion Detection Based on Vision Transformer Fusion Model from Multi-parametric MRI
Mengzhou Sun1, Xiaoyun Liang2, Jing Zhang2, Di Wu3, and Wenzhen Zhu3
1Institute of Research and Clinical Innovations, Neusoft Medical Systems Co., Ltd, BeiJing, China, 2Institute of Research and Clinical Innovations, Neusoft Medical Systems Co., Ltd, Shanghai, China, 3Radiology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, WuHan, China

Keywords: Diagnosis/Prediction, Stroke

Motivation: Traditionally, large vessel severe stenosis and occlusion (LVSSO) detection based on CTA needs contrast agent exposure. It is important to develop a LVSSO detection approach using contrast agent-free MR images that can achieve results comparable to clinical doctors.

Goal(s): To develop a new fusion algorithm that can achieve the accuracy comparable to clinical diagnostic levels.

Approach: A new fusion algorithm model based on vision transformer was developed. A total of 380 patients were enrolled in the current study.

Results: The proposed model achieved an AUC of 0.963 and an accuracy of 94.7%.

Impact: The proposed model achieved satisfactory accuracy for LVSSO detection, i.e. 94.7%, indicating that the performance of the proposed model has reached the clinical diagnosis level.

1954.
86Cerebral Artery Segmentation with Limited Data: Using Hierarchical Transformers
Song Tian1, Sicong Huang1, Zhixuan Song1, Jingyu Xie1, Shanshan Jiang1, Tianwei Zhang2, Wenjing Zhang2, and Su Lui2
1CTS, Philips Healthcare, Beijing, China, 2Department of Radiology, and Functional and Molecular Imaging Key Laboratory of Sichuan Province, West China Hospital of Sichuan University, Chengdu, China

Keywords: Analysis/Processing, Machine Learning/Artificial Intelligence

Motivation: Transformer networks have demonstrated their effectiveness in both large-scale natural language processing and 2D image analysis tasks. However, their potential in 3D medical image analysis, particularly on small training datasets, remains unexplored.

Goal(s): Leveraging transformer-based models to attain highly accurate cerebral artery segmentation in 3D TOF-MRA images.

Approach: SwinUNETR with a hard example mining loss function to perform cerebral artery segmentation on the public dataset CAS2023.

Results: We obtained an average Dice score of 0.844 and 0.889 for the stenosis area, a normalized Hausdorff distance of 0.888 and 0.8444 for the stenosis area, along with a weighted Dice and Hausdorff score of 0.867.

Impact: We used fewer than 100 cases to train a transformer model for artery segmentation, indicating that transformers have the potential to replace CNNs in the processing of 3D TOF-MRA medical images, even with a small training dataset.

1955.
87Cross-Attention Mechanism and Vision Transformer-Enhanced Multimodal Medical Imaging Fusion for Nasopharyngeal Carcinoma Segmentation
Xingyu Xie1, Wenjie Zhao1, Si Tang2, Zhenxing Huang1, Yingying Hu2, Wei Fan2, Yongfeng Yang Yang1,3, Hairong Zheng1,3, Dong Liang1,3, Chuanli Cheng1,3, and Zhanli Hu1,3
1Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Shenzhen, China, 2Department of Nuclear Medicine, Sun Yat‐sen University Cancer Center, Guangzhou, China, 3Key Laboratory of Biomedical Imaging Science and System, Chinese Academy of Sciences, Shenzhen, China

Keywords: Diagnosis/Prediction, Data Analysis

Motivation: The segmentation of nasopharyngeal carcinoma (NPC) is vital for diagnostic and prognostic processes. NPC segmentation is challenging due to its intricate anatomy, variability, and closeness to essential structures.

Goal(s): This study aims to improve NPC segmentation accuracy by leveraging multiple modalities, such as DCE-MRI and PET-CT. However, it is worth noting that previous research has not fully harnessed the potential of cross-modal features through cross-attention mechanisms.

Approach: This paper introduces a new approach, integrating cross-attention with the Vision Transformer structure, enabling efficient interaction between features from various modalities.

Results: The experiments show that the proposed model offers superior performance and state-of-the-art results.

Impact: The proposed method aims to enhance NPC segmentation results by utilizing multimodal medical imaging fusion, such as DCE-MRI fusion and PET-CT fusion. This approach has the potential to benefit other segmentation tasks involving multimodal medical image data.

1956.
88Multi Complex-valued Spatio-temporal Fusion Networks for Robust MRF Reconstruction
Tianyi Ding1, Yang Gao2, Zhuang Xiong1, Martijn Cloos1, and Hongfu Sun1
1The University of Queensland, Brisbane, Australia, 2Central South University, Changsha, China

Keywords: AI/ML Image Reconstruction, MR Fingerprinting

Motivation: Motivated by the prevalent use of basic deep learning architectures in MRF image reconstruction, and their heavy reliance on conventional dictionary matching methods for ground truth or paired in vivo acquisitions, we sought to innovate.

Goal(s): Our specific intent was to determine whether novel architectures could surpass traditional ones in MRF reconstruction.

Approach: To this end, we introduced the MRF-Mixer, blending complex-valued MLP with U-Net, and the more advanced MRF-TransMixer, integrating complex-valued MLP, Transformer, and U-Net.

Results: Leveraging our purely simulated training dataset, we methodically assessed their performance, endeavoring to highlight advancements with potential to transform MRF reconstruction in practice.

Impact: The MRF-Mixer and MRF-TransMixer offer enhanced MRF image reconstruction, potentially boosting diagnostic accuracy for clinicians. This advancement could lead to safer imaging for patients and motivate researchers to further explore SOTA network architecture applications in MRF.

1957.
89Transformer residual cross (T-REX) networks for volumetric super-resolution.
James Grover1,2, Shanshan Shan1,3, Paul Keall1,2, and David E.J. Waddington1,2
1Image X Institute, The University of Sydney, Sydney, Australia, 2Ingham Institute for Applied Medical Research, Sydney, Australia, 3State Key Laboratory of Radiation Medicine and Protection, Soochow University, Suzhou, China

Keywords: Other AI/ML, Machine Learning/Artificial Intelligence

Motivation: Higher temporal resolution is needed for many MRI-guidance applications. Reducing matrix sizes can increase temporal resolution at the cost of lower spatial resolution. Deep learning-based super-resolution could mitigate the trade-off in spatiotemporal resolution.

Goal(s): Develop and evaluate a unified deep learning-based algorithm that can up-sample thick single-slice low spatial resolution MRI to a thin multi-slice high spatial resolution MRI.

Approach: Developed a transformer residual cross (T-REX) neural network that simultaneously increased the spatial resolution and decreased the slice thickness providing high spatial resolution multi-slice MRI.

Results: T-REX was successfully trained and evaluated showing promising results for a variety of field strengths and sequences.

Impact: The ability to acquire high spatial resolution volumetric MRI quicky has applications to low-field MRI and MRI-guided radiation therapy. Here, we present our initial findings of a unified neural network that applies volumetric super-resolution.

1958.
90Manifold-Aware Swin UNET Transformer for High-Fidelity Diffusion Tensor Imaging from Six Directions
Noga Kertes1 and Moti Freiman1
1Biomedical Engineering, Technion - Israel Institute of Technology, Haifa, Israel

Keywords: Analysis/Processing, Brain, AI/ML Image Reconstruction ,Diffusion Analysis & Visualization, DTI

Motivation: Diffusion Tensor Imaging (DTI) requires numerous diffusion weighted images, resulting in long scan sessions that motivate a need for more efficient DTI estimation techniques.

Goal(s): Our goal is to demonstrate that Deep Neural Networks (DNNs) trained with a manifold-respecting loss function can more accurately estimate diffusion tensors from fewer diffusion-weighted images, surpassing networks trained with  Euclidean losses while honoring the tensors' manifold structure.

Approach: We employed the Swin UNET Transformer architecture and trained two models: one with Log-Euclidean loss and another with Euclidean loss.

Results: When evaluating the predicted tensors against conventional techniques, our approach consistently outshined the rest.

Impact: The study will enable an accurate estimate of brain microstructure from DTI data acquired with six gradient directions by developing manifold-aware DNN for DTI analysis. This breakthrough may reduce patient discomfort and scanning artifacts, and potentially increase imaging centers throughput.

1959.
91Deep Learning Classification of Muscular Dystrophy from MR Images using Swin Transformer
Maria Giovanna Taccogna1, Giovanna Rizzo2, Maria Grazia D'Angelo3, Denis Peruzzo4, and Alfonso Mastropietro2
1Istituto di Tecnologie Biomediche, Consiglio Nazionale delle Ricerche, Segrate (MI), Italy, 2Istituto di Sistemi e Tecnologie Industriali Intelligenti per il Manifatturiero Avanzato, Consiglio Nazionale delle Ricerche, Milano, Italy, 3Unit of rehabilitation of rare diseases of the central and peripherical nervous system, Scientific Institute IRCCS "Eugenio Medea", Bosisio Parini (LC), Italy, 4Neuroimaging Unit, Scientific Institute IRCCS “Eugenio Medea”, Bosisio Parini (LC), Italy

Keywords: Diagnosis/Prediction, Muscle, Deep Learning; Classification; MRI; Swin Transformer; Generative AI; Dystrophy

Motivation: This study aims to improve the accuracy of muscular dystrophy (MD) diagnosis by applying AI and multiparametric MRI to distinguish subtypes with similar muscle involvement patterns.

Goal(s): The primary goal is to develop a Swin Transformer (SwinT) AI-based classification approach for BMD, LGMD2, and healthy subjects using muscle MR images and identify the optimal MRI contrast for accurate classification.

Approach: In a retrospective study, we utilized SwinT and VGG19 AI models with various MRI contrasts in a 10-fold cross-validation setup.

Results: SwinT outperformed VGG19, with the Fat Fraction contrast delivering the highest accuracy of 89.3%±4.9%, highlighting the potential for more accurate MD diagnosis.

Impact: This work could improve muscular dystrophy diagnosis, offering clinicians a more objective and accurate tool. Patients may benefit from earlier and more precise interventions, while scientists can explore novel research avenues in AI-driven medical diagnostics, ultimately reducing healthcare disparities.

1960.
92Hybrid attention SwinV2 transformer cascade design for Accelerated Multi-Coil MRI Reconstruction
Tahsin Rahman1, Ali Bilgin2, and Sergio Cabrera1
1Electrical and Computer Engineering, The University of Texas at El Paso, El Paso, TX, United States, 2Electrical and Computer Engineering, University of Arizona, Tucson, AZ, United States

Keywords: AI/ML Image Reconstruction, Machine Learning/Artificial Intelligence, Transformer, Swin, Attention

Motivation: Shifted window (Swin) Vision Transformers are increasingly outperforming CNNs in computer vision tasks, particularly if adequate GPU resources are available for training.

Goal(s): In this work, we investigate cascaded Swin transformers with hybrid attention for accelerated MRI reconstruction.

Approach: Our proposed Hybrid SwinV2-MRI-cascade architecture incorporates multi-coil data and k-space consistency constraints while offering a high degree of flexibility in network choice depending on performance requirements and compute capabilities.

Results: Experiments show that both hybrid attention and longer cascades can be used in a granular manner to improve MRI reconstruction performance in Swin transformer networks.

Impact: A highly configurable cascaded hybrid attention SwinV2 transformer architecture for MRI reconstruction is proposed. Its modular nature offers the ability to create transformer networks that fully leverage available training compute resources while producing high quality output.

1961.
93Decoding Brain Dynamic Functional Connectivity Implicated in ADHD via Graph Neural Networks and Transformers
Deepank Girish1, Yi Hao Chan1, Jing Xia1, and Jagath C Rajapakse1
1School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore

Keywords: Diagnosis/Prediction, Machine Learning/Artificial Intelligence, Dynamic Functional Connectivity

Motivation: Few studies have investigated the potential of using dynamic functional connectivity for Attention Deficit Hyperactivity Disorder (ADHD) diagnosis and biomarker discovery.

Goal(s): The goal of this study is to effectively capture spatiotemporal dynamic features in resting state fMRI data for detection of ADHD subjects. 

Approach: We present a novel ensemble framework that combines the strengths of Graph Convolutional Networks (GCN), Graph Isomorphism Networks (GIN), and Transformers. 

Results: On ADHD-200 dataset, our framework outperforms other state-of-the-art models for ADHD detection. By using explainable AI, we generated biomarkers for ADHD which are consistent with the existing literature. 

Impact: Innovative integration of GCN, GIN, and Transformers in our proposed framework enables analysis of effective dynamic functional connectivity for ADHD diagnosis. The classification performance outperforms existing state-of-the-art models and generated biomarkers further affirm its usefulness of our methods.

1962.
94Generalizable Transformer-based Automatic MRI Quality Control for Infant Brain Imaging
Haowen Deng1, Gaofeng Wu1, Zihao Zhu1, Zhuoyang Gu1, Xinyi Cai1, Tianli Tao1, Lixuan Zhu1, Yitian Tao1, Dinggang Shen1,2, and Han Zhang1,3
1School of Biomedical Engineering, ShanghaiTech University, Shanghai, China, 2Shanghai Clinical Research and Trial Center, Shanghai, China, 3Shanghai Clinical Research and Trial Center, Shanghai, Shanghai, China

Keywords: Analysis/Processing, Brain, Quality Control, Infant

Motivation: Manual quality control (QC) for infant brain MRI is time-consuming and labor-intensive. The implementation of automatic QC is necessary for clinical scenarios.

Goal(s): To develop a generalizable, highly accurate, automatic tool for infant brain T1w-MRI quality control.

Approach: We design a generalizable automatic model with Residual Network (ResNet) and Vision transformer (ViT) modules for infant brain T1w-MRI QC. Our model is trained and validated on two large-scale multi-site infant MRI datasets (including Baby Connectome Project and China Baby Connectome Project).

Results: Based on our method, we can automatically classify the data quality with the accuracy of over 95% for BCP and CBCP datasets.

Impact: Our automatic MRI quality control tool can consider both local and global image features and shows excellent performance and efficiency, specifically on the infants' 3D brain T1w-MRI. It considerably reduces the requirement of labor in the traditional QC process. 

1963.
95Application of Swin Transformer for Alzheimer’s Disease Classification on Structural MRI and FDG-PET Brain Scans Using ADNI Data
Chiara Weber1, Jakob Seeger1, Ben Isselmann1, Matthias Günther2,3,4, Andreas Weinmann1, and Johannes Gregori1,2
1Department of Mathematics and Natural Sciences, University of Applied Sciences Darmstadt, Darmstadt, Germany, 2mediri GmbH, Heidelberg, Germany, 3Fraunhofer Mevis, Bremen, Germany, 4University of Bremen, Bremen, Germany

Keywords: Diagnosis/Prediction, PET/MR, Brain

Motivation: The detection of Alzheimer’s Disease (AD) can be supported by using automated computer vision solutions. Those can potentially enable an earlier diagnosis and facilitate improved patient treatment.

Goal(s): Our goal is to apply a state-of-the-art deep learning approach to the field of AD diagnosis based on brain scans.

Approach: A pretrained Swin Transformer model is tuned on FDG-PET and structural MRI brain scans to classify AD.

Results: Our model achieves a competitive area under curve of 97.8% / 99.7% and accuracy of 97.0% / 99.5% (MRI / PET) on independent test data. 

Impact: We show how a modern deep neural network can be trained with reasonable efforts while still achieving comparable results to established approaches. This procedure can lead the way towards classifying AD on more challenging modalities, such as ASL.

1964.
96A deep learning based approach to generate synthetic CT images from multi-modal MRI data
Zhuoyao Xin1, Christopher Wu2, Dong Liu3, Chunming Gu1,4,5, Jia Guo2, and Jun Hua1,4,5
1F.M. Kirby Research Center for Functional Brain Imaging, Kennedy Krieger Institute, Baltimore, MD, United States, 2Department of Biomedical Engineering, Columbia University, New York City, NY, United States, 3Department of Neuroscience, Columbia University, New York City, NY, United States, 4Neurosection, Division of MRI Research, Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD, United States, 5Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States

Keywords: Analysis/Processing, Machine Learning/Artificial Intelligence, multi-modal MRI

Motivation: Synthetic CT is a useful technique to generate CT images from MR images. Most existing methods exploit only one single MRI modality such as T1-weighted (T1w) images. 

Goal(s): We aim to develop a synthetic CT method integrating dual-channel T1w+FLAIR input images.

Approach: A dual-channel, multi-task deep learning approach based on the 3D Transformer U-net was tested using a public human brain MRI-CT dataset. Its performance was compared to single-modal T1w-based CT synthesis.

Results: Our results indicate that dual-modal T1w+FLAIR images can provide richer details, particularly in pixel-level predictions compared to single-modal synthetic CT. The improvement in morphology was moderate.

Impact: The proposed framework may be used to integrate two or more MRI modalities to improve the performance of CT image synthesis.