MedDet: Generative Adversarial Distillation for Efficient Cervical Disc Herniation Detection

 

1 Southwest Minzu University 2 The Australian National University
3 West China Hospital 4 Sichuan University
5 La Trobe University

 

*Equal Contribution. Corresponding author.
Work done in Southwest Minzu University.

Abstract

Cervical disc herniation (CDH) is a prevalent musculoskeletal disorder that significantly impacts health and requires labor-intensive analysis from experts. Despite advancements in automated detection of medical imaging, two significant challenges hinder the real-world application of these methods. First, the computational complexity and resource demands present a significant gap for real-time application. Second, noise in MRI reduces the effectiveness of existing methods by distorting feature extraction. To address these challenges, we propose three key contributions: Firstly, we introduced MedDet, which leverages the multi-teacher single-student knowledge distillation for model compression and efficiency, meanwhile integrating generative adversarial training to enhance performance. Additionally, we customize the second-order nmODE to improve the model's resistance to noise in MRI. Lastly, we conducted comprehensive experiments on the CDH-1848 dataset, achieving up to a 5% improvement in mAP compared to previous methods. Our approach also delivers over 5 times faster inference speed, with approximately 67.8% reduction in parameters and 36.9% reduction in FLOPs compared to the teacher model. These advancements significantly enhance the performance and efficiency of automated CDH detection, demonstrating promising potential for future application in clinical practice.

Overall Method

 


The left subfigure illustrates the overall architecture of our proposed MedDet model, which includes a novel adversarial auxiliary teacher module (AATM) for generative adversarial distillation, an adaptive feature alignment (AFA), and a learnable weighted feature fusion (LWFF) module to fuse features from teacher networks and align them with those of the student network. Additionally, it incorporates a denoising nmODE2 module in the detection head. The right subfigure shows that our method achieves superior efficiency compared to teacher models in terms of FLOPs, parameter count, and inference speed.


Generative Adversarial Knowledge Distillation

 


The diagram of the adversarial auxiliary teacher module (AATM). The FTi and FSi represent the i-th feature maps outputted by the teacher network and the student network's FPN respectively. G represents the generator.


The nmODE2 Block

 


The diagram illustrates the architecture of the nmODE2 block, integrated into the classification and regression heads of the teacher networks.


Adaptive Feature Alignment (AFA)

 


The figure illustrates the adaptive feature alignment (AFA). Subfigure (a) shows the channel-wise alignment model, while subfigure (b) shows the height-width (HW) alignment model.


Learnable Weighted Feature Fusion (LWFF)

 


The figure illustrates the learnable weighted feature fusion (LWFF) module, where M denotes the multiplication operation and C denotes the concatenation operation.


CDH-1848 Dataset

 


The figure illustrates the pipeline for patient selection, MRI acquisition, and expert annotation following inter-rater reliability.


Comparative Studies

 


The results indicate that our proposed MedDet surpasses other efficient supervised methods and knowledge distillation techniques with the same student network, GFL (MobileNetV2). This demonstrates the superior ability of MedDet to leverage both feature extraction and knowledge transfer, resulting in enhanced performance in CDH detection.


 

 


The table presents a comparison between our model and the teacher models, demonstrating that our method is significantly lighter and faster while achieving similar performance. This highlights our model's exceptional balance between performance and efficiency.


Ablation Studies

 


The table shows the results of the ablation study on the proposed architecture components including nmODE2, LWFF, and AATM, highlighting the significant contributions of each customization to the overall performance.


 

 


The table shows the results of different customizations of nmODE2 across various components of the overall architecture, highlighting the impact on performance when adapted to the detection head.


 

 


The table shows the results of different feature alignment and fusion methods, indicating that our proposed LWFF effectively aligns and fuses features for knowledge distillation.


Visualization

 


The visualization shows that our MedDet surpasses other efficient detection methods, demonstrating its superior ability in accurate CDH detection.