Scientific Papers

A computer-aided determining method for the myometrial infiltration depth of early endometrial cancer on MRI images | BioMedical Engineering OnLine

Patients and data preparation

The Institutional Review Board (IRB) of Fujian Maternity and Child Health Hospital in China (FMCHH) approved the retrospective study, and the requirement for informed consent was waived. 207 patients who underwent pelvic MRI examination in FMCHH during the period from January 1, 2018, to December 31, 2020, were included in this study after being pathologically diagnosed with early-stage EC. Patients were identified by using information from the hospital’s picture archiving and communication system (PACS). The exclusion criteria were as follows: (1) without a final pathologic diagnostic statement; (2) inability to pathologically confirm early-stage EC (FIGO stage IA or IB); (3) missing MRI data (no corresponding sagittal T2WI sequence). The total number of patients in the study was 154 (mean age 55.7 ± 9.7 years, 75 stage IA and 79 stage IB). All the included patients were confirmed by pathology as shown in Table 4.

Subsequently, visual selection of MRI sequences (24 slices per sequence, for a total of 3696 MRI images) was performed by two experienced radiologists in a consensus manner and the following exclusion criteria were applied: (1) presence of artifacts; (2) uterus and tumor not clearly detectable on T2WI images. Radiologists usually select the MRI slice with the maximum tumor diameter as the central slice and 1–2 anterior and posterior slices of the central slice as the selected objects for analysis. Finally, the experimental data are 224 MRI slices (101 IA images, 123 IB images). The experimental data are randomly divided into the training dataset (70%) and the testing dataset (30%). The training dataset has 108 cases (53 stage IA/55 stage IB) including 156 images, and the test dataset has 46 cases (22 stage IA/24 stage IB) including 68 images. A flow diagram of the cohort selection is presented in Fig. 7.

The proposed methods are all based on a dataset composed of the aforementioned MRI images. Since this dataset is relatively small in terms of the number of images, data augmentation techniques, such as random horizontal flipping, random vertical flipping, and random scaling, were employed during the training process to enhance the model’s robustness and prevent overfitting [21].

Table 4 Clinical and pathological data summaries in training, and independent test group
Fig. 7
figure 7

A flow diagram of the cohort selection

MRI protocol

All MRI examinations were performed using a 1.5-T MRI scanner (Optima MR360, GE Healthcare) with a phase-controlled oscillation coil. Before the examination, the patient’s bowel was defecated using a glycerine enema and had an appropriate urine holding (about one-half). To reduce bowel artifacts and motion artifacts caused by significant bowel movements, no enemas or slow-defecation medications were used for bowel movements. Eating was allowed (food cannot contain iron components) and no intramuscular injection of any medication was required. Ensure that the patient has no contraindications to MRI and no metallic foreign bodies on the body. It is especially important to ask whether the patient has ever had surgery or radiation chemotherapy. Whether the current status is menstrual or menopausal. The patient’s position was feet-first and supine in all cases. Keep the body in line with the bed of the MR scanner so that the scanning site is as close as possible to the main magnetic field and the center of the coil, the center of the coil to the pubic symphysis. Place a soft cushion on the lower abdomen to reduce motion artifacts caused by breathing. Simultaneously, ask the patient to raise both hands (ensuring they do not cross their hands to form a loop) and provide appropriate support using triangular cushions to ensure the patient completes the examination in a comfortable position. The fat-suppressed fast-spin-echo T2WI (FS FSE T2WI) sagittal sequence was selected for this study. The acquisition parameters of the MRI were as follows: repetition time/echo time [TR/TE], \(5600-5700/65-70\) msec; bandwidth, 31.25Hz/pixel; thickness, 5 mm; flip angle, 160degrees; field of view, \(280\,mm\); matrix size, \(320 \times 224\,mm\); and image resolution, \(512 \times 512 pixels\).

MRI lesion labeling

Localization of ROIs in all MRI images and segmentation of ROI contours in cropped images in this study were performed by experienced radiologists (Chen’s team). For the object detection model, two rectangular boxes were drawn as labels for the dataset using labelImg (version 1.8.5), one including the uterus, and the other including the lesion structures (Fig. 8), and these borders were considered as the ground truth for the object detection model. For the semantic segmentation model, the edge contours of the lesion region and the uterine body were outlined using labelme (version 4.5.7), which was used as the label of the dataset, and these two contours were considered as the gold standard for the semantic segmentation model (Fig. 8).

Fig. 8
figure 8

a, b, e, and f are the labeling and prediction of the object detection model. a and e Are uterus regions, b and f are tumor regions. c and g are cropped images based on the detection results. d and h are labeling of the semantic segmentation model (red is the tumor, green is the uterus)

Proposed method

A DL-based multi-stage CAD method is proposed to evaluate the exact MI depth (Fig. 9). The object detection model based on the SSD algorithm is used to perform ROI detection on the original MRI image sequences (Fig. 9a). The MRI images that can clearly show the uterus and tumor are selected according to the confidence score of the detection results (Fig. 9b), and be cropped out according to the detection box (Fig. 9c). The cropped images (Fig. 9d) are fed into a semantic segmentation model based on the Attention U-net network for prediction (Fig. 9e). Then the ellipse fitting algorithm based on UCLGA is employed to generate the UCL on the segmentation map. The MI depth is obtained by the ratio of the tumor-UCL maximum length to the uterus-UCL maximum length. According to the criteria of the FIGO for determining the staging of early EC tumors, the depth of tumor infiltration less than 50% of myometrial thickness is identified as stage IA and greater than 50% of myometrial thickness is identified as stage IB [22]. Finally, the EC MRI image is classified as IA stage when MI is less than 0.5 and IB stage when MI is greater than 0.5.

Fig. 9
figure 9

The flowchart of the proposed method. a is the original MRI image sequence. The object detection model based on the SSD algorithm is used to detect the ROI (uterus and tumor) (b). c is the optimal image that can clearly see the ROI. d is the cropped image, which only includes ROI. The semantic segmentation model based on Attention U-net is used to accurately predict the uterus (light blue region) and tumor (red region) of the cropped image e. f is the ellipse fitting algorithm that is used to generate UCL, and the R in (i) is the final prediction of the depth of MI

Fig. 10
figure 10

The architecture is used for object detection and semantic segmentation

Object detection model

The task of object detection is to locate instances of a certain class of semantic objects [23]. In this study, the SSD model [24] is employed to detect the bounding box of ROI in MRI images. The architecture of the model is shown in Fig. 10. SSD is a method for object detection in images using a single deep neural network. SSD extracts features from the image using VGGNet. Additional convolutional layers are then added on top of these features to generate feature maps at different scales. These feature maps contain information about objects of different sizes and scales, allowing SSD to detect objects of different sizes. Then it discretizes the bounding box output space at each location on the feature map, into a set of default boxes with different aspect ratios and scales. Each of these default boxes predicts the confidence level of its internal object class and the offset relative to the ground truth box. Finally, the proportion of positive and negative samples of the default boxes is controlled by non-maximal suppression and hard negative mining. Firstly, model parameters that were pre-trained on the VOC 2007 dataset were loaded. Secondly, the parameters of the first 21 layers of the pre-trained model for training were frozen in the first 50 epochs. Lastly, the parameters of the overall network were updated after 50 epochs of training, which achieved a higher training speed and better model performance. The original MRI images and bounding boxes outlined by radiologists were used as the input data to train the SSD model. The original MRI images were uniformly resized to 512\(\times\)512 and then fed into the object detection model for training.

Semantic segmentation model

Semantic segmentation is the ability to segment an unknown image into different parts and objects (e.g., beach, ocean, sun, dog, swimmer). Moreover, segmentation goes deeper than object recognition, because recognition is not necessary for segmentation [25]. In this study, the Attention U-net model [26] is used to segment the uterine and tumor regions of the input images. The Attention U-net is a variant of U-net that retains the original encoder–decoder structure as shown in Fig. 10. The encoding layer maps the input images to a latent representation or bottleneck, and the decoding layer maps this representation to the original images [27]. To concatenate the features of high and low levels together, skip-connection was added to the encoder–decoder network[21]. It is also boosted with attention gates to highlight better salient features passed through the skip connections [28]. First, the model parameters that were pre-trained on the VOC 2007 dataset were loaded. The parameters of the first 17 layers of the pre-trained model were frozen for training in the first 50 epochs, and then the parameters of the overall network were updated while training after 50 epochs, which achieved a faster training speed and improved model performance. The original MRI image is cropped according to the radiologist’s boxed-out uterine region and then fed into the semantic segmentation model for training. Due to the inconsistent size of the cropped images, a uniform size is required for semantic segmentation model training and prediction. To resize the image without distortion, this work supplements the image with gray bars of pixel value 128 around the image to unify the image to 256\(\times\)256, and the gray bars will be intercepted in the final prediction result.

Optimal slice selection

To solve the problem of requiring radiologists to manually select MRI slices that reflect the lesion, an object detection model is employed to automatically select MRI slices that can clearly see uterus and tumor from MRI sequence images. To begin with, the MRI sequence images of EC patients are fed into the object detection model for detection, and then three images of this sequence with the highest confidence scores of the uterus and tumor (predicted by object detection models) are selected as the screening results (Fig. 9a, b). The performance of the automated slice selection will be evaluated using the radiologist’s manually selected slices as positive labels. Since there are no quantitative criteria for radiologists to select the best slice, and usually more than one slice in an MRI sequence that clearly visualizes the uterus and tumor (One slice was selected by the radiologist as best slice in 24 patients, two slices were both selected by the radiologist as best slices in the other 22 patients). Three MRI slices were automatically selected by CAD for each patient. CAD1-accuracy is defined as the performance when CAD automatically selects only one slice to match the manually selected slices by the radiologist. Similarly, CAD2-accuracy is defined as the performance when CAD automatically selects two slices, and CAD3-accuracy is the metric used when any of the three slices suggested by CAD is among the manually selected slices by the radiologist. The implementation source code and experimental data of the module are available at

UCL generation algorithm

Employing a single algorithm or model to determine MI depth is a great challenge due to the diversity of uterine shapes and tumor locations. Therefore, an algorithm for automatic UCL generation (Fig. 9) on the semantic segmentation map is proposed in order to calculate the MI depth. The UCL generation algorithm is described in Algorithm 1. To begin with, a line is obtained as the virtual UCL. Then, two maximum perpendicular lines to the UCL are determined. One line is the maximum thickness of the myometrium to the UCL and the other line is the maximum extent of tumor to the UCL. The ratio of the line lengths equals the depth of MI. A general formula of the ellipse is as shown in equation 1.

A. Fitzgibbon et al. proposed a direct least-squares fit an ellipse [29], which fits an ellipse specific to discrete data by minimizing the algebraic distance, subject to a constraint of \(4ac-b^{2} =1\). It is easy to implement and extremely robust. Where abcdef are the fitted ellipse parameters obtained from the set of points (x,y) extracted from the input uterine contour lines. The algorithm is applied to the uterine contour in the segmentation image and considered the fitted ellipse long axis as the UCL (Fig. 9f):

$$\begin{aligned} a{x^2} + bxy + c{y^2} + dx + ey + f = 0. \end{aligned}$$


Perpendicular lines are made at each point of the UCL, and the distance ratio of each perpendicular line to the intersection of the tumor border and the uterine border is calculated, and the maximum distance ratio is considered as the MI depth (m, n in Fig. 9f).

figure a

Validation and statistics

A test dataset containing 68 images from 46 randomly selected patients is used to validate the performance of the CAD method. Given a patient with a sequence of sagittal T2WI images (the number of images varies from 19 to 23) is first fed into the object detection network to select the optimal MRI slices in which the tumor and uterus could be clearly visualized. Then, the radiologist-selected slices are cropped according to the detection boxes predicted by the object detection network and fed into the semantic segmentation network to obtain segmentation maps of the uterine region and the tumor region. Finally, UCL is generated using the UCLGA to yield the infiltration depth and classification of MI. Statistical analyses are performed on SPSS (version 26.0., SPSS Inc.) and p-values are obtained by t-test.

Source link