Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Jun;47(5):e148-e167.
doi: 10.1002/mp.13649.

Machine learning techniques for biomedical image segmentation: An overview of technical aspects and introduction to state-of-art applications

Affiliations
Review

Machine learning techniques for biomedical image segmentation: An overview of technical aspects and introduction to state-of-art applications

Hyunseok Seo et al. Med Phys. 2020 Jun.

Abstract

In recent years, significant progress has been made in developing more accurate and efficient machine learning algorithms for segmentation of medical and natural images. In this review article, we highlight the imperative role of machine learning algorithms in enabling efficient and accurate segmentation in the field of medical imaging. We specifically focus on several key studies pertaining to the application of machine learning methods to biomedical image segmentation. We review classical machine learning algorithms such as Markov random fields, k-means clustering, random forest, etc. Although such classical learning models are often less accurate compared to the deep-learning techniques, they are often more sample efficient and have a less complex structure. We also review different deep-learning architectures, such as the artificial neural networks (ANNs), the convolutional neural networks (CNNs), and the recurrent neural networks (RNNs), and present the segmentation results attained by those learning models that were published in the past 3 yr. We highlight the successes and limitations of each machine learning paradigm. In addition, we discuss several challenges related to the training of different machine learning models, and we present some heuristics to address those challenges.

Keywords: deep learning; machine learning; medical Image; overview; segmentation.

PubMed Disclaimer

Conflict of interest statement

“The authors have no conflicts to disclose.”

Figures

Figure 1.
Figure 1.
The architecture of the segmentation network based on kernel SVMs, using a filter bank in conjunction with the kernel feature selection to generate semantic representations. Random feature maps φ1, ⋯, φD capture the non-linear relationship between the representations and the class labels.
Figure 2.
Figure 2.
Visualization of the random feature maps in three dimensions, using the t-SNE plot, and for different bandwidth parameters γ ≡ 1/2σ2 of the Gaussian RBF kernel kX(x,y)=exp(γxy22). To generate the feature maps, the pre-trained VGG network is used. The red and blue regions correspond to the random feature maps generated by the pixels from each class label in a sampled colonoscopy image, respectively. To enhance the visualization, we have cropped the selected image and retained a balanced numbers of pixels from each class label. (a): γ = 10−6, (b): γ = 10−3, (c): γ = 0.1, and (d): γ = 1.
Figure 3.
Figure 3.
Segmentation of Angiodysplasia colonoscopy images generated by FCN on sampled test images from the GIANA challenge dataset. Top: the colonscopy images obtained using Wireless Capsule Endoscopy (WCE), Middle: the heat maps depicting the soft-max output of FCN, Bottom: the heat map of the residual image computed as the absolute difference between the proposed segmentation and the ground truth. Due to training on a small data-set, FCN tends to overfit and does not generalize well to unseen data.
Figure 4.
Figure 4.
Segmentation of Angiodysplasia colonoscopy images on sampled test images from the GIANA challenge dataset, generated via the kernel SVM using the VGG filter bank with the kernel feature selection. The bandwidth of RBF kernel 1/2σ2 is selected via maximum mean discrepancy optimization. Top: the colonscopy images obtained using Wireless Capsule Endoscopy (WCE), Middle: the heat maps depicting the soft-max of SVM kernel classifier, Bottom: the heat map of the residual image computed as the absolute difference between the proposed segmentation and the ground truth. Despite training on a small data-set, the kernel SVM performs well on the test data set.
Figure 5.
Figure 5.
Comparison of the mean IoU score MIoU for FCN (the red color), the kernel SVM with Mallat’s scattering network as the filter bank (the green color), and the kernel SVM with a pre-trained VGG network as a filter bank (the blue color) on the test dataset. To tune the parameters of the kernel in the Gaussian RBF kernel, the two-sample test is performed. Each plot correspond to the performance of networks that are trained on different sample sizes. Panel (a): 76800 Pixels (1 image), Panel (b): 153600 Pixels (two images), Panel (c): Trained on 1 % of the data-set (3 images), (d): Trained on 5 % of the data-set (15 images).
Figure 6.
Figure 6.
The architecture of the artificial neural network (ANN). (a) Mathematical model of a perceptron (node). (b) Multi-layer perceptron (MLP) structure for ANN. Each node in the hidden layer of (b) is described mathematically in (a). (c) An example of back-propagation. Loss is minimized by the update of the weight, w based on the gradient of the loss function with respect to w via the chain rule where b is the constant bias. (d) An example of convolution operation in CNN. Same kernel weights are applied to convolution operation for an output.
Figure 7.
Figure 7.
The architecture of the recurrent neural network (RNN).
Figure 8.
Figure 8.
Network architecture of the patch-wise CNN for liver/liver-tumor segmentation.
Figure 9.
Figure 9.
Network architecture of (a) FCN and (b) U-Net.
Figure 10.
Figure 10.
(a) The results of the liver and liver-tumor segmentation. Yellow, purple, red, green, and blue lines are acquired from SBBS-CNN, dual-frame U-Net, atrous pyramid pooling, the proposed network, and ground truth, respectively. (b) and (c) are the contouring of the segmentation results in (a).
Figure 11.
Figure 11.
Network architecture of cascaded CNN network (example of patch-wise CNN and FCN) for tumor segmentation. The first network is trained for ROI or rough classification and the second network is further tuned for final segmentation.
Figure 12.
Figure 12.
Descriptions of (a) stride and (b) atrous. Stride is the amount by which the convolution kernel shifts, and atrous is the distance of kernel elements (weights). (c) Structure of atrous pyramid pooling. Pyramid pooling can form the feature map which contains both local and global context information by applying different sub-region representations followed by up sampling and concatenation layers.
Figure 13.
Figure 13.
The network architecture ranked 1st in BRATS challenge in 2018.
Figure 14.
Figure 14.
Structure of the Generative Adversarial Network (GAN).

Similar articles

Cited by

References

    1. Mao KZ, Zhao P, Tan P-H. Supervised learning-based cell image segmentation for p53 immunohistochemistry. IEEE Transactions on Biomedical Engineering. 2006;53(6):1153–1163. - PubMed
    1. Wachinger C, Golland P. Atlas-based under-segmentation. Paper presented at: International Conference on Medical Image Computing and Computer-Assisted Intervention2014. - PMC - PubMed
    1. Li D, Liu L, Chen J, et al. Augmenting atlas-based liver segmentation for radiotherapy treatment planning by incorporating image features proximal to the atlas contours. Physics in Medicine & Biology. 2016;62(1):272. - PubMed
    1. Noh H, Hong S, Han B. Learning Deconvolution Network for Semantic Segmentation. arXiv e-prints. 2015. https://ui.adsabs.harvard.edu/\#abs/2015arXiv150504366N. Accessed May 01, 2015.
    1. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Paper presented at: Proceedings of the IEEE conference on computer vision and pattern recognition2016.