Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Oct;15(10):1234-45.
doi: 10.1016/j.acra.2008.04.016.

Performance of breast ultrasound computer-aided diagnosis: dependence on image selection

Affiliations

Performance of breast ultrasound computer-aided diagnosis: dependence on image selection

Nicholas P Gruszauskas et al. Acad Radiol. 2008 Oct.

Abstract

Rationale and objectives: The automated classification of sonographic breast lesions is generally accomplished by extracting and quantifying various features from the lesions. The selection of images to be analyzed, however, is usually left to the radiologist. Here we present an analysis of the effect that image selection can have on the performance of a breast ultrasound computer-aided diagnosis system.

Materials and methods: A database of 344 different sonographic lesions was analyzed for this study (219 cysts/benign processes, 125 malignant lesions). The database was collected in an institutional review board-approved, Health Insurance Portability and Accountability Act-compliant manner. Three different image selection protocols were used in the automated classification of each lesion: all images, first image only, and randomly selected images. After image selection, two different protocols were used to classify the lesions: (a) the average feature values were input to the classifier or (b) the classifier outputs were averaged together. Both protocols generated an estimated probability of malignancy. Round-robin analysis was performed using a Bayesian neural network-based classifier. Receiver-operating characteristic analysis was used to evaluate the performance of each protocol. Significance testing of the performance differences was performed via 95% confidence intervals and noninferiority tests.

Results: The differences in the area under the receiver-operating characteristic curves were never more than 0.02 for the primary protocols. Noninferiority was demonstrated between these protocols with respect to standard input techniques (all images selected and feature averaging).

Conclusion: We have proved that our automated lesion classification scheme is robust and can perform well when subjected to variations in user input.

PubMed Disclaimer

Conflict of interest statement

CONFLICTS OF INTEREST: M. L. Giger is a shareholder in and receives research funding from R2 Technology/Hologic (Sunnyvale, CA). It is the University of Chicago Conflict of Interest Policy that investigators disclose publicly actual or potential significant financial interest that would reasonably appear to be directly and significantly affected by the research activities.

Figures

Figure 1
Figure 1
The distribution of the number of images available per lesion.
Figure 2
Figure 2
Four different images depicting the same physical lesion. In the “all images” view selection protocol, features from all four of the images are extracted and used in analysis (indicated by a solid outline). In the “first image only” view selection protocol, only features from the first image are used in analysis (indicated by the dashed-dotted outline). In the “random images” view selection protocol, only features from a randomly selected group of images are used in analysis (indicated by a dashed outline); in this example, two of the four images were randomly selected via this protocol.
Figure 3
Figure 3
A flowchart depicting the two main protocols used to evaluate our BNN classifier. In protocol A, features from multiple images of the same physical lesion are extracted and averaged together before they are input into the classifier. In protocol B, features from multiple images of the same physical lesion are input into the classifier directly, and then the classifier outputs from each of these images are averaged together.
Figure 4
Figure 4
An example of a lesion with a relatively large difference in estimated probability of malignancy between the two different classification protocols. Both images depict the same physical lesion, a biopsy-proven carcinoma. The estimated probability of malignancy for each image individually is 0.8495 and 0.3834 respectively. The estimated probability of malignancy for the lesion is 0.8589 when using classification protocol A (feature averaging) and it is 0.6165 when using protocol B (classifier output averaging), demonstrating a difference of 0.24 between the two protocols. The “all images” view selection protocol was used in this example.
Figure 5
Figure 5
An example of a lesion with a relatively small difference in estimated probability of malignancy between the two different classification protocols. All three images depict the same physical lesion, an aspiration-proven cyst. The estimated probability of malignancy for each image individually is 0.0791, 0.0388, and 0.0323 respectively. The estimated probability of malignancy for the lesion is 0.0434 when using classification protocol A (feature averaging) and it is 0.0501 when using protocol B (classifier output averaging), demonstrating a difference of 0.007 between the two protocols. The “all images” view selection protocol was used in this example.
Figure 6
Figure 6
ROC curves resulting from the round robin testing of the different view selection protocols when using feature averaging during classification (protocol A) (N=344 for each test).
Figure 7
Figure 7
ROC curves resulting from the round robin testing of the different view selection protocols when using classifier output averaging during classification (protocol B) (N=344 for each test).
Figure 8
Figure 8
ROC curves resulting from the round robin testing of the different view selection protocols when no averaging is used (i.e., neither feature nor classifier output averaging) during classification (protocol C) (N=1067 for the “all images” protocol and N=517 for the “random images” protocol).
Figure A1
Figure A1
Histogram depicting the distribution of overlap values between center-point-generated lesion outlines and random-point-generated lesion outlines.

Similar articles

Cited by

References

    1. Edwards BK, Brown ML, Wingo PA, et al. Annual report to the nation on the status of cancer, 1975-2002, featuring population-based trends in cancer treatment. J Natl Cancer Inst. 2005;97:1407–1427. - PubMed
    1. Elmore JG, Armstrong K, Lehman CD, Fletcher SW. Screening for breast cancer. JAMA. 2005;293:1245–1256. - PMC - PubMed
    1. Fine RE, Staren ED. Updates in breast ultrasound. Surg Clin North Am. 2004;84:1001–1034. - PubMed
    1. Kolb TM. Breast US for screening, diagnosing, and staging breast cancer: Issues and controversies; RSNA Categorical Course in Diagnostic Radiology Physics: Advances in Breast Imaging – Physics, Technology, and Clinical Applications 2004. Radiologic Society of North America; Chicago, IL: 2004. pp. 247–257.
    1. Berg WA, Gutierrez L, NessAiver MS, et al. Diagnostic accuracy of mammography, clinical examination, US, and MR imaging in preoperative assessment of breast cancer. Radiology. 2004;233:830–849. - PubMed

Publication types

MeSH terms