Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Mar 30:8:110.
doi: 10.1186/1471-2105-8-110.

Fast automated cell phenotype image classification

Affiliations

Fast automated cell phenotype image classification

Nicholas A Hamilton et al. BMC Bioinformatics. .

Abstract

Background: The genomic revolution has led to rapid growth in sequencing of genes and proteins, and attention is now turning to the function of the encoded proteins. In this respect, microscope imaging of a protein's sub-cellular localisation is proving invaluable, and recent advances in automated fluorescent microscopy allow protein localisations to be imaged in high throughput. Hence there is a need for large scale automated computational techniques to efficiently quantify, distinguish and classify sub-cellular images. While image statistics have proved highly successful in distinguishing localisation, commonly used measures suffer from being relatively slow to compute, and often require cells to be individually selected from experimental images, thus limiting both throughput and the range of potential applications. Here we introduce threshold adjacency statistics, the essence which is to threshold the image and to count the number of above threshold pixels with a given number of above threshold pixels adjacent. These novel measures are shown to distinguish and classify images of distinct sub-cellular localization with high speed and accuracy without image cropping.

Results: Threshold adjacency statistics are applied to classification of protein sub-cellular localization images. They are tested on two image sets (available for download), one for which fluorescently tagged proteins are endogenously expressed in 10 sub-cellular locations, and another for which proteins are transfected into 11 locations. For each image set, a support vector machine was trained and tested. Classification accuracies of 94.4% and 86.6% are obtained on the endogenous and transfected sets, respectively. Threshold adjacency statistics are found to provide comparable or higher accuracy than other commonly used statistics while being an order of magnitude faster to calculate. Further, threshold adjacency statistics in combination with Haralick measures give accuracies of 98.2% and 93.2% on the endogenous and transfected sets, respectively.

Conclusion: Threshold adjacency statistics have the potential to greatly extend the scale and range of applications of image statistics in computational image analysis. They remove the need for cropping of individual cells from images, and are an order of magnitude faster to calculate than other commonly used statistics while providing comparable or better classification accuracy, both essential requirements for application to large-scale approaches.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distinguishing cell images by thresholding. Images of the endoplasmic reticulum (a) and the microtubule cytoskeleton (b) are thresholded (a' and b') such that pixels with intensity in the range μ-30 to μ+30 are shown in white, where μ is the average pixel intensity of each image. Though images (a) and (b) are texturally and visually similar, images (a') and (b') are more distinguished. Image (a') contains more solid white regions, while (b') shows more interior speckling and feathering of edges.
Figure 2
Figure 2
Threshold statistics for cell images. Once a cellular image (a) is thresholded (a'), statistics are calculated from the threshold image. For each white pixel the number of pixels adjacent that are also white are counted. Examples of having zero to eight white neighbours are given in (0)-(8). The first threshold statistic is then the number of white pixels with zero white neighbours, the second is the number with one white neighbour, and so on up to eight. These nine statistics are then normalised by dividing each by the total number of white pixels in the threshold image.
Figure 3
Figure 3
Sample images of the 10 localisation classes of endogenously expressed proteins. (a) Microtubule, (b) Golgi, (c) Plasma membrane, (d) Actin cytoskeleton, (e) Nucleus, (f) Endosome, (g) ER, (h) Mitochondria, (i) Peroxisome, (j) Lysosome. Scale bar 10 μm.
Figure 4
Figure 4
Sample images of the 11 localisation classes of transfected proteins. (a) Microtubule, (b) Golgi, (c) Plasma membrane, (d) Actin cytoskeleton, (e) Nucleus, (f) Endosome, (g) ER, (h) Mitochondria, (i) Peroxisome, (j) Lysosome, (k) Cytoplasm. Scale bar 10 μm.

Similar articles

Cited by

References

    1. Stow J.L. Teasdale R.D. Expression and localization of proteins in mammalian cells. In: Little P., Quackenbush J., editor. Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics. John Wiley and Sons; 2005.
    1. Fink J.L. Aturaliya R.N. Davis M.J. Zhang F. Hanson K. Teasdale M.S. Teasdale R.D. LOCATE: A Protein Subcellular Localization Database. Nucl Acids Res. 2006;34 - PMC - PubMed
    1. Bonetta L. Flow cytometry smaller and better. Nature Methods. 2005;2:785 –7795. doi: 10.1038/nmeth1005-785. - DOI
    1. Lang P. Yeow K. Nichols A. Scheer A. Cellular imaging in drug discovery. Nature Reviews Drug Discovery. 2006;5:343–356. doi: 10.1038/nrd2008. - DOI - PubMed
    1. Stephens DJ, Allan VJ. Light Microscopy Techniques for Live Cell Imaging. Science. 2003;300:82–86. doi: 10.1126/science.1082160. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources