Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jun:Chapter 11:11.9.1-11.9.20.
doi: 10.1002/0471250953.bi1109s38.

Using cloud computing infrastructure with CloudBioLinux, CloudMan, and Galaxy

Affiliations

Using cloud computing infrastructure with CloudBioLinux, CloudMan, and Galaxy

Enis Afgan et al. Curr Protoc Bioinformatics. 2012 Jun.

Abstract

Cloud computing has revolutionized availability and access to computing and storage resources, making it possible to provision a large computational infrastructure with only a few clicks in a Web browser. However, those resources are typically provided in the form of low-level infrastructure components that need to be procured and configured before use. In this unit, we demonstrate how to utilize cloud computing resources to perform open-ended bioinformatic analyses, with fully automated management of the underlying cloud infrastructure. By combining three projects, CloudBioLinux, CloudMan, and Galaxy, into a cohesive unit, we have enabled researchers to gain access to more than 100 preconfigured bioinformatics tools and gigabytes of reference genomes on top of the flexible cloud computing infrastructure. The protocol demonstrates how to set up the available infrastructure and how to use the tools via a graphical desktop interface, a parallel command-line interface, and the Web-based Galaxy interface.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A snapshot of the BioCloudCentral portal showing all the form fields that are required to instantiate a CloudBioLinux and CloudMan instance.
Figure 2
Figure 2
BioCloudCentral monitor page showing the details about the started instance. This page provides a direct link to the new instance as well as an option to download user data. This user data can be used to restart this same instance from the AWS console by uploading it in the instance wizard request form.
Figure 3
Figure 3
The CloudMan web console used to manage the cluster.
Figure 4
Figure 4
The initial CloudMan cluster configuration box. Here, it is possible to choose from the different cluster types supported by CloudMan. Depending on the cluster type, input may be required.
Figure 5
Figure 5
The main CloudMan interface used to control and manage the cloud cluster. Through this interface it is possible to add and remove nodes from the cluster, monitor the status of cluster services, and manage cluster features such as auto-scaling and instance sharing.
Figure 6
Figure 6
The NX client properties box specifying the IP address of the instance and the choice of GNOME desktop - both are required to establish a successful connection.
Figure 7
Figure 7
The remote CloudBioLinux graphical interface. Via this interface, it is possible to interact with the system as if it was a local workstation; standard Ubuntu menus and tools are available via the point-and-click interface.
Figure 8
Figure 8
ClustalX application on the remote instance with the sample dataset loaded.
Figure 9
Figure 9
A snapshot of the MyBayes block (file SP1_file2.nxs) that needs to be manually adjusted with the results from the step 5 of the Basic Protocol 2. Append the edited block to the end of file SP1_file1.nxs and save the resulting file as SP1_file3.nxs.
Figure 10
Figure 10
Galaxy history view with the two RNA datasets transferred from modENCODE.
Figure 11
Figure 11
The Cuffcompare tool interface within Galaxy with all the options described in Basic Protocol 4, step 6 selected.
Figure 12
Figure 12
List of Gene Ontology terms found overrepresented in the submitted dataset, ordered by their corresponding p-values, as returned by the DAVID tool.

Similar articles

Cited by

References

    1. Afgan E, Baker D, Coraor N, Chapman B, Nekrutenko A, Taylor J. Galaxy CloudMan: delivering cloud compute clusters. BMC Bioinformatics. 2010;11:S4. - PMC - PubMed
    1. Afgan E, Baker D, Coraor N, Goto H, Paul IM, et al. Harnessing cloud computing with Galaxy Cloud. Nature Biotechnology. 2011a;29:972–974. - PMC - PubMed
    1. Afgan E, Baker D, Nekrutenko A, Taylor J. A Reference Model for Deploying Applications in Virtualized Environments. Concurrency and Computation: Practice and Experience. 2011b - PMC - PubMed
    1. Afgan E, Goecks J, Baker D, Coraor N, Nekrutenko A, Taylor J. Yang K. Galaxy - a Gateway to Tools in e-Science. Guide to e-Science: Next Generation Scientific Research and Discovery. 2011c Springer;:145–177.
    1. Posada D. jModelTest: Phylogenetic Model Averaging. Molecular Biology and Evolution. 2008;25:1253–1256. - PubMed

Publication types

LinkOut - more resources