Skip to content

PowerGraph: A framework for large-scale machine learning and graph computation.

Notifications You must be signed in to change notification settings

foxspy/PowerGraph

Repository files navigation

                         Graphlab
                         --------

=======
License
=======

GraphLab is free software licensed under the Apache 2.0 License. See
license/LICENSE.txt for details.

The GraphLab source distribution contains a modified version of METIS
which is under a different license. See the file 
src/graphlab/extern/metis/LICENSE for further details.

For more information, contact Yucheng Low at ylow@cs.cmu.edu or 
Joseph Gonzalez at jegonzal@cs.cmu.edu .

============
Introduction
============
Designing and implementing efficient and provably correct parallel 
machine learning (ML) algorithms can be very challenging. Existing 
high-level parallel abstractions like MapReduce are often insufficiently 
expressive while low-level tools like MPI and Pthreads leave ML experts 
repeatedly solving the same design challenges. By targeting common 
patterns in ML, we developed GraphLab, which improves upon abstractions 
like MapReduce by compactly expressing asynchronous iterative algorithms 
with sparse computational dependencies while ensuring data consistency 
and achieving a high degree of parallel performance.

For more details on the GraphLab see http://graphlab.ml.cmu.edu


============
Dependencies
============
We require Cmake, Boost and Kyoto Cabinet. If you use distributed
GraphLab, we also require MPI.


CMake
-----
You can obtain CMake from 
 - http://www.cmake.org/cmake/resources/software.html


Boost
-----
We have tested with Boost 1.46. Many earlier versions, and all later
versions should work as well.

To install Boost 1.46 with the minimal set of libraries:

   # download
   wget http://sourceforge.net/projects/boost/files/boost/1.46.1/boost_1_46_1.tar.bz2/download boost.tar.bz2
   cd boost_1_46_1
   
   # configure
   ./bootstrap.sh --with-libraries="filesystem,program_options,system,iostreams"
   
   # compile
   ./bjam --threading=multi --link=static --variant=release | tee bjamlog.txt
   
   # install 
   sudo ./bjam install


Kyoto Cabinet
-------------
We have tested with Kyoto Cabinet 1.2.53. Later versions should work
as well.

To install Kyoto Cabinet 1.2.53: 
   # download
   wget http://fallabs.com/kyotocabinet/pkg/kyotocabinet-1.2.53.tar.gz kyotocabinet.tar.gz
   cd kyotocabinet-1.2.53
   
   # configure
   ./configure
   
   # compile
   make
    
   # install
   sudo make install


MPI
---
We have tested with MPICH2. Open MPI might also work. 
Mac OS X comes with MPI and works out of the box.

To install MPICH2:
   # download
   http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/1.3.2p1/mpich2-1.3.2p1.tar.gz
   cd mpich2-1.3.2p1/
   
   # configure
   ./configure
   
   # compile
   make
   
   # install
   sudo make install
   

===========
Compilation
===========
GraphLab has three dependencies: CMake, Boost and Kyoto Cabinet

If you already have all dependencies satistied,
1a: ./configure

Otherwise, autodetect, install and build boost, CMake and Kyoto Cabinet.
Note that this will install all the dependencies in the deps/ directory
and not install them globally. We STRONGLY recommend installing all the 
dependencies globally. See the "Dependencies" section.

1b: ./configure --bootstrap 


The configure script will create three subdirectories
debug/
release/
profile/
With different configuration profiles.

cd into any of the subdirectories and run "make" will compile the entire 
GraphLab library + demoapps + unit tests. "make install" will install the
graphlab library.


==================
Running Unit Tests
==================
# switch into any of the build subdirectories
1:  cd debug/ or cd release/ or cd profile/

# cd into the tests directory
2:  cd tests/

# run the test script 
3:  ./runtest.sh

If you encounter any failures, please send us an email.


=====================
Writing Your Own Apps
=====================
There are two ways to write your own apps.
1: To work in the GraphLab source tree,
or
2: Install and link against Graphlab

1: Working in the GraphLab Source Tree
---------------------------------------
This is the best option if you just want to try using GraphLab quickly. GraphLab
uses the fairly sophisticated CMake build system enabling you to quickly create
a c++ project without having to write complicated Makefiles. 

1: Create your own sub-directory in the apps/ directory
   for example apps/my_app
   
2: Create a CMakeLists.txt in apps/my_app containing the following lines:

  project(GraphLab) 
  add_graphlab_executable(my_app [List of cpp files space seperated]) 
  target_link_libraries(my_app [Additional libraries space seperated])

  Substituting the right values into the square brackets. For instance:

  project(GraphLab) 
  add_graphlab_executable(my_app my_app.cpp) 
  target_link_libraries(my_app tcmalloc)

  If you have no additional libraries, you can comment out the 
  target_link_libraries line by prepending it with a # character.

3: Running ./configure --bootstrap in the base of the source tree will then
create a debug/ release/ and profile/ directory with different build 
configurations.

4: Running "make" in any of the build directories will compile your program
in the apps/my_app directory.



2: Installing and Linking Against GraphLab
-------------------------------------------
Run the ./configure --bootstrap script to create the build directories. You can
change the default installation location (currently /usr/local) using the
--prefix option. For more options see ./configure

Once you have run the configure script change directories to release/ and then
run make and then make install . This will install the following:

  include/graphlab.hpp
      The primary GraphLab header 
  include/graphlab/...
      The folder containing the headers for the rest of the GraphLab library 
  lib/libgraphlab.a
      The main GraphLab binary 
    
Once you have installed GraphLab you can compile your program by running:

  g++ hello_world.cpp -lboost_program_options -lgraphlab 
  
  
Known Installation Issues
=========================

Boost configuration
--------------------
If running ./configure fails because of Boost libraries are not found, you might
need to declare environment variable BOOST_ROOT which points to the installation
directory of Boost. You can do this using:

env BOOST_ROOT=[location of boost installation] ./configure --bootstrap

Alternatively, you can try using our automatic dependency installing tools by
running

./configure --bootstrap

Please let us know if you have trouble configuring GraphLab on your system.

Mac OS X
--------
on Mac OS X, you also need to include Boost libraries in the list of
dynamic libraries defined in environment variable DYLD_LIBRARY_PATH. Here is an
example:
declare -x DYLD_LIBRARY_PATH=":${BOOST_ROOT}/stage/lib:${DYLD_LIBRARY_PATH}" 

About

PowerGraph: A framework for large-scale machine learning and graph computation.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 85.5%
  • Python 6.0%
  • CMake 2.8%
  • Java 2.1%
  • JavaScript 1.5%
  • Shell 0.6%
  • Other 1.5%