Source code, data files and utilities related to "Scala for Machine Learning"
Version 0.95e Copyright Patrick Nicolas All rights reserved 2013-2015
The source code provides software developers with a broad overview of the difference in machine learning algorithms. The reader is expected to have a good grasp of the Scala programming language along with some knowledge in basic statistics. Experience in data mining and machine learning is not a pre-requisite.The examples are related to investment portfolio management and trading strategies. For the readers interested either in mathematics or the techniques implemented in this library, I strongly recommend the following readings:
- "Machine Learning: A Probabilistic Perspective" K. Murphy
- "The Elements of Statistical Learning" T. Hastie, R. Tibshirani, J. Friedman
Hardware: 2 CPU core with 4 Gbytes RAM for small datasets to build and run examples.
4 CPU Core and 8+ Gbytes RAM for datasets of size 75,000 or larger and/or with 50 features set or larger
Operating system: None
Software: JDK 1.7.0_45 or 1.8.0_25, Scala 2.10.3/2.10.4 or 2.11.2 and SBT 0.13+ (see installation section for deployment.
Directory structure of the source code library for Scala for Machine Learning:
Directory structure of the source code of the examples for Scala for Machine Learning:
Library components for Scala for Machine Learning:
Build script for Scala for Machine Learning:
To build the library and tests: $(ROOT)/sbt clean compile publish-local
The Simple Build Too (SBT) has to be used to build the library from the source code using the build.sbt file in the root directory: sbt compile publish-local
The installation and build workflow is described in the following diagram: