Help improve Trains by filling our 2-min user survey
🚀 Trains-Agent Services is now included, for more information see services
The trains-server is the backend service infrastructure for Trains. It allows multiple users to collaborate and manage their experiments. By default, Trains is set up to work with the Trains demo server, which is open to anyone and resets periodically. In order to host your own server, you will need to launch trains-server and point Trains to it.
trains-server contains the following components:
- The Trains Web-App, a single-page UI for experiment management and browsing
- RESTful API for:
- Documenting and logging experiment information, statistics and results
- Querying experiments history, logs and results
- Locally-hosted file server for storing images and models making them easily accessible using the Web-App
You can quickly deploy your trains-server using Docker, AWS EC2 AMI, or Kubernetes.
trains-server has two supported configurations:
-
Single IP (domain) with the following open ports
- Web application on port 8080
- API service on port 8008
- File storage service on port 8081
-
Sub-Domain configuration with default http/s ports (80 or 443)
- Web application on sub-domain: app.*.*
- API service on sub-domain: api.*.*
- File storage service on sub-domain: files.*.*
The ports 8080/8081/8008 must be available for the trains-server services.
For example, to see if port 8080
is in use:
-
Linux or macOS:
sudo lsof -Pn -i4 | grep :8080 | grep LISTEN
-
Windows:
netstat -an |find /i "8080"
Launch trains-server in any of the following formats:
- Pre-built AWS EC2 AMI
- Pre-built GCP Custom Image
- Pre-built Docker Image
- Kubernetes
By default, the Trains client is set up to work with the Trains demo server.
To have the Trains client use your trains-server instead:
-
Run the
trains-init
command for an interactive setup. -
Or manually edit
~/trains.conf
file, making sure the server settings (api_server
,web_server
,file_server
) are configured correctly, for example:api { # API server on port 8008 api_server: "http://localhost:8008" # web_server on port 8080 web_server: "http://localhost:8080" # file server on port 8081 files_server: "http://localhost:8081" }
Note: If you have set up trains-server in a sub-domain configuration, then there is no need to specify a port number, it will be inferred from the http/s scheme.
After launching the trains-server and configuring the Trains client to use the trains-server,
you can use Trains in your experiments and view them in your trains-server web server,
for example http://localhost:8080.
For more information about the Trains client, see Trains.
As of version 0.15 of trains-server, dockerized deployment includes a Trains-Agent Services container running as part of the docker container collection.
Trains-Agent Services is an extension of Trains-Agent that provides the ability to launch long-lasting jobs that previously had to be executed on local / dedicated machines. It allows a single agent to launch multiple dockers (Tasks) for different use cases. To name a few use cases, auto-scaler service (spinning instances when the need arises and the budget allows), Controllers (Implementing pipelines and more sophisticated DevOps logic), Optimizer (such as Hyper-parameter Optimization or sweeping), and Application (such as interactive Bokeh apps for increased data transparency)
Trains-Agent Services container will spin any task enqueued into the dedicated services
queue.
Every task launched by Trains-Agent Services will be registered as a new node in the system,
providing tracking and transparency capabilities.
You can also run the Trains-Agent Services manually, see details in trains-agent services mode
Note: It is the user's responsibility to make sure the proper tasks are pushed into the services
queue.
Do not enqueue training / inference tasks into the services
queue, as it will put unnecessary load on the server.
trains-server provides a few additional useful features, which can be manually enabled:
To restart the trains-server, you must first stop the containers, and then restart them.
docker-compose down
docker-compose -f docker-compose.yml up
trains-server releases are also reflected in the docker compose configuration file.
We strongly encourage you to keep your trains-server up to date, by keeping up with the current release.
Note: The following upgrade instructions use the Linux OS as an example.
To upgrade your existing trains-server deployment:
-
Shut down the docker containers
docker-compose down
-
We highly recommend backing up your data directory before upgrading.
Assuming your data directory is
/opt/trains
, to archive all data into~/trains_backup.tgz
execute:sudo tar czvf ~/trains_backup.tgz /opt/trains/data
Restore instructions:
To restore this example backup, execute:
sudo rm -R /opt/trains/data sudo tar -xzf ~/trains_backup.tgz -C /opt/trains/data
-
Download the latest
docker-compose.yml
file.curl https://raw.githubusercontent.com/allegroai/trains-server/master/docker-compose.yml -o docker-compose.yml
-
Configure the Trains-Agent Services (not supported on Windows installation). If
TRAINS_HOST_IP
is not provided, Trains-Agent Services will use the external public address of the trains-server. IfTRAINS_AGENT_GIT_USER
/TRAINS_AGENT_GIT_PASS
are not provided, the Trains-Agent Services will not be able to access any private repositories for running service tasks.export TRAINS_HOST_IP=server_host_ip_here export TRAINS_AGENT_GIT_USER=git_username_here export TRAINS_AGENT_GIT_PASS=git_password_here
-
Spin up the docker containers, it will automatically pull the latest trains-server build
docker-compose -f docker-compose.yml pull docker-compose -f docker-compose.yml up
* If something went wrong along the way, check our FAQ: Common Docker Upgrade Errors.
If you have any questions, look to the Trains server FAQ, or tag your questions on stackoverflow with 'trains' tag.
For feature requests or bug reports, please use GitHub issues.
Additionally, you can always find us at trains@allegro.ai
Server Side Public License v1.0
trains-server relies on both MongoDB and ElasticSearch. With the recent changes in both MongoDB's and ElasticSearch's OSS license, we feel it is our responsibility as a member of the community to support the projects we love and cherish. We believe the cause for the license change in both cases is more than just, and chose SSPL because it is the more general and flexible of the two licenses.
This is our way to say - we support you guys!