Skip to content

Commit

Permalink
Major changes leading up to version 0.6.2
Browse files Browse the repository at this point in the history
This commit brings a major overhaul to the structure of the code, while leaving the
API *mostly* intact.

Major Changes
.............

- Batch system definitions now fully modular and are contained in the `fyrd.batch_systems`
  package. `options.py` has also been moved into this package, which allows any programmer
  to add a new batch system definition to fyrd by just editing the contents of that small
  subpackaged
- Updated console script to allow running arbitrary shell scripts on the console with
  `fyrd run` or submitting any number of existing job files using `fyrd sub`. Added the
  new alias scripts `frun` and `fsub` for those new modes also. Both new modes will accept
  the `--wait` argument, meaning that they will block until the jobs complete.
- Documentation overhauled to update API and add instructions on creating a new batch system,
  these instructions are duplicated in the README within the `batch_systems` package folder.
- **Local support temporarily removed**. It didn't work very well, and it broke the new
  batch system structure, I hope to add it back again shortly.
- Full support for array job parsing for both torque and slurm. We now create on job entry
  for each array job child, instead of for each array job. To manage this, the
  `fyrd.queue.Queue.QueueJob` class was moved to `fyrd.queue.QueueJob` and split to add a
  child class, `fyrd.queue.QueueChild`. All array jobs not have one `fyrd.queue.QueueJob`
  job, plus one `fyrd.queue.QueueChild` job for each of their children, which are stored
  in the `children` dictionary in the `fyrd.queue.QueueJob` class.
- Added a `get` method to the `fyrd.queue.Queue` class to allow a user to get outputs from
  a list of jobs, loops continuously through the jobs so that jobs are not lost.
- Added `tqdm <https://pypi.python.org/pypi/tqdm>`_ as a requirement and enabled progressbars
  in multi-job wait and get

Minor Changes
.............

- Updated the documentation to include this changelog, which will only contain change information
  for version 0.6.2a1 onwards.
- Added additional tests to cover the new changes as well as generally increase test suite
  coverage.
- Several small bug fixes
  • Loading branch information
MikeDacre committed Aug 8, 2017
1 parent d35a160 commit d744842
Show file tree
Hide file tree
Showing 39 changed files with 2,579 additions and 2,571 deletions.
20 changes: 9 additions & 11 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
Fyrd
####

One liner script and function submission to torque, slurm, or a local machines
with dependency tracking using python. Uses the same syntax irrespective of
cluster environment!
One liner script and function submission to torque or slurm clusters with
dependency tracking using python. Uses the same syntax irrespective of cluster
environment!

Learn more at https://fyrd.science, https://fyrd.rtfd.com, and
https://github.com/MikeDacre/fyrd
Expand All @@ -20,7 +20,7 @@ https://github.com/MikeDacre/fyrd
+---------+----------------------------------------------------+
| License | MIT License, property of Stanford, use as you wish |
+---------+----------------------------------------------------+
| Version | 0.6.1b9 |
| Version | 0.6.2a1 |
+---------+----------------------------------------------------+


Expand Down Expand Up @@ -134,9 +134,7 @@ can be done like this:
jobs = []
for i in huge_list:
jobs.append(fyrd.Job(my_function, (i,), profile='small').submit())
results = []
for i in jobs:
results.append(i.get())
results = fyrd.get(jobs)
The results list in this example will contain the function outputs, even if
those outputs are integers, objects, or other Python types. Similarly, shell
Expand All @@ -148,10 +146,9 @@ scripts can be run like this:
jobs = []
for i in [i for i in os.listdir('.') if i.endswith('.gz')]:
jobs.append(fyrd.Job(script.format(i), profile='long').submit())
results = []
for i in jobs:
i.wait()
results.append(i.stdout)
results = fyrd.get(jobs)
for i in results:
print(i.stdout)
Results will contain the contents of STDOUT for the submitted script

Expand Down Expand Up @@ -295,6 +292,7 @@ This software requires the following external modules:
- `tabulate <https://pypi.python.org/pypi/tabulate>`_ — allows readable printing of help
- `six <https://pypi.python.org/pypi/six>`_ — makes python2/3 cross-compatibility easier
- `tblib <https://pypi.python.org/pypi/tblib>`_ — allows me to pass Tracebacks between nodes
- `tqdm <https://pypi.python.org/pypi/tqdm>`_ — pretty progress bars for multi-job get and wait

Cluster Dependencies
....................
Expand Down
4 changes: 4 additions & 0 deletions bin/frun
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/usr/bin/env bash
# Simple alias for `fyrd run`
# Allows the user to run arbitrary shell commands on the cluster
fyrd run $@
4 changes: 4 additions & 0 deletions bin/fsub
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/usr/bin/env bash
# Simple alias for `fyrd submit`
# Allows the user to submit existing job files on the cluster
fyrd submit $@
Binary file modified docs/fyrd_manual.pdf
Binary file not shown.
1 change: 1 addition & 0 deletions docs/sphinx/adding_batch_systems.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.. include:: ../../fyrd/batch_systems/README.rst
117 changes: 51 additions & 66 deletions docs/sphinx/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,34 +35,35 @@ Methods

.. automethod:: fyrd.queue.Queue.wait

.. automethod:: fyrd.queue.Queue.wait_to_submit
.. automethod:: fyrd.queue.Queue.get

.. automethod:: fyrd.queue.Queue.get_jobs
.. automethod:: fyrd.queue.Queue.wait_to_submit

.. automethod:: fyrd.queue.Queue.update
.. automethod:: fyrd.queue.Queue.test_job_in_queue

.. autoclass:: fyrd.queue.Queue.QueueJob
.. automethod:: fyrd.queue.Queue.get_jobs

.. autoexception:: fyrd.queue.QueueError
.. automethod:: fyrd.queue.Queue.get_user_jobs

fyrd.queue functions
....................
.. automethod:: fyrd.queue.Queue.update

parsers
~~~~~~~
.. automethod:: fyrd.queue.Queue.check_dependencies

.. autofunction:: fyrd.queue.queue_parser
fyrd.queue Jobs
................

.. autofunction:: fyrd.queue.torque_queue_parser
Hold information about individual jobs, `QueueJob` about primary jobs,
`QueueChild` about individual array jobs (which are stored in the `children`
attribute of `QueueJob` objects.

.. autofunction:: fyrd.queue.slurm_queue_parser
.. autoclass:: fyrd.queue.QueueJob

utilities
~~~~~~~~~
.. autoclass:: fyrd.queue.QueueChild

.. autofunction:: fyrd.queue.get_cluster_environment
fyrd.queue.QueueError
.....................

.. autofunction:: fyrd.queue.check_queue
.. autoexception:: fyrd.queue.QueueError


fyrd.job
Expand Down Expand Up @@ -96,12 +97,24 @@ fyrd.job.Job
Methods
~~~~~~~

.. automethod:: fyrd.job.Job.initialize

.. automethod:: fyrd.job.Job.gen_scripts

.. automethod:: fyrd.job.Job.write

.. automethod:: fyrd.job.Job.clean

.. automethod:: fyrd.job.Job.scrub

.. automethod:: fyrd.job.Job.submit

.. automethod:: fyrd.job.Job.resubmit

.. automethod:: fyrd.job.Job.get_keywords

.. automethod:: fyrd.job.Job.set_keywords

.. automethod:: fyrd.job.Job.wait

.. automethod:: fyrd.job.Job.get
Expand Down Expand Up @@ -137,8 +150,22 @@ including writing the files. `Function` is actually a child class of `Script`.
:members:
:show-inheritance:

fyrd.options
------------
fyrd.batch_systems
------------------

All batch systems are defined here.

fyrd.batch_systems functions
............................

.. autofunction:: fyrd.batch_systems.get_cluster_environment

.. autofunction:: fyrd.batch_systems.check_queue

.. autofunction:: fyrd.batch_systems.get_batch_system

fyrd.batch_systems.options
..........................

All `keyword arguments </keywords.html>`_ are defined in dictionaries in the
`options.py` file, alongside function to manage those dictionaries. Of
Expand Down Expand Up @@ -171,17 +198,17 @@ an empty string is returned.
whole dictionary of arguments, it explicitly handle arguments that cannot be
managed using a simple string format.

.. autofunction:: fyrd.options.option_help
.. autofunction:: fyrd.batch_systems.options.option_help

.. autofunction:: fyrd.options.sanitize_arguments
.. autofunction:: fyrd.batch_systems.options.sanitize_arguments

.. autofunction:: fyrd.options.split_keywords
.. autofunction:: fyrd.batch_systems.options.split_keywords

.. autofunction:: fyrd.options.check_arguments
.. autofunction:: fyrd.batch_systems.options.check_arguments

.. autofunction:: fyrd.options.options_to_string
.. autofunction:: fyrd.batch_systems.options.options_to_string

.. autofunction:: fyrd.options.option_to_string
.. autofunction:: fyrd.batch_systems.options.option_to_string

fyrd.conf
---------
Expand Down Expand Up @@ -309,48 +336,6 @@ from any directory.
.. autofunction:: fyrd.basic.clean_dir()


fyrd.local
----------

The local queue implementation is based on the multiprocessing library and is
not intended to be used directly, it should always be used via the Job class
because it is somewhat temperamental. The essential idea behind it is that we
can have one JobQueue class that is bound to the parent process, it exclusively
manages a single child thread that runs the `job_runner()` function. The two
process communicate using a `multiprocessing.Queue` object, and pass
`fyrd.local.Job` objects back and forth between them.

The Job objects (different from the Job objects in `job.py`) contain information
about the task to run, including the number of cores required. The job runner
manages a pool of `multiprocessing.Pool` tasks directly, and keeps the total
running cores below the total allowed (default is the system max, can be set
with the threads keyword). It backfills smaller jobs and holds on to larger jobs
until there is enough space free.

This is close to what torque and slurm do, but vastly more crude. It serves as a
stopgap to allow parallel software written for compute clusters to run on a
single machine in a similar fashion, without the need for a pipeline alteration.
The reason I have reimplemented a process pool is that I need dependency
tracking and I need to allow some processes to run on multiple cores (e.g. 6 of
the available 24 on the machine).

The `job_runner()` and `Job` objects should never be accessed except by the
JobQueue. Only one JobQueue should run at a time (not enforced), and by default
it is bound to `fyrd.local.JQUEUE`. That is the interface used by all
other parts of this package.

fyrd.local.JobQueue
...................

.. autoclass:: fyrd.local.JobQueue
:members:
:show-inheritance:

fyrd.local.job_runner
.....................

.. autofunction:: fyrd.local.job_runner

fyrd.run
--------

Expand Down
21 changes: 20 additions & 1 deletion docs/sphinx/basic_usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ To run with dependency tracking, run:
import fyrd
job = fyrd.submit(<command1>)
job2 = fyrd.submit(<command2>, depends=job1)
out = job2.get() # Will block until job completes
out1, out2 = fyrd.get([job, job2]) # Will block until job completes
The `submit()` function is actually just a wrapper for the
`Job </api.html#fyrd-job-job>`_ class. The same behavior as above can be
Expand All @@ -36,6 +36,25 @@ can be called on job initialization. Also note that the object returned by
calling the `submit()` function (as in the first example) is also a `Job`
object, so these two examples can be used fully interchangeably.

Similar wrappers allow you to submit and monitor existing job files, such
as those made by other pipelines:

.. code:: python
import os
import fyrd
jobs = []
job_dir = os.path.abspath('./jobs/')
for job in [os.path.join(job_dir, i) for i in os.listdir(job_dir) if i.endswith('sh')]:
jobs.append(fyrd.submit_file(job))
fyrd.wait(jobs) # Will block until every job is completed
This type of thing can also be accomplished using the `console script </console.html>`_:

.. code:: shell
fyrd run --wait ./jobs/*.sh
Functions
---------

Expand Down
43 changes: 43 additions & 0 deletions docs/sphinx/changelog.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
Change Log
==========

Version 0.6.2a1
--------------

This version brings a major overhaul to the structure of the code, while leaving the
API *mostly* intact.

Major Changes
.............

- Batch system definitions now fully modular and are contained in the `fyrd.batch_systems`
package. `options.py` has also been moved into this package, which allows any programmer
to add a new batch system definition to fyrd by just editing the contents of that small
subpackaged
- Updated console script to allow running arbitrary shell scripts on the console with
`fyrd run` or submitting any number of existing job files using `fyrd sub`. Added the
new alias scripts `frun` and `fsub` for those new modes also. Both new modes will accept
the `--wait` argument, meaning that they will block until the jobs complete.
- Documentation overhauled to update API and add instructions on creating a new batch system,
these instructions are duplicated in the README within the `batch_systems` package folder.
- **Local support temporarily removed**. It didn't work very well, and it broke the new
batch system structure, I hope to add it back again shortly.
- Full support for array job parsing for both torque and slurm. We now create on job entry
for each array job child, instead of for each array job. To manage this, the
`fyrd.queue.Queue.QueueJob` class was moved to `fyrd.queue.QueueJob` and split to add a
child class, `fyrd.queue.QueueChild`. All array jobs not have one `fyrd.queue.QueueJob`
job, plus one `fyrd.queue.QueueChild` job for each of their children, which are stored
in the `children` dictionary in the `fyrd.queue.QueueJob` class.
- Added a `get` method to the `fyrd.queue.Queue` class to allow a user to get outputs from
a list of jobs, loops continuously through the jobs so that jobs are not lost.
- Added `tqdm <https://pypi.python.org/pypi/tqdm>`_ as a requirement and enabled progressbars
in multi-job wait and get

Minor Changes
.............

- Updated the documentation to include this changelog, which will only contain change information
for version 0.6.2a1 onwards.
- Added additional tests to cover the new changes as well as generally increase test suite
coverage.
- Several small bug fixes
2 changes: 1 addition & 1 deletion docs/sphinx/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
copyright = '2016, Michael Dacre <mike.dacre@gmail.com>'
author = 'Michael Dacre <mike.dacre@gmail.com>'
version = '0.6'
release = '0.6.1b9'
release = '0.6.2a1'
language = 'en'

# Add any paths that contain templates here, relative to this directory.
Expand Down
Loading

0 comments on commit d744842

Please sign in to comment.