Skip to content

Documentation overhaul roadmap #2105

Open
@darsnack

Description

This issue is a WIP. I am starting from the template discussed on call, we need to break this into smaller steps.

Flux documentation restructure

It should surprise no one that Flux's current documentation is not very good. It might surprise people that the "whole suite" of docs does have reasonably good coverage—that speaks to the poor organization of the documentation. Despite being written down, most information is an ordeal to find.

With this in mind, we consolidated a list of steps to improve this situation. These steps have been discussed by maintainers and contributors on various Julia platforms, but this issue will now serve as a central tracking for the re-haul.

  • Restructure the Flux.jl documentation
    The current doc model was designed at a time when Flux's ethos was to demonstrate how simple neural networks in Julia can be. The current package offers a lot more than a simple list of layers and tracker-based AD. The proposed organization here is a commonly used format (notably in Python fast.ai and our own SciML). See below for a description of this format.
  • Deprecate website tutorials and model zoo in favor of documentation
  • Detailed documentation for NNlib.jl, FluxTraining.jl, etc.

New documentation structure

Home / Getting Started: Briefly describe the package and quick installation instructions. Provide links to (a) the entry point for an absolute beginner, (b) the ecosystem page, (c) contributing and bug reporting resources.

Ecosystem: This page is currently way down at the bottom, and it needs to be prominently featured. This should immediately make clear what it is the Flux provides (layers + simple utilities) and what it does not / what it borrows (datasets, data loading, optimizers, schedules, etc.). A diagram like MLJ's would be useful.

Tutorials: These should be complete, runnable examples that users can use to get started quickly and learn by doing. The model zoo examples should be tutorials. Literate.jl offers the ability to generate markdown for consumption by Documenter.jl. Tutorials should leverage this feature.

  • Beginner: These should be basic tutorials like "taking your first gradient" or "running MNIST." They should rely heavily on the ecosystem packages that a typical beginner would use. For example, use MLDatasets.jl to fetch MNIST instead of rolling your own dataset downloader + type. Writing a data pipeline from scratch provides no value to someone who may not even know what a CNN does. Some existing parts of the Flux docs that fit this area are:
    • "Overview" which should be renamed to "First steps" or "What is machine learning?"
    • "Basics" which should be renamed to "Getting started with neural networks" or "What is a layer?"
    • "Training" which probably should be re-written to be more like a tutorial on understanding the basics of a training loop (i.e. remove the docstrings, etc.)
  • Intermediate: These should be where most model zoo examples should end up. Existing pages that belong here are:
  • Advanced: These should be where complex examples with unconventional training schemes end up.

How-tos: Unlike tutorials, which are focused on end-to-end examples, how-tos are focused on small bite-sized examples. They should demostrate doing a single thing only. Existing pages here could be:

Reference documentation: These pages explain how certain concepts are designed in Flux from first principles. They are pedagogical like tutorials, but they are not focused on end-to-end examples. Instead of bringing together multiple concepts, reference docs bring explain a single concept in the context of FluxML.

API documentation: Flux likes to weave docstrings in and out of tutorials which is horrible for finding API references. We should have a single category that is nothing but docstrings organized by package and type.

Developer documentation: Flux currently lacks explanations for how things work under the hood. This is a place to explain how all the packages (Functors.jl, NNlib.jl, Zygote.jl, etc.) come together in a gradient call.

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions