Skip to content

Commit

Permalink
Bitesize WDL parsing doc [BA-6184] (broadinstitute#5388)
Browse files Browse the repository at this point in the history
  • Loading branch information
cjllanwarne authored Jan 28, 2020
1 parent 176d54d commit afe67ef
Show file tree
Hide file tree
Showing 4 changed files with 284 additions and 2 deletions.
Original file line number Diff line number Diff line change
@@ -1 +1 @@
Coming soon.
Coming soon.
38 changes: 37 additions & 1 deletion docs/developers/bitesize/workflowParsing/wdlParsingOverview.md
Original file line number Diff line number Diff line change
@@ -1 +1,37 @@
Coming soon.
### WDL Source to WOM Conversion

#### Parsing Flowchart

For the various versions of WDL supported in Cromwell, the conversion to WOM follows
these paths:

![Parsing Flowchart](wdlmap.svg)

#### Process Description

You can think of WDL parsing in Cromwell in terms of the following major steps:

1. Lexing: Converting the raw WDL string into a one dimensional stream of "tokens".
2. Parsing: Converting the stream of tokens into an abstract syntax tree (AST).
3. Transliteration: Transforming the language-specific AST into a standard set of Scala objects
4. Import Resolution: Recursively processing any import statements into WOM bundles.
5. Linking: Discovering, resolving and recording all references within the AST and imports.
6. WOM Building: Creating a set of WOM objects
7. Input Validation: Link any provided inputs to inputs on the WOM objects.


#### Intermediate Data Formats

* **WDL Object Model (WDLOM)**:
* A Scala case class representation of WDL grammar ASTs.
* **Linked inputs**:
* The original WDL source's WDLOM representation
* And WOM bundles imported
* Links from any references to their definitions
* Including custom type references, variable references, task calls, subworkflow calls
* **WOM Bundle**:
* A set of tasks, workflows and custom types, and the fully qualified names by which they can be referenced.
* In Cromwell's WOM format (the WOM format is the ultimate destination for _all_ languages, including WDL and CWL)
* **Validated WOM Namespace**:
* The conjunction of a WOM bundle with an input set.
* The entry point workflow (or sometimes task, in CWL) is known.
62 changes: 62 additions & 0 deletions docs/developers/bitesize/workflowParsing/wdlmap.dot
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
digraph "WDL_MAP"
{
compound=true;

# WDL 1_0 standard path
wdldraft2 -> wdldraft2ast [label="Hermes Parser (draft 2)"]
wdldraft2ast -> wdldraft2model [label="WdlNamespace constructor"]
wdldraft2model -> validatedwomnamespace [label="inputs.json,\noptions.json"]

# 1.0 standard path
wdl1_0 -> wdl1_0ast [label="Hermes Parser (1.0)"]
wdl1_0ast -> wdlom [label=" draft3.ast2wdlom"]
wdlbiscayne -> wdlbiscayneast [label="Hermes Parser (Biscayne)"]
wdlbiscayneast -> wdlom [label=" biscayne.ast2wdlom"]
wdlom -> wdlomandimports
#wdlom -> importedbundles [style=invis] # [label=" recurse to import"]
importedbundles -> wdlomandimports
wdlomandimports -> linkedgraph [label=" draft3.linking"]
#linkedgraph -> fileandlinksandimports
linkedgraph -> wombundle [label=" draft3.wdlom2wom"]
#importedbundles -> fileandlinksandimports
#wdlom -> fileandlinksandimports
#fileandlinksandimports -> wombundle [label=" draft3.wdlom2wom"]
#wombundle -> importablebundle
wombundle -> importedbundles [ style=dotted]
wombundle -> validatedwomnamespace [label="inputs.json,\noptions.json"]

# Upgrade script
# wdldraft2model -> wdlom [label="Draft 2 to 1.0 converter"]
# wdlom -> wdl1_0 [label="1.0 WDL generator"]

# Draft 2 model
wdldraft2 [shape=invtriangle label="WDL Draft 2 file" ];
wdldraft2ast [shape=oval label="WDL Draft 2 AST"];
wdldraft2model [shape=oval label="WDL Draft 2 Namespace"];

# 1.0 model
wdl1_0 [shape=invtriangle label="WDL 1.0 file"];
wdl1_0ast [shape=oval label="WDL 1.0 AST"];
wdlbiscayne [shape=invtriangle label="WDL Biscayne file"];
wdlbiscayneast [shape=oval label="WDL Biscayne AST"];
wdlom [shape=oval label="WDL Object Model"];
wdlomandimports [shape=circle fixedsize=true width=0.3 label="+"] # [shape=oval label="File WDLOM + 'import' WOM Bundles"];
linkedgraph [shape=oval label="Linked inputs\n(with resolved types, value lookups, graph edges)"];
#fileandlinksandimports [shape=oval label="File WDLOM + linking information + 'import' WOM Bundles"];
wombundle [shape=oval label="WOM Bundle"];
importedbundles [shape=oval label="WOM Bundles from imports\n(generated recursively)"];
#importablebundle [shape=oval label="WOM Bundle available for import"];

validatedwomnamespace [shape=oval label="Validated WOM Namespace" peripheries=2];

{rank = same; wdldraft2; wdl1_0; wdlbiscayne}
{rank = same; wdldraft2ast; wdl1_0ast; wdlbiscayneast}
{rank = same; wdldraft2model; wdlom; importedbundles}


#importablebundle -> importedbundles [ style=dotted]

# This "rank + invisible edge" combo is just to force importablebundle to be on the RHS of validatedwomnamespace
#{rank = same; validatedwomnamespace; importablebundle}
#importablebundle -> validatedwomnamespace [style=invis]
}
184 changes: 184 additions & 0 deletions docs/developers/bitesize/workflowParsing/wdlmap.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit afe67ef

Please sign in to comment.