Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: static variable analysis #770

Open
wants to merge 27 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
4f0c2c4
feat: static variable analysis
jg-rp Nov 16, 2024
e7b8559
Accept any iterable from `children`, `arguments`, etc.
jg-rp Nov 17, 2024
a69f33c
Test analysis of standard tags
jg-rp Nov 18, 2024
e5163ba
Merge branch 'harttle:master' into static-analysis-alternate
jg-rp Nov 19, 2024
5705bc6
Use `TagToken.tokenizer` instead of creating a new one
jg-rp Nov 19, 2024
2081083
Test analysis of netsted tags
jg-rp Nov 19, 2024
502a80d
Group variables by their root value
jg-rp Nov 19, 2024
5a9d192
Test analysis of nested globals and locals
jg-rp Nov 19, 2024
2cb9a4f
Analyze included and rendered templates WIP
jg-rp Nov 20, 2024
bc6be99
Use existing tokenizer when constructing `Hash`
jg-rp Nov 21, 2024
7f63cef
Improve test coverage
jg-rp Nov 21, 2024
0d1393b
Analyze variables from `layout` and `block` tags
jg-rp Nov 21, 2024
a1972ab
Test analysis of Jekyll style includes
jg-rp Nov 21, 2024
730ab19
Handle variables that start with a nested variable
jg-rp Nov 21, 2024
c0a19e3
Async analysis
jg-rp Nov 22, 2024
1a79437
Test non-standard tag end to end
jg-rp Nov 23, 2024
d9f47d6
Implement convenience analysis methods on the `Liquid` class
jg-rp Nov 23, 2024
67fdbe5
More analysis convenience methods
jg-rp Nov 23, 2024
cde3b5d
Accept string or template array
jg-rp Nov 23, 2024
a3a93cc
Draft static analysis docs
jg-rp Nov 23, 2024
2bf55db
Deduplicate variables names
jg-rp Nov 23, 2024
3ff787d
Fix isolated scope global variable map
jg-rp Dec 5, 2024
5c76035
Coerce variables to strings instead of extending String
jg-rp Dec 5, 2024
9770ff3
Private map instead of extending Map
jg-rp Dec 5, 2024
ad2333e
Fix e2e test
jg-rp Dec 5, 2024
f73f0d1
Tentatively implement analysis of aliased variables
jg-rp Dec 6, 2024
e9b11f4
Fix nested variable segments array
jg-rp Dec 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
286 changes: 286 additions & 0 deletions docs/source/tutorials/static-analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,286 @@
---
Copy link
Owner

@harttle harttle Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you also need change sidebar and en.yml, to make this file visible on sidebar.

title: Static Template Analysis
---

{% since %}v10.20.0{% endsince %}

{% note info Sync and Async %}
There are synchronous and asynchronous versions of each of the methods demonstrated on this page. See the [Liquid API](liquid-api) for a complete reference.
{% endnote %}

## Variables

Retrieve the names of variables used in a template with `Liquid.variables(template)`. It returns an array of strings, one string for each distinct variable, without its properties.

```javascript
import { Liquid } from 'liquidjs'

const engine = new Liquid()

const template = engine.parse(`\
<p>
{% assign title = user.title | capitalize %}
{{ title }} {{ user.first_name | default: user.name }} {{ user.last_name }}
{% if user.address %}
{{ user.address.line1 }}
{% else %}
{{ user.email_addresses[0] }}
{% for email in user.email_addresses %}
- {{ email }}
{% endfor %}
{% endif %}
{{ a[b.c].d }}
<p>
`)

console.log(engine.variablesSync(template))
```

**Output**

```javascript
[ 'user', 'title', 'email', 'a', 'b' ]
```

Notice that variables from tag and filter arguments are included, as well as nested variables like `b` in the example. Alternatively, use `Liquid.fullVariables(template)` to get a list of variables including their properties as strings.

```javascript
// continued from above
engine.fullVariables(template).then(console.log)
```

**Output**

```javascript
[
'user.title',
'user.first_name',
'user.name',
'user.last_name',
'user.address',
'user.address.line1',
'user.email_addresses[0]',
'user.email_addresses',
'title',
'email',
'a[b.c].d',
'b.c'
]
```

Or use `Liquid.variableSegments(template)` to get an array of strings and numbers that make up each variable's path.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For fullVariables and variableSegments, can we also include an examle for nested variables, or we'll need to mention how nesting will be handled in return values of these 2.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference a[b.c].d, I think we have another option, return 2 references without nesting them. As what's inside [] is not important because it's dynamic anyway:

a[].d
b.c

If you adopt this implementation, we'll need to decide how to represent [] in variableSegments return value. Otherwise these 2 will be the same:

arr[0].length
arr.length

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe differentiate with nesting like:

['arr', ['length']]
['arr', 'length']

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using your example, a[b.c].d, variableSegments was incorrectly producing something like this:

[
  [
    'a',
    Variable {
      segments: [ 'b', 'c' ],
      location: { row: 1, col: 6, file: undefined }
    },
    'd'
  ],
  [ 'b', 'c' ]
]

This was not my intention. Now we get the following.

variables

[ 'a', 'b' ]

fullVariables

[
  'a[b.c].d',
  'b.c'
]`

variableSegments

[
  [ 'a', [ 'b', 'c' ], 'd' ],
  [ 'b', 'c' ]
]

For arr[0].length and arr.length, we get [ 'arr', 0, 'length' ] and [ 'arr', 'length' ], respectively.

Is that what you had in mind?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I mean maybe we can drop nesting. And treat them as different references. I guess what exactly is inside ’[]’ is not important, we only know that array/map is accessed. Assuming figure out its value when called is not feasible for static analysis, then ppl won’t use that information effectively. Here’s my simplified approach

variables

[ 'a', 'b' ]
fullVariables

[
'a[].d',
'b.c'
]`
variableSegments

[
[ 'a', [ 'd' ] ],
[ 'b', 'c' ]
]

note the last one use nested array to represent entering into array, not nested index.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I think your approach should be in addition to variableSegments, as separate method/s. Then users have the option of working with nested variables and array indexes, if they're useful, or your more convenient representation if they're not.

Perhaps normalizedSegments would be a good name?

It seems like we're loosing potentially valuable information if we drop nested variables altogether.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just another idea to simplify the implementation. If you still think dynamic index info is important, we can keep your current implementation. No need to compromise for my opinion.


```javascript
// continued from above
engine.variableSegments(template).then(console.log)
```

**Output**

```javascript
[
[ 'user', 'title' ],
[ 'user', 'first_name' ],
[ 'user', 'name' ],
[ 'user', 'last_name' ],
[ 'user', 'address' ],
[ 'user', 'address', 'line1' ],
[ 'user', 'email_addresses', 0 ],
[ 'user', 'email_addresses' ],
[ 'title' ],
[ 'email' ],
[ 'a', [ 'b', 'c' ], 'd' ],
[ 'b', 'c' ]
]
```

### Global Variables

Notice, in the examples above, that `title` and `email` are included in the results. Often you'll want to exclude names that are in scope from `{% assign %}` tags, and temporary variables like those introduced by a `{% for %}` tag.

To get names that are expected to be _global_, that is, provided by application developers rather than template authors, use the `globalVariables`, `globalFullVariables` or `globalVariableSegments` methods (or their synchronous equivalents) of a `Liquid` class instance.

```javascript
// continued from above
engine.globalVariableSegments(template).then(console.log)
```

**Output**

```javascript
[
[ 'user', 'title' ],
[ 'user', 'first_name' ],
[ 'user', 'name' ],
[ 'user', 'last_name' ],
[ 'user', 'address' ],
[ 'user', 'address', 'line1' ],
[ 'user', 'email_addresses', 0 ],
[ 'user', 'email_addresses' ],
[ 'a', [ 'b', 'c' ], 'd' ],
[ 'b', 'c' ]
]
```

### Partial Templates

By default, LiquidJS will try to load and analyze any included and rendered templates too.

```javascript
import { Liquid } from 'liquidjs'

const footer = `\
<footer>
<p>&copy; {{ "now" | date: "%Y" }} {{ site_name }}</p>
<p>{{ site_description }}</p>
</footer>`

const engine = new Liquid({ templates: { footer } })

const template = engine.parse(`\
<body>
<h1>Hi, {{ you | default: 'World' }}!</h1>
{% assign some = 'thing' %}
{% include 'footer' %}
</body>
`)

engine.globalVariables(template).then(console.log)
```

**Output**

```javascript
[ 'you', 'site_name', 'site_description' ]
```

You can disable analysis of partial templates by setting the `partials` options to `false`.

```javascript
// continue from above
engine.globalVariables(template, { partials: false }).then(console.log)
```

**Output**

```javascript
[ 'you' ]
```

If an `{% include %}` tag uses a dynamic template name (one that can't be determined without rendering the template) it will be ignored, even if `partials` is set to `true`.

### Advanced Usage

The examples so far all use convenience methods of the `Liquid` class, intended to cover the most common use cases. Instead, you can work with [analysis results](static-analysis-interface) directly, which expose the row, column and file name for every occurrence of each variable.

This is an example of an object returned from `Liquid.analyze()`, passing it the template from the [Partial Template](#partial-templates) section above.

```javascript
{
variables: {
you: [
[String (Variable): 'you'] {
segments: [ 'you' ],
location: { row: 2, col: 14, file: undefined }
}
],
site_name: [
[String (Variable): 'site_name'] {
segments: [ 'site_name' ],
location: { row: 2, col: 41, file: 'footer' }
}
],
site_description: [
[String (Variable): 'site_description'] {
segments: [ 'site_description' ],
location: { row: 3, col: 9, file: 'footer' }
}
]
},
globals: {
you: [
[String (Variable): 'you'] {
segments: [ 'you' ],
location: { row: 2, col: 14, file: undefined }
}
],
site_name: [
[String (Variable): 'site_name'] {
segments: [ 'site_name' ],
location: { row: 2, col: 41, file: 'footer' }
}
],
site_description: [
[String (Variable): 'site_description'] {
segments: [ 'site_description' ],
location: { row: 3, col: 9, file: 'footer' }
}
]
},
locals: {
some: [
[String (Variable): 'some'] {
segments: [ 'some' ],
location: { row: 3, col: 13, file: undefined }
}
]
}
}
```

### Analyzing Custom Tags

For static analysis to include results from custom tags, those tags must implement some additional methods defined on the [Template interface]( /api/interfaces/Template.html). LiquidJS will use the information returned from these methods to traverse the template and report variable usage.

Not all methods are required, depending in the kind of tag. If it's a block with a start tag, end tag and any amount of Liquid markup in between, it will need to implement the [`children()`](/api/interfaces/Template.html#children) method. `children()` is defined as a generator, so that we can use it in synchronous and asynchronous contexts, just like `render()`. It should return HTML content, output statements and tags that are child nodes of the current tag.

The [`blockScope()`](/api/interfaces/Template.html#blockScope) method is responsible for telling LiquidJS which names will be in scope for the duration of the tag's block. Some of these names could depend on the tag's arguments, and some will be fixed, like `forloop` from the `{% for %}` tag.

Whether a tag is an inline tag or a block tag, if it accepts arguments it should implement [`arguments()`](/api/interfaces/Template.html#arguments), which is responsible for returning the tag's arguments as a sequence of [`Value`](/api/classes/Value.html) instances or tokens of type [`ValueToken`](/api/types/ValueToken.html).

This example demonstrates these methods for a block tag. See LiquidJS's [built-in tags](built-in) for more examples.

```javascript
import { Liquid, Tag, Hash } from 'liquidjs'

class ExampleTag extends Tag {
args
templates

constructor (token, remainTokens, liquid, parser) {
super(token, remainTokens, liquid)
this.args = new Hash(token.tokenizer)
this.templates = []

const stream = parser.parseStream(remainTokens)
.on('tag:endexample', () => { stream.stop() })
.on('template', (tpl) => this.templates.push(tpl))
.on('end', () => { throw new Error(`tag ${token.getText()} not closed`) })

stream.start()
}

* render (ctx, emitter) {
const scope = (yield this.args.render(ctx))
ctx.push(scope)
yield this.liquid.renderer.renderTemplates(this.templates, ctx, emitter)
ctx.pop()
}

* children () {
return this.templates
}

* arguments () {
yield * Object.values(this.args.hash).filter((el) => el !== undefined)
}

blockScope () {
return Object.keys(this.args.hash)
}
}
```

[liquid-api]: /api/classes/Liquid.html
[static-analysis-interface]: /api/interfaces/StaticAnalysis.html
[built-in]: https://github.com/harttle/liquidjs/tree/master/src/tags
4 changes: 2 additions & 2 deletions src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ export { Drop } from './drop'
export { Emitter } from './emitters'
export { defaultOperators, Operators, evalToken, evalQuotedToken, Expression, isFalsy, isTruthy } from './render'
export { Context, Scope } from './context'
export { Value, Hash, Template, FilterImplOptions, Tag, Filter, Output } from './template'
export { Value, Hash, Template, FilterImplOptions, Tag, Filter, Output, Variable, VariableLocation, VariableSegments, Variables, StaticAnalysis, StaticAnalysisOptions, analyze, analyzeSync, Arguments, PartialScope } from './template'
export { Token, TopLevelToken, TagToken, ValueToken } from './tokens'
export { TokenKind, Tokenizer, ParseStream } from './parser'
export { TokenKind, Tokenizer, ParseStream, Parser } from './parser'
export { filters } from './filters'
export * from './tags'
export { defaultOptions, LiquidOptions } from './liquid-options'
Expand Down
Loading
Loading