Skip to content

Commit

Permalink
Merge branch 'develop'
Browse files Browse the repository at this point in the history
* develop:
  refactor(examples): update markdown demo
  feat(examples): update markdown parser demo/editor
  refactor(hiccup-markdown): update serializer
  feat(hiccup-markdown): update parser types/fns, add docs
  docs(hiccup-markdown): update readme
  feat(examples): update markdown demo w/ new parser
  feat(hiccup-markdown): add metadata support for <hr> tags
  fix(hiccup-markdown): update emoji parse grammar
  refactor(emoji): update emoji index, comments, pkg meta
  feat(hdom): update setAttrib()
  feat(hiccup-markdown): update parser & tag transforms
  feat(hiccup-markdown): update/extend parser (blockquotes, formats)
  feat(strings): add more HTML entities
  feat(hiccup-markdown): update img & link parsers
  • Loading branch information
postspectacular committed Feb 27, 2023
2 parents e84e1c1 + 757a2e3 commit d306ddf
Show file tree
Hide file tree
Showing 23 changed files with 1,519 additions and 1,037 deletions.
Binary file modified assets/examples/markdown-parser.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
291 changes: 135 additions & 156 deletions examples/markdown/README.md
Original file line number Diff line number Diff line change
@@ -1,170 +1,149 @@
# Minimal Markdown parser
# Markdown parser demo

![screenshot](https://raw.githubusercontent.com/thi-ng/umbrella/develop/assets/examples/markdown-parser.jpg)

This project is part of the
[@thi.ng/umbrella](https://github.com/thi-ng/umbrella/) monorepo.

## About

This example is a test environment for the new & minimal
[Markdown](https://en.wikipedia.org/wiki/Markdown) parser & converter to
[hiccup](https://github.com/thi-ng/umbrella/tree/develop/packages/hiccup)
format from the
[@thi.ng/hiccup-markdown](https://github.com/thi-ng/umbrella/tree/develop/packages/hiccup-markdown)
package.

The rest of this file is an excerpt of the relevant parts of that
package's `README.md`...

### Features

The parser itself is not aimed at supporting **all** of Markdown's
quirky syntax features, but will restrict itself to a sane subset of
features and already sports:

| Feature | Comments |
|-------------|-----------------------------------------------------------------------------------------------------|
| Heading | ATX only (`#` line prefix), levels 1-6, then downgrade to paragraph |
| Paragraph | no support for `\` line breaks |
| Blockquote | Respects newlines |
| Format | **bold**, _emphasis_, `code`, ~~strikethrough~~ in paragraphs, headings, lists, blockquotes, tables |
| Link | no support for inline formats in label |
| Image | no image links |
| List | only unordered (`- ` line prefix), no nesting, supports line breaks |
| Table | no support for column alignment |
| Code block | GFM only (triple backtick prefix), w/ optional language hint |
| Horiz. Rule | only dash supported (e.g. `---`), min 3 chars required |

**Note: Because of MD's line break handling and the fact the parser only
consumes single characters from an iterable without knowledge of further
values, the last heading, paragraph, blockquote, list or table requires
an additional newline.**

### Limitations

These MD features (and probably many more) are **not** supported:

- inline HTML
- nested inline formats (e.g. **bold** inside _italic_)
- inline formats within link labels
- image links
- footnotes
- link references
- nested / ordered / numbered / todo lists

Some of these are considered, though currently not high priority...

> "Weeks of coding can **save hours** of planning."
> -- Anonymous
### Other features

- **Functional:** parser entirely built using
[transducers](https://github.com/thi-ng/umbrella/tree/develop/packages/transducers)
(specifically those defined in
[@thi.ng/fsm](https://github.com/thi-ng/umbrella/tree/develop/packages/fsm))
& function composition. Use the parser in a transducer pipeline to
easily apply post-processing of the emitted results
- **Declarative:** parsing rules defined declaratively with only minimal
state/context handling needed
- **No regex:** consumes input character-wise and produces an iterator
of hiccup-style tree nodes, ready to be used with
[@thi.ng/hdom](https://github.com/thi-ng/umbrella/tree/develop/packages/hdom),
[@thi.ng/hiccup](https://github.com/thi-ng/umbrella/tree/develop/packages/hiccup)
or the serializer of this package for back conversion to MD
- **Customizable:** supports custom tag factory functions to override
default behavior / representation of each parsed result element
- **Fast (enough):** parses this markdown file (5.9KB) in ~5ms on MBP2016 / Chrome 71
- **Small:** minified + gzipped ~2.5KB (parser sub-module incl. deps)

See [example source
code](https://github.com/thi-ng/umbrella/tree/develop/examples/markdown/src/)
for reference...

## Parsing & serializing to HTML

```ts
import { iterator } from "@thi.ng/transducers";
import { serialize } from "@thi.ng/hiccup";

import { parse } from "@thi.ng/hiccup-markdown";

const src = `
# Hello world
![screenshot](https://raw.githubusercontent.com/thi-ng/umbrella/develop/assets/examples/Hello world.png)
[This](http://example.com) is a _test_.
`;

// convert to hiccup tree
[...iterator(parse(), src)]
// [ [ 'h1', ' Hello world ' ],
// [ 'p',
// [ 'a', { href: 'http://example.com' }, 'This' ],
// ' is a ',
// [ 'em', 'test' ],
// '. ' ] ]

// or serialize to HTML
serialize(iterator(parse(), src));

// <h1>Hello world</h1><p>
// <a href="http://example.com">This</a> is a <em>test</em>. </p>
```
This example is showcasing some features of the Markdown parser of
[@thi.ng/hiccup-markdown][pkghome] package. Depending on device, the right or
bottom half is showing a realtime preview of the source document in the other
half.

![screenshot of markdown editor w/ preview](https://raw.githubusercontent.com/thi-ng/umbrella/develop/assets/examples/markdown-parser.jpg "screenshot")

[Live demo](https://demo.thi.ng/umbrella/markdown/)

## Syntax features & extensions

### Blockquotes

Nested blockquotes are supported and can contain links, images and inline
formatting, but not other block elements (e.g. lists):

> Nesting is supported:
>> "To understand recursion, one must first understand recursion."
>> — Stephen Hawking
>
> Images in blockquotes are ok too:\
> ![foo](https://raw.githubusercontent.com/thi-ng/umbrella/develop/assets/grid-iterators/zcurve2d-small.gif)
>
> etc.
### Code block headers

## Customizing tags

The following interface defines factory functions for all supported
elements. User implementations / overrides can be given to the
`parseMD()` transducer to customize output.

```ts
interface TagFactories {
blockquote(...children: any[]): any[];
code(body: string): any[];
codeblock(lang: string, body: string): any[];
em(body: string): any[];
heading(level, children: any[]): any[];
hr(): any[];
img(src: string, alt: string): any[];
li(children: any[]): any[];
link(href: string, body: string): any[];
list(type: string, items: any[]): any[];
paragraph(children: any[]): any[];
strike(body: string): any[];
strong(body: string): any[];
table(rows: any[]): any[];
td(i: number, children: any[]): any[];
tr(i: number, cells: any[]): any[];
}
Code blocks can have additional space-separated fields in the header which are
being passed to the tag handler (but which we ignore here in this example):

```ts tangle:yes export:no
// clever code here
```

Example with custom link elements:
### Custom blocks

This is a non-standard Markdown syntax extension for custom freeform content:

:::info Custom block example
In this example we're parsing the contents of these custom blocks
as _Markdown_ itself, but the overall idea is to enable all sorts
of additional "rich" content (UI components, visualizations, media
players etc.)
:::

:::warn Custom block example
Each block has its own `type` (the 1st word in the block header).
The example handler only supports `info` or `warn` types...
:::

### Emoji names

The familiar `:emoji_name:` syntax can be used to include emojis in body text.
We're using from [thi.ng/emoji](https://thi.ng/emoji) for look ups (source data
from [node-emoji](https://raw.githubusercontent.com/omnidan/node-emoji/master/lib/emoji.json)).
Kewl! :sunglasses:

### Footnotes

```ts
const tags = {
link: (href, body) => ["a.link.blue", { href }, body]
};
Footnotes are supported, but this statement might need some further
explanation[^1].

serialize(iterator(parse(tags), src));
### Headings

// <h1>Hello world</h1>
// <p><a href="http://example.com" class="link blue">This</a> is a <em>test</em>. </p>
Only ATX-style headings are supported (any level). The parser also provides
content-based, auto-generated slugs/IDs (via
[`slugifyGH()`](https://docs.thi.ng/umbrella/strings/functions/slugifyGH.html "function docs"))
which are passed to the element handler.

For example, here is a [link to this section](#headings) (using ID `#headings`).

### Images

**Alt text for images is required**. `title` attributes (e.g. for tooltips) can
be given in quotes after the image URL. For example:

```markdown
![alt text](url "title text")
```

## Building locally
### Link formats

The following link formats are supported:

1. `[label](target)`
2. `[label](target "title")`
3. `[label][ref-id]` - the reference ID will have to provided somewhere else in
the document or pre-defined via options given to the parser
4. `[[page name]]` - Wiki-style page reference, non-standard Markdown
5. `[[page name|label]]` - like 4., but with added link label

### Blocklevel metadata

Arbitrary metadata can be assigned to any blocklevel element:

- blockquotes
- code blocks
- custom blocks
- headings
- horizontal rules
- lists
- paragraphs
- tables

See the package readme for more details. Here's an example of metadata assigned
to a headline:

{{{ some freeform metadata }}}
#### Amazing example headline title

...and another one with a custom block:

{{{
date=2023-02-25
status=done
}}}
:::foo Example block title
Just checkout that metadata...
:::

### Tables

| Cells in... | header are treated separately |
|:------------------|:-----------------------------------------------------------------------------------------------|
| Column alignments | :white_check_mark: supported (ignored in this demo though) |
| Inline formats | :white_check_mark: _supported and **nestable**_ |
| Images | ![C-SCAPE](https://raw.githubusercontent.com/thi-ng/umbrella/develop/assets/cellular/hero.png) |
| Links | :white_check_mark: [supported](#links) |
| | |
| Unsupported | :x: no linebreaks |
| | :x: no lists |
| | :x: no blockquotes |

Please refer to the [example build
instructions](https://github.com/thi-ng/umbrella/wiki/Example-build-instructions)
on the wiki.
## Onwards!

## Authors
Please see the [package
readme](https://github.com/thi-ng/umbrella/blob/develop/packages/hiccup-markdown/README.md)
& [API docs](https://docs.thi.ng/umbrella/hiccup-markdown/) for further details.
If you've got any questions, please use the [thi.ng/umbrella discussion
forum](https://github.com/thi-ng/umbrella/discussions) or [issue
tracker](https://github.com/thi-ng/umbrella/issues)...

- Karsten Schmidt
---

## License
[pkghome]: https://thi.ng/hiccup-markdown "package homepage"

© 2018 Karsten Schmidt // Apache Software License 2.0
[^1]: ...or does it really?! :wink:
Loading

0 comments on commit d306ddf

Please sign in to comment.