parse HTML id attribute #44

sknebel · 2018-11-18T13:26:13Z

In a few places, being able to consume the HTML id attribute would be useful.

use cases

to be able to consume fragment links to identify the relevant microformats object
- Xray already has a feature like this (which works by preprocessing the HTML, losing some information in the process)
- Permalinks to feed pages like https://chat.indieweb.org/2018-11-18#t1542536774501700 or https://grapefruit.zegnat.net/2018/04.html#dt201804091942Z
For following pages with multiple feeds, it's necessary to find the same feed again, while the page author should be free to move elements around on the page
- feature requested e.g. by @dshanske

output format

I'd propose a new 'id' attribute on the microformats object (not a property)
i.e.

<div class="h-feed" id="updates">
<a class="u-author h-card" href="https://example.com">Max Mustermann</a>
<li class="h-entry">[...]</li>
[...]

would produce output like

{
    "items": [
        {
            "type": [ "h-feed"],
            "id": "updates",      <------------------
            "properties": {
                "author": ...
            },
            "children": [
                {
                    "type": [
                        "h-entry"
                    ],
                    ...
}

This format should be completely backwards compatible.

imply `uid`?

In the discussion in IRC and in microformats/php-mf2#206, it was also proposed to automatically imply a uid property based on the document URL and the id as a fragment.

I don't think this is a good idea for a few reasons:

I'm not confident that this will not interact weirdly with concepts like authorship, representative h-card, ... uid seems fairly core to the identity of an object, and I'd prefer leaving it to the author.
for the feed use case, it's not necessarily desirable to use the URL of the resulting document, which would be reflected in the uid, if redirects are involved. Feed consumers should follow HTTP 302/307, but not remember those URLs. As such, the correct thing to remember is not the URL of the resulting document + a fragment, but the URL the redirect was found at + the fragment. The parser can not construct this, since it isn't aware of that URL.
EDIT: also, implying the uid could be a problem if the author later adds one, e.g. because they added a dedicated for the feed that didn't exist before

The text was updated successfully, but these errors were encountered:

sknebel · 2018-11-18T14:51:08Z

spec change proposal

Extend http://microformats.org/wiki/microformats2-parsing#parse_a_document_for_microformats with the new last bullet point:

else if found, start parsing a new microformat

keep track of whether the root class name(s) was from backcompat

create a new { } structure with:

type: [array of unique microformat "h-*" type(s) on the element sorted alphabetically],

properties: { } - to be filled in when that element itself is parsed for microformats properties

if the element has a non-empty HTML id property:
id: string value of the HTML id attribute of the element

EDIT: text clarified that id has to be non-empty (it being empty isn't valid HTML anyways).

gRegorLove · 2018-11-19T00:46:26Z

Sounds like good reasoning and a reasonable spec update. I'm in favor and can implement in php-mf2 pretty easily.

dshanske · 2018-12-24T22:04:14Z

As a user of the php-mf2 parser in my Parse This library, I would find this useful.

jalcine · 2018-12-24T22:16:29Z

This could help out quite a bit with the Elixir implementation of Microformats2. I do see the potential issue with using u-uid and have been opting to use u-uid in Koype but this would make things more explicit (which is better).

dshanske · 2018-12-31T06:12:25Z

I implemented some changes to my post-processing of parser output to take the id now in the PHP-MF2 master branch and use it to create a url with fragment for each feed, which allowed me to individually enumerate the feeds. That will assist me in letting them be parsed as individual elements should someone request a specific feed.

tantek · 2018-12-31T06:35:46Z

Resolution: proposal accepted.

No objections in above discussion, and positive opinions (👍) from a few implementors on the proposal.

Proposal implementations in mf2py and phpmf2 parsers, and https://github.com/dshanske verification that phpmf2 implementation satisfies use-case for the issue is sufficient to demonstrate implementability and utility, all as noted/linked in issue thread.

Editing specification accordingly.

(Originally published at: http://tantek.com/2018/364/t3/)

ref microformats/microformats2-parsing#44

dshanske mentioned this issue Nov 18, 2018

Add optional ID for h-* elements microformats/php-mf2#206

Closed

sknebel mentioned this issue Nov 18, 2018

Implement HTML id parsing proposal microformats/php-mf2#207

Merged

sknebel mentioned this issue Nov 19, 2018

implement id parsing microformats/mf2py#143

Merged

tantek changed the title ~~parse HTML id= property~~ parse HTML id attribute Dec 31, 2018

tantek closed this as completed Dec 31, 2018

sknebel mentioned this issue Jul 24, 2019

URLs with fragments should only have the node identified by the fragment and its descendents parsed #46

Closed

willnorris added a commit to willnorris/microformats that referenced this issue May 21, 2020

add support for HTML IDs attributes

89baa61

ref microformats/microformats2-parsing#44

Zegnat mentioned this issue Sep 13, 2020

No test with nested IDs. microformats/tests#120

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parse HTML id attribute #44

parse HTML id attribute #44

sknebel commented Nov 18, 2018 •

edited by tantek

Loading

sknebel commented Nov 18, 2018 •

edited

Loading

gRegorLove commented Nov 19, 2018

dshanske commented Dec 24, 2018

jalcine commented Dec 24, 2018

dshanske commented Dec 31, 2018

tantek commented Dec 31, 2018

parse HTML id attribute #44

parse HTML id attribute #44

Comments

sknebel commented Nov 18, 2018 • edited by tantek Loading

use cases

output format

imply uid?

sknebel commented Nov 18, 2018 • edited Loading

spec change proposal

gRegorLove commented Nov 19, 2018

dshanske commented Dec 24, 2018

jalcine commented Dec 24, 2018

dshanske commented Dec 31, 2018

tantek commented Dec 31, 2018

sknebel commented Nov 18, 2018 •

edited by tantek

Loading

imply `uid`?

sknebel commented Nov 18, 2018 •

edited

Loading