Skip to content

Unicode error with non-ascii Infobox boxterm #138

Open
@baerbock

Description

I would like to extract the infobox of the Bulgarian Railway Line No. 1 article.

import wptools
page = wptools.page('Железопътна линия 1 (България)', lang='bg')
page.get_parse()
page.data['ЖП линия']

which fails.

Are infoboxes not detected if they are named in an unsual manner (ЖП линия)?

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions