Gutenberg-LD

This is a suite of scripts and models that refactor the metadata set of the Project Gutenberg digital library, with the aim of turning it into a proper Linked Data set.

Features

Reconciliation of blank nodes, resulting in a much smaller dataset (~29% smaller as of March 2020)
Linking with Library of Congress subject headings and classification systems
Structuring of Table Of Contents data
Ontology alignment of undocumented Gutenberg terms

Requirements

You need:

Python 3
an RDF store with SPARQL querying/updating over HTTP (e.g. Jena Fuseki, Virtuoso, BlazeGraph)
The Project Gutenberg catalog as RDF - Download at https://www.gutenberg.org/wiki/Gutenberg:Feeds

Usage

Download the metadata set from Gutenberg and load it onto your RDF store.
cd gutenberg-fixes
In settings.py set the SPARQL service and RDF graph name
python refactor.py bookshelves formats toc (or a subset of the three arguments)
in gutenberg-fixes/queries you can find other SPARQL queries to run by yourselves.

Licensing

Gutenberg-LD is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
gutenberg-fixes		gutenberg-fixes
ont		ont
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gutenberg-LD

Features

Requirements

Usage

Licensing

About

Releases

Packages

Languages

License

alexdma/gutenberg-ld

Folders and files

Latest commit

History

Repository files navigation

Gutenberg-LD

Features

Requirements

Usage

Licensing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages