Skip to content

Roadmap and rationale

Peter edited this page Nov 20, 2017 · 3 revisions

Presention and rationale

...Some "what-and-why" items, explaining project's moivation... Or jump to roadmap.

WHAT

  • Transfer standard Datasets to a PostgreSQL database, transforming it into fast, reliable and compact data-representations (CSV lines as JSONb arrays).

  • Offer "standard SQL VIEWs" (standard table names and column names) for the dataset SQL-representation.
    PS: so also offer a simple universal data-transformation language (SQL) to describe, in a standard and reproductible way, the data provenance of my datasets (and perhaps any standard dataset).

  • Do class modeling of all datasets. (illustred below from sql-unifier/src/Appendix.).

  • Build (easy to) new datasets, as a mixed-datasets, from standard ones, knowing or modeling relationships (SQL JOINS, etc.). See illustration bellow.

  • See standard Datasets and Datasets-BR as an "ecossystem of reliable data" that have long-term digital preservation at git repositories, and an enhanced access by standard class modeling, standard API interfaces (as GraphQL with PostGraphQL), or consuming it as standard SQL tables.

  • ...

WHY

  • To be portable: all datasets in one table.

  • To be easy for me: I like SQL, and there are costs of test and learning to use other (non-standard) tools like CSVkit.

  • To be easy to produce new mix-datasets, in a realiable and standaerd way.

  • To be easy to use the Datasets in PostgreSQL databases, as an independent, refreshable and flexible SQL-SCHEMA.

  • To be easy to offer the Datasets for my users, in standard APIs.

  • ...

Rodadmap and rationale

... next steps? ...

  1. Consensus about a "SQL-kernel" framework to do all basic things.

  2. Test with all

  3. Test with other people and import/expot/analyse tools as CSVkit or Goodtables.

  4. Test with all datasets of http://github.com/datasets

  5. API: consuming online the datasets by GraphQL's interface of PostGraphQL project... Or others like http://postgrest.com

  6. Expand and finesh.