Skip to content

Collect, aggregate, and visualize a data ecosystem's metadata

License

Notifications You must be signed in to change notification settings

kostas-theo/marquez

Repository files navigation

Marquez

CircleCI Codecov status license Known Vulnerabilities

Marquez is a fundamental core service for collection, aggregation, and visualization of all metadata within a data ecosystem. It maintains the provenance of how datasets are consumed and produced, provides visibility into job runtime and frequency of dataset access, centralization of dataset lifecycle management, and much more.

Status

This project is under active development at WeWork and Stitch Fix (in collaboration with many others organizations).

Documentation

The Marquez design is being actively updated and is open for comments.

Requirements

  • Java 8 or above
  • PostgreSQL database
  • Gradle 4.9 or above

Building

To build the entire project run:

$ ./gradlew shadowJar

The executable can be found under build/libs/

Configuration

Note: When creating your database, we recommend calling it marquez.

To run Marquez, you will have to define config.yml. The configuration file is used to specify your database connection. Please copy and edit config.example.yml:

$ cp config.example.yml config.yml

Edit the following parameters in the config.yml you created based on your environment:

  DB name (need to be created beforehand):      POSTGRESQL_DB_NAME
  DB user:                                      POSTGRESQL_USER
  DB password:                                  POSTGRESQL_PASSWORD

Then run the database migration:

$ ./gradlew run --args 'db migrate config.yml'

Running the Application

$ ./gradlew run --args 'server config.yml'

Then browse to the admin interface: http://localhost:8081

About

Collect, aggregate, and visualize a data ecosystem's metadata

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 75.7%
  • TypeScript 17.0%
  • Python 3.5%
  • Shell 1.3%
  • HTML 1.1%
  • JavaScript 0.7%
  • Other 0.7%