Marquez uses Helm to manage deployments onto Kubernetes in a cloud environment. The chart and templates for the HTTP API server and Web UI are maintained in the Marquez repository and can be found in the chart directory. The chart’s base
values.yaml file includes an option to easily override deployment settings.
Note: The Marquez HTTP API server and Web UI images are publshed to DockerHub.
The Marquez HTTP API server relies only on PostgreSQL to store dataset, job, and run metadata allowing for minimal operational overhead. We recommend a cloud provided databases, such as AWS RDS, when deploying Marquez onto Kubernetes.
Figure 1: Minimal Marquez deployment via Docker.
Figure 2: Marquez deployment via Kubernetes.
|Marquez Web UI||marquezproject/marquez-web||The web UI used to view metadata.|
|Marquez HTTP API||marquezproject/marquez||The core API used to collect metadata using OpenLineage.|
|Database||bitnami/postgresql or cloud provided||A PostgreSQL instance used to store metadata.|
|Scheduler||User-provided||A scheduler used to run a workflow on a particular schedule (ex: Airflow)|
|Workflow||User-provided||A workflow using an OpenLineage integration to send lineage metadata to Marquez.|
Our clients support authentication by automatically sending an API key on each request via Bearer Auth when configured on client instantiation. By default, the Marquez HTTP API does not require any form of authentication or authorization.
The following guides will help you and your team effectively deploy and manage Marquez in a cloud environment: