Marquez (0.10.4)

Download OpenAPI specification:Download

License: Apache 2.0

Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem's metadata.

Namespaces

Create a namespace

Creates a new namespace object. A namespace enables the contextual grouping of related jobs and datasets. Namespaces must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_). A namespace is case-insensitive with a maximum length of 1024 characters. Note jobs and datasets will be unique within a namespace, but not across namespaces.

path Parameters
namespace
required
string <= 1024 characters
Example: wework

The name of the namespace.

Request Body schema: application/json
ownerName
required
string

The owner of the namespace.

description
string

The description of the namespace.

Responses

200

OK

put /namespaces/{namespace}

Local API server

http://localhost:5000/api/v1/namespaces/{namespace}

Request samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "ownerName": "dataengineering",
  • "description": "A namespace for core jobs and datasets at wework."
}

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "name": "wework",
  • "createdAt": "2019-05-09T19:49:24.201Z",
  • "updatedAt": "2019-05-09T19:49:24.201Z",
  • "ownerName": "dataengineering",
  • "description": "A namespace for core jobs and datasets at wework."
}

Retrieve a namespace

Returns a namespace.

path Parameters
namespace
required
string <= 1024 characters
Example: wework

The name of the namespace.

Responses

200

OK

get /namespaces/{namespace}

Local API server

http://localhost:5000/api/v1/namespaces/{namespace}

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "name": "wework",
  • "createdAt": "2019-05-09T19:49:24.201Z",
  • "updatedAt": "2019-05-09T19:49:24.201Z",
  • "ownerName": "dataengineering",
  • "description": "A namespace for core jobs and datasets at wework."
}

List all namespaces

Returns a list of namespaces.

query Parameters
limit
integer

The number of results to return from offset

offset
integer

The initial position from which to return results

Responses

200

OK

get /namespaces

Local API server

http://localhost:5000/api/v1/namespaces

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "namespaces":
    [
    ]
}

Sources

Create a source

Creates a new source object. A source is the physical location of a dataset such as a table in PostgreSQL, or topic in Kafka. A source enables the grouping of physical datasets to their physical source.

path Parameters
source
required
string <= 1024 characters
Example: analytics_db

The name of the source.

Request Body schema: application/json
type
required
string
Enum: "MYSQL" "POSTGRESQL" "REDSHIFT" "SNOWFLAKE" "KAFKA"

The type of the source.

connectionUrl
required
string <URL>

The URL to be used to connect to the source.

description
string

The description of the source.

Responses

200

OK

put /sources/{source}

Local API server

http://localhost:5000/api/v1/sources/{source}

Request samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "type": "POSTGRESQL",
  • "connectionUrl": "jdbc:postgresql://localhost:5431/reports"
}

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "type": "POSTGRESQL",
  • "name": "analytics_db",
  • "createdAt": "2019-05-09T19:49:24.201Z",
  • "updatedAt": "2019-05-09T19:49:24.201Z",
  • "connectionUrl": "jdbc:postgresql://localhost:5431/reports"
}

Get a source

Returns a source.

path Parameters
source
required
string <= 1024 characters
Example: analytics_db

The name of the source.

Responses

200

OK

get /sources/{source}

Local API server

http://localhost:5000/api/v1/sources/{source}

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "type": "POSTGRESQL",
  • "name": "analytics_db",
  • "createdAt": "2019-05-09T19:49:24.201Z",
  • "updatedAt": "2019-05-09T19:49:24.201Z",
  • "connectionUrl": "jdbc:postgresql://localhost:5431/reports"
}

List all sources

Returns a list of sources.

query Parameters
limit
integer

The number of results to return from offset

offset
integer

The initial position from which to return results

Responses

200

OK

get /sources

Local API server

http://localhost:5000/api/v1/sources

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "sources":
    [
    ]
}

Datasets

Create a dataset

Creates a new dataset object.

path Parameters
namespace
required
string <= 1024 characters
Example: wework

The name of the namespace.

dataset
required
string <= 1024 characters
Example: wedata.room_bookings

The name of the dataset.

Request Body schema: application/json
One of
  • DB_TABLE
  • STREAM
type
required
enum
Value: "DB_TABLE"

The type of the dataset.

physicalName
required
string

The physical name of the table.

sourceName
required
string

The name of the source associated with the table.

fields
Array of objects

The fields of the table.

tags
array

List of tags.

description
string

The description of the table.

runId
string

The ID associated with the run modifying the table.

Responses

200

OK

put /namespaces/{namespace}/datasets/{dataset}

Local API server

http://localhost:5000/api/v1/namespaces/{namespace}/datasets/{dataset}

Request samples

Content type
application/json
Example
Copy
Expand all Collapse all
{
  • "type": "DB_TABLE",
  • "physicalName": "wedata.room_bookings",
  • "sourceName": "analytics_db",
  • "fields":
    [
    ],
  • "tags":
    [
    ],
  • "description": "All room booking occupancy data."
}

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "type": "DB_TABLE",
  • "name": "wedata.room_bookings",
  • "physicalName": "wedata.room_bookings",
  • "createdAt": "2019-05-09T19:49:24.201Z",
  • "updatedAt": "2019-05-09T19:49:24.201Z",
  • "sourceName": "analytics_db",
  • "fields":
    [
    ],
  • "tags":
    [
    ],
  • "lastModifiedAt": null,
  • "description": "All room booking occupancy data."
}

Get a dataset

Returns a dataset.

path Parameters
namespace
required
string <= 1024 characters
Example: wework

The name of the namespace.

dataset
required
string <= 1024 characters
Example: wedata.room_bookings

The name of the dataset.

Responses

200

OK

get /namespaces/{namespace}/datasets/{dataset}

Local API server

http://localhost:5000/api/v1/namespaces/{namespace}/datasets/{dataset}

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "type": "DB_TABLE",
  • "name": "wedata.room_bookings",
  • "physicalName": "wedata.room_bookings",
  • "createdAt": "2019-05-09T19:49:24.201Z",
  • "updatedAt": "2019-05-09T19:49:24.201Z",
  • "sourceName": "analytics_db",
  • "fields":
    [
    ],
  • "tags":
    [
    ],
  • "lastModifiedAt": null,
  • "description": "All room booking occupancy data."
}

List all datasets

Returns a list of datasets.

path Parameters
namespace
required
string <= 1024 characters
Example: wework

The name of the namespace.

dataset
required
string <= 1024 characters
Example: wedata.room_bookings

The name of the dataset.

query Parameters
limit
integer

The number of results to return from offset

offset
integer

The initial position from which to return results

Responses

200

OK

get /namespaces/{namespace}/datasets

Local API server

http://localhost:5000/api/v1/namespaces/{namespace}/datasets

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "datasets":
    [
    ]
}

Tag a dataset

Tag an existing dataset.

path Parameters
namespace
required
string <= 1024 characters
Example: wework

The name of the namespace.

dataset
required
string <= 1024 characters
Example: wedata.room_bookings

The name of the dataset.

tag
required
string
Example: SENSITIVE

The name of the tag.

Responses

200

OK

post /namespaces/{namespace}/datasets/{dataset}/tags/{tag}

Local API server

http://localhost:5000/api/v1/namespaces/{namespace}/datasets/{dataset}/tags/{tag}

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "type": "DB_TABLE",
  • "name": "wedata.room_bookings",
  • "physicalName": "wedata.room_bookings",
  • "createdAt": "2019-05-09T19:49:24.201Z",
  • "updatedAt": "2019-05-09T19:49:24.201Z",
  • "sourceName": "analytics_db",
  • "fields":
    [
    ],
  • "tags":
    [
    ],
  • "lastModifiedAt": null,
  • "description": "All room booking occupancy data."
}

Tag a field

Tag an existing field of a dataset.

path Parameters
namespace
required
string <= 1024 characters
Example: wework

The name of the namespace.

dataset
required
string <= 1024 characters
Example: wedata.room_bookings

The name of the dataset.

field
required
string
Example: member_id

The name of the field.

tag
required
string
Example: SENSITIVE

The name of the tag.

Responses </