PHDI Tabulation Service (1.2.11)

Download OpenAPI specification:Download

CDC Public Health Data Infrastructure: dmibuildingblocks@cdc.gov URL: https://cdcgov.github.io/phdi-site/ License: Creative Commons Zero v1.0 Universal

Getting Started with the PHDI Tabulation Service

Introduction

The PHDI tabulation service offers a REST API to extract and tabulate data from a FHIR server according to a user-defined schema, and then persist the data in one of serveral supported formats. More information about this process and writing schemas can be found here.

Running the Tabulation Service

The tabulation service can be run using Docker (or any other OCI container runtime e.g., Podman), or directly from the Python source code.

Running with Docker (Recommended)

To run the tabulation service with Docker follow these steps.

Confirm that you have Docker installed by running docker -v. If you do not see a response similar to what is shown below, follow these instructions to install Docker.
```
❯ docker -v
Docker version 20.10.21, build baeda1f
```
Download a copy of the Docker image from the PHDI repository by running docker pull ghcr.io/cdcgov/phdi/tabulation:latest.
Run the service with docker run -p 8080:8080 -v $(pwd):/code tabulation:latest.

Congratulations, the tabulation service should now be running on localhost:8080!

Running from Python Source Code

We recommend running the tabulation service from a container, but if that is not feasible for a given use case, it may also be run directly from Python using the steps below.

Ensure that both Git and Python 3.10 or higher are installed.
Clone the PHDI repository with git clone https://github.com/CDCgov/phdi.
Navigate to /phdi/containers/tabulation/.
Make a fresh virtual environment with python -m venv .venv.
Activate the virtual environment with source .venv/bin/activate (MacOS and Linux), venv\Scripts\activate (Windows Command Prompt), or .venv\Scripts\Activate.ps1 (Windows PowerShell).
Install all of the Python dependencies for the tabulation service with pip install -r requirements.txt into your virtual environment.
Run the FHIR Converter on localhost:8080 with python -m uvicorn app.main:app --host 0.0.0.0 --port 8080.

Building the Docker Image

To build the Docker image for the tabulation service from source code instead of downloading it from the PHDI repository, follow these steps.

Ensure that both Git and Docker are installed.
Clone the PHDI repository with git clone https://github.com/CDCgov/phdi.
Navigate to /phdi/containers/tabulation/.
Run docker build -t tabulation ..

The API

When viewing these docs from the /redoc endpoint on a running instance of the tabulation service or the PHDI website, detailed documentation on the API will be available below.

Health Check

Check service status. If an HTTP 200 status code is returned along with '{"status": "OK"}' then the service is available and running properly.

Responses

Response samples

Content type

application/json

{"status": "OK"
}

Validate Schema Endpoint

Request Body schema: application/json

schema

required

object (Schema)

A JSON formatted PHDI schema.

Responses

Request samples

Payload

Content type

application/json

{"schema": { }
}

Response samples

Content type

application/json

null

Tabulate Endpoint

This endpoint will extract, tabulate, and persist data from a FHIR server according to a user-defined schema in the method of the user's choosing.

Request Body schema: application/json

schema required	object (Schema) A JSON formatted PHDI schema.
output_type required	string (Output Type) Enum: "parquet" "csv" "sql" Method for persisting data after extraction from the FHIR server and tabulation.
fhir_url	string (Fhir Url) The URL of the FHIR server from which data should be extracted, should end with '/fhir'. If not provided here then it must be set as an environment variable.
cred_manager	string (Cred Manager) Enum: "azure" "gcp" Chose a PHDI credential manager to use for authentication with the FHIR. May be set here or as an environment variable. If not provided anywhere then un-authenticated FHIR server requests will be attempted.

Responses

Request samples

Payload

Content type

application/json

{"schema": { },
"output_type": "parquet",
"fhir_url": "string",
"cred_manager": "azure"
}

Response samples

Content type

application/json

null