PHDI Message Parser (1.2.11)

Download OpenAPI specification:Download

Getting Started with the DIBBs Message Parser

Introduction

The PHDI message parser offers a REST API for extracting desired fields from a given message. The service natively supports extracting values from FHIR bundles, but it can support parsing Hl7v2 (eLR, VXU, ADT, etc.) and CDA(eCR) messages by first using the DIBBs FHIR converter to convert them to FHIR. Fields are extracted using a "parsing schema" which is simply a mapping in key:value format between desired field names (keys) and the FHIRPaths within the bundle to the values. In addition the data type of value (string, integer, float, boolean, date, timestamp) as well as whether the value can be null (true, false) must be specified. A simple example of a schema for extracting a patient's first and last name from messages is shown below.

{
  "first_name": {
    "fhir_path": "Bundle.entry.resource.where(resourceType = 'Patient').name.first().given.first()",
    "data_type": "string",
    "nullable": true
  },
  "last_name": {
    "fhir_path": "Bundle.entry.resource.where(resourceType = 'Patient').name.first().family",
    "data_type": "string",
    "nullable": true
  }
}

Using this schema on a message about a patient named John Doe yield a result like this.

{
  "first_name": "John",
  "last_name": "Doe"
}

Nested Data

Sometimes healthcare messages can be large and complex. A single message might contain several lab results that all must be extracted. We could do this by mapping each lab to its own column, "lab_result_1", "lab_result_2", "lab_result_3" and so on. However, this is cumbersome and often a poor solution if the possible number of labs is unknown or very large. To address this the message parser can return multiple values found in equivalent locations in a FHIR bundle as an array. To do this we can add the "secondary_schema" key to the field of a parsing schema that should contain multiple values. The schema below demonstrates extracting a patient's first name, last name, as well as all of their labs.

{
  "first_name": {
    "fhir_path": "Bundle.entry.resource.where(resourceType = 'Patient').name.first().given.first()",
    "data_type": "string",
    "nullable": true
  },
  "last_name": {
    "fhir_path": "Bundle.entry.resource.where(resourceType = 'Patient').name.first().family",
    "data_type": "string",
    "nullable": true
  },
  "labs": {
        "fhir_path": "Bundle.entry.resource.where(resourceType='Observation').where(category.coding.code='laboratory')",
        "data_type": "array",
        "nullable": true,
        "secondary_schema": {
          "test_type": {
              "fhir_path": "Observation.code.coding.display",
              "data_type": "string",
              "nullable": true
          },
          "test_type_code": {
              "fhir_path": "Observation.code.coding.code",
              "data_type": "string",
              "nullable": true
          },
          "test_result": {
              "fhir_path": "Observation.valueString",
              "data_type": "string",
              "nullable": true
          },
          "specimen_collection_date": {
              "fhir_path": "Observation.extension.where(url='http://hl7.org/fhir/R4/specimen.html').extension.where(url='specimen collection time').valueDateTime",
              "data_type": "datetime",
              "nullable": true
          }
        }
    }
}

If this parsing schema is used on a message about a patient named Jane Doe with two labs the service would a return a result like this.

{
  "first_name": "Jane",
  "last_name": "Doe",
  "labs": [
    {
      "test_type": "Campylobacter, NAAT",
      "test_type_code": "82196-7",
      "test_result": "Not Detected",
      "specimen_collection_date": "2023-01-31T18:52:00Z"
    },
    {
      "test_type": "C. Diff Toxin A/B, NAAT",
      "test_type_code": "82197-5",
      "test_result": "Not Detected",
      "specimen_collection_date": "2023-01-31T18:52:00Z"
    }
  ]
}

Running the Message Parser

The message parser can be run using Docker (or any other OCI container runtime e.g., Podman), or directly from the Python source code.

To run the message parser with Docker, follow these steps.

  1. Confirm that you have Docker installed by running docker -v. If you do not see a response similar to what is shown below, follow these instructions to install Docker.
    ❯ docker -v
    Docker version 20.10.21, build baeda1f
    
  2. Download a copy of the Docker image from the PHDI repository by running docker pull ghcr.io/cdcgov/phdi/message-parser:latest.
  3. Run the service with docker run -p 8080:8080 message-parser:latest.

Congratulations, the message parser should now be running on localhost:8080!

Running from Python Source Code

We recommend running the message parser from a container, but if that is not feasible for a given use-case, it may also be run directly from Python using the steps below.

  1. Ensure that both Git and Python 3.10 or higher are installed.
  2. Clone the PHDI repository with git clone https://github.com/CDCgov/phdi.
  3. Navigate to /phdi/containers/message-parser/.
  4. Make a fresh virtual environment with python -m venv .venv.
  5. Activate the virtual environment with source .venv/bin/activate (MacOS and Linux), venv\Scripts\activate (Windows Command Prompt), or .venv\Scripts\Activate.ps1 (Windows Power Shell).
  6. Install all of the Python dependencies for the message parser with pip install -r requirements.txt into your virtual environment.
  7. Run the FHIR Converter on localhost:8080 with python -m uvicorn app.main:app --host 0.0.0.0 --port 8080.

Building the Docker Image

To build the Docker image for the message parser from source instead of downloading it from the PHDI repository follow these steps.

  1. Ensure that both Git and Docker are installed.
  2. Clone the PHDI repository with git clone https://github.com/CDCgov/phdi.
  3. Navigate to /phdi/containers/message-parser/.
  4. Run docker build -t message-parser ..

The API

When viewing these docs from the /redoc endpoint on a running instance of the message parser or the PHDI website, detailed documentation on the API will be available below.

Health Check

Check service status. If an HTTP 200 status code is returned along with '{"status": "OK"}' then the FHIR conversion service is available and running properly.

Responses

Response samples

Content type
application/json
null

Parse Message Endpoint

Extract the desired values from a message. If the message is not already in FHIR format, convert it to FHIR first. You can either provide a parsing schema or the name of a previously loaded parsing schema.

Request Body schema: application/json
message_format
required
string (Message Format)
Enum: "fhir" "hl7v2" "ecr"

The format of the message.

message_type
string (Message Type)
Enum: "ecr" "elr" "vxu"

The type of message that values will be extracted from. Required when 'message_format is not FHIR.

parsing_schema
object (Parsing Schema)
Default: {}

A schema describing which fields to extract from the message. This must be a JSON object with key:value pairs of the form :.

parsing_schema_name
string (Parsing Schema Name)
Default: ""

The name of a schema that was previously loaded in the service to use to extract fields from the message.

fhir_converter_url
string (Fhir Converter Url)
Default: ""

The URL of an instance of the PHDI FHIR converter. Required when the message is not already in FHIR format.

credential_manager
string (Credential Manager)
Enum: "azure" "gcp"

The type of credential manager to use for authentication with a FHIR converter when conversion to FHIR is required.

include_metadata
string (Include Metadata)
Enum: "true" "false"

Boolean to include metadata in the response.

required
Message (string) or Message (object) (Message)

The message to be parsed.

Responses

Request samples

Content type
application/json
{
  • "message_format": "fhir",
  • "message_type": "ecr",
  • "parsing_schema": { },
  • "parsing_schema_name": "",
  • "fhir_converter_url": "",
  • "credential_manager": "azure",
  • "include_metadata": "true",
  • "message": "string"
}

Response samples

Content type
application/json
Example
{
  • "message": "Parsing succeeded!",
  • "parsed_values": {
    }
}

Fhir To Phdc Endpoint

Convert a FHIR bundle to a Public Health Document Container (PHDC).

Request Body schema: application/json
phdc_report_type
required
string (Phdc Report Type)
Enum: "case_report" "contact_record" "lab_report" "morbidity_report"

The type of PHDC document the user wants returned to them. The choice of report type should reflect the type of the incoming data and determines which PHDC schema is used when extracting.

message
required
object (Message)

The FHIR bundle to extract from.

Responses

Request samples

Content type
application/json
{
  • "phdc_report_type": "case_report",
  • "message": { }
}

Response samples

Content type
application/json
{
  • "message": "FHIR extraction succeeded!",
  • "parsed_values": {
    }
}

List Schemas

Get a list of all the parsing schemas currently available. Default schemas are ones that are packaged by default with this service. Custom schemas are any additional schema that users have chosen to upload to this service (this feature is not yet implemented)

Responses

Response samples

Content type
application/json
{
  • "default_schemas": [
    ],
  • "custom_schemas": [ ]
}

Get Schema

Get the schema specified by 'parsing_schema_name'.

path Parameters
parsing_schema_name
required
string (Parsing Schema Name)

Responses

Response samples

Content type
application/json
{
  • "message": "Schema found!",
  • "parsing_schema": {
    }
}

Upload Schema

Upload a new parsing schema to the service or update an existing schema.

path Parameters
parsing_schema_name
required
string (Parsing Schema Name)
Request Body schema: application/json
required
object (Parsing Schema)

A JSON formatted parsing schema to upload.

overwrite
boolean (Overwrite)
Default: false

When true if a schema already exists for the provided name it will be replaced. When false no action will be taken and the response will indicate that a schema for the given name already exists. To proceed submit a new request with a different schema name or set this field to true.

Responses

Request samples

Content type
application/json
{
  • "parsing_schema": {
    },
  • "overwrite": false
}

Response samples

Content type
application/json
{
  • "message": "Schema uploaded successfully!"
}