Download OpenAPI specification:Download
The PHDI message parser offers a REST API for extracting desired fields from a given message. The service natively supports extracting values from FHIR bundles, but it can support parsing Hl7v2 (eLR, VXU, ADT, etc.) and CDA(eCR) messages by first using the DIBBs FHIR converter to convert them to FHIR. Fields are extracted using a "parsing schema" which is simply a mapping in key:value format between desired field names (keys) and the FHIRPaths within the bundle to the values. In addition the data type of value (string, integer, float, boolean, date, timestamp) as well as whether the value can be null (true
, false
) must be specified. A simple example of a schema for extracting a patient's first and last name from messages is shown below.
{
"first_name": {
"fhir_path": "Bundle.entry.resource.where(resourceType = 'Patient').name.first().given.first()",
"data_type": "string",
"nullable": true
},
"last_name": {
"fhir_path": "Bundle.entry.resource.where(resourceType = 'Patient').name.first().family",
"data_type": "string",
"nullable": true
}
}
Using this schema on a message about a patient named John Doe yield a result like this.
{
"first_name": "John",
"last_name": "Doe"
}
Sometimes healthcare messages can be large and complex. A single message might contain several lab results that all must be extracted. We could do this by mapping each lab to its own column, "lab_result_1", "lab_result_2", "lab_result_3"
and so on. However, this is cumbersome and often a poor solution if the possible number of labs is unknown or very large. To address this the message parser can return multiple values found in equivalent locations in a FHIR bundle as an array. To do this we can add the "secondary_schema"
key to the field of a parsing schema that should contain multiple values. The schema below demonstrates extracting a patient's first name, last name, as well as all of their labs.
{
"first_name": {
"fhir_path": "Bundle.entry.resource.where(resourceType = 'Patient').name.first().given.first()",
"data_type": "string",
"nullable": true
},
"last_name": {
"fhir_path": "Bundle.entry.resource.where(resourceType = 'Patient').name.first().family",
"data_type": "string",
"nullable": true
},
"labs": {
"fhir_path": "Bundle.entry.resource.where(resourceType='Observation').where(category.coding.code='laboratory')",
"data_type": "array",
"nullable": true,
"secondary_schema": {
"test_type": {
"fhir_path": "Observation.code.coding.display",
"data_type": "string",
"nullable": true
},
"test_type_code": {
"fhir_path": "Observation.code.coding.code",
"data_type": "string",
"nullable": true
},
"test_result": {
"fhir_path": "Observation.valueString",
"data_type": "string",
"nullable": true
},
"specimen_collection_date": {
"fhir_path": "Observation.extension.where(url='http://hl7.org/fhir/R4/specimen.html').extension.where(url='specimen collection time').valueDateTime",
"data_type": "datetime",
"nullable": true
}
}
}
}
If this parsing schema is used on a message about a patient named Jane Doe with two labs the service would a return a result like this.
{
"first_name": "Jane",
"last_name": "Doe",
"labs": [
{
"test_type": "Campylobacter, NAAT",
"test_type_code": "82196-7",
"test_result": "Not Detected",
"specimen_collection_date": "2023-01-31T18:52:00Z"
},
{
"test_type": "C. Diff Toxin A/B, NAAT",
"test_type_code": "82197-5",
"test_result": "Not Detected",
"specimen_collection_date": "2023-01-31T18:52:00Z"
}
]
}
The message parser can be run using Docker (or any other OCI container runtime e.g., Podman), or directly from the Python source code.
To run the message parser with Docker, follow these steps.
docker -v
. If you do not see a response similar to what is shown below, follow these instructions to install Docker.❯ docker -v
Docker version 20.10.21, build baeda1f
docker pull ghcr.io/cdcgov/phdi/message-parser:latest
. docker run -p 8080:8080 message-parser:latest
.Congratulations, the message parser should now be running on localhost:8080
!
We recommend running the message parser from a container, but if that is not feasible for a given use-case, it may also be run directly from Python using the steps below.
git clone https://github.com/CDCgov/phdi
./phdi/containers/message-parser/
.python -m venv .venv
.source .venv/bin/activate
(MacOS and Linux), venv\Scripts\activate
(Windows Command Prompt), or .venv\Scripts\Activate.ps1
(Windows Power Shell).pip install -r requirements.txt
into your virtual environment.localhost:8080
with python -m uvicorn app.main:app --host 0.0.0.0 --port 8080
.To build the Docker image for the message parser from source instead of downloading it from the PHDI repository follow these steps.
git clone https://github.com/CDCgov/phdi
./phdi/containers/message-parser/
.docker build -t message-parser .
.When viewing these docs from the /redoc
endpoint on a running instance of the message parser or the PHDI website, detailed documentation on the API will be available below.
Extract the desired values from a message. If the message is not already in FHIR format, convert it to FHIR first. You can either provide a parsing schema or the name of a previously loaded parsing schema.
message_format required | string (Message Format) Enum: "fhir" "hl7v2" "ecr" The format of the message. |
message_type | string (Message Type) Enum: "ecr" "elr" "vxu" The type of message that values will be extracted from. Required when 'message_format is not FHIR. |
parsing_schema | object (Parsing Schema) Default: {} A schema describing which fields to extract from the message. This must be a JSON object with key:value pairs of the form |
parsing_schema_name | string (Parsing Schema Name) Default: "" The name of a schema that was previously loaded in the service to use to extract fields from the message. |
fhir_converter_url | string (Fhir Converter Url) Default: "" The URL of an instance of the PHDI FHIR converter. Required when the message is not already in FHIR format. |
credential_manager | string (Credential Manager) Enum: "azure" "gcp" The type of credential manager to use for authentication with a FHIR converter when conversion to FHIR is required. |
include_metadata | string (Include Metadata) Enum: "true" "false" Boolean to include metadata in the response. |
required | Message (string) or Message (object) (Message) The message to be parsed. |
{- "message_format": "fhir",
- "message_type": "ecr",
- "parsing_schema": { },
- "parsing_schema_name": "",
- "fhir_converter_url": "",
- "credential_manager": "azure",
- "include_metadata": "true",
- "message": "string"
}
{- "message": "Parsing succeeded!",
- "parsed_values": {
- "last_name": "DOE",
- "first_name": "JANE"
}
}
Convert a FHIR bundle to a Public Health Document Container (PHDC).
phdc_report_type required | string (Phdc Report Type) Enum: "case_report" "contact_record" "lab_report" "morbidity_report" The type of PHDC document the user wants returned to them. The choice of report type should reflect the type of the incoming data and determines which PHDC schema is used when extracting. |
message required | object (Message) The FHIR bundle to extract from. |
{- "phdc_report_type": "case_report",
- "message": { }
}
{- "message": "FHIR extraction succeeded!",
- "parsed_values": {
- "first_name": "JOHN",
- "last_name": "DOE",
- "labs": [
- {
- "test_type": "Blood culture",
- "test_result_code_display": "Staphylococcus aureus",
- "ordering_provider": "Western Pennsylvania Medical General",
- "requesting_organization_contact_person": "Dr. Totally Real Doctor, M.D."
}
]
}
}
Get a list of all the parsing schemas currently available. Default schemas are ones that are packaged by default with this service. Custom schemas are any additional schema that users have chosen to upload to this service (this feature is not yet implemented)
{- "default_schemas": [
- "ecr.json",
- "test_schema.json"
], - "custom_schemas": [ ]
}
Get the schema specified by 'parsing_schema_name'.
parsing_schema_name required | string (Parsing Schema Name) |
{- "message": "Schema found!",
- "parsing_schema": {
- "first_name": "Bundle.entry.resource.where(resourceType = 'Patient').name.first().given.first()",
- "last_name": "Bundle.entry.resource.where(resourceType = 'Patient').name.first().family"
}
}
Upload a new parsing schema to the service or update an existing schema.
parsing_schema_name required | string (Parsing Schema Name) |
required | object (Parsing Schema) A JSON formatted parsing schema to upload. |
overwrite | boolean (Overwrite) Default: false When |
{- "parsing_schema": {
- "property1": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true,
- "secondary_schema": {
- "property1": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true,
- "secondary_schema": {
- "property1": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true
}, - "property2": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true
}
}
}, - "property2": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true,
- "secondary_schema": {
- "property1": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true
}, - "property2": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true
}
}
}
}
}, - "property2": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true,
- "secondary_schema": {
- "property1": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true,
- "secondary_schema": {
- "property1": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true
}, - "property2": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true
}
}
}, - "property2": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true,
- "secondary_schema": {
- "property1": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true
}, - "property2": {
- "fhir_path": "string",
- "data_type": "string",
- "nullable": true
}
}
}
}
}
}, - "overwrite": false
}
{- "message": "Schema uploaded successfully!"
}