- A Tool for Automating the Curation of Medical Concepts derived from Coding Lists (ACMC)
- Jakub J. Dylag 1, Roberta Chiovoloni 3, Ashley Akbari 3, Simon D. Fraser 2, Michael J. Boniface 1
- Citation
- Introduction
- Requirements
- Installation
- Getting Started
- Install Clinically Assured NHS TRUD Code Mappings
- Install OMOP Vocabularies
- Example
- Usage
- General Syntax
- Global Options
- Commands
- TRUD Command
- OMOP Command
- PHEN Command
- License
- Support
- Contributing
- Acknowledgements
- License


A Tool for Automating the Curation of Medical Concepts derived from Coding Lists (ACMC)
Jakub J. Dylag 1, Roberta Chiovoloni 3, Ashley Akbari 3, Simon D. Fraser 2, Michael J. Boniface 1
1 Digital Health and Biomedical Engineering, School of Electronics and Computer Science, Faculty of Engineering and Physical Sciences, University of Southampton
2 School of Primary Care Population Sciences and Medical Education, University of Southampton
3 Population Data Science, Swansea University Medical School, Faculty of Medicine, Health & Life Science, Swansea University
Correspondence to: Jakub J. Dylag, Digital Health and Biomedical Engineering, School of Electronics and Computer Science, Faculty of Engineering and Physical Sciences, University of Southampton, J.J.Dylag@soton.ac.uk
Citation
Dylag JJ, Chiovoloni R, Akbari A, Fraser SD, Boniface MJ. A Tool for Automating the Curation of Medical Concepts derived from Coding Lists. GitLab [Internet]. May 2024. Available from: https://git.soton.ac.uk/meldb/concepts-processing
Introduction
This tool automates the verification, translation and organisation of medical coding lists defining phenotypes for inclusion criteria in cohort analysis. By processing externally sourced clinical inclusion criteria into actionable code lists, this tool ensures consistent and efficient curation of cohort definitions. These code lists can be subsequently used by data providers to construct study cohorts.
Requirements
- Python 3.9 or higher
Installation
To install the acmc
package, simply run:
pip install acmc
Once installed, you'll be ready to use the acmc
tool along with the associated vocabularies.
Getting Started
Install Clinically Assured NHS TRUD Code Mappings
-
Register at TRUD
Registry your account with TRUD at NHS TRUD.
-
Subscribe and Accept Licenses: Subscribe to the following data files:
After subscribing, you'll receive an API key once your request is approved (usually within 24 hours).
-
Get TRUD API KEY
Copy your API key from NHS TRUD Account Management and store it securely.
-
Add TRUD API KEY to as an environment variable
To set the environment variable temporarily (for the current session), run:
On macOS/Linux:
export ACMC_TRUD_API_KEY="your_api_key_here"
On Windows (Command Prompt or PowerShell):
setx ACMC_TRUD_API_KEY "your_api_key_here"
-
Download and Install TRUD Resources:
Run the following
acmc
command to download and process the TRUD resources:acmc trud install
Install OMOP Vocabularies
-
Register with OHDSI Athena
-
Download vocabularies from OHDSI Athena
- Required vocabularies include:
-
- SNOMED
-
- ICD9CM
-
- Readv2
-
- ATC
-
- OPCS4
-
- HES Specialty
-
- ICD10CM
-
- dm+d
-
- UK Biobank
-
- NHS Ethnic Category
-
- NHS Place of Service
-
You will be notified by email with a vocabularies version number and link to download a zip file of OMOP database tables in CSV format. The subject will be
OHDSI Standardized Vocabularies. Your download link
frompallas@ohdsi.org
- Required vocabularies include:
Content of your package
Vocabularies release version: v20240830
acmc-omop Vocabularies:
SNOMED - Systematic Nomenclature of Medicine - Clinical Terms (IHTSDO)
ICD9CM - International Classification of Diseases, Ninth Revision, Clinical Modification, Volume 1 and 2 (NCHS)
Read - NHS UK Read Codes Version 2 (HSCIC)
ATC - WHO Anatomic Therapeutic Chemical Classification
OPCS4 - OPCS Classification of Interventions and Procedures version 4 (NHS)
HES Specialty - Hospital Episode Statistics Specialty (NHS)
ICD10CM - International Classification of Diseases, Tenth Revision, Clinical Modification (NCHS)
dm+d - Dictionary of Medicines and Devices (NHS)
UK Biobank - UK Biobank (UK Biobank)
NHS Ethnic Category - NHS Ethnic Category
NHS Place of Service - NHS Admission Source and Discharge Destination
Installation of the OHDSI Standardized Vocabularies
Please execute the following process:
Click on this link to download the zip file. Typical file sizes, depending on the number of vocabularies selected, are between 30 and 1500 MB.
Unpack.
Reconstitute CPT-4. See below for details.
If needed, create the tables.
Load the unpacked files into the tables.
Download the OMOP file onto your computer and note the path to the file
-
Install OMOP vocabularies
Run the following
acmc
command to create a local OMOP database from the OMOP zip file with a specific version:acmc omop install -f <path to downloaded OMOP zip file> -v <release version from email>
Example
Follow these steps to initialize and manage a phenotype using acmc
. In this example, we use a source concept code list for the Concept Set Abdominal Pain
created from ClinicalCodes.org. The source concept codes are is read2. We genereate versioned phenotypes for read2 and then translate to snomed with a another version.
-
Initialize a phenotype in the workspace
Use the followijng
acmc
command to initialize the phenotype in a local Git repository:
acmc phen init
-
Copy example medical code lists to the phenotype codes directory
From the command prompt, copy medical code lists
/examples/codes
to the phenotype code directory:- Download
res176-abdominal-pain.csv
- Alternatively, place your code lists in
./workspace/phen/codes
.
- Download
cp -r ./examples/codes/* ./workspace/phen/codes
-
Copy the example phenotype configuration file to the phenotype directory
From the command prompt, copy example phenotype configuration files
/examples/config.json
to the phenotype directory:- Download
config.json
- Alternatively, place your own
config.json
file in./workspace/phen
.
- Download
cp -r ./examples/config.json ./workspace/phen
-
Validate the phenotype configuration
Use the followijng
acmc
command to validate the phenotype configuration to ensure it's correct:
acmc phen validate
**Expected Output:**
Once the command is executed, you should see output similar to this:
[INFO] - Validating phenotype: <path>/concepts-processing/workspace/phen
[INFO] - Phenotype validated successfully
-
Publish phenotype at an initial version
Use the following
acmc
command to publish the phenotype at an initial version:
acmc phen publish
- Generate phenotype in SNOWMED code format
Generate the phenotype in snomed
format:
acmc phen map -t snomed
-
Get a copy of the previous version from the repo
Use the following
acmc
command to retrieve a copy of the previous version (v1.0.3
) from the repository:
acmc phen copy -v v1.0.3
-
Compare the previous version
v1.0.3
with the latest versionUse the following
acmc
command to compare the previous version (v1.0.3
) with the latest version in the repository:
acmc phen diff -old ./workspace/v1.0.3/
-
Publish the phenotype at the next version
Use the following
acmc
command to publish the phenotype at the next version:
acmc phen publish
Usage
The acmc
command-line tool provides various commands to interact with TRUD, OMOP, and Phenotype data. Below are the usage details for each command.
General Syntax
acmc [OPTIONS] COMMAND [SUBCOMMAND] [ARGUMENTS]
Where:
-
[OPTIONS]
are global options that apply to all commands (e.g.,--debug
,--version
). -
[COMMAND]
is the top-level command (e.g.,trud
,omop
,phen
). -
[SUBCOMMAND]
refers to the specific operation within the command (e.g.,install
,validate
).
Global Options
-
--version
: Display the acmc tool version number -
--debug
: Enable debug mode for more verbose logging.
Commands
TRUD Command
The trud
command is used for installing NHS TRUD vocabularies.
-
Install TRUD
Install clinically assurred TRUD medical code mappings:
acmc trud install
OMOP Command
The omop
command is used for installing OMOP vocabularies.
-
Install OMOP
Install vocabularies in a local OMOP database:
acmc omop install -d <OMOP_DIRECTORY_PATH> -v <OMOP_VERSION>
-
-d
,--omop-dir
: (Optional) Directory path to extracted OMOP downloads, default is./build/omop
-
-v
,--version
: OMOP vocabularies release version.
-
-
Clear OMOP
Clear data from the local OMOP database:
acmc omop clear
-
Delete OMOP
Delete the local OMOP database:
acmc omop delete
PHEN Command
The phen
command is used phenotype-related operations.
-
Initialize Phenotype
Initialize a phenotype directory locally or from a remote git repository:
acmc phen init -d <PHENOTYPE_DIRECTORY> -r <REMOTE_URL>
-
-d
,--phen-dir
: (Optional) Directory to write phenotype configuration (the default is ./build/phen). -
-r
,--remote_url
: (Optional) URL to a remote git repository.
-
-
Validate Phenotype
Validate the phenotype configuration:
acmc phen validate -d <PHENOTYPE_DIRECTORY>
-
-d
,--phen-dir
: (Optional) Directory of phenotype configuration (the default is ./build/phen).
-
-
Map Phenotype
Process phenotype mapping and specify the target coding and output format:
acmc phen map -d <PHENOTYPE_DIRECTORY> -t <TARGET_CODING> -o <OUTPUT_FORMAT>
-
-t
,--target-coding
: Specify the target coding (e.g.,read2
,read3
,icd10
,snomed
,opcs4
). -
-d
,--phen-dir
: (Optional) Directory of phenotype configuration (the default is ./build/phen). -
-o
,--output
: Output format(s) (csv
,omop
, or both), default is 'csv'.
-
-
Publish Phenotype Configuration
Publish a phenotype configuration, committing all changes and tagging with a new version number. If the phenotype has been initialised from a remote git URL, then the commit and new version tag will be pushed to the remote repo:
acmc phen publish -d <PHENOTYPE_DIRECTORY>
-
-d
,--phen-dir
: (Optional) Directory of phenotype configuration (the default is ./build/phen).
-
-
Copy Phenotype Configuration
Copy a phenotype configuration from a source directory to a target directory at a specific version. This is used when wanting to compare versions of phenotypes using the
acmc phen diff
command:acmc phen copy -d <PHENOTYPE_DIRECTORY> -td <TARGET_DIRECTORY> -v <PHENOTYPE_VERSION>
-
-d
,--phen-dir
: (Optional) Directory of phenotype configuration (the default is ./build/phen). -
-td
,--target-dir
: (Optional) Directory to copy the phenotype configuration to, (the default is ./build). -
-v
,--version
: The phenotype version to copy.
-
-
Compare Phenotype Configurations
Compare a a new phenotype version with pervious version of a phenotype:
acmc phen diff -d <NEW_PHENOTYPE_DIRECTORY> -old <OLD_PHENOTYPE_DIRECTORY>
-
-d
,--phen-dir
: (Optional) Directory of current phenotype configuration (the default is ./build/phen). -
-old
,--phen-dir-old
: (Required) Directory of old phenotype version)
-
License
MIT License
Support
For issues, open an issue in the repository
Contributing
Please contacted the corresponding author Jakub Dylag at J.J.Dylag@soton.ac.uk.
Acknowledgements
This project was developed in the context of the MELD-B project, which is funded by the UK National Institute of Health Research under grant agreement NIHR203988.
License
This work is licensed under a Apache License, Version 2.0.