Skip to content
Snippets Groups Projects
Commit 4541b584 authored by mjbonifa's avatar mjbonifa
Browse files

updated readme

parent 8e356322
Branches
No related tags found
No related merge requests found
......@@ -84,47 +84,69 @@ $env:ACMC_GITLAB_PAT="your_personal_access_token"
$env:ACMC_GITHUB_PAT="your_personal_access_token"
```
## Requirements
- Python 3.9 or higher
## Installation
**1. Setup Conda Enviroment**
To install the `acmc` package, simply run:
```bash
pip install acmc
```
Once installed, you'll be ready to use the `acmc` tool along with the associated vocabularies.
## Getting Started
### Install Clinically Assured NHS TRUD Code Mappings
ACMC requires Python and the enviroment is maintained using conda.
1. **Register at TRUD**: Access clinically assured terminology mappings at [NHS TRUD](https://isd.digital.nhs.uk/trud/user/guest/group/0/account/form).
* Ensure you have conda installed, e.g. following instructions for miniconda from [https://docs.conda.io/en/latest/miniconda.html](https://docs.conda.io/en/latest/miniconda.html).
* Create environment: `conda env create -f conda.yaml`
* Activate environment: `conda activate acmc`
2. **Subscribe and Accept Licenses**: Subscribe to the following data files:
**2. Register at TRUD** to access clinically assured terminology mappings [NHS TRUD](https://isd.digital.nhs.uk/trud/user/guest/group/0/account/form)
- [NHS Read Browser](https://isd.digital.nhs.uk/trud/users/guest/filters/2/categories/9/items/8/releases)
- [NHS Data Migration](https://isd.digital.nhs.uk/trud/users/guest/filters/0/categories/8/items/9/releases)
- [ICD10 Edition 5 XML](https://isd.digital.nhs.uk/trud/users/guest/filters/0/categories/28/items/259/releases)
- [OPCS-4.10 Data Files](https://isd.digital.nhs.uk/trud/users/guest/filters/0/categories/10/items/119/releases)
**3. Subscribe and accept the following licenses**
After subscribing, you'll receive an API key once your request is approved (usually within 24 hours).
ACMC uses clinically assured medical terminologies provided by the NHS. The datafiles are downloaded automatically but you need to register, request subscription and obtain an API key.
4. **Get TRUD API KEY**: Copy your API key from [NHS TRUD Account Management](https://isd.digital.nhs.uk/trud/users/authenticated/filters/0/account/manage) and store it securely.
* [NHS Read Browser](https://isd.digital.nhs.uk/trud/users/guest/filters/2/categories/9/items/8/releases)
* [NHS Data Migration](https://isd.digital.nhs.uk/trud/users/guest/filters/0/categories/8/items/9/releases)
* [ICD10 Edition 5 XML](https://isd.digital.nhs.uk/trud/users/guest/filters/0/categories/28/items/259/releases)
* [OPCS-4.10 Data Files](https://isd.digital.nhs.uk/trud/users/guest/filters/0/categories/10/items/119/releases)
<!-- - [BNF/Snomed Mapping data.xlsx](https://www.nhsbsa.nhs.uk/prescription-data/understanding-our-data/bnf-snomed-mapping) -->
5. **Add TRUD API KEY to as an environment variable**
Each data file has a "Subscribe" link that will take you to the licence. You will need to "Tell us about your subscription request" that summarises why you need access to the data, e.g. for a specific research project. Your subscription will not be approved immediately and will remain in the "pending" state until it is. This is usually approved within 24 hours.
To set the environment variable temporarily (for the current session), run:
**4. Get TRUD API Key**
On macOS/Linux:
Go to your [NHS TRUD Account Management](https://isd.digital.nhs.uk/trud/users/authenticated/filters/0/account/manage) and copy you api key to a safe place, e.g. a personnal key store. The api key is required by ACMC tools to download TRUD resources.
```bash
export ACMC_TRUD_API_KEY="your_api_key_here"
```
On Windows (Command Prompt or PowerShell):
```bash
setx ACMC_TRUD_API_KEY "your_api_key_here"
```
**5. Download and install TRUD resources**
4. **Download and Install TRUD Resources**:
Execute the following script to download, install and process TRUD resources
Run the following acmc command to download and process the TRUD resources:
`python acmc.py trud install --key <API_KEY>`.
```bash
acmc trud install
```
Processed resources will be saved in the `build/maps/processed/` directory.
Processed TRUD resources are saved as `.parquet` files in the `build/maps/processed/` directory.
*Note: NHS TRUD provides one-way mappings. To reverse mappings, duplicate the `.parquet` file and reverse the filename (e.g., `read2_code_to_snomed_code.parquet` to `snomed_code_to_read2_code.parquet`).*
*Note: NHS TRUD defines one-way mappings and does <b>NOT ADVISE</b> reversing the mappings. If you still wish to reverse these into two-way mappings, duplicate the given `.parquet` table and reverse the filename (e.g. `read2_code_to_snomed_code.parquet` to `snomed_code_to_read2_code.parquet`)*
### Install OMOP Vocabularies
**6. Optional: Install OMOP Database:**
1. Register with [Athena](https://athena.ohdsi.org/auth/login)
ACMC optionally supports outputting coding lists in structured OMOP database. To do this you will need to register with [Athena](https://athena.ohdsi.org/auth/login?forceSSO=true) and then download the following vocabularies manually from [Athena OHDSI](https://athena.ohdsi.org/vocabulary/list).
2. Download vocabularies [Athena OHDSI](https://athena.ohdsi.org/vocabulary/list).
* Required vocabularies include:
* 1) SNOMED
......@@ -139,13 +161,21 @@ ACMC optionally supports outputting coding lists in structured OMOP database. To
* 154) NHS Ethnic Category
* 155) NHS Place of Service
The vocabularies will not be available immediately, you will be notified by email when they are ready. This process cannot be automated due to the way that Athena delivers vocabularies for download.
You will be notified by email with a vocabularies version number and link to download a zip file of OMOP database tables in CSV format
* Un-zip the downloaded folder and copy it's path.
3. Un-zip the OMOP file
* Install vocabularies using the following command:
Create a directory where you want the OMOP CSV tables to be stored, the default from the current working directory is ./build/omop
`python acmc.py omop install -f <Path to extracted OMOP downloads folder>`
Unzip the OMOP files into that directory
5. Install OMOP vocabularies
Run the following acmc command to create a local OMOP database from the download:
```bash
acmc omop install -d <Directory path to extracted OMOP downloads> -v <release version from email>
```
## Defining phenotypes
......@@ -283,101 +313,147 @@ Need to split column into multiple columns, so only one code type per column.
**<b>Large Code lists</b> with numerous phenotypes (e.g. Ho et al), require lots of JSON to be generated. See the "Ho generate JSON" section in process_codes_WP.ipynb for example code to generate*
## Usage - ACMC Command-Line Tool
## Usage
The tool follows a structured command system:
The `acmc` command-line tool provides various commands to interact with TRUD, OMOP, and Phenotype data. Below are the usage details for each command.
### General Syntax
```bash
python acmc.py <command> <subcommand> [options]
acmc [OPTIONS] COMMAND [SUBCOMMAND] [ARGUMENTS]
```
### Available Commands
- **`trud`** – Manage TRUD components
- **`omop`** – Manage OMOP codes and database
- **`map`** – Process mapping configurations
Where:
- `[OPTIONS]` are global options that apply to all commands (e.g., `--debug`).
- `[COMMAND]` is the top-level command (e.g., `trud`, `omop`, `phen`).
- `[SUBCOMMAND]` refers to the specific operation within the command (e.g., `install`, `validate`).
---
### Global Options
- `--debug`: Enable debug mode for more verbose logging.
### Commands
#### TRUD Command
The `trud` command is used for installing NHS TRUD vocabularies.
- **Install TRUD**
Install clinically assurred TRUD medical code mappings:
## TRUD Command
### Install TRUD Components
```bash
acmc trud install -k <TRUD_API_KEY>
acmc trud install
```
**Options:**
- `-k, --api-key` _(required)_ – TRUD API key
---
#### OMOP Command
The `omop` command is used for installing OMOP vocabularies.
- **Install OMOP**
Install vocabularies in a local OMOP database:
## OMOP Commands
### Install OMOP Codes
```bash
acmc omop install -f <OMOP_FOLDER_PATH>
acmc omop install -d <OMOP_DIRECTORY_PATH> -v <OMOP_VERSION>
```
**Options:**
- `-f, --omop-folder` _(required)_ – Path to extracted OMOP downloads folder
### Clear OMOP Data
- `-d`, `--omop-dir`: (Optional) Directory path to extracted OMOP downloads, default is `./build/omop`
- `-v`, `--version`: OMOP vocabularies release version.
- **Clear OMOP**
Clear data from the local OMOP database:
```bash
acmc omop clear
```
_Removes OMOP data from the database._
### Delete OMOP Database
- **Delete OMOP**
Delete the local OMOP database:
```bash
acmc omop delete
```
_Deletes the entire OMOP database._
---
#### PHEN Command
The `phen` command is used phenotype-related operations.
- **Initialize Phenotype**
Initialize a phenotype directory locally or from a remote git repository:
```bash
acmc phen init -d <PHENOTYPE_DIRECTORY> -r <REMOTE_URL>
```
- `-d`, `--phen-dir`: (Optional) Directory to write phenotype configuration (the default is ./build/phen).
- `-r`, `--remote_url`: (Optional) URL to a remote git repository.
- **Validate Phenotype**
Validate the phenotype configuration:
## MAP Commands
### Process Phenotype Configuration
```bash
acmc map process -c <CONFIG_FILE> -s <SOURCE_CODES_DIR> -o <OUTPUT_DIR> -t <TARGET_CODING> [options]
acmc phen validate -d <PHENOTYPE_DIRECTORY>
```
**Required Options:**
- `-c, --config-file` – Path to the phenotype configuration file
- `-s, --source-codes-dir` – Root directory of source codes
- `-o, --output-dir` – Directory for CSV or OMOP database output
- `-t, --target-coding` – Target coding system _(choices: read2, read3, icd10, snomed, opcs4)_
- `-d`, `--phen-dir`: (Optional) Directory of phenotype configuration (the default is ./build/phen).
- **Map Phenotype**
**Optional Flags:**
- `-tr, --translate` – Enable code translation (default: disabled)
- `-v, --verify` – Enable code verification (default: disabled)
Process phenotype mapping and specify the target coding and output format:
**Optional Arguments:**
- `-l, --error-log` – Filepath to save error log (default: `error.csv`)
```bash
acmc phen map -d <PHENOTYPE_DIRECTORY> -t <TARGET_CODING> -o <OUTPUT_FORMAT>
```
- `-t`, `--target-coding`: Specify the target coding (e.g., `read2`, `read3`, `icd10`, `snomed`, `opcs4`).
- `-d`, `--phen-dir`: (Optional) Directory of phenotype configuration (the default is ./build/phen).
- `-o`, `--output`: Output format(s) (`csv`, `omop`, or both), default is 'csv'.
- **Publish Phenotype Configuration**
---
Publish a phenotype configuration, committing all changes and tagging with a new version number. If the phenotype has been initialised from a remote git URL, then the commit and new version tag will be pushed to the remote repo:
## Examples
### Install TRUD Components
```bash
acmc trud install -k my-trud-api-key
acmc phen publish -d <PHENOTYPE_DIRECTORY>
```
### Install OMOP Codes
- `-d`, `--phen-dir`: (Optional) Directory of phenotype configuration (the default is ./build/phen).
- **Copy Phenotype Configuration**
Copy a phenotype configuration from a source directory to a target directory at a specific version. This is used when wanting to compare versions of phenotypes using the `acmc phen diff` command:
```bash
acmc omop install -f /path/to/omop
acmc phen copy -d <PHENOTYPE_DIRECTORY> -td <TARGET_DIRECTORY> -v <PHENOTYPE_VERSION>
```
### Process Mapping Configuration with Read2 Target Coding
- `-d`, `--phen-dir`: (Optional) Directory of phenotype configuration (the default is ./build/phen).
- `-td`, `--target-dir`: (Optional) Directory to copy the phenotype configuration to, (the default is ./build).
- `-v`, `--version`: The phenotype version to copy.
- **Compare Phenotype Configurations**
Compare a a new phenotype version with pervious version of a phenotype:
```bash
acmc map process -c config.json -s /data/source -o /data/output -t read2 --translate --verify
acmc phen diff -d <NEW_PHENOTYPE_DIRECTORY> -old <OLD_PHENOTYPE_DIRECTORY>
```
- `-d`, `--phen-dir`: (Optional) Directory of current phenotype configuration (the default is ./build/phen).
- `-old`, `--phen-dir-old`: (Required) Directory of old phenotype version)
## License
MIT License
## Support
For issues, open a ticket in the repository or contact support@example.com.
## Contributing
### Commit to GitLab
......
......@@ -110,13 +110,31 @@ def main():
# phen map
phen_map_parser = phen_subparsers.add_parser("map", help="Process phen mapping")
phen_map_parser.add_argument("-d", "--phen-dir", type=str, default=str(phen.DEFAULT_PHEN_PATH.resolve()), help="Phenotype directory")
phen_map_parser.add_argument("-t", "--target-coding", required=True, choices=['read2', 'read3', 'icd10', 'snomed', 'opcs4'], help="Specify the target coding (read2, read3, icd10, snomed, opcs4)")
phen_map_parser.add_argument("-d",
"--phen-dir",
type=str,
default=str(phen.DEFAULT_PHEN_PATH.resolve()),
help="Phenotype directory")
phen_map_parser.add_argument("-t",
"--target-coding",
required=True,
choices=['read2', 'read3', 'icd10', 'snomed', 'opcs4'],
help="Specify the target coding (read2, read3, icd10, snomed, opcs4)")
phen_map_parser.add_argument("-o",
"--output",
choices=["csv", "omop"],
nargs="+", # allows one or more values
default=["csv"], # default to CSV if not specified
help="Specify output format(s): 'csv', 'omop', or both (default: csv)")
phen_map_parser.set_defaults(func=phen_map)
# phen publish
phen_publish_parser = phen_subparsers.add_parser("publish", help="Publish phenotype configuration")
phen_publish_parser.add_argument("-d", "--phen-dir", type=str, default=str(phen.DEFAULT_PHEN_PATH.resolve()), help="Phenotype directory")
phen_publish_parser.add_argument("-d",
"--phen-dir",
type=str,
default=str(phen.DEFAULT_PHEN_PATH.resolve()),
help="Phenotype directory")
phen_publish_parser.set_defaults(func=phen_publish)
# phen copy
......
......@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
[project]
name = "acmc"
version = "0.0.1"
version = "0.0.2"
authors = [
{ name = "Jakub Dylag", email = "j.j.dylag@soton.ac.uk" },
{ name = "Michael Boniface", email = "m.j.boniface@soton.ac.uk" }
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment