diff --git a/README.md b/README.md index c2675584ceca18992db4046fc3ba626d95c9ac84..aa00cae3a243dd7ef05f434c46823f35e9195fef 100644 --- a/README.md +++ b/README.md @@ -24,12 +24,12 @@ This tool automates the verification, translation and organisation of medical co ## Methods ### Workflow Overview -1. Approved MELD-B concepts are outlined in a CSV spreadsheet (e.g., `PHEN_summary_working.csv`). +1. Approved concept sets are outlined in a CSV spreadsheet (e.g., `PHEN_summary_working.csv`). 2. Imported code lists in the `/src` directory are validated against NHS TRUD-registered codes. -3. Mappings from imported code lists to outputted MELD-B concepts are defined in the `PHEN_assign_v3.json` file. +3. Mappings from imported code lists to outputted concept sets are defined in the `PHEN_assign_v3.json` file. - See "JSON Phenotype Mapping" section for more details 4. The process is executed via command-line. Refer to the "Usage" section for execution instructions. -5. Outputted concept code lists are saved to the `/concepts` Git repository, with all changes tracked. +5. Outputted concept set codes lists are saved to the `/concepts` Git repository, with all changes tracked. 6. The code lists can be exported to SAIL or any other Data Bank. ### Supported Medical Coding Standards @@ -88,7 +88,7 @@ Processed tables will be saved as `.parquet` files in the `maps/processed/` dire ## Configuration -The JSON configuration file specifies how input codes are grouped into **concept sets**, which are collections of related codes used for defining phenotypes or other data subsets. The configuration is divided into two main components: the `"concept_sets"` object and the `"codes"` object. The `"codes"` objects specifies the inputted codes; their filepaths, column names and code types, as well as any formatting actions that maybe be neccessary. The `"concept_sets"` object defines the concept groups each of the inputted codes will be assigned to. All files must be formatted as shown below. +The JSON configuration file specifies how input codes are grouped into **concept sets**, which are collections of related codes used for defining phenotypes or other data subsets. The configuration is divided into two main components: the `"concept_sets"` object and the `"codes"` object. The `"codes"` objects specifies the inputted codes; their filepaths, column names and code types, as well as any formatting actions that maybe be neccessary. The `"concept_sets"` object defines a grouping that each of the inputted codes will be assigned to. All files must be formatted as shown below. ```json { "concept_sets": {