Skip to content
Snippets Groups Projects
Commit 31a8ef93 authored by mjbonifa's avatar mjbonifa
Browse files

docs: added directory docs and yaml config doc file

parent 33c28bbf
No related branches found
No related tags found
No related merge requests found
......@@ -13,7 +13,7 @@
- [Change Log](./changelog.md)
- [Troubleshooting](./troubleshooting.md)
### Overview
## Overview
### Supported Medical Coding Standards
......@@ -39,5 +39,51 @@ The tool supports verification and mapping across diagnostic coding formats belo
*Note: NHS TRUD provides one-way mappings. To reverse mappings, duplicate the `.parquet` file and reverse the filename (e.g., `read2_code_to_snomed_code.parquet` to `snomed_code_to_read2_code.parquet`).*
## Phenotype Definition
### **Phenotype directory structure
```markdown
```
workspace/ # Default workspace directory
├── phen/ # Default phenotype directory
│ ├── codes/ # Phenotype source concept code lists directory
│ ├── concept-set/ # Processed phenotype concept sets in CSV format
│ ├── map/ # Process mapping from source to target code types
│ │ ├── errors/ # Errors recorded during mapping
│ ├── omop/ # Processed phenotype concept sets in OMOP database CSV files
│ ├── config.yaml # Phenotype configuration file
│ ├── vocab_versions.yaml # Versions file for vocabularies used to generate concept sets
```
```
### **Configuration File**
Phenotype configuration is stored in the root of the phenotype directory in `config.yaml`. The file is yaml format.
#### **Root Element**
- `phenotype`: **(object)** The root element containing all phenotype-related concept sets and metadata.
#### **Phenotype Attributes**
- `version`: **(string)** Specifies the version of the phenotype definition.
- `omop`: **(object)** Metadata related to OMOP vocabulary.
- `vocabulary_id`: **(string)** Identifier for the vocabulary.
- `vocabulary_name`: **(string)** Human-readable name of the vocabulary.
- `vocabulary_reference`: **(string, URL)** A reference URL for the vocabulary source.
#### **Concept Sets**
- `concept_sets`: **(array)** A list of concept set definitions, where each item has the following attributes:
- `name`: **(string)** Unique name of the concept set.
- `file`: **(object)** Contains file-related metadata.
- `path`: **(string, file path)** Relative path to the source concepts coding list file, relative to `<phen_directory>/codes`
- `columns`: **(object)** Key-value pairs mapping column names in the file to coding list types
- `category` **(optional, string)** A categorical identifier for processing files containing multiple concept sets.
- `actions` **(optional, object)** Additional transformations on data.
- `divide_col`: **(string)** Specifies a column name in the source concept file to group on.
- `metadata`: **(object)** Reserved for additional metadata.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment