(docs) testing mkdocs and readthedocs integratino

e350060b · mjbonifa · d366cf07 · d366cf07 · e350060b · d366cf07
Commit e350060b authored 3 months ago by mjbonifa
--- a/docs/index.html
+++ b/docs/index.html
--- a/docs/index.md
+++ b/docs/index.md
-# ACMC Documentation
+# User Guide

-## Contents
+- [Phenotype Workflow](#phenotype-workflow)
+- [Phenotype Definition](#phenotype-definition)
+- [Version Control](#version-control)

-  - [Installation](./installation.md)
-  - [User Guide](./user-guide.md)
-  - [Usage](./usage.md)
-  - Tutorials
-    - [Example 1 - Basic local phenotype](./tutorials/example1.md)
-    - [Example 2 - More complex local phenotype](./tutorials/example2.md)
-    - [Example 3 - Using a remote git repository](./tutorials/example3.md)
-  - [API Reference](./api/acmc.html)
-  - [Change Log](./changelog.md)
-  - [Troubleshooting](./troubleshooting.md)
+## Phenotype Workflow

+ACMC has a five step workflow to create a phenotype including steps initialise, validate, map, publish and export
+
+### Step 1: Initialise
+
+The `initialise` step creates a phenotype directory within the acmc workspace. The outcome will be a directory with all
+required subdirectories and files, see [directory structure](#phenotype-directory-structure)
+
+```bash
+acmc phen init
+```
+
+### Step 2: Validate
+
+The `validate` step checks that the phenotype configuration is valid including verification of the 
+configuration file according to schema and consistency between concept sets and source concept coding lists. The outcome will be notifications of the validity of the phenotype configuration.
+
+```bash
+acmc phen validate
+```
+
+### Step 3: Map
+
+The `map` step performs code translations between source and target coding types. The outcome will be a concept sets defined for the target coding types stored in CSV files.
+
+```bash
+acmc phen map
+```
+
+### Step 4: Publish
+
+The `publish` step commits the phenotype to a git repo and increments the version number. The outcome will be a published phenotype at the next version number
+
+```bash
+acmc phen publish
+```
+
+### Step 5: Export
+
+The `export` step creates an OMOP database for the phenotype. The outcome will be an OMOP database including concept sets for all target coding types exported as CSV files
+
+```bash
+acmc phen export
+```
+
+## Phenotype Definition
+
+### **Phenotype directory structure**
+
+```
+workspace/                          # Default workspace directory
+├── phen/                           # Default phenotype directory
+│   ├── concepts/                   # Phenotype source concept code lists directory
+│   ├── concept-sets/               # Processed phenotype concept sets
+│   │   ├── csv/                    # Processed phenotype concept sets in ACMC CSV format
+│   │   ├── omop/                   # Processed phenotype concept sets in OMOP CDM database exported as CSV files
+│   ├── map/                        # Process mapping from source to target code types
+│   │   ├── errors/                 # Errors recorded during mapping process
+│   ├── config.yaml                 # Phenotype configuration file
+│   ├── vocab_versions.yaml         # Versions file for vocabularies used to generate concept sets
+``` 
+
+### **Configuration File**
+
+Phenotype configuration is stored in the root of the phenotype directory in `config.yaml`. The file is yaml format. 
+
+#### **Root Phenotype Element**  
+- `phenotype`: **(object)** The root element containing all phenotype-related concept sets and metadata.
+
+#### **Phenotype Attributes**  
+- `version`: **(string)** Specifies the version of the phenotype definition.  
+- `omop`: **(object)** Metadata related to OMOP vocabulary.  
+  - `vocabulary_id`: **(string)** Identifier for the vocabulary.  
+  - `vocabulary_name`: **(string)** Human-readable name of the vocabulary.  
+  - `vocabulary_reference`: **(string, URL)** A reference URL for the vocabulary source.  
+
+#### **Concept Sets**  
+- `concept_sets`: **(array)** A list of concept set definitions, where each item has the following attributes:  
+  - `name`: **(string)** Unique name of the concept set.  
+  - `file`: **(object)** Contains file-related metadata.  
+    - `path`: **(string, file path)** Relative path to the source concepts coding list file, relative to `<phen_directory>/concepts`
+    - `columns`: **(object)** Key-value pairs mapping column names in the file to coding list types 
+  - `category` **(optional, string)** A categorical identifier for processing files containing multiple concept sets.  
+  - `actions` **(optional, object)** Additional transformations on data.  
+    - `divide_col`: **(string)** Specifies a column name in the source concept file to group on.  
+  - `metadata`: **(object)** Reserved for additional metadata.
+ 
+## **Version Control**
+
+ACMC uses [Git](https://git-scm.com/) to support versioning of phenotypes. Git is a version control system that track changes in documents such as source coding lists, coding maps or configuration files. Using git allows ACMC to track versions and changes. 
+
+When a phenotype is initialised the directory is configured as a Git repository. ACMC then provides a simplified interaction with Git through a specific workflow using ACMC commands including integrate with remote Git services such as [GitLab](https://about.gitlab.com/) or [GitHub](https://github.com/).
+
+ACMC does not currently support merging contributions from multiple collaborators on a phenotype through ACMC commands. This has to be done using existing Git tools. 
+
+### **Version Numbers**
+
+ACMC uses [semantic versioning](https://semver.org/) to version phenotypes. Semantic versioning uses three numbers MAJOR.MINOR.PATCH where each number is incremented depending on the significance of the change. Although semantic versioning is designed for sofware the idea of major, minor and patch changes is retained for the phenotype as per the following 
+
+- MAJOR version when you make changes to concept sets
+- MINOR version when you make changes to coding list concepts
+- PATCH version when you make other minor changes such as documentation
+
+### **Workflows**
+
+ACMC supports local and remote repositories
+
+#### **Local Workflow**
+
+A local phenotype is only stored within a directory on a filesystem. The following command will create a git repository with the initial phenotype directory structure and make a commit to the git repository.
+
+```bash
+acmc phen init
+```
+
+You can then configure your phenotype and generate maps to other coding types as required. When you are finished and happy to publish a version of your phenotype, you run the following command
+
+```bash
+acmc phen publish
+```
+
+This will commit the changes to the git repository and generate a new version number. If this is the first publish the initial version will be `0.0.1`. You can tell ACMC how to increment the version using the `-i` argument with either major, minor or patch. The defaul is a patch change, i.e. incrementing the patch number. Using the following command will create a major release `1.0.0`
+
+```bash
+acmc phen publish -i major
+```
+
+#### **Remote Workflow**
+
+A remote phenotype is stored on a central server that can be accessed remotely by others. Common central services include GitHub or GitLab (public or private). You can connect your local phenotype to a remote repository during initialisation or publication. When connecting to a remote repository it is important and recommended that you connect to an empty repo without any previous commits. Do not initialise it with a readme.md file, which is often the default. If there are commits you will need to resolve the conflicts manually before ACMC will work. 
+
+To initiatise a phenotype with a remote Git repository using the following command replacing the git URL with the URL to your remote repo.
+
+```bash
+acmc phen init -r https://git.soton.ac.uk/meldb/remote-phenotype.git
+```
+
+If you have a local phenotype and later want to connect it to a remote phenotype you can do this when it's published
+
+```bash
+acmc phen publish -r https://git.soton.ac.uk/meldb/remote-phenotype.git
+```
+
+#### **Fork Remote Workflow**
+
+If there is an existing published remote phenotype that you want to use as a starting point you can fork the upstream repo and create a new phenotype. To do this you can run the following command to create a fork of the remote repo in a local directory
+
+```bash
+acmc fork -u https://git.soton.ac.uk/meldb/forked-phenotype.git -v 1.0.0
+```
+
+If you want to fork the repo and connect this to a remote repo you can run the following command. 
+
+```bash
+acmc fork -u https://git.soton.ac.uk/meldb/forked-phenotype.git -v 1.0.0 -r https://git.soton.ac.uk/meldb/remote-phenotype.git
+```
+
+Alternatively you can connect the remote repo later when you publish 
+
+```bash
+acmc phen publish -r https://git.soton.ac.uk/meldb/remote-phenotype.git
+```
+
+### Supported Medical Coding Standards
+
+The tool supports verification and mapping across diagnostic coding formats below:
+
+| Medical Code  | Verification | Translation to                    |
+|---------------|--------------|-----------------------------------|
+| Readv2        | NHS TRUD     | Readv3, SNOMED, ICD10, OPCS4, ATC |
+| Readv3 (CTV3) | NHS TRUD     | Readv3, SNOMED, ICD10, OPCS4      |
+| ICD10         | NHS TRUD     | None                              |
+| SNOMED        | NHS TRUD     | None                              |
+| OPCS4         | NHS TRUD     | None                              |
+| ATC           | None         | None                              |
+
+- [**Read V2:**](https://digital.nhs.uk/services/terminology-and-classifications/read-codes) NHS clinical terminology standard used in primary care and replaced by SNOMED-CT in 2018; Still supported by some data providers as widely used in primary care, e.g. [SAIL Databank](https://saildatabank.com/)
+- [**SNOMED-CT:**](https://icd.who.int/browse10/2019/en) international standard for clinical terminology for Electronic Healthcare Records adopted by the NHS in 2018; Mappings to Read codes are partially provided by [Clinical Research Practice Database (CPRD)](https://www.cprd.com/) and [NHS Technology Reference Update Distribution (TRUD)](https://isd.digital.nhs.uk/trud).
+- [**ICD-10:**](https://icd.who.int/browse10/2019/en) International Classification of Diseases (ICD) is a medical classification list from the World Health Organization (WHO) and widely used in hospital settings, e.g. Hospital Episode Statistics (HES).
+- [**ATC Codes:**](https://www.who.int/tools/atc-ddd-toolkit/atc-classification) Anatomical Therapeutic Chemical (ATC) Classification is a drug classification list from the World Health Organization (WHO)
+
+## Notes
+
+   Processed resources will be saved in the `build/maps/processed/` directory.
+
+*Note: NHS TRUD provides one-way mappings. To reverse mappings, duplicate the `.parquet` file and reverse the filename (e.g., `read2_code_to_snomed_code.parquet` to `snomed_code_to_read2_code.parquet`).*
--- a/docs/installation.html
+++ b/docs/installation.html
--- a/docs/user-guide.md
+++ b/docs/user-guide.md
-# User Guide
-
- [Phenotype Workflow](#phenotype-workflow)
- [Phenotype Definition](#phenotype-definition)
- [Version Control](#version-control)
-
-## Phenotype Workflow
-
-ACMC has a five step workflow to create a phenotype including steps initialise, validate, map, publish and export
-
-### Step 1: Initialise
-
-The `initialise` step creates a phenotype directory within the acmc workspace. The outcome will be a directory with all
-required subdirectories and files, see [directory structure](#phenotype-directory-structure)
-
-```bash
-acmc phen init
-```
-
-### Step 2: Validate
-
-The `validate` step checks that the phenotype configuration is valid including verification of the 
-configuration file according to schema and consistency between concept sets and source concept coding lists. The outcome will be notifications of the validity of the phenotype configuration.
-
-```bash
-acmc phen validate
-```
-
-### Step 3: Map
-
-The `map` step performs code translations between source and target coding types. The outcome will be a concept sets defined for the target coding types stored in CSV files.
-
-```bash
-acmc phen map
-```
-
-### Step 4: Publish
-
-The `publish` step commits the phenotype to a git repo and increments the version number. The outcome will be a published phenotype at the next version number
-
-```bash
-acmc phen publish
-```
-
-### Step 5: Export
-
-The `export` step creates an OMOP database for the phenotype. The outcome will be an OMOP database including concept sets for all target coding types exported as CSV files
-
-```bash
-acmc phen export
-```
-
-## Phenotype Definition
-
-### **Phenotype directory structure**
-
-```
-workspace/                          # Default workspace directory
-├── phen/                           # Default phenotype directory
-│   ├── concepts/                   # Phenotype source concept code lists directory
-│   ├── concept-sets/               # Processed phenotype concept sets
-│   │   ├── csv/                    # Processed phenotype concept sets in ACMC CSV format
-│   │   ├── omop/                   # Processed phenotype concept sets in OMOP CDM database exported as CSV files
-│   ├── map/                        # Process mapping from source to target code types
-│   │   ├── errors/                 # Errors recorded during mapping process
-│   ├── config.yaml                 # Phenotype configuration file
-│   ├── vocab_versions.yaml         # Versions file for vocabularies used to generate concept sets
-``` 
-
-### **Configuration File**
-
-Phenotype configuration is stored in the root of the phenotype directory in `config.yaml`. The file is yaml format. 
-
-#### **Root Phenotype Element**  
- `phenotype`: **(object)** The root element containing all phenotype-related concept sets and metadata.
-
-#### **Phenotype Attributes**  
- `version`: **(string)** Specifies the version of the phenotype definition.  
- `omop`: **(object)** Metadata related to OMOP vocabulary.  
-  - `vocabulary_id`: **(string)** Identifier for the vocabulary.  
-  - `vocabulary_name`: **(string)** Human-readable name of the vocabulary.  
-  - `vocabulary_reference`: **(string, URL)** A reference URL for the vocabulary source.  
-
-#### **Concept Sets**  
- `concept_sets`: **(array)** A list of concept set definitions, where each item has the following attributes:  
-  - `name`: **(string)** Unique name of the concept set.  
-  - `file`: **(object)** Contains file-related metadata.  
-    - `path`: **(string, file path)** Relative path to the source concepts coding list file, relative to `<phen_directory>/concepts`
-    - `columns`: **(object)** Key-value pairs mapping column names in the file to coding list types 
-  - `category` **(optional, string)** A categorical identifier for processing files containing multiple concept sets.  
-  - `actions` **(optional, object)** Additional transformations on data.  
-    - `divide_col`: **(string)** Specifies a column name in the source concept file to group on.  
-  - `metadata`: **(object)** Reserved for additional metadata.
- 
-## **Version Control**
-
-ACMC uses [Git](https://git-scm.com/) to support versioning of phenotypes. Git is a version control system that track changes in documents such as source coding lists, coding maps or configuration files. Using git allows ACMC to track versions and changes. 
-
-When a phenotype is initialised the directory is configured as a Git repository. ACMC then provides a simplified interaction with Git through a specific workflow using ACMC commands including integrate with remote Git services such as [GitLab](https://about.gitlab.com/) or [GitHub](https://github.com/).
-
-ACMC does not currently support merging contributions from multiple collaborators on a phenotype through ACMC commands. This has to be done using existing Git tools. 
-
-### **Version Numbers**
-
-ACMC uses [semantic versioning](https://semver.org/) to version phenotypes. Semantic versioning uses three numbers MAJOR.MINOR.PATCH where each number is incremented depending on the significance of the change. Although semantic versioning is designed for sofware the idea of major, minor and patch changes is retained for the phenotype as per the following 
-
- MAJOR version when you make changes to concept sets
- MINOR version when you make changes to coding list concepts
- PATCH version when you make other minor changes such as documentation
-
-### **Workflows**
-
-ACMC supports local and remote repositories
-
-#### **Local Workflow**
-
-A local phenotype is only stored within a directory on a filesystem. The following command will create a git repository with the initial phenotype directory structure and make a commit to the git repository.
-
-```bash
-acmc phen init
-```
-
-You can then configure your phenotype and generate maps to other coding types as required. When you are finished and happy to publish a version of your phenotype, you run the following command
-
-```bash
-acmc phen publish
-```
-
-This will commit the changes to the git repository and generate a new version number. If this is the first publish the initial version will be `0.0.1`. You can tell ACMC how to increment the version using the `-i` argument with either major, minor or patch. The defaul is a patch change, i.e. incrementing the patch number. Using the following command will create a major release `1.0.0`
-
-```bash
-acmc phen publish -i major
-```
-
-#### **Remote Workflow**
-
-A remote phenotype is stored on a central server that can be accessed remotely by others. Common central services include GitHub or GitLab (public or private). You can connect your local phenotype to a remote repository during initialisation or publication. When connecting to a remote repository it is important and recommended that you connect to an empty repo without any previous commits. Do not initialise it with a readme.md file, which is often the default. If there are commits you will need to resolve the conflicts manually before ACMC will work. 
-
-To initiatise a phenotype with a remote Git repository using the following command replacing the git URL with the URL to your remote repo.
-
-```bash
-acmc phen init -r https://git.soton.ac.uk/meldb/remote-phenotype.git
-```
-
-If you have a local phenotype and later want to connect it to a remote phenotype you can do this when it's published
-
-```bash
-acmc phen publish -r https://git.soton.ac.uk/meldb/remote-phenotype.git
-```
-
-#### **Fork Remote Workflow**
-
-If there is an existing published remote phenotype that you want to use as a starting point you can fork the upstream repo and create a new phenotype. To do this you can run the following command to create a fork of the remote repo in a local directory
-
-```bash
-acmc fork -u https://git.soton.ac.uk/meldb/forked-phenotype.git -v 1.0.0
-```
-
-If you want to fork the repo and connect this to a remote repo you can run the following command. 
-
-```bash
-acmc fork -u https://git.soton.ac.uk/meldb/forked-phenotype.git -v 1.0.0 -r https://git.soton.ac.uk/meldb/remote-phenotype.git
-```
-
-Alternatively you can connect the remote repo later when you publish 
-
-```bash
-acmc phen publish -r https://git.soton.ac.uk/meldb/remote-phenotype.git
-```
-
-### Supported Medical Coding Standards
-
-The tool supports verification and mapping across diagnostic coding formats below:
-
-| Medical Code  | Verification | Translation to                    |
-|---------------|--------------|-----------------------------------|
-| Readv2        | NHS TRUD     | Readv3, SNOMED, ICD10, OPCS4, ATC |
-| Readv3 (CTV3) | NHS TRUD     | Readv3, SNOMED, ICD10, OPCS4      |
-| ICD10         | NHS TRUD     | None                              |
-| SNOMED        | NHS TRUD     | None                              |
-| OPCS4         | NHS TRUD     | None                              |
-| ATC           | None         | None                              |
-
- [**Read V2:**](https://digital.nhs.uk/services/terminology-and-classifications/read-codes) NHS clinical terminology standard used in primary care and replaced by SNOMED-CT in 2018; Still supported by some data providers as widely used in primary care, e.g. [SAIL Databank](https://saildatabank.com/)
- [**SNOMED-CT:**](https://icd.who.int/browse10/2019/en) international standard for clinical terminology for Electronic Healthcare Records adopted by the NHS in 2018; Mappings to Read codes are partially provided by [Clinical Research Practice Database (CPRD)](https://www.cprd.com/) and [NHS Technology Reference Update Distribution (TRUD)](https://isd.digital.nhs.uk/trud).
- [**ICD-10:**](https://icd.who.int/browse10/2019/en) International Classification of Diseases (ICD) is a medical classification list from the World Health Organization (WHO) and widely used in hospital settings, e.g. Hospital Episode Statistics (HES).
- [**ATC Codes:**](https://www.who.int/tools/atc-ddd-toolkit/atc-classification) Anatomical Therapeutic Chemical (ATC) Classification is a drug classification list from the World Health Organization (WHO)
-
-## Notes
-
-   Processed resources will be saved in the `build/maps/processed/` directory.
-
-*Note: NHS TRUD provides one-way mappings. To reverse mappings, duplicate the `.parquet` file and reverse the filename (e.g., `read2_code_to_snomed_code.parquet` to `snomed_code_to_read2_code.parquet`).*
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -7,11 +7,10 @@ theme:
 nav:
  - Home: index.html
  - Installation: installation.html
-  - User Guide: user-guide.md
-  - Usage: usage.md
  - Tutorial 1 Basic local phenotype: tutorials/example1.md
  - Tutorial 2 More complex local phenotype: tutorials/example2.md
  - Tutorial 3 Using a remote git repository: tutorials/example3.md
+  - Comand Line Reference: usage.md  
  - API Reference: api/acmc.html  
  - Change Log: changelog.md
  - Troubleshooting: troubleshooting.md