Commit 76f7b571 authored by Leon-Charles Tranchevent's avatar Leon-Charles Tranchevent
Browse files

Updated README file

parent 9527d599
......@@ -14,14 +14,14 @@ This repository contains the code necessary to run the analyses described in the
This project focuses on a meta-analysis of transcriptomics datasets of Parkinson's disease patients and controls in order to identify variations associated with both disease status and biological sex. These variations are then further investigated through functional enrichment and regulatory network analyses.
## Content
The workflow is split in eight sequential steps, each one is associated with a corresponding folder. There is an additional folder for the configuration files (*e.g.*, to indicate where to find the data and to define the parameters of the analyses). Each step is briefly described below but each associated folder also contain its own README file.
The workflow is split into eight sequential steps, each one is associated with a corresponding folder. There is an additional folder for the configuration files (*e.g.*, to indicate where to find the data and to define the parameters of the analyses). Each step is briefly described below, but each associated folder also contains its own README file.
1. The quality control of the raw expression data is performed.
2. The raw expression data is preprocessed and another quality control is performed afterwards.
2. The raw expression data is preprocessed and a final quality control is performed afterwards.
3. The clinical annotations are investigated in order to identify whether missing values can be predicted.
4. The processed data and associated clinical annotations are prepared taking into account the observations from the previous steps (*i.e.*, samples to remove because of the quality control, predicted clinical values to add).
5. For each dataset, two differential expression analyses are performed using respectively only the male samples and only the female samples. For both analyses, patients and controls are compared so that the models identify the genes that are differentially expressed between female patients and female controls (or between male patients and male controls).
6. The meta-analyses are performed by integrating the results of the differential expression analyses across datasets (but again separately for each sex). By comparing the male and female results, the female-specific, male-specific and sex-dimorphic genes are then defined.
4. The processed data and associated clinical annotations are prepared taking into account the observations from the previous steps (*i.e.*, samples to remove because they did not pass quality control filters, predicted clinical values to add).
5. For each dataset, two differential expression analyses are performed using only the male samples or only the female samples, respectively. For both analyses, patients and controls are compared so that the models identify the genes that are differentially expressed between female patients and female controls (or between male patients and male controls).
6. The meta-analyses are performed by integrating the results of the differential expression analyses across datasets (again separately for each sex). By comparing the male and female results, the female-specific, male-specific and sex-dimorphic genes are then defined.
7. Functional enrichment of the meta-analysis results is performed.
8. Regulatory networks around the key differentially expressed genes are reconstructed.
......@@ -29,10 +29,7 @@ The workflow is split in eight sequential steps, each one is associated with a c
The datasets used in our study have been extracted from the [Gene Expression Omnibus](https://www.ncbi.nlm.nih.gov/geo/). The code can be used to analyze other datasets as long as the raw data and the associated clinical data is available. The configuration of the meta-analysis (*i.e.*, which datasets to include) can be found in the configuration folder `Confs/`.
## Requirements
The code consists of R and bash scripts. In addition, makefiles are used to illustrate how the scripts were exactly used in our meta-analysis. This project relies on various R and BioConductor packages (see the full list in the file `Confs/packages`). It also relies on the ArrayUtils set of functions which repository can be found [here](https://git-r3lab.uni.lu/bds/geneder/arrayutils).
The code consists of R and bash scripts. In addition, makefiles are used to illustrate how the scripts were exactly used in our meta-analysis. This project relies on various R and BioConductor packages (see the full list in the file `Confs/packages`). It also relies on the ArrayUtils set of functions for which the repository can be found [here](https://git-r3lab.uni.lu/bds/geneder/arrayutils).
## License
The code is available under the GNU General Public License (GPLv3).
## Citation
If you found this code useful, please cite our article: **Systems level meta-analysis of disease-associated molecular gender differences in Parkinson’s disease**, Tranchevent LC., Halder R. and Glaab E., *manuscript submitted*
The code is available under the GNU General Public License (GPLv3).
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment