diff --git a/README.md b/README.md index 4c64070..f2d2fe9 100644 --- a/README.md +++ b/README.md @@ -4,38 +4,44 @@ * R (packages: edgeR) * Python (packages: pandas) -## Running step wise +## RNASeq Pipeline: +Following series of steps show how to run the DDR method on RNASeq data. ### Step0 Preprocess +Step0 should be used to format the count table such that the input table has the gene expression table with first n columns as group1 samples (Triple negative:TN) and remaining columns as samples from group2 (other:OT). Rows represent the ENSG ids. ``` Rscript step0_preprocess.R ``` ### Step1 Calculate Stats -#### Based on the source of input file, enter RNASeq or microarray as input argument. +In this step, the count table is normalized and the covariance, standard deviation, mean and MFC are being calculated. ``` Rscript step1_calculateStats.R RNASeq -Rscript step1_calculateStats.R microarray ``` ### Step2 Find Reference set -#### Based on the source of input file, enter RNASeq or microarray as input argument. - +In this step, reference set of genes are being determined. The output file 'ref_cpm.csv' stores expression level of these reference genes. ``` Rscript step2_findRef.R RNASeq -Rscript step2_findRef.R microarray ``` ### Step3 Find overlap using Fisher's Exact test #### Enter number of samples in first group as first input argument. -#### Based on the source of input file, enter RNASeq or microarray as second input argument. - Since the example dataset has 115 samples in group1. ``` Rscript step3_overlapFisher.R 115 RNASeq -Rscript step3_overlapFisher.R 115 microarray - ``` +## Output files +* final_out.csv: Normalized count data +* ref_cpm.csv: Expression of reference genes +* overlap_test_fdr_1_[RNASeq|microarray].csv or : Differentially expressed genes with fdr < 0.1 +* overlap_test_fdr_05_[RNASeq|microarray].csv: Differentially expressed genes with fdr < 0.05 + +## Pipeline for microarray data. Note that the steps are similar to for RNASeq data. +Rscript step0_preprocess_microarray.R +Rscript step1_calculateStats.R microarray +Rscript step2_findRef.R microarray +Rscript step3_overlapFisher.R 115 microarray ## Running all the steps together ``` @@ -45,9 +51,3 @@ If you have another python version that has pandas: ``` Rscript DDR_Ref.R PYTHON_PATH ``` - -## Output files -* final_out.csv: Normalized count data -* ref_cpm.csv: Expression of reference genes -* overlap_test_fdr_1_[RNASeq|microarray].csv or : Differentially expressed genes with fdr < 0.1 -* overlap_test_fdr_05_[RNASeq|microarray].csv: Differentially expressed genes with fdr < 0.05