Q) My datasets of interest aren't included in DEE. What should I do?

  1. A) Firstly confirm that the species/accession number combination is correct. Then check whether the sequence data is available from the SRA ftp site. Check that the study raw data is released and not under embargo. If the dataset has been added recently, then you have a few options. You can run the docker or singularity image on your own server or on the cloud. The instructions to do this are in the github here. Alternatively contact us and we'll have it added.

Q) How were the QC metrics generated and what do they mean?

  1. I have provided an explanation of each of the QC metrics on the GitHub page here.

Q) Can I get the TPM values from Kallisto?

  1. A) Unfortunately, the abundance.tsv files from kallisto were occupying too much disk space so these will no longer be hosted on the webserver. Naturally we still have copies of these elsewhere and can arrange access other ways if necessary.

Q) How do I collapse transcript counts to the gene level?

  1. You can use the getDEE2 R package documented here. It performs a simple aggregation by sum of isoforms to the parent gene.

Q) How do I open zip compressed files?

  1. A) Winzip can uncompress .zip files and is available here for free. Decompression tools are also available from the Apple and Android app store for mobile devices.

Q) What tools are recommended for analyzing RNA-seq count data?

  1. A) There are many options, but we recommend Degust for new users.

Q) How do I load the data into R?

  1. A) We have written an R/Bioconductor package that interfaces with the webpage and loads data into R. For more information, read the documentation here
  2. .

Q) The webserver is limited to 500 datasets but my project of interest contains more than that - can I still use DEE2?

  1. A) Yes, you have three options - the best one is to see whether the project has been packaged. For convenience, we have packaged all projects with >200 runs that have been completely processed by DEE2 into zip files which contain the expression data. These are accessible here. If the project is not included there, you could download the project in chunks by submitting 500 run accessions at a time. Thirdly, you could download the full bulk data and filter as necessary.

Q) Can I load the data directly into my Galaxy history?

  1. A) DEE data can be uploaded manually into Galaxy; we will be investigating direct integration in future.

Q) My DEE data contains multiple technical replicates, how can I aggregate the counts?

  1. A) The getDEE2 R package has a function called srx_agg() which can do this. Refer to the package manual for details. Alternatives include awk or even spreadsheet software (ie. Excel). If using Excel or other spreadsheet software, please be wary of its default behaviour that converts gene names and accession numbers to dates and scientific numbers.

See our YouTube clip

Get more help

Contact us by email (mark.ziemann[at]