Plant Bioinformatics Vol 3 RNA-Seq Tutorial¶
Goal¶
This tutorial is an online version of an RNA-Seq tutorial developed for Vol. 3 of Plant Bioinformatics. It is an end-to-end RNA-seq analysis using the Kallisto and Sleuth. The tutorial emphasizes reproducibility features of the CyVerse platforms.
Tutorial Maintainer(s)¶
Who to contact if this guide needs fixing. You can also email learning@CyVerse.org
Maintainer | Institution | Contact |
---|---|---|
Jason Williams | CyVerse / Cold Spring Harbor Laboratory | williams@cshl.edu |
Prerequisites¶
Downloads, access, and services¶
In order to complete this tutorial you will need access to the following services/software
Prerequisite | Preparation/Notes | Link/Download |
---|---|---|
CyVerse account | You will need a CyVerse account to complete this exercise | CyVerse User Portal |
Spreadsheet software | Software for editing spreadsheet data | User provided (e.g. Excel, Google Sheets, or Open Office) |
Cyberduck (optional) | Standalone software for upload/download to Data Store | Download Cyberduck |
Platform(s)¶
We will use the following CyVerse platform(s):
Platform | Interface | Link | Platform Tour |
---|---|---|---|
Discovery Environment and Data Store | Web/Point-and-click | Discovery Environment | Discovery Environment guide |
Application(s) used¶
Discovery Environment App(s):
App name | Version | Description | App link | Notes/other links |
---|---|---|---|---|
sra-tools prefetch | SRA tools version 2.8.10, CyVerse App version 0.1 | Utility for downloading data from NCBI Sequence Read Archive | sra-tools prefetch app | SRA Tools Documentation |
sra-tools vdb-validate | SRA tools version 2.8.10, CyVerse App version 0.1 | Utility for validating downloaded data from NCBI Sequence Read Archive | sra-tools vdb-validate app | SRA Tools Documentation |
sra-tools fasterq-dump | SRA tools version 2.8.10, CyVerse App version 0.1 | Convert SRA files to FastQ format | sra-tools fasterq-dump app | SRA Tools Documentation |
FastQC | 0.11.5 | Quality control reports for FastQ files | FastQC app | FastQC Documentation |
Kallisto | 0.43.1 | RNA-Seq quantification by pseudoalignment | Kallisto app | FastQC Documentation |
Sleuth | 0.30.0 | VICE application of RStudio with Sleuth and related R packages | Sleuth app | FastQC Documentation |
Input and example data¶
In order to complete this tutorial you will need to have the following inputs prepared
Input File(s) | Format | Preparation/Notes | Example Data |
---|---|---|---|
SRA files (from NCBI Sequence Read Archive) or FastQ files | ".sra", ".fastq" | This tutorial starts with importing data from the SRA. You could also start at Organize files, validate import, and extract to FastQ format | CyVerse Data Commons |
SRA Metadata, custom metadata | ".csv" | This tutorial uses metadata provided from the SRA. If using your own FastQ files, you may create and use any metadata you wish. | CyVerse Data Commons 2 |
Sample-data
The data for this tutorial comes from Zia et al. 2019 which used RNA-Seq to explore overlap in signaling pathways in Arabidopsis treated with the hormones melatonin and auxin. Although these hormones have similar chemical structures (indoles), the study identified distinct signaling pathways and changes in gene expression. The dataset is available on the SRA under BioProject PRJNA553702. We will attempt to replicate the RNA-Seq portion of this analysis.
Zia, S. F., Berkowitz, O., Bedon, F., Whelan, J., Franks, A. E., & Plummer, K. M. (2019). Direct comparison of Arabidopsis gene expression reveals different responses to melatonin versus auxin. BMC Plant Biology. https://doi.org/10.1186/s12870-019-2158-3