Skip to content

Plant Bioinformatics Vol 3 RNA-Seq Tutorial

Goal

This tutorial is an online version of an RNA-Seq tutorial developed for Vol. 3 of Plant Bioinformatics. It is an end-to-end RNA-seq analysis using the Kallisto and Sleuth. The tutorial emphasizes reproducibility features of the CyVerse platforms.


Tutorial Maintainer(s)

Who to contact if this guide needs fixing. You can also email learning@CyVerse.org

Maintainer Institution Contact
Jason Williams CyVerse / Cold Spring Harbor Laboratory williams@cshl.edu

Prerequisites

Downloads, access, and services

In order to complete this tutorial you will need access to the following services/software

Prerequisite Preparation/Notes Link/Download
CyVerse account You will need a CyVerse account to complete this exercise CyVerse User Portal
Spreadsheet software Software for editing spreadsheet data User provided (e.g. Excel, Google Sheets, or Open Office)
Cyberduck (optional) Standalone software for upload/download to Data Store Download Cyberduck

Platform(s)

We will use the following CyVerse platform(s):

Platform Interface Link Platform Tour
Discovery Environment and Data Store Web/Point-and-click Discovery Environment Discovery Environment guide

Application(s) used

Discovery Environment App(s):

App name Version Description App link Notes/other links
sra-tools prefetch SRA tools version 2.8.10, CyVerse App version 0.1 Utility for downloading data from NCBI Sequence Read Archive sra-tools prefetch app SRA Tools Documentation
sra-tools vdb-validate SRA tools version 2.8.10, CyVerse App version 0.1 Utility for validating downloaded data from NCBI Sequence Read Archive sra-tools vdb-validate app SRA Tools Documentation
sra-tools fasterq-dump SRA tools version 2.8.10, CyVerse App version 0.1 Convert SRA files to FastQ format sra-tools fasterq-dump app SRA Tools Documentation
FastQC 0.11.5 Quality control reports for FastQ files FastQC app FastQC Documentation
Kallisto 0.43.1 RNA-Seq quantification by pseudoalignment Kallisto app FastQC Documentation
Sleuth 0.30.0 VICE application of RStudio with Sleuth and related R packages Sleuth app FastQC Documentation

Input and example data

In order to complete this tutorial you will need to have the following inputs prepared

Input File(s) Format Preparation/Notes Example Data
SRA files (from NCBI Sequence Read Archive) or FastQ files ".sra", ".fastq" This tutorial starts with importing data from the SRA. You could also start at Organize files, validate import, and extract to FastQ format CyVerse Data Commons
SRA Metadata, custom metadata ".csv" This tutorial uses metadata provided from the SRA. If using your own FastQ files, you may create and use any metadata you wish. CyVerse Data Commons 2

Sample-data

The data for this tutorial comes from Zia et al. 2019 which used RNA-Seq to explore overlap in signaling pathways in Arabidopsis treated with the hormones melatonin and auxin. Although these hormones have similar chemical structures (indoles), the study identified distinct signaling pathways and changes in gene expression. The dataset is available on the SRA under BioProject PRJNA553702. We will attempt to replicate the RNA-Seq portion of this analysis.

Zia, S. F., Berkowitz, O., Bedon, F., Whelan, J., Franks, A. E., & Plummer, K. M. (2019). Direct comparison of Arabidopsis gene expression reveals different responses to melatonin versus auxin. BMC Plant Biology. https://doi.org/10.1186/s12870-019-2158-3


Last update: 2022-11-28