This commit is contained in:
reubenthomas 2023-05-25 06:19:13 -07:00
commit 7e70e4f48c
11 changed files with 128 additions and 11 deletions

Binary file not shown.

Binary file not shown.

View file

@ -2,12 +2,13 @@
[Link to wiki page](https://github.com/gladstone-institutes/Bioinformatics-Workshops/wiki/Introduction-to-RNA-Seq-Analysis) [Link to wiki page](https://github.com/gladstone-institutes/Bioinformatics-Workshops/wiki/Introduction-to-RNA-Seq-Analysis)
### Description of files ### Description of files
- Single_read.fastq (fastq file with a single read to understand the fastq file format) 1. Single_read.fastq (fastq file with a single read to understand the fastq file format)
- Bacteria_GATTACA_L001_R1_001.fastq (single-end small practice data with 100k reads for the demo in the workshop) 2. Bacteria_GATTACA_L001_R1_001.fastq (single-end small practice data with 100k reads for the demo in the workshop)
- Adapter_Sequence.fasta (fasta file with adapter sequence for demo with cutadapt) 3. Adapter_Sequence.fasta (fasta file with adapter sequence for demo with cutadapt)
- rDNA_sequence.fasta (fasta file with the reference genome sequence for demo with STAR aligner) 4. rDNA_sequence.fasta (fasta file with the reference genome sequence for demo with STAR aligner)
- rDNA.gtf (GTF file with the annotations for demo with featureCounts) 5. rDNA.gtf (GTF file with the annotations for demo with featureCounts)
- all_steps_wynton.sh (shell script for running all the analysis steps on UCSF Wynton command-line interface using the practice data provided) 6. all_steps_wynton.sh (shell script for running all the analysis steps on UCSF Wynton command-line interface using the practice data provided)
- steps_on_wynton_session1.txt (text file with steps used on wynton to download/uplaod files, change directories etc in session 1) 7. steps_on_wynton_part1.txt (text file with steps used on wynton to setup the folders, upload the data and create a singularity container)
- steps_on_wynton_session2.txt (text file with steps used on wynton to download/uplaod files, change directories etc in session 2) 8. steps_on_wynton_part2.txt (text file with steps used on wynton to run the bulk RNA-seq analysis using the demo files)
- all_steps_docker_desktop.sh (shell script for running all the analysis steps using Docker Desktop and the practice data provided) 9. all_steps_docker_desktop_mac.sh (commands for running all the analysis steps using Docker Desktop and the demo files on MacOS)
10. all_steps_docker_desktop_windows.sh (commands for running all the analysis steps using Docker Desktop and the demo files on Windows)

View file

@ -0,0 +1,56 @@
#!/bin/bash
# This script should be run on your local laptop or computer
# Please make sure you have installed Docker Desktop using the instructions at: https://www.docker.com/products/docker-desktop/
# Before running this script, do the following
## 1. open the Docker Dekstop and make sure it is running
## 2. Download the workshop materials at https://github.com/gladstone-institutes/Bioinformatics-Workshops/raw/master/intro-rna-seq/Intro_to_RNA-seq_data_analysis.zip?raw=true.
## 3. Unzip the workshop materials in the Downloads folder
## 4. Open a new terminal window
#check if docker is running
#the below command will print out the docker version if docker is running
docker --version
#get the docker image from https://hub.docker.com/r/nfcore/rnaseq/
#we will be using this image for all the analyses
docker pull nfcore/rnaseq
#check if the docker image was downloaded successfully
docker images
#spin up a docker container
#change the path below to the path on your computer where the downloaded materials are unzipped
##for M1 chip mac, use the below command
docker run --platform linux/amd64 --name rna_bash --rm -it -v ~/Downloads/Intro_to_RNA-seq_data_analysis:/home nfcore/rnaseq bash
##for all other computers, use the below command
docker run --name rna_bash --rm -it -v ~/Downloads/Intro_to_RNA-seq_data_analysis:/home nfcore/rnaseq bash
#a new prompt will apear and the conatiner is now active
#the home directory should have all the contents of the downloaded materials folder
#go to the home directory in the docker container
cd home
#run fastqc
fastqc Bacteria_GATTACA_L001_R1_001.fastq
#trim the reads using cutadapt
cutadapt -a file:Adapter_Sequence.fasta -o trimmed.fastq Bacteria_GATTACA_L001_R1_001.fastq
#run fastqc on the trimmed reads
fastqc trimmed.fastq
#create a new folder
mkdir star_index
#create the STAR index
STAR --runMode genomeGenerate --genomeDir ./star_index --genomeFastaFiles rDNA_sequence.fasta --genomeSAindexNbases 3
#run STAR for the trimmed reads
STAR --genomeDir ./star_index --readFilesIn ./trimmed.fastq
#generate the read count matrix using featureCounts
featureCounts -a rDNA.gtf -t CDS -o counts.txt Aligned.out.sam
# END #

View file

@ -0,0 +1,60 @@
#!/bin/bash
# This script should be run on your local laptop or computer
# Please make sure you have installed Docker Desktop using the instructions at: https://www.docker.com/products/docker-desktop/
# Before running this script, do the following
## 1. open the Docker Dekstop and make sure it is running
## If Docker Desktop gives a pop-up that it requires a newer WSL kernel version, open a new command prompt window and run the below command:
## $ wsl --update
## Restart Docker Desktop
## 2. Download the workshop materials at https://github.com/gladstone-institutes/Bioinformatics-Workshops/raw/master/intro-rna-seq/Intro_to_RNA-seq_data_analysis.zip?raw=true.
## 3. Unzip the workshop materials in the Downloads folder
## 4. Open a new command prompt window
##run the below commands in the command prompt window
#check if docker is running
#the below command will print out the docker version if docker is running
docker --version
#get the docker image from https://hub.docker.com/r/nfcore/rnaseq/
#we will be using this image for all the analyses
docker pull nfcore/rnaseq
#check if the docker image was downloaded successfully
docker images
#spin up a docker container
#change the path below to the path on your computer where the downloaded materials are unzipped
docker run --name rna_bash --rm -it -v C:\Users\ayushi.agrawal\Downloads\Intro_to_RNA-seq_data_analysis\Intro_to_RNA-seq_data_analysis:/home nfcore/rnaseq bash
#a new prompt will apear and the conatiner is now active
#the home directory should have all the contents of the downloaded materials folder
#go to the home directory in the docker container
cd home
#run fastqc
fastqc Bacteria_GATTACA_L001_R1_001.fastq
#trim the reads using cutadapt
cutadapt -a file:Adapter_Sequence.fasta -o trimmed.fastq Bacteria_GATTACA_L001_R1_001.fastq
#run fastqc on the trimmed reads
fastqc trimmed.fastq
#STAR does not work on windows so, we will be using another aligner "HISAT2" for the demo
#create a new folder
mkdir hisat2_index
#create the hisat2 index
hisat2-build -p 7 rDNA_sequence.fasta hisat2_index/rDNA_
#run hisat2 for the trimmed reads
#HISAT2 and STAR are different aligners with different defaults, scoring algorithms, etc. This might result in different outputs from the two aligners.
hisat2 -p 7 -x hisat2_index/rDNA_ -U ./trimmed.fastq -S hisat2_aligned_out.sam
#generate the read count matrix using featureCounts
featureCounts -a rDNA.gtf -t CDS -o counts.txt hisat2_aligned_out.sam
# END #

View file

@ -9,8 +9,8 @@
#enter your wynton password when prompted and hit enter #enter your wynton password when prompted and hit enter
#once you are logged in to wynton, #once you are logged in to wynton,
#list the contents of the home diretory or ~ #list the contents of the home directory or ~
#the uploaded folder Intro_to_RNA-seq_data_analysis shoudl appear in the result #the uploaded folder Intro_to_RNA-seq_data_analysis should appear in the result
[alice@log2 ~]$ ls [alice@log2 ~]$ ls
#login to the development node #login to the development node