mirror of
https://github.com/gladstone-institutes/Bioinformatics-Workshops.git
synced 2025-11-30 09:45:43 -08:00
closes #20
This commit is contained in:
parent
094d41cd0d
commit
fd2fc5b190
6 changed files with 1040 additions and 1502 deletions
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
|
@ -1,7 +1,8 @@
|
||||||
---
|
---
|
||||||
title: "Introduction to R Data Analysis - Part 1"
|
title: "Introduction to R Data Analysis"
|
||||||
|
subtitle: "Part 1"
|
||||||
author: "Natalie Elphick"
|
author: "Natalie Elphick"
|
||||||
date: "January 22nd, 2024"
|
date: "May 20th, 2024"
|
||||||
knit: (function(input, ...) {
|
knit: (function(input, ...) {
|
||||||
rmarkdown::render(
|
rmarkdown::render(
|
||||||
input,
|
input,
|
||||||
|
|
@ -16,6 +17,7 @@ output:
|
||||||
|
|
||||||
```{r, setup, include=FALSE}
|
```{r, setup, include=FALSE}
|
||||||
library(tidyverse)
|
library(tidyverse)
|
||||||
|
knitr::opts_chunk$set(comment = "")
|
||||||
```
|
```
|
||||||
|
|
||||||
##
|
##
|
||||||
|
|
@ -25,10 +27,10 @@ library(tidyverse)
|
||||||
## Introductions
|
## Introductions
|
||||||
|
|
||||||
**Natalie Elphick**
|
**Natalie Elphick**
|
||||||
Bioinformatician I
|
Bioinformatician I
|
||||||
|
|
||||||
**Michela Traglia (TA)**
|
**Yihang Xin (Online TA)**
|
||||||
Senior Statistician
|
Software Engineer III
|
||||||
|
|
||||||
|
|
||||||
## Poll 1
|
## Poll 1
|
||||||
|
|
@ -36,7 +38,7 @@ Senior Statistician
|
||||||
**What is your level of experience with coding/data analysis?**
|
**What is your level of experience with coding/data analysis?**
|
||||||
|
|
||||||
1. I know another data analysis programming language (Python, Matlab etc.)
|
1. I know another data analysis programming language (Python, Matlab etc.)
|
||||||
2. I can use Excel to do linear regression
|
2. I can use Excel
|
||||||
3. I know some R
|
3. I know some R
|
||||||
4. All of the above
|
4. All of the above
|
||||||
5. None of the above
|
5. None of the above
|
||||||
|
|
@ -50,8 +52,8 @@ Senior Statistician
|
||||||
1. What is R and why should you use it?
|
1. What is R and why should you use it?
|
||||||
2. The RStudio interface
|
2. The RStudio interface
|
||||||
3. File types
|
3. File types
|
||||||
4. Error messages
|
4. Variables
|
||||||
5. Variables
|
5. Error and warning messages
|
||||||
6. Types & data structures
|
6. Types & data structures
|
||||||
7. Math and logic operations
|
7. Math and logic operations
|
||||||
8. Functions and packages
|
8. Functions and packages
|
||||||
|
|
@ -86,9 +88,9 @@ functionality
|
||||||
# RStudio
|
# RStudio
|
||||||
|
|
||||||
## RStudio
|
## RStudio
|
||||||
- RStudio is an integrated development
|
- RStudio is an integrated development
|
||||||
environment (IDE)
|
environment (IDE)
|
||||||
- It makes R code easier to write by providing a
|
- It is an app that makes R code easier to write by providing a
|
||||||
feature rich graphical user interface (GUI)
|
feature rich graphical user interface (GUI)
|
||||||
|
|
||||||
<br>
|
<br>
|
||||||
|
|
@ -131,8 +133,6 @@ feature rich graphical user interface (GUI)
|
||||||
## Variable definition
|
## Variable definition
|
||||||
- Variables store information that is referenced and manipulated
|
- Variables store information that is referenced and manipulated
|
||||||
in a computer program
|
in a computer program
|
||||||
- In contrast to the mathematical definition of a variable,
|
|
||||||
variables in computer science are _mutable_
|
|
||||||
- There are 3 ways to define variables in R, but one is preferred:
|
- There are 3 ways to define variables in R, but one is preferred:
|
||||||
```{r}
|
```{r}
|
||||||
x <- 1 # Preferred way
|
x <- 1 # Preferred way
|
||||||
|
|
@ -141,7 +141,59 @@ x = 1
|
||||||
print(x)
|
print(x)
|
||||||
```
|
```
|
||||||
|
|
||||||
## Variable naming
|
## Example
|
||||||
|
- Run the following in the R console:
|
||||||
|
|
||||||
|
```{r}
|
||||||
|
x <- 1
|
||||||
|
y <- 4
|
||||||
|
z <- y
|
||||||
|
x + y + z
|
||||||
|
```
|
||||||
|
|
||||||
|
# Error and Warning Messages
|
||||||
|
|
||||||
|
## Errors
|
||||||
|
- **Errors**: Stop the execution of your code and must be fixed for the code to run successfully
|
||||||
|
|
||||||
|
```{r, eval=FALSE}
|
||||||
|
x <- 5
|
||||||
|
y <- 10
|
||||||
|
z <- x + a
|
||||||
|
```
|
||||||
|
|
||||||
|
```{r,echo=FALSE}
|
||||||
|
message("Error: object 'a' not found")
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
## Common Errors
|
||||||
|
|
||||||
|
- **Syntax Error:** Invalid R code syntax (e.g. misplaced parentheses)
|
||||||
|
```{r,echo=FALSE}
|
||||||
|
message('Error: unexpected ")"')
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Object not found:** This variable is not defined (e.g. misspelled variables)
|
||||||
|
|
||||||
|
```{r,echo=FALSE}
|
||||||
|
message('Error: object "a" not found')
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
See this [article](https://statsandr.com/blog/top-10-errors-in-r/) for more common errors and how to fix them.
|
||||||
|
|
||||||
|
## Warnings
|
||||||
|
- Do not stop the execution but indicate potential issues that you should be aware of and might need to address
|
||||||
|
|
||||||
|
```{r}
|
||||||
|
a <- c(1, 2, 3, 4, 5)
|
||||||
|
b <- c(6, 7, 8, 9)
|
||||||
|
result <- a + b
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
## Variable Naming
|
||||||
|
|
||||||
- Variables names must start with a letter and can contain
|
- Variables names must start with a letter and can contain
|
||||||
underscores and periods
|
underscores and periods
|
||||||
|
|
@ -176,7 +228,7 @@ DogBreeds <- c("Labrador Retriever", "Akita", "Bulldog")
|
||||||
|
|
||||||
## Data Types
|
## Data Types
|
||||||
- Integer
|
- Integer
|
||||||
- Whole numbers (in R denote with L ex. 1L,2L)
|
- Whole numbers (in R denoted with L ex. 1L,2L)
|
||||||
- Numeric
|
- Numeric
|
||||||
- Decimal numbers
|
- Decimal numbers
|
||||||
- Logical
|
- Logical
|
||||||
|
|
@ -191,7 +243,7 @@ DogBreeds <- c("Labrador Retriever", "Akita", "Bulldog")
|
||||||
**Which of these is not the correct data type for the value?**
|
**Which of these is not the correct data type for the value?**
|
||||||
|
|
||||||
1. 1.5 - Numeric
|
1. 1.5 - Numeric
|
||||||
2. "Labrador Retriever" - Character
|
2. "1" - Character
|
||||||
3. NA - Logical
|
3. NA - Logical
|
||||||
4. 1 - Integer
|
4. 1 - Integer
|
||||||
|
|
||||||
|
|
@ -227,6 +279,8 @@ DogBreeds <- c("Labrador Retriever", "Akita", "Bulldog")
|
||||||
countdown::countdown(minutes = 10,
|
countdown::countdown(minutes = 10,
|
||||||
seconds = 0,
|
seconds = 0,
|
||||||
color_border = "black",
|
color_border = "black",
|
||||||
|
color_running_background = "#47d193",
|
||||||
|
color_finished_background = "#a3184e",
|
||||||
padding = "50px",
|
padding = "50px",
|
||||||
margin = "5%",
|
margin = "5%",
|
||||||
font_size = "5em",
|
font_size = "5em",
|
||||||
|
|
@ -297,6 +351,7 @@ x & !y
|
||||||
execution of code
|
execution of code
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
|
dog_breeds <- c("Labrador Retriever", "Akita", "Bulldog")
|
||||||
if ("Akita" %in% dog_breeds) {
|
if ("Akita" %in% dog_breeds) {
|
||||||
print("dog_breeds already contains Akita")
|
print("dog_breeds already contains Akita")
|
||||||
} else {
|
} else {
|
||||||
|
|
@ -379,11 +434,17 @@ library(ggplot2) # Makes all of the ggplot2 functions available
|
||||||
- The tidyverse is a collection of commonly used data analysis
|
- The tidyverse is a collection of commonly used data analysis
|
||||||
packages
|
packages
|
||||||
- Learning curve is less steep
|
- Learning curve is less steep
|
||||||
- Lots of useful packages for data analysis
|
- Lots of useful packages for cleaning and "wrangling" data into the correct format
|
||||||
|
|
||||||
##
|
## Why use Tidyverse Packages?
|
||||||
|
|
||||||

|
- Most of the work in data analysis is getting data into the correct format to create outputs
|
||||||
|
- The tidyverse collection of packages simplifies this process
|
||||||
|
- Intuitive syntax
|
||||||
|
- Comprehensive (data manipulation, cleaning, modeling and graphics)
|
||||||
|
- Consistent data structure
|
||||||
|
- Strong community support
|
||||||
|
|
||||||
|
|
||||||
# End of Part 1
|
# End of Part 1
|
||||||
|
|
||||||
|
|
@ -392,11 +453,8 @@ packages
|
||||||
|
|
||||||
## Upcoming Workshops
|
## Upcoming Workshops
|
||||||
|
|
||||||
1. [Introduction to Statistics, Experimental Design, and Hypothesis Testing](https://gladstone.org/index.php/events/introduction-statistics-experimental-design-and-hypothesis-testing-0)
|
[Single Cell ATAC-Seq Data Analysis Part 2](https://gladstone.org/events/single-cell-atac-seq-data-analysis-part-2-1)
|
||||||
- Jan 25, 2024 (Session 1 - 10am–12pm) (Session 2 - 1pm–3pm)
|
|
||||||
- Jan 26, 2024 (Session 3 - 10am–12pm)
|
|
||||||
|
|
||||||
2. [Intermediate RNA-Seq Analysis Using R](https://gladstone.org/index.php/events/intermediate-rna-seq-analysis-using-r-4)
|
- Check [this link](https://gladstone.org/events?series=data-science-training-program) at the end of the summer for out fall workshop schedule
|
||||||
- Feb 1, 2024 (9:30am-12:00pm)
|
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,7 +1,8 @@
|
||||||
---
|
---
|
||||||
title: "Introduction to R Data Analysis - Part 2"
|
title: "Introduction to R Data Analysis"
|
||||||
|
subtitle: "Part 2"
|
||||||
author: "Natalie Elphick"
|
author: "Natalie Elphick"
|
||||||
date: "January 23rd, 2024"
|
date: "May 21st, 2024"
|
||||||
knit: (function(input, ...) {
|
knit: (function(input, ...) {
|
||||||
rmarkdown::render(
|
rmarkdown::render(
|
||||||
input,
|
input,
|
||||||
|
|
@ -19,6 +20,7 @@ library(kableExtra)
|
||||||
library(tidyverse)
|
library(tidyverse)
|
||||||
library(readxl)
|
library(readxl)
|
||||||
theme_set(theme_grey(base_size = 16))
|
theme_set(theme_grey(base_size = 16))
|
||||||
|
knitr::opts_chunk$set(comment = "")
|
||||||
```
|
```
|
||||||
|
|
||||||
##
|
##
|
||||||
|
|
@ -29,7 +31,10 @@ theme_set(theme_grey(base_size = 16))
|
||||||
**Natalie Elphick**
|
**Natalie Elphick**
|
||||||
Bioinformatician I
|
Bioinformatician I
|
||||||
|
|
||||||
**Yihang Xin (TA)**
|
**Michela Traglia (In Person TA)**
|
||||||
|
Senior Statistician
|
||||||
|
|
||||||
|
**Yihang Xin (Online TA)**
|
||||||
Software Engineer III
|
Software Engineer III
|
||||||
|
|
||||||
# Schedule
|
# Schedule
|
||||||
|
|
@ -46,11 +51,11 @@ Software Engineer III
|
||||||
- The tidyverse packages work well together because they share
|
- The tidyverse packages work well together because they share
|
||||||
common data representations and design principles
|
common data representations and design principles
|
||||||
- Rows = observations, columns = variables
|
- Rows = observations, columns = variables
|
||||||
- [ggplot2](), for data visualization.
|
- [ggplot2](https://ggplot2.tidyverse.org/), for data visualization.
|
||||||
- [dplyr](), for data manipulation.
|
- [dplyr](https://dplyr.tidyverse.org/), for data manipulation.
|
||||||
- [tidyr](), for data tidying.
|
- [tidyr](https://tidyr.tidyverse.org/), for data tidying.
|
||||||
- [readr](), for data import.
|
- [readr](https://readr.tidyverse.org/), for data import.
|
||||||
- [purrr](), for iteration.
|
- [purrr](https://purrr.tidyverse.org/), for iteration.
|
||||||
- and more..
|
- and more..
|
||||||
|
|
||||||
## dplyr
|
## dplyr
|
||||||
|
|
@ -67,66 +72,38 @@ common data representations and design principles
|
||||||
|
|
||||||
## Example Dataframe
|
## Example Dataframe
|
||||||
- mpg is a dataframe built into the ggplot2 package
|
- mpg is a dataframe built into the ggplot2 package
|
||||||
```{r, eval = FALSE}
|
```{r}
|
||||||
head(mpg)
|
head(mpg)
|
||||||
```
|
```
|
||||||
|
|
||||||
```{r, echo = FALSE}
|
|
||||||
head(mpg) |>
|
|
||||||
kable() |>
|
|
||||||
kable_styling("striped") |>
|
|
||||||
scroll_box(width = "100%")
|
|
||||||
```
|
|
||||||
|
|
||||||
## Select Columns
|
## Select Columns
|
||||||
|
|
||||||
```{r, eval = FALSE}
|
```{r}
|
||||||
select(.data = mpg,
|
select(.data = mpg,
|
||||||
year, cty, hwy, manufacturer)
|
year, cty, hwy, manufacturer)
|
||||||
```
|
```
|
||||||
|
|
||||||
```{r, echo = FALSE}
|
|
||||||
select(.data = mpg,
|
|
||||||
year, cty, hwy, manufacturer) |>
|
|
||||||
head() |>
|
|
||||||
kable() |>
|
|
||||||
kable_styling("striped") |>
|
|
||||||
scroll_box(width = "100%")
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
## Filter Rows
|
## Filter Rows
|
||||||
|
|
||||||
|
|
||||||
```{r, eval = FALSE}
|
```{r}
|
||||||
filter(.data = mpg,
|
filter(.data = mpg,
|
||||||
year == 2008)
|
year == 2008)
|
||||||
```
|
```
|
||||||
|
|
||||||
```{r, echo = FALSE}
|
|
||||||
filter(.data = mpg,
|
|
||||||
year == 2008) |>
|
|
||||||
head() |>
|
|
||||||
kable() |>
|
|
||||||
kable_styling("striped") |>
|
|
||||||
scroll_box(width = "100%")
|
|
||||||
```
|
|
||||||
## Arrange Rows
|
## Arrange Rows
|
||||||
|
|
||||||
- desc() is used to arrange rows in descending order, the default is ascending
|
- desc() is used to arrange rows in descending order, the default is ascending
|
||||||
```{r, eval = FALSE}
|
```{r}
|
||||||
arrange(.data = mpg,
|
arrange(.data = mpg,
|
||||||
desc(cyl))
|
desc(cty))
|
||||||
```
|
```
|
||||||
|
|
||||||
```{r, echo = FALSE}
|
|
||||||
arrange(.data = mpg,
|
|
||||||
desc(cyl)) |>
|
|
||||||
head(n = 3) |>
|
|
||||||
kable() |>
|
|
||||||
kable_styling("striped") |>
|
|
||||||
scroll_box(width = "100%")
|
|
||||||
```
|
|
||||||
## Summarising data
|
## Summarising data
|
||||||
- The dplyr **summarise()** function computes a table of
|
- The dplyr **summarise()** function computes a table of
|
||||||
summaries for a data frame
|
summaries for a data frame
|
||||||
|
|
@ -136,6 +113,9 @@ variable(s)
|
||||||
different categorical groupings
|
different categorical groupings
|
||||||
|
|
||||||
## Group and Summarise
|
## Group and Summarise
|
||||||
|
|
||||||
|
- Get the mean and median city mileage within manufacturer
|
||||||
|
|
||||||
```{r, eval = FALSE}
|
```{r, eval = FALSE}
|
||||||
summarise(group_by(.data = mpg,
|
summarise(group_by(.data = mpg,
|
||||||
manufacturer),
|
manufacturer),
|
||||||
|
|
@ -144,37 +124,27 @@ summarise(group_by(.data = mpg,
|
||||||
```
|
```
|
||||||
|
|
||||||
```{r, echo = FALSE}
|
```{r, echo = FALSE}
|
||||||
summarise(group_by(.data = mpg,
|
summarise(.data = group_by(.data = mpg,
|
||||||
manufacturer),
|
manufacturer),
|
||||||
mean_cty = mean(cty),
|
mean_cty = mean(cty),
|
||||||
median_cty = median(cty)) |>
|
median_cty = median(cty)) |>
|
||||||
head() |>
|
head(10)
|
||||||
kable() |>
|
|
||||||
kable_styling("striped") |>
|
|
||||||
scroll_box(width = "100%")
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## The pipe operator |>
|
## The pipe operator |>
|
||||||
- Allows "chaining" of function calls to make code more readable
|
- Allows "chaining" of function calls to make code more readable
|
||||||
```{r, eval = FALSE}
|
```{r}
|
||||||
mpg |>
|
|
||||||
group_by(manufacturer) |>
|
|
||||||
summarise(mean_cty = mean(cty),
|
|
||||||
median_cty = median(cty))
|
|
||||||
```
|
|
||||||
|
|
||||||
```{r, echo = FALSE}
|
|
||||||
mpg |>
|
mpg |>
|
||||||
group_by(manufacturer) |>
|
group_by(manufacturer) |>
|
||||||
summarise(mean_cty = mean(cty),
|
summarise(mean_cty = mean(cty),
|
||||||
median_cty = median(cty)) |>
|
median_cty = median(cty)) |>
|
||||||
head(n = 4) |>
|
head(5)
|
||||||
kable() |>
|
|
||||||
kable_styling("striped") |>
|
|
||||||
scroll_box(width = "100%")
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
# Plotting
|
# Plotting
|
||||||
|
|
||||||
## ggplot2
|
## ggplot2
|
||||||
|
|
@ -204,11 +174,10 @@ ggplot(data = mpg, # Input dataframe
|
||||||
## Adding and Modifying Layers
|
## Adding and Modifying Layers
|
||||||
|
|
||||||
```{r, fig.dim=c(10,4)}
|
```{r, fig.dim=c(10,4)}
|
||||||
ggplot(data = mpg,
|
ggplot(data = mpg,
|
||||||
mapping = aes(x = class, y = cty, fill = class)) +
|
mapping = aes(x = cty, y = hwy)) +
|
||||||
geom_violin() +
|
geom_point() +
|
||||||
geom_boxplot(width = 0.1,
|
geom_smooth(formula = y ~ x, method = "lm")
|
||||||
fill = "white")
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -221,6 +190,8 @@ ggplot(data = mpg,
|
||||||
countdown::countdown(minutes = 10,
|
countdown::countdown(minutes = 10,
|
||||||
seconds = 0,
|
seconds = 0,
|
||||||
color_border = "black",
|
color_border = "black",
|
||||||
|
color_running_background = "#47d193",
|
||||||
|
color_finished_background = "#a3184e",
|
||||||
padding = "50px",
|
padding = "50px",
|
||||||
margin = "5%",
|
margin = "5%",
|
||||||
font_size = "5em",
|
font_size = "5em",
|
||||||
|
|
@ -234,8 +205,8 @@ countdown::countdown(minutes = 10,
|
||||||
|
|
||||||
## Dataset Description
|
## Dataset Description
|
||||||
- PanTHERIA
|
- PanTHERIA
|
||||||
- A global species-level data set of key life-history, ecological and geographical traits of all known extant and recently extinct mammals compiled from the literature
|
- A global species-level data set of key traits of all known extant and recently extinct mammals compiled from literature
|
||||||
- Macroecological and macroevolutionary research projects
|
- Used in macroecological and macroevolutionary research projects
|
||||||
- Data is organized by taxonomic rank
|
- Data is organized by taxonomic rank
|
||||||
|
|
||||||
## Taxonomic Rank
|
## Taxonomic Rank
|
||||||
|
|
@ -252,10 +223,20 @@ read_xlsx("Intro_to_R_workshop_materials/PanTHERIA.xlsx") |>
|
||||||
scroll_box(width = "100%")
|
scroll_box(width = "100%")
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Hands-on Analysis
|
||||||
|
|
||||||
|
- We will read in the data and explore if the trophic level has a significant impact on the adult body mass of mammals
|
||||||
|
|
||||||
|
|
||||||
|
Steps:
|
||||||
|
1. Combine and clean the data
|
||||||
|
2. Visualize adult body mass by trophic level
|
||||||
|
3. Check for overrepresented groups
|
||||||
|
4. Fit a simple linear model
|
||||||
|
|
||||||
## Hands-on Analysis
|
## Hands-on Analysis
|
||||||
- Open part_2.Rmd
|
- Open part_2.Rmd
|
||||||
|
- If you just want to follow along and not run code, open part2_filled_out.html
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -263,63 +244,34 @@ read_xlsx("Intro_to_R_workshop_materials/PanTHERIA.xlsx") |>
|
||||||
|
|
||||||
## General Tips
|
## General Tips
|
||||||
|
|
||||||
|
- Follow any relevant institutional guidelines on using LLMs
|
||||||
- Always confirm ChatGPT's outputs are correct
|
- Always confirm ChatGPT's outputs are correct
|
||||||
- Provide as much detail as possible about the problem in the 1st prompt
|
- Provide as much detail as possible about the problem in the 1st prompt
|
||||||
- Use separate chats for separate tasks/projects
|
- Use separate chats for separate tasks/projects
|
||||||
- Try the 'Custom Instructions' function that adds additional information to every prompt
|
- Try the 'Custom Instructions' function
|
||||||
- Can visit webpages (GPT 4 only), which can help get more specific answers
|
|
||||||
|
|
||||||
## Code Tips
|
## Code Tips
|
||||||
|
|
||||||
- Commented R code yields better responses in my experience
|
- Commented R code yields better responses
|
||||||
- Provide the code and error message in the same prompt
|
- Provide the code and error message in the same prompt
|
||||||
- ChatGPT can work well to convert syntax and improve your code:
|
- ChatGPT can work well to convert syntax and improve your code:
|
||||||
- "Turn this loop into a function : [your code]"
|
- "Turn this loop into a function : [your code]"
|
||||||
- "Is there a better way to do this : [your code]"
|
- "Is there a better way to do this : [your code]"
|
||||||
- Check out the file: `example_code/1_convert_syntax_example.R` for an example use case
|
- Check out the file: `example_code/1_convert_syntax_example.R` for an example use case
|
||||||
|
|
||||||
# Finding R Packages
|
|
||||||
|
|
||||||
## Key Questions
|
|
||||||
|
|
||||||
- What assay was the package designed for?
|
|
||||||
- When was the last release?
|
|
||||||
- Is it maintained (frequent updates)?
|
|
||||||
- Does it work on all operating systems?
|
|
||||||
- Are other people using it? (citations)
|
|
||||||
- Do they respond to github issues?
|
|
||||||
- Is there a benchmarking paper?
|
|
||||||
|
|
||||||
## BioConductor and CRAN
|
|
||||||
|
|
||||||
- Both of these have stringent requirements for packages they host (eg. for BioConductor they have to run on all major operating systems)
|
|
||||||
|
|
||||||
- Prefer BioConductor packages if available over CRAN
|
|
||||||
|
|
||||||
- Prefer CRAN packages over ones only hosted on GitHub
|
|
||||||
|
|
||||||
## Start with the Assay
|
|
||||||
|
|
||||||
- Click [here](https://www.bioconductor.org/packages/release/BiocViews.html#___Sequencing) to go to BioC views
|
|
||||||
- Pick the assay you want to analyse
|
|
||||||
- Pick the type of analysis you want to do
|
|
||||||
- Find a package that does it
|
|
||||||
- Find benchmarking papers to narrow the list of packages down
|
|
||||||
- Find the vignette on the package page and refer to the manual for any questions not covered by it
|
|
||||||
|
|
||||||
|
|
||||||
# Additional Resources
|
# Additional Resources
|
||||||
|
|
||||||
## R
|
## R
|
||||||
|
|
||||||
- [R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown/how-to-read-this-book.html) : Excellent R markdown reference
|
|
||||||
|
|
||||||
- [R for Data Science](https://r4ds.hadley.nz/)
|
- [R for Data Science](https://r4ds.hadley.nz/)
|
||||||
|
- [Top 10 R Errors and How to Fix them](https://statsandr.com/blog/top-10-errors-in-r/)
|
||||||
|
- [R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown/how-to-read-this-book.html) : Excellent R markdown reference
|
||||||
- [ggplot2: elegant graphics for data analysis](https://ggplot2-book.org/)
|
- [ggplot2: elegant graphics for data analysis](https://ggplot2-book.org/)
|
||||||
|
|
||||||
- [Advanced R](https://adv-r.hadley.nz/)
|
- [Advanced R](https://adv-r.hadley.nz/)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Statistics
|
## Statistics
|
||||||
|
|
||||||
- [Data Analysis in R](https://bookdown.org/steve_midway/DAR) : This book has more statistics details than *R for Data Science*
|
- [Data Analysis in R](https://bookdown.org/steve_midway/DAR) : This book has more statistics details than *R for Data Science*
|
||||||
|
|
@ -346,10 +298,9 @@ read_xlsx("Intro_to_R_workshop_materials/PanTHERIA.xlsx") |>
|
||||||
|
|
||||||
## Upcoming Workshops
|
## Upcoming Workshops
|
||||||
|
|
||||||
1. [Introduction to Statistics, Experimental Design, and Hypothesis Testing](https://gladstone.org/index.php/events/introduction-statistics-experimental-design-and-hypothesis-testing-0)
|
[Single Cell ATAC-Seq Data Analysis Part 2](https://gladstone.org/events/single-cell-atac-seq-data-analysis-part-2-1)
|
||||||
- Jan 25, 2024 (Session 1 - 10am–12pm) (Session 2 - 1pm–3pm)
|
|
||||||
- Jan 26, 2024 (Session 3 - 10am–12pm)
|
- Check [this link](https://gladstone.org/events?series=data-science-training-program) at the end of the summer for out fall workshop schedule
|
||||||
|
|
||||||
|
|
||||||
2. [Intermediate RNA-Seq Analysis Using R](https://gladstone.org/index.php/events/intermediate-rna-seq-analysis-using-r-4)
|
|
||||||
- Feb 1, 2024 (9:30am-12:00pm)
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -22,7 +22,7 @@
|
||||||
},
|
},
|
||||||
"MASS": {
|
"MASS": {
|
||||||
"Package": "MASS",
|
"Package": "MASS",
|
||||||
"Version": "7.3-60",
|
"Version": "7.3-60.0.1",
|
||||||
"Source": "Repository",
|
"Source": "Repository",
|
||||||
"Repository": "CRAN",
|
"Repository": "CRAN",
|
||||||
"Requirements": [
|
"Requirements": [
|
||||||
|
|
@ -33,11 +33,11 @@
|
||||||
"stats",
|
"stats",
|
||||||
"utils"
|
"utils"
|
||||||
],
|
],
|
||||||
"Hash": "a56a6365b3fa73293ea8d084be0d9bb0"
|
"Hash": "b765b28387acc8ec9e9c1530713cb19c"
|
||||||
},
|
},
|
||||||
"Matrix": {
|
"Matrix": {
|
||||||
"Package": "Matrix",
|
"Package": "Matrix",
|
||||||
"Version": "1.6-1.1",
|
"Version": "1.6-5",
|
||||||
"Source": "Repository",
|
"Source": "Repository",
|
||||||
"Repository": "CRAN",
|
"Repository": "CRAN",
|
||||||
"Requirements": [
|
"Requirements": [
|
||||||
|
|
@ -50,7 +50,7 @@
|
||||||
"stats",
|
"stats",
|
||||||
"utils"
|
"utils"
|
||||||
],
|
],
|
||||||
"Hash": "1a00d4828f33a9d690806e98bd17150c"
|
"Hash": "8c7115cd3a0e048bda2a7cd110549f7a"
|
||||||
},
|
},
|
||||||
"R6": {
|
"R6": {
|
||||||
"Package": "R6",
|
"Package": "R6",
|
||||||
|
|
@ -872,7 +872,7 @@
|
||||||
},
|
},
|
||||||
"lattice": {
|
"lattice": {
|
||||||
"Package": "lattice",
|
"Package": "lattice",
|
||||||
"Version": "0.21-9",
|
"Version": "0.22-6",
|
||||||
"Source": "Repository",
|
"Source": "Repository",
|
||||||
"Repository": "CRAN",
|
"Repository": "CRAN",
|
||||||
"Requirements": [
|
"Requirements": [
|
||||||
|
|
@ -883,7 +883,7 @@
|
||||||
"stats",
|
"stats",
|
||||||
"utils"
|
"utils"
|
||||||
],
|
],
|
||||||
"Hash": "5558c61e0136e247252f5f952cdaad6a"
|
"Hash": "cc5ac1ba4c238c7ca9fa6a87ca11a7e2"
|
||||||
},
|
},
|
||||||
"learnr": {
|
"learnr": {
|
||||||
"Package": "learnr",
|
"Package": "learnr",
|
||||||
|
|
@ -977,7 +977,7 @@
|
||||||
},
|
},
|
||||||
"mgcv": {
|
"mgcv": {
|
||||||
"Package": "mgcv",
|
"Package": "mgcv",
|
||||||
"Version": "1.9-0",
|
"Version": "1.9-1",
|
||||||
"Source": "Repository",
|
"Source": "Repository",
|
||||||
"Repository": "CRAN",
|
"Repository": "CRAN",
|
||||||
"Requirements": [
|
"Requirements": [
|
||||||
|
|
@ -990,7 +990,7 @@
|
||||||
"stats",
|
"stats",
|
||||||
"utils"
|
"utils"
|
||||||
],
|
],
|
||||||
"Hash": "086028ca0460d0c368028d3bda58f31b"
|
"Hash": "110ee9d83b496279960e162ac97764ce"
|
||||||
},
|
},
|
||||||
"mime": {
|
"mime": {
|
||||||
"Package": "mime",
|
"Package": "mime",
|
||||||
|
|
@ -1033,7 +1033,7 @@
|
||||||
},
|
},
|
||||||
"nlme": {
|
"nlme": {
|
||||||
"Package": "nlme",
|
"Package": "nlme",
|
||||||
"Version": "3.1-163",
|
"Version": "3.1-164",
|
||||||
"Source": "Repository",
|
"Source": "Repository",
|
||||||
"Repository": "CRAN",
|
"Repository": "CRAN",
|
||||||
"Requirements": [
|
"Requirements": [
|
||||||
|
|
@ -1043,7 +1043,7 @@
|
||||||
"stats",
|
"stats",
|
||||||
"utils"
|
"utils"
|
||||||
],
|
],
|
||||||
"Hash": "8d1938040a05566f4f7a14af4feadd6b"
|
"Hash": "a623a2239e642806158bc4dc3f51565d"
|
||||||
},
|
},
|
||||||
"openssl": {
|
"openssl": {
|
||||||
"Package": "openssl",
|
"Package": "openssl",
|
||||||
|
|
|
||||||
|
|
@ -129,4 +129,14 @@ small {
|
||||||
.big-picture img{
|
.big-picture img{
|
||||||
max-width: 70%;
|
max-width: 70%;
|
||||||
border: 1px solid black !important;
|
border: 1px solid black !important;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Chage link color to sky blue */
|
||||||
|
.reveal a {
|
||||||
|
color: #0c74dc;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Change link color to magenta on hover */
|
||||||
|
.reveal a:hover {
|
||||||
|
color: #9c0366 !important;
|
||||||
|
}
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue