diff --git a/hypothesis-testing/HypothesisTesting_GladstoneBIonformaticsCore.Rmd b/hypothesis-testing/HypothesisTesting_GladstoneBIonformaticsCore.Rmd index c4b4ef5..26dc673 100644 --- a/hypothesis-testing/HypothesisTesting_GladstoneBIonformaticsCore.Rmd +++ b/hypothesis-testing/HypothesisTesting_GladstoneBIonformaticsCore.Rmd @@ -37,7 +37,7 @@ After looking numerically at the data, let us look at it visually. We will plot ```{r} ## load the library to be used for plotting suppressMessages(library(ggplot2)) -ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot() +ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot() + geom_jitter(width=0.1) ``` ## One-sided, one sample t-test @@ -58,6 +58,7 @@ LinSeedWeights <- filter(chickwts, feed =="linseed")$weight print(LinSeedWeights) ##let us again visualize this boxplot(LinSeedWeights) +abline(h=200, lty=2, col="red") ``` Now, we will run the one-sample, one-sided t-test. @@ -84,7 +85,7 @@ SubChickWts$feed <- droplevels(SubChickWts$feed) str(SubChickWts) ##let us plot this again -ggplot(SubChickWts, aes(x=feed, y=weight)) + geom_boxplot() +ggplot(SubChickWts, aes(x=feed, y=weight)) + geom_boxplot() + geom_jitter(width=0.1) ``` The distribution of chick weights fed soybean appears to have slightly higher than those fed with linseed. We will compute the mean weights of the checks fed with each kind of feed. @@ -137,7 +138,7 @@ wilcox.test(weight ~ feed, data=SubChickWts) ## Statistical power estimates or Sample Size calculations We were unable to reject the hypothesis that linseed and soybean feed kept the mean weights of the chicks the same. Let us visualize these data again, ```{r} -ggplot(SubChickWts, aes(x=feed, y=weight)) + geom_boxplot() +ggplot(SubChickWts, aes(x=feed, y=weight)) + geom_boxplot()+ geom_jitter(width=0.1) ``` It however appears that the soybean feed does increase the mean weight over the linseed feed. If the increase is true then we need more samples to conclude that the soybean feed does increase the mean chick weight in a statistically significant manner. @@ -151,7 +152,7 @@ To do that we can perform something called a statistical power analyses. We will 2. The effect size that you want to have the statistical power to estimate -3. At what Type I error will you be making claims of statistical significance. This is a number between 0 and 1 (typically 0.05) and represents the fraction of times (when you repeat the same experiment over and over again) when you will claim significance when in fact your null hypothesis is true (there is no differernce in the mean weights). +3. At what Type I error will you be making claims of statistical significance. This is a number between 0 and 1 (typically 0.05) and represents the fraction of times (when you repeat the same experiment over and over again) when you will claim significance when in fact your null hypothesis is true (there is no difference in the mean weights). 4. What is the desired statistical power? This is a number between 0 and 1 and represents the fraction of times (when you repeat the same experiment over and over) you want to claim significance at the chosen Type I error, when there is really a difference as captured by the effect size. @@ -172,10 +173,10 @@ The results says we need to have at least 60 chicks in each feed group to have a We will now go back to looking at the distribution of the weights of chicks fed all the diets and not just the above two one. Our null hypothesis is that the mean chick weights is same for all the 6 feeds. Let us visualize the data again, ```{r} -ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot() +ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot()+ geom_jitter(width=0.1) ``` -The appropriate test statistic to use here is called the F-statistic, its sampling distribution is called the F-distribution. While the t-distribution captures the sampling distribution of the scaled sample mean or scaled difference of sample means, the F-distribution captures the proportion of variance between all observations within a feed group due to variance in the mean chick weights between feed groups, i.e., +The appropriate test statistic to use here is called the F-statistic, its sampling distribution is called the F-distribution. While the t-distribution captures the sampling distribution of the scaled sample mean or scaled difference of sample means, the F-distribution captures the ratio of variance in the mean chick weights between feed groups versus variance between all observations within a feed group , i.e., $$F = \frac{between\ feed\ group\ weight\ variance}{within\ feed\ group\ weight\ variance}$$ So, intuitively when the mean chick weights are not different between the different feed groups, the variance between these mean weights should be similar to variances of weights within a feed group. That is, under the null hypothesis F will hover around 1. Note, when we say, "within a feed group", we don't specify which particular feed group. This should suggest to you the requirement of the assumption that within feed groups variances are same across all groups. @@ -187,7 +188,7 @@ summary(AmodelFit) ``` The significance above suggests that there are feeds resulting in differing mean chick weights. -We don't get information on which pairs are really different from each other. To get this information, we will perform multiple pairwise tests using Tukey's posthoc tests. +We don't get information on which pairs are really different from each other. To get this information, we will perform multiple pairwise tests using Tukey's post-hoc tests. ### Multiple testing ```{r} @@ -296,7 +297,7 @@ There are always assumptions to check for. We will visually attempt to test the 2. Normality of residuals (differences between observations and their predictions using the linear model). -3. Homogenity of variances across the fitted/predicted values of distance +3. Homogeneity of variances across the fitted/predicted values of distance 4. Influence of outliers on slope estimates @@ -309,7 +310,7 @@ plot(lmFit) We will now perform a linear model version of the one-way ANOVA test we ran above, ```{r} -ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot() +ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot() + geom_jitter(width = 0.1) lmFit <- lm(weight ~ feed, chickwts) print(levels(chickwts$feed)) summary(lmFit) @@ -334,7 +335,7 @@ Let us visualize the data now, ```{r} -ggplot(ToothGrowth, aes(x=dose, y=len, color=supp)) + geom_boxplot() +ggplot(ToothGrowth, aes(x=dose, y=len, color=supp)) + geom_boxplot() ``` We will formulate a linear model to estimate the effects of _dose_ and _supp_. diff --git a/hypothesis-testing/HypothesisTesting_GladstoneBIonformaticsCore.html b/hypothesis-testing/HypothesisTesting_GladstoneBIonformaticsCore.html index 242ef07..6c7b5f0 100644 --- a/hypothesis-testing/HypothesisTesting_GladstoneBIonformaticsCore.html +++ b/hypothesis-testing/HypothesisTesting_GladstoneBIonformaticsCore.html @@ -205,8 +205,8 @@ str(chickwts)
## load the library to be used for plotting
suppressMessages(library(ggplot2))
-ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot()
-## [1] 309 229 181 141 260 203 148 169 213 257 244 271
##let us again visualize this
-boxplot(LinSeedWeights)
-Now, we will run the one-sample, one-sided t-test.
t.test(LinSeedWeights, mu=200, alternative = "greater")
##
@@ -258,8 +259,8 @@ str(SubChickWts)
## $ weight: num 309 229 181 141 260 203 148 169 213 257 ...
## $ feed : Factor w/ 2 levels "linseed","soybean": 1 1 1 1 1 1 1 1 1 1 ...
##let us plot this again
-ggplot(SubChickWts, aes(x=feed, y=weight)) + geom_boxplot()
-The distribution of chick weights fed soybean appears to have slightly higher than those fed with linseed. We will compute the mean weights of the checks fed with each kind of feed.
##mean of wight with linseed feed
print(mean(SubChickWts$weight[SubChickWts$feed == "linseed"]))
@@ -341,14 +342,14 @@ wilcox.test(weight ~ feed, data=SubChickWts)
We were unable to reject the hypothesis that linseed and soybean feed kept the mean weights of the chicks the same. Let us visualize these data again,
-ggplot(SubChickWts, aes(x=feed, y=weight)) + geom_boxplot()
- It however appears that the soybean feed does increase the mean weight over the linseed feed. If the increase is true then we need more samples to conclude that the soybean feed does increase the mean chick weight in a statistically significant manner.
ggplot(SubChickWts, aes(x=feed, y=weight)) + geom_boxplot()+ geom_jitter(width=0.1)
+ It however appears that the soybean feed does increase the mean weight over the linseed feed. If the increase is true then we need more samples to conclude that the soybean feed does increase the mean chick weight in a statistically significant manner.
How many more samples would we need?
To do that we can perform something called a statistical power analyses. We will use a library in R called pwr that will help us with these analyses. Before doing any power analyses, you need to know a bunch of things.
The sampling distribution of the test statistic (t-statistic)
The effect size that you want to have the statistical power to estimate
At what Type I error will you be making claims of statistical significance. This is a number between 0 and 1 (typically 0.05) and represents the fraction of times (when you repeat the same experiment over and over again) when you will claim significance when in fact your null hypothesis is true (there is no differernce in the mean weights).
At what Type I error will you be making claims of statistical significance. This is a number between 0 and 1 (typically 0.05) and represents the fraction of times (when you repeat the same experiment over and over again) when you will claim significance when in fact your null hypothesis is true (there is no difference in the mean weights).
What is the desired statistical power? This is a number between 0 and 1 and represents the fraction of times (when you repeat the same experiment over and over) you want to claim significance at the chosen Type I error, when there is really a difference as captured by the effect size.
suppressMessages(library(pwr))
@@ -375,9 +376,9 @@ pwr.t.test(d=d, power = 0.8, sig.level = 0.05)
We will now go back to looking at the distribution of the weights of chicks fed all the diets and not just the above two one. Our null hypothesis is that the mean chick weights is same for all the 6 feeds. Let us visualize the data again,
-ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot()
-The appropriate test statistic to use here is called the F-statistic, its sampling distribution is called the F-distribution. While the t-distribution captures the sampling distribution of the scaled sample mean or scaled difference of sample means, the F-distribution captures the proportion of variance between all observations within a feed group due to variance in the mean chick weights between feed groups, i.e.,
+ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot()+ geom_jitter(width=0.1)
+The appropriate test statistic to use here is called the F-statistic, its sampling distribution is called the F-distribution. While the t-distribution captures the sampling distribution of the scaled sample mean or scaled difference of sample means, the F-distribution captures the ratio of variance in the mean chick weights between feed groups versus variance between all observations within a feed group , i.e.,
\[F = \frac{between\ feed\ group\ weight\ variance}{within\ feed\ group\ weight\ variance}\] So, intuitively when the mean chick weights are not different between the different feed groups, the variance between these mean weights should be similar to variances of weights within a feed group. That is, under the null hypothesis F will hover around 1. Note, when we say, “within a feed group”, we don’t specify which particular feed group. This should suggest to you the requirement of the assumption that within feed groups variances are same across all groups.
We will now run the ANOVA analyses as follows:
AmodelFit <- aov(weight ~ feed, data=chickwts)
@@ -387,7 +388,7 @@ summary(AmodelFit)
## Residuals 65 195556 3009
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-The significance above suggests that there are feeds resulting in differing mean chick weights. We don’t get information on which pairs are really different from each other. To get this information, we will perform multiple pairwise tests using Tukey’s posthoc tests.
+The significance above suggests that there are feeds resulting in differing mean chick weights. We don’t get information on which pairs are really different from each other. To get this information, we will perform multiple pairwise tests using Tukey’s post-hoc tests.
TukeyHSD(AmodelFit,ordered = TRUE)
@@ -569,7 +570,7 @@ summary(lmFit)
Suitability of a linear (as opposed to some non-linear) model for these data.
Normality of residuals (differences between observations and their predictions using the linear model).
Homogenity of variances across the fitted/predicted values of distance
Homogeneity of variances across the fitted/predicted values of distance
Influence of outliers on slope estimates
plot(lmFit)
@@ -577,8 +578,8 @@ summary(lmFit)
We will now perform a linear model version of the one-way ANOVA test we ran above,
-ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot()
-ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot() + geom_jitter(width = 0.1)
+lmFit <- lm(weight ~ feed, chickwts)
print(levels(chickwts$feed))
## [1] "casein" "horsebean" "linseed" "meatmeal" "soybean" "sunflower"
@@ -632,7 +633,7 @@ str(ToothGrowth)
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: Factor w/ 3 levels "0.5","1","2": 1 1 1 1 1 1 1 1 1 1 ...
Let us visualize the data now,
-ggplot(ToothGrowth, aes(x=dose, y=len, color=supp)) + geom_boxplot()
+ggplot(ToothGrowth, aes(x=dose, y=len, color=supp)) + geom_boxplot()
We will formulate a linear model to estimate the effects of dose and supp.
lmFit <- lm(len ~ dose + supp + dose:supp, ToothGrowth)