diff --git a/intro-experimental-design-hypothesis-testing/IntroStatsExpDesignHT_011723_demo_day3.html b/intro-experimental-design-hypothesis-testing/IntroStatsExpDesignHT_011723_demo_day3.html index 8b0b802..d714b18 100644 --- a/intro-experimental-design-hypothesis-testing/IntroStatsExpDesignHT_011723_demo_day3.html +++ b/intro-experimental-design-hypothesis-testing/IntroStatsExpDesignHT_011723_demo_day3.html @@ -395,7 +395,7 @@ diet/feed they were on
## load the library to be used for plotting
suppressMessages(library(ggplot2))
ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot() + geom_jitter(width=0.1)
-##let us plot this again
ggplot(SubChickWts, aes(x=feed, y=weight)) + geom_boxplot() + geom_jitter(width=0.1)
-The distribution of chick weights fed soybean appears to have slightly higher than those fed with linseed. We will compute the mean weights of the checks fed with each kind of feed.
@@ -584,7 +584,7 @@ wilcox.test(weight ~ feed, data=SubChickWts) kept the mean weights of the chicks the same. Let us visualize these data again,ggplot(SubChickWts, aes(x=feed, y=weight)) + geom_boxplot()+ geom_jitter(width=0.1)
-
+
It however appears that the soybean feed does increase the mean weight
over the linseed feed. If the increase is true then we need more samples
to conclude that the soybean feed does increase the mean chick weight in
@@ -642,7 +642,7 @@ chicks fed all the diets and not just the above two one. Our null
hypothesis is that the mean chick weights is same for all the 6 feeds.
Let us visualize the data again,
ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot()+ geom_jitter(width=0.1)
-The appropriate test statistic to use here is called the F-statistic, its sampling distribution is called the F-distribution. While the t-distribution captures the sampling distribution of the scaled sample @@ -842,17 +842,8 @@ method controls for False-Discovery Rate, or the expected (over repeated experiments) fraction of false positives among the rejected hypothesis. The Holm-Sidak method is more conservative while the BH method has more tolerance for false-positives
-##load the require library
-suppressMessages(library(multtest))
-MCor <- mt.rawp2adjp(pValueFeedComparisons, proc = c("SidakSD", "BH"))
-MCorResults <- cbind(pValueFeedComparisons[MCor$index], MCor$adjp)
-head(MCorResults)
-## rawp SidakSD BH
-## horsebean 7.210250e-07 7.210250e-07 3.605120e-06 3.605125e-06
-## linseed 2.606217e-04 2.606217e-04 1.042079e-03 6.515541e-04
-## soybean 3.521263e-03 3.521263e-03 1.052664e-02 5.868772e-03
-## meatmeal 9.866222e-02 9.866222e-02 1.875902e-01 1.233278e-01
-## sunflower 8.215118e-01 8.215118e-01 8.215118e-01 8.215118e-01
+MCor <- p.adjust(pValueFeedComparisons, method = c("holm"))
+MCorResults <- data.frame(pValueFeedComparisons, adj.p=p.adjust(pValueFeedComparisons, method = "holm"))
You will the results in the SidakSD and BH columns as adjusted p-values
Influence of outliers on slope estimates
See https://data.library.virginia.edu/diagnostic-plots/ for +an elaboration of what each of these plots mean
plot(lmFit)
We will now perform a linear model version of the one-way ANOVA test we ran above,
ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot() + geom_jitter(width = 0.1)
-lmFit <- lm(weight ~ feed, chickwts)
print(levels(chickwts$feed))
## [1] "casein" "horsebean" "linseed" "meatmeal" "soybean" "sunflower"
@@ -1004,17 +997,11 @@ summary(lmFit)
We will go over the interpretation of the estimates from this linear model fit. Note, we have estimated both the main effects and also interaction effects.
+Note, per our discussion in class yesterday the variable +supp can be considered to moderate the influence of tooth +growth on the dose of vitamin C.