From 82a173485a7e8196e17539ce8db460a525177629 Mon Sep 17 00:00:00 2001 From: reubenthomas Date: Tue, 17 Jan 2023 19:20:13 -0800 Subject: [PATCH] mod --- ...ntroStatsExpDesignHT_011723_demo_day3.html | 37 ++++++------------- 1 file changed, 12 insertions(+), 25 deletions(-) diff --git a/intro-experimental-design-hypothesis-testing/IntroStatsExpDesignHT_011723_demo_day3.html b/intro-experimental-design-hypothesis-testing/IntroStatsExpDesignHT_011723_demo_day3.html index 8b0b802..d714b18 100644 --- a/intro-experimental-design-hypothesis-testing/IntroStatsExpDesignHT_011723_demo_day3.html +++ b/intro-experimental-design-hypothesis-testing/IntroStatsExpDesignHT_011723_demo_day3.html @@ -395,7 +395,7 @@ diet/feed they were on

## load the library to be used for plotting
 suppressMessages(library(ggplot2))
 ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot() + geom_jitter(width=0.1)
-

+

One-sided, one sample t-test

@@ -476,7 +476,7 @@ str(SubChickWts) ## $ feed : Factor w/ 2 levels "linseed","soybean": 1 1 1 1 1 1 1 1 1 1 ...
##let us plot this again
 ggplot(SubChickWts, aes(x=feed, y=weight)) + geom_boxplot() + geom_jitter(width=0.1)
-

+

The distribution of chick weights fed soybean appears to have slightly higher than those fed with linseed. We will compute the mean weights of the checks fed with each kind of feed.

@@ -584,7 +584,7 @@ wilcox.test(weight ~ feed, data=SubChickWts) kept the mean weights of the chicks the same. Let us visualize these data again,

ggplot(SubChickWts, aes(x=feed, y=weight)) + geom_boxplot()+ geom_jitter(width=0.1)
-

+

It however appears that the soybean feed does increase the mean weight over the linseed feed. If the increase is true then we need more samples to conclude that the soybean feed does increase the mean chick weight in @@ -642,7 +642,7 @@ chicks fed all the diets and not just the above two one. Our null hypothesis is that the mean chick weights is same for all the 6 feeds. Let us visualize the data again,

ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot()+ geom_jitter(width=0.1)
-

+

The appropriate test statistic to use here is called the F-statistic, its sampling distribution is called the F-distribution. While the t-distribution captures the sampling distribution of the scaled sample @@ -842,17 +842,8 @@ method controls for False-Discovery Rate, or the expected (over repeated experiments) fraction of false positives among the rejected hypothesis. The Holm-Sidak method is more conservative while the BH method has more tolerance for false-positives

-
##load the require library
-suppressMessages(library(multtest))
-MCor <- mt.rawp2adjp(pValueFeedComparisons, proc = c("SidakSD", "BH"))
-MCorResults <- cbind(pValueFeedComparisons[MCor$index], MCor$adjp)
-head(MCorResults)
-
##                                rawp      SidakSD           BH
-## horsebean 7.210250e-07 7.210250e-07 3.605120e-06 3.605125e-06
-## linseed   2.606217e-04 2.606217e-04 1.042079e-03 6.515541e-04
-## soybean   3.521263e-03 3.521263e-03 1.052664e-02 5.868772e-03
-## meatmeal  9.866222e-02 9.866222e-02 1.875902e-01 1.233278e-01
-## sunflower 8.215118e-01 8.215118e-01 8.215118e-01 8.215118e-01
+
MCor <- p.adjust(pValueFeedComparisons, method = c("holm"))
+MCorResults <- data.frame(pValueFeedComparisons, adj.p=p.adjust(pValueFeedComparisons, method = "holm"))

You will the results in the SidakSD and BH columns as adjusted p-values

@@ -910,6 +901,8 @@ their predictions using the linear model).

distance

  • Influence of outliers on slope estimates

  • +

    See https://data.library.virginia.edu/diagnostic-plots/ for +an elaboration of what each of these plots mean

    plot(lmFit)

    @@ -917,7 +910,7 @@ distance

    We will now perform a linear model version of the one-way ANOVA test we ran above,

    ggplot(chickwts, aes(x=feed, y=weight)) + geom_boxplot() + geom_jitter(width = 0.1)
    -

    +

    lmFit <- lm(weight ~ feed, chickwts)
     print(levels(chickwts$feed))
    ## [1] "casein"    "horsebean" "linseed"   "meatmeal"  "soybean"   "sunflower"
    @@ -1004,17 +997,11 @@ summary(lmFit)

    We will go over the interpretation of the estimates from this linear model fit. Note, we have estimated both the main effects and also interaction effects.

    +

    Note, per our discussion in class yesterday the variable +supp can be considered to moderate the influence of tooth +growth on the dose of vitamin C.

    -
    -

    Clustered data

    -
    -

    Paired t-test

    -
    -
    -
    -

    2x2 tables

    -