Problems 3-1 through 3-3, Analysis of variance
We will provide solutions for problems 3-1, 3-2 and 3-3 in this tutorial.
Problem statement
Problem 3-1
The tensile strength of portland cement is being studied. Four different mixing techniques can be used economically. A completely randomized experiment was conducted and the following data collected.
Mixing Technique | Tensile Strength (lb/in2) | |||
---|---|---|---|---|
1 | 3129 | 3000 | 2865 | 2890 |
2 | 3200 | 3300 | 2975 | 3150 |
3 | 2800 | 2900 | 2985 | 3050 |
4 | 2600 | 2700 | 2600 | 2765 |
- Test the hypothesis that mixing techniques affect the strength of the cement. Use α = 0.05.
- Construct a graphical display as described in section 3-5.3 to compare the mean tensile strengths for the four mixing techniques. What are your conclusions?
- Use the Fisher LSD method with α = 0.05 to make comparisons between pairs of means.
- Construct a normal probability plot of the residuals. What conclusion would you draw about the validity of the normality assumption?
- Plot the residuals vs. the predicted tensile strength. Comment on the plot.
- Prepare a scatter plot of the results to aid the interpretation of the results of this experiment.
Problem 3-2
- Rework part (b) of problem 3-1 using Tukey's test with α = 0.05. Do you get the same conclusions from Tukey's test that you did from the graphical procedure and/or the Fisher LSD method?
- Explain the difference between the Tukey and Fisher procedures.
Problem 3-3
Reconsider the experiment in problem 3-1. Find a 95% confidence interval on the mean tensile strength of the portland cement produced by each of the four mixing techniques. Also find a 95% confidence interval on the difference in means for techniques 1 and 3. Does this aid you in interpreting the results of the experiment?
Solution
In these problems, we are given a data set with subsets, each containing values for a total of data points. Each subset contains measurements of tensile strength of cement samples that were produced with a different mixing technique, or treatment. We define the mean over all data points as the grand mean, and the mean of each point within a given treatment as the treatment mean. To compare these data subsets, it is useful to think of each data point as the sum of the grand mean , the ith treatment mean , and a random error ϵij specific to the jth data point in the ith treatment (refer to figure 1).
Since we are given a finite set of data, we must approximate these means by calculating sample means. The grand sample mean is given by:
The sample treatment means are given by:
The dot indicates that you are summing over the variable it replaces.
Section 3-1 (A): Hypothesis testing
We would like to know if one of our data subsets is significantly different from the others, as this may indicate that one of our manufacturing techniques is superior (or inferior) to the others. To compare our data subsets we are interested in whether most of the error is within the treatments (ϵ) or between the treatments (). If most of the error is between the treatment means, then we can claim there are significant differences between them. If there is too much error within the treatment means we cannot claim that they are significantly different (see figure x). Mathematically, we can approximate the error between the treatment means as
To approximate the error within the treatment means, it is easiest to subtract the error between the means from the total error:
We are interested in the ratio
To determine whether or not there are significant differences between our treatments, we will compare F0 to from the F distribution. In Excel this value can be found using the function FINV(α,a−1,N−a); in R it can be found using qf(1−α,a−1,N−a). If , which is the case here, then the error between treatment means is large enough compared to the error within treatment means to conclude that there is a significant difference between at least one treatment and the others.
Source of Variation | Sum of Squares | Degrees of Freedom | Mean Square | ||
---|---|---|---|---|---|
Mixing Technique | 489740 | 3 | 163247 | 12.728 | 3.490 |
Error | 153908 | 12 | 12826 | ||
Total | 643648 | 15 |
Section 3-1 (B): Graphical display to compare mean tensile strengths
A relatively simple way to visualize the treatment means, and see whether or not they are statistically equal (qualitatively), is to simply plot the four averages on the same graph as a T distribution. We need to know what mean and standard deviation to use for our T distribution. For the mean, we will simply use the grand mean, and we will approximate the standard deviation with . This approximation for the standard deviation relies on , which does not take into account the differences between the treatment means. It assumes that the treatment means are all equal – if they are not statistically equivalent, it will be obvious when we plot the treatment means on the same plot as the distribution.
Looking at the plot on the left, we see that for our data it is unlikely that all of the treatment means come from the plotted distribution. The two treatment means under the tails of the T distribution appear to be significantly different from those under the center.
Section 3-1 (C): Fisher LSD comparisons
Fisher LSD comparisons allow each pair of treatment means to be compared. This is done using a t-test as we did in problem 2.11 B (solution: Excel, R), but replacing with :
Solving for yields:
We will compare this to a theoretical value called the least significant difference:
If then the treatment means and are significantly different.
In Excel, you can calculate LSD using TINV(α,N-a)*sqrt(2*MSe/n), where MSe is . In R, the equivalent command is qt(1-α/2,N-a)*sqrt(2*MSe/n).
The following table shows the differences between each pair of treatment means. Differences highlighted in blue are large enough for that pair to be considered significantly different.
2 | 3 | 4 | |
---|---|---|---|
1 | 185.25 | 37.25 | 304.75 |
2 | 222.50 | 490.00 | |
3 | 267.50 |
To apply the Fisher LSD method to our data, we will compare LSD to each of the numbers in the chart to the left. For example, we see that 185.25 > 174.5, so there is a statistically significant difference between treatment 1 and 2.
Section 3-1 (D): Normal probability plot
We have been assuming that our data is distributed normally (on a Gaussian), and that it is therefore valid to do t-tests. To be sure, we should check our normality assumption by creating a normal probability plot. This is done by plotting the residuals against values from a z-distribution. Residuals are calculated by subtracting the corresponding treatment mean from each data point, and must be sorted before using them to make the plot. The values we seek from a z-distribution are obtained by doing NORMSINV(percent) in Excel, or qnorm(percent) in R. In these commands, percent is a number from 1/(dof+1) to dof/(dof+1) where dof is the degrees of freedom. These commands return z-distribution values that represent ideal residual values. If the resulting plot is roughly linear, then the normality assumption is valid.
Section 3-1 (E): Plot of residuals vs. predicted tensile strength
As an estimate of the tensile strength for each treatment, we use the treatment mean. The plot of residuals vs. their treatment means gives an indication of the relative sizes of errors between (x-axis) and within (y-axis) treatments.
Section 3-1 (F): Plot of all data
Section 3-2 (A): Tukey test
Tukey's test is similar to Fisher LSD comparisons in that they allow pairs of treatment means to be compared. However, instead of using the t-statistic, Tukey's test uses the Studentized range statistic q:
Solving for yields:
We will compare this with the theoretical value:
This can be calculated in R using qtukey(1-alpha,a,N-a)*sqrt(MSe/n) where MSe is . If Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle |\bar{y}_i - \bar{y}_j| > T_\alpha} , there is a significant difference between the two treatments.
We now compare this statistic to the differences between the treatment means. Differences highlighted in blue are large enough for that pair to be considered significantly different.
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle |\bar{y}_i-\bar{y}_j|} | |||
---|---|---|---|
2 | 3 | 4 | |
1 | 185.25 | 37.25 | 304.75 |
2 | 222.50 | 490.00 | |
3 | 267.50 |
Section 3-2 (B): Difference between Tukey and Fisher procedures
The Fisher procedure uses the T-statistic to compare pairs of treatment means, while the Tukey test uses the Studentized range statistic. One consequence of this is that the Fisher procedure controls the error rate for each individual pairwise comparison, whereas the Tukey test controls the overall error rate.
Section 3-3: Confidence intervals
We want to find a 95% confidence interval on the mean tensile strength for each mixing technique. The upper bound of each confidence interval is the treatment mean plus the least significant difference, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \bar{y}_i.+\mathrm{LSD}} . The lower bound is . LSD was calculated for =0.95, so this gives us a 95% confidence interval (see figure 7).
lower bound | treatment mean | upper bound | |
---|---|---|---|
Treatment 1 | 2848 | 2971 | 3094 |
Treatment 2 | 3033 | 3156 | 3280 |
Treatment 3 | 2810 | 2933 | 3057 |
Treatment 4 | 2543 | 2666 | 2790 |
To find the confidence interval on the differences in means, we simply subtract to get the difference between our treatment means, and then use the formula above to calculate the confidence interval (see figure 8).
lower bound | upper bound | ||
---|---|---|---|
Treatment 1 - 2 | -359 | -185 | -10 |
Treatment 1 - 3 | -137 | 37 | 211 |
Treatment 1 - 4 | 130 | 304 | 479 |
Treatment 2 - 3 | 48 | 222 | 396 |
Treatment 2 - 4 | 315 | 490 | 664 |
Treatment 3 - 4 | 93 | 267 | 441 |