Scenario: A research team is doing an experiment to assess whether there is a significant difference between the control and treatment group in the post test ratings of the students. The number of participants for each group were determined via sample size estimation using power and mean, and randomization is considered in the assignment of groups.

library(xlsx)
data <- read.xlsx("Experiment1.xlsx", sheetIndex = "Sheet1")

Check the first and last 6 rows, structure of the data and generate a summary

head(data)
##     Pre  Post Group
## 1  2.03 17.23 Cntrl
## 2  4.02 16.04 Cntrl
## 3 14.34 19.22 Cntrl
## 4 15.55 19.45 Cntrl
## 5  2.05 18.53 Cntrl
## 6 11.07 21.00 Cntrl
tail(data)
##      Pre  Post Group
## 69 17.32 19.67    Tx
## 70  8.95  6.86    Tx
## 71  8.34 17.63    Tx
## 72 17.52 18.84    Tx
## 73  5.21 15.11    Tx
## 74  3.73 19.07    Tx
str(data)
## 'data.frame':    74 obs. of  3 variables:
##  $Pre : num 2.03 4.02 14.34 15.55 2.05 ... ##$ Post : num  17.2 16 19.2 19.4 18.5 ...
##  $Group: chr "Cntrl" "Cntrl" "Cntrl" "Cntrl" ... summary(data) ## Pre Post Group ## Min. : 0.37 Min. : 5.28 Length:74 ## 1st Qu.: 5.48 1st Qu.:16.05 Class :character ## Median :11.67 Median :18.20 Mode :character ## Mean :10.93 Mean :16.92 ## 3rd Qu.:15.74 3rd Qu.:19.16 ## Max. :19.77 Max. :21.04 Create subsets by test administration and groups #Retrieve Pretest ratings from Control Group CntrlPre <- data[data$Group =="Cntrl",]$Pre #Retrieve Pretest ratings from Treatment Group TxPre <- data[data$Group == "Tx",]$Pre #Retrieve PostTest ratings from Control Group CntrlPost <- data[data$Group == "Cntrl",]$Post #Retrive PostTest ratings from Treatment Group TxPost <- data[data$Group == "Tx",]$Post Test for Normality with Anderson-Darling library(nortest) nordata <- cbind(CntrlPre, CntrlPost, TxPre, TxPost) apply(nordata, 2, function(x) ad.test(x)) ##$CntrlPre
##
##  Anderson-Darling normality test
##
## data:  x
## A = 0.75109, p-value = 0.04608
##
##
## $CntrlPost ## ## Anderson-Darling normality test ## ## data: x ## A = 2.4165, p-value = 3.078e-06 ## ## ##$TxPre
##
##  Anderson-Darling normality test
##
## data:  x
## A = 0.83981, p-value = 0.02754
##
##
## $TxPost ## ## Anderson-Darling normality test ## ## data: x ## A = 1.9828, p-value = 3.726e-05 Generate Probability Density Function Plots par(mfrow=c(2,2)) apply(nordata, 2, function(x) plot(density(x), col = "firebrick")) ## NULL # Checking both the statistics and visualizations, it seems that the data set is not approximately normally distributed.  Generate Descriptive Statistics library(psych) describeBy(data$Pre, data$Group) #Median is the measure for central tendency given the result of normality test ## ## Descriptive statistics by group ## group: Cntrl ## vars n mean sd median trimmed mad min max range skew kurtosis se ## X1 1 37 11.48 5.92 11.65 11.68 8.35 1.37 19.7 18.33 -0.25 -1.33 0.97 ## ------------------------------------------------------------ ## group: Tx ## vars n mean sd median trimmed mad min max range skew kurtosis se ## X1 1 37 10.37 5.79 11.69 10.35 7.99 0.37 19.77 19.4 -0.05 -1.43 0.95 describeBy(data$Post, data\$Group) #Median is the measure for central tendency given the result of normality test
##
##  Descriptive statistics by group
## group: Cntrl
##    vars  n  mean   sd median trimmed  mad  min   max range  skew kurtosis   se
## X1    1 37 17.23 3.52  19.05   17.67 1.44 5.28 21.04 15.76 -1.47     1.77 0.58
## ------------------------------------------------------------
## group: Tx
##    vars  n  mean   sd median trimmed  mad  min   max range  skew kurtosis   se
## X1    1 37 16.62 3.17  17.63   17.03 1.96 6.86 20.83 13.97 -1.32      1.1 0.52

Test for Sig. Diff. in Pretest ratings of Control and Treatment

wilcox.test(CntrlPre, jitter(TxPre), alternative = "two.sided", paired = FALSE) # There is no significant difference in the pretest ratings between Control and Treatment. This is consistent with the expectations of an experimental design. 
##
##  Wilcoxon rank sum exact test
##
## data:  CntrlPre and jitter(TxPre)
## W = 758, p-value = 0.4323
## alternative hypothesis: true location shift is not equal to 0

Test for Sig. Diff. in Pretest and Posttest ratings of Control Group

wilcox.test(CntrlPre, jitter(CntrlPost), alternative = "two.sided", paired = TRUE) # Presence of significant difference in pretest and post test ratings of the Control Group indicate that an increase in ratings in the Post Test may be attributed by other factors beside chance. This makes the findings rather intriguing considering the change in ratings despite the fact that it occured in the Control Group; one that was not introduced with Treatment.
##
##  Wilcoxon signed rank exact test
##
## data:  CntrlPre and jitter(CntrlPost)
## V = 31, p-value = 3.456e-08
## alternative hypothesis: true location shift is not equal to 0

Test for Sig. Diff. in Pretest and Posttest ratings of Treatment Group

wilcox.test(TxPre, jitter(TxPost), alternative = "two.sided", paired = TRUE) # There exists a significant difference in the pretest and post test ratings for the Treatment group which somehow indicates that the change may be related to the introduction of the treatment rather than chance. However, this is questionable considering the observation in the Control group wherein there is an increase in ratings despite the fact that there was no treatment introduced. 
##
##  Wilcoxon signed rank exact test
##
## data:  TxPre and jitter(TxPost)
## V = 25, p-value = 1.315e-08
## alternative hypothesis: true location shift is not equal to 0

Test for Sig. Diff. in PostTest ratings of Control and Treatment Group

wilcox.test(CntrlPost, jitter(TxPost), alternative = "two.sided", paired = FALSE) # When comparing the Post Test ratings of Control and Treatment Group, no significant difference is observed. This further explains the observations made earlier. 
##
##  Wilcoxon rank sum exact test
##
## data:  CntrlPost and jitter(TxPost)
## W = 842, p-value = 0.08972
## alternative hypothesis: true location shift is not equal to 0