Comparison of User Ratings between Traditional and System Setup

Scenario: The company is planning to replace the traditional setup with a new system. Prior to the decision, a team has been formed to assess whether such move will lead to a marked improvement in user satisfaction ratings. The team will assess whether there is a significant difference between the control and treatment group in the post test ratings of the users. The number of users for each group were determined via sample size estimation using power and mean, and randomization is considered in the assignment of groups.

Open the data set

library(xlsx)
data <- read.xlsx("Experiment2.xlsx", sheetIndex = "Sheet1")

Check the first and last six rows, check the data structure and generate a summary

head(data)

##   NA.       Pre     Post  Group
## 1   1 10.221375 19.27253 system
## 2   2 11.915686 18.81053 system
## 3   3 18.337294 22.71338 system
## 4   4 14.803235 16.46261 system
## 5   5  7.123360 18.42337 system
## 6   6  6.090328 20.80002 system

tail(data)

##    NA.      Pre     Post       Group
## 79  79 12.55211 13.88570 traditional
## 80  80 15.78812 14.64949 traditional
## 81  81 15.17438 23.67276 traditional
## 82  82 12.83547 22.57924 traditional
## 83  83 10.67936 11.44996 traditional
## 84  84 13.41753 21.68929 traditional

str(data)

## 'data.frame':    84 obs. of  4 variables:
##  $ NA.  : chr  "1" "2" "3" "4" ...
##  $ Pre  : num  10.22 11.92 18.34 14.8 7.12 ...
##  $ Post : num  19.3 18.8 22.7 16.5 18.4 ...
##  $ Group: chr  "system" "system" "system" "system" ...

summary(data)

##      NA.                 Pre              Post          Group          
##  Length:84          Min.   : 3.940   Min.   : 4.30   Length:84         
##  Class :character   1st Qu.: 7.916   1st Qu.:12.91   Class :character  
##  Mode  :character   Median :10.728   Median :16.83   Mode  :character  
##                     Mean   :10.647   Mean   :16.37                     
##                     3rd Qu.:12.623   3rd Qu.:20.07                     
##                     Max.   :18.354   Max.   :24.62

Clean data by removing the irrelevant feature

data <- data[,-1]

Subset data by test administration (Pre or Post test) and Group (Traditional or New System)

SysPre <- data[data$Group == "system",]$Pre
SysPost <- data[data$Group == "system",]$Post
TradPre <- data[data$Group == "traditional",]$Pre
TradPost <- data[data$Group == "traditional",]$Post

Conduct normality testing for the data for each group and test administration

library(nortest)
nordata <- data.frame(cbind(SysPre, SysPost, TradPre, TradPost))
nortest <- apply(nordata, 2, function(x) ad.test(x))
nortest #Data sets tend to assume Normal Distribution

## $SysPre
## 
##  Anderson-Darling normality test
## 
## data:  x
## A = 0.24322, p-value = 0.7516
## 
## 
## $SysPost
## 
##  Anderson-Darling normality test
## 
## data:  x
## A = 0.63055, p-value = 0.09373
## 
## 
## $TradPre
## 
##  Anderson-Darling normality test
## 
## data:  x
## A = 0.4832, p-value = 0.2178
## 
## 
## $TradPost
## 
##  Anderson-Darling normality test
## 
## data:  x
## A = 0.32375, p-value = 0.5146

Generate Probability Density Function Plots

par(mfrow=c(2,2))
apply(nordata, 2, function(x) plot(density(x), col = "darkorchid"))

## Generate BoxPlots

par(mfrow=c(2,2))
apply(nordata, 2, function(x) boxplot(x, col = "steelblue"))

# Check out the statistics and visualization, it would seem that some features are not approximately normally distributed.

Generate Descriptive Statistics

library(psych)
describeBy(data$Pre, data$Group) #Median is the measure for central tendency given the result of the normality test

## 
##  Descriptive statistics by group 
## group: system
##    vars  n  mean   sd median trimmed  mad  min   max range skew kurtosis   se
## X1    1 42 10.61 3.61  10.56   10.51 3.58 3.94 18.35 14.41 0.22    -0.46 0.56
## ------------------------------------------------------------ 
## group: traditional
##    vars  n  mean   sd median trimmed  mad  min   max range  skew kurtosis   se
## X1    1 42 10.69 2.84  11.23   10.66 3.06 5.19 15.84 10.65 -0.08    -1.07 0.44

describeBy(data$Post, data$Group)

## 
##  Descriptive statistics by group 
## group: system
##    vars  n  mean   sd median trimmed  mad  min  max range  skew kurtosis   se
## X1    1 42 16.23 4.76  16.35   16.37 6.19 7.43 23.9 16.47 -0.14    -1.23 0.73
## ------------------------------------------------------------ 
## group: traditional
##    vars  n mean   sd median trimmed  mad min   max range  skew kurtosis   se
## X1    1 42 16.5 4.48  17.01   16.65 5.34 4.3 24.62 20.32 -0.32    -0.35 0.69

Conduct of test for significant differences in the Pretest of System and Traditional (Unpaired)

wilcox.test(SysPre, TradPre, paired = 0) # No significant difference in the pretest ratings between System and Traditional

## 
##  Wilcoxon rank sum exact test
## 
## data:  SysPre and TradPre
## W = 851, p-value = 0.7861
## alternative hypothesis: true location shift is not equal to 0

Conduct of test for significant differences in the Pretest and PostTest for Group System (Paired)

wilcox.test(SysPost, SysPre, paired = 1) # There exists a significant difference; it seems that post test ratings tend to be higher than pre test ratings by about 3.88 to 7.37 at 95% confidence interval.

## 
##  Wilcoxon signed rank exact test
## 
## data:  SysPost and SysPre
## V = 844, p-value = 4.107e-08
## alternative hypothesis: true location shift is not equal to 0

Conduct of test for significant differences in the Pretest and PostTest for Group Traditional (Paired)

wilcox.test(TradPost, TradPre, paired = 1) # There exists a significant difference; it seems that post test ratings tend to be higher than pre test ratings by about 4.68 to 6.96 at 95% confidence interval.

## 
##  Wilcoxon signed rank exact test
## 
## data:  TradPost and TradPre
## V = 882, p-value = 2.033e-10
## alternative hypothesis: true location shift is not equal to 0

Conduct of test for significant differences in the PostTest of System and Traditional (Unpaired)

wilcox.test(SysPost, TradPost, paired = 0) # No significant difference in the post test ratings between System and Traditional

## 
##  Wilcoxon rank sum exact test
## 
## data:  SysPost and TradPost
## W = 847, p-value = 0.7588
## alternative hypothesis: true location shift is not equal to 0

From the results of the data analysis, it can be observed that the Post test user ratings between the use of system and traditional set up is not statistically significant thus such may simply be attributed to chance and not necessarily due to the introduction of the treatment. Although a significant increase can be observed in the ratings for the system group from the pre test to post test, it is to be noted that such is also observed in the traditional group which probably implies that there are other factors that leads to an increase in user ratings apart from the introduction of the treatment. Given these results, it might not be a good decision to invest in the new system as user ratings seem to not differ much.