Analysis of Covariance (ANCOVA)

name

Walden University

RSCH 8250

instructor

Analysis of Covariance (ANCOVA)** **

**Introduction**

According Morrow A.J (2009), ANCOVA is an extension of analysis of variance in which additional variable is added called covariate to the equation. And covariate is added to control statistically for the effect of the variables, and that besides covariate increases the sensitivity of the statistical test (Morrow,2009)In Smart Alex task, the researchers want to see the effect of two different therapies ( stalking-cruel-to-be kind therapy and psychodynamic therapy) on stalking behavior. The independent variables are the therapy approaches, and the dependent variable is the stalking behavior after therapy (Posttest). The co-variable is Pretest stalking behavior (Morrow, 2009)

**Assumptions**

There are six assumptions of analysis of covariate mentioned by Dr. Morrow, and they are as follows:

**Outliers**

An outlier is an observation point that is distant from other observations. It may be due to variability in measurements or experimental errors, and it can greatly impact the result of an analysis.

**Homogeneity of Regression**

This is an assumption that shows the relationship between the covariates and the outcome as constant. In order words, the relationship between the outcome and the covariate is the same, and when it is different, it is called heterogeneity of regression (Field,2013).

**Normality of dependent variable**

The values of the dependent variables and the analysis of the covariance must be normally distributed. And histogram is used to see whether the distributions are normally distributed.

**Homogeneity of variances**

This means the variances of the outcomes variables (independent) should have equal variances, and the covariates should have variances across independent variables.

**Multi-collinearity**

If covariate is more than one there should not be high correlated with each other meaning the correlation in terms of absolute value should not be greater than 0.8.

**Missing data**

Missing data have a great impact on lowering the power of analysis of covariance. It should be dealt with before running analysis of covariance. There is no missing data in our case as seen in the frequency table.

**Whethe****r the Assumptions have been met**

**Assumption: ****Outliers**

The data shows the syntax and frequency to check for outliers.

[DataSet1] C:UsersKudeeDesktopSPSS-NEWStalker.sav

Time Spent Stalking Before Therapy (hours per week) | Time Spent Stalking After Therapy (hours per week) | ||

N | Valid | 50 | 50 |
---|---|---|---|

Missing | 0 | 0 | |

65.22 | 58.40 | ||

1.507 | 1.929 | ||

64.50 | 61.00 | ||

57a | 61 | ||

10.655 | 13.641 | ||

113.522 | 186.082 | ||

.254 | -1.344 | ||

.337 | .337 | ||

-.657 | 3.150 | ||

.662 | .662 | ||

46 | 11 | ||

89 | 80 | ||

Percentiles | 25 | 57.00 | 54.75 |

50 | 64.50 | 61.00 | |

75 | 73.50 | 64.00 | |

Frequency | Percent | Valid Percent | Cumulative Percent | ||

Valid | 46 | 1 | 2.0 | 2.0 | 2.0 |
---|---|---|---|---|---|

47 | 1 | 2.0 | 2.0 | 4.0 | |

50 | 1 | 2.0 | 2.0 | 6.0 | |

51 | 1 | 2.0 | 2.0 | 8.0 | |

52 | 2 | 4.0 | 4.0 | 12.0 | |

53 | 3 | 6.0 | 6.0 | 18.0 | |

54 | 1 | 2.0 | 2.0 | 20.0 | |

57 | 4 | 8.0 | 8.0 | 28.0 | |

58 | 1 | 2.0 | 2.0 | 30.0 | |

59 | 1 | 2.0 | 2.0 | 32.0 | |

60 | 3 | 6.0 | 6.0 | 38.0 | |

61 | 2 | 4.0 | 4.0 | 42.0 | |

62 | 2 | 4.0 | 4.0 | 46.0 | |

63 | 1 | 2.0 | 2.0 | 48.0 | |

64 | 1 | 2.0 | 2.0 | 50.0 | |

65 | 1 | 2.0 | 2.0 | 52.0 | |

66 | 4 | 8.0 | 8.0 | 60.0 | |

68 | 1 | 2.0 | 2.0 | 62.0 | |

71 | 3 | 6.0 | 6.0 | 68.0 | |

72 | 3 | 6.0 | 6.0 | 74.0 | |

73 | 1 | 2.0 | 2.0 | 76.0 | |

75 | 4 | 8.0 | 8.0 | 84.0 | |

77 | 2 | 4.0 | 4.0 | 88.0 | |

79 | 2 | 4.0 | 4.0 | 92.0 | |

80 | 1 | 2.0 | 2.0 | 94.0 | |

85 | 1 | 2.0 | 2.0 | 96.0 | |

87 | 1 | 2.0 | 2.0 | 98.0 | |

89 | 1 | 2.0 | 2.0 | 100.0 | |

Total | 50 | 100.0 | 100.0 |

Frequency | Percent | Valid Percent | Cumulative Percent | ||

Valid | 11 | 1 | 2.0 | 2.0 | 2.0 |
---|---|---|---|---|---|

18 | 1 | 2.0 | 2.0 | 4.0 | |

34 | 1 | 2.0 | 2.0 | 6.0 | |

35 | 1 | 2.0 | 2.0 | 8.0 | |

40 | 1 | 2.0 | 2.0 | 10.0 | |

46 | 1 | 2.0 | 2.0 | 12.0 | |

47 | 2 | 4.0 | 4.0 | 16.0 | |

50 | 2 | 4.0 | 4.0 | 20.0 | |

52 | 1 | 2.0 | 2.0 | 22.0 | |

54 | 1 | 2.0 | 2.0 | 24.0 | |

55 | 4 | 8.0 | 8.0 | 32.0 | |

56 | 2 | 4.0 | 4.0 | 36.0 | |

58 | 1 | 2.0 | 2.0 | 38.0 | |

59 | 1 | 2.0 | 2.0 | 40.0 | |

60 | 1 | 2.0 | 2.0 | 42.0 | |

61 | 7 | 14.0 | 14.0 | 56.0 | |

62 | 5 | 10.0 | 10.0 | 66.0 | |

63 | 2 | 4.0 | 4.0 | 70.0 | |

64 | 4 | 8.0 | 8.0 | 78.0 | |

65 | 2 | 4.0 | 4.0 | 82.0 | |

70 | 2 | 4.0 | 4.0 | 86.0 | |

71 | 1 | 2.0 | 2.0 | 88.0 | |

74 | 1 | 2.0 | 2.0 | 90.0 | |

78 | 3 | 6.0 | 6.0 | 96.0 | |

79 | 1 | 2.0 | 2.0 | 98.0 | |

80 | 1 | 2.0 | 2.0 | 100.0 | |

Total | 50 | 100.0 | 100.0 |

73.50+16.5=90 any valve above 90 and below 40.5 is outlier.

**Explanation****:**- One is checking for outliers by looking at
- Q3-Q1 before- 73.50-57=16.5
- 57-16.5-16.5=40.5 and any valve below 40.5 is an outlier

Looking at the data ( Time spent After Therapy) one has four data that are below 40.5 which are 11, 18, 34 and 35 and these are outliers meanings we have violated the first assumptions..

27-SEP-2015 19:21:28 | ||

Input | Data | C:UsersKudeeDesktopSPSS-NEWStalker.sav |
---|---|---|

Active Dataset | DataSet1 | |

Filter | ||

Weight | ||

Split File | ||

N of Rows in Working Data File | 50 | |

Missing Value Handling | Definition of Missing | User-defined missing values are treated as missing. |

Cases Used | Statistics are based on all cases with valid data. | |

FREQUENCIES VARIABLES=stalk1 stalk2 /NTILES=4 /NTILES=10 /PERCENTILES=25.0 50.0 75.0 /STATISTICS=STDDEV VARIANCE MINIMUM MAXIMUM SEMEAN MEAN MEDIAN MODE SUM SKEWNESS SESKEW KURTOSIS SEKURT /ORDER=ANALYSIS. | ||

Resources | Processor Time | 00:00:00.02 |

Elapsed Time | 00:00:00.04 |

Outlier should be deleted or transformed but there remains a challenge of dealing with outlier in that deleting can reduce sample size and transforming can make it difficult to interpret (Laureate Edu. Inc., 2009). Assumption is not met as there are many outliers meaning we have violated the assumption of no outliers.

** ****Assumption****: ****Homogeneity of Regression**

**Univariate A****nalysis of Variance**

27-SEP-2015 10:03:29 | ||

Input | Data | C:UsersKudeeDesktopSPSS-NEWStalker.sav |
---|---|---|

Active Dataset | DataSet1 | |

Filter | ||

Weight | ||

Split File | ||

N of Rows in Working Data File | 50 | |

Missing Value Handling | Definition of Missing | User-defined missing values are treated as missing. |

Cases Used | Statistics are based on all cases with valid data for all variables in the model. | |

UNIANOVA stalk2 BY group WITH stalk1 /METHOD=SSTYPE(3) /INTERCEPT=INCLUDE /PRINT=OPOWER ETASQ HOMOGENEITY DESCRIPTIVE /CRITERIA=ALPHA(.05) /DESIGN=group stalk1. | ||

Resources | Processor Time | 00:00:00.08 |

Elapsed Time | 00:00:00.82 |

Value Label | N | ||

Group | 1 | Cruel to be Kind Therapy | 25 |
---|---|---|---|

2 | Psychodyshamic Therapy | 25 |

Group | Mean | Std. Deviation | N |
---|---|---|---|

Cruel to be Kind Therapy | 54.96 | 16.331 | 25 |

Psychodyshamic Therapy | 61.84 | 9.410 | 25 |

Total | 58.40 | 13.641 | 50 |

F | df1 | df2 | Sig. |
---|---|---|---|

7.189 | 1 | 48 | .010 |

Source | Type III Sum of Squares | df | Mean Square | F | Sig. | Partial Eta Squared | Noncent. Parameter | Observed Powerb |
---|---|---|---|---|---|---|---|---|

Corrected Model | 5006.278a | 2 | 2503.139 | 28.613 | .000 | .549 | 57.225 | 1.000 |

Intercept | .086 | 1 | .086 | .001 | .975 | .000 | .001 | .050 |

Group | 480.265 | 1 | 480.265 | 5.490 | .023 | .105 | 5.490 | .631 |

stalk1 | 4414.598 | 1 | 4414.598 | 50.462 | .000 | .518 | 50.462 | 1.000 |

Error | 4111.722 | 47 | 87.483 | |||||

Total | 179646.000 | 50 | ||||||

Corrected Total | 9118.000 | 49 | ||||||

**Explanation:**

And to test the assumption of homogeneity of regression slopes one run the SPSS and produce the Test of between-Subjects Effects. Looking the F (1.49)=50.462, P=0.00 meaning there is an interaction between the covariates (Pre Stalking behavior) and the outcome (post stalking behavior). Then in this case one can conclude that the assumptions have been violated.

**Assumption: ****N****ormality of dependent variables**

GRAPH

/HISTOGRAM(NORMAL)=stalk1

/PANEL ROWVAR=group ROWOP=CROSS.

GRAPH

/HISTOGRAM(NORMAL)=stalk2

/PANEL ROWVAR=group ROWOP=CROSS.

As shown above this assumption is met since the hours spent on stalking therapy are normally distributed in both the pretest and posttest variables.

**Assumption: Homogeneity of variances**

Looking at Leven’s test of equality of error variance below shows that variance are not equal because the F (1.48)=7.189, P <0.01 and hence the assumption of equal variance have been violated

F | df1 | df2 | Sig. |
---|---|---|---|

7.189 | 1 | 48 | .010 |

Assumptions of Multicollinearity are met because the data has only one co-variable. And assumption of missing data are met because there no missing data.

**The Null and Alternative hypothesis**

*Null Hypothesis*: Ho: µ1 = µ2

Stalking behavior before the intervention methods are independent of stalking behavior after the intervention.

*Alternate Hypothesis***: **H1: µ1 ≠ µ2

Stalking behavior before the intervention methods are dependent of the stalking behavior after the intervention.

Syntax-

UNIANOVA stalk1 BY group WITH stalk2

/CONTRAST(group)=Simple(1)

/METHOD=SSTYPE(3)

/INTERCEPT=INCLUDE

/EMMEANS=TABLES(group) WITH(stalk2=MEAN) COMPARE ADJ(SIDAK)

/PRINT=HOMOGENEITY DESCRIPTIVE PARAMETER

/CRITERIA=ALPHA(.05)

/DESIGN=stalk2.

**Univariate Analysis of Variance**

[DataSet1] C:UsersKudeeDesktopSPSS-NEWStalker.sav

Value Label | N | ||

Group | 1 | Cruel to be Kind Therapy | 25 |
---|---|---|---|

2 | Psychodyshamic Therapy | 25 |

Statistic | Bootstrapa | |||||

Bias | Std. Error | 95% Confidence Interval | ||||

Lower | Upper | |||||

Cruel to be Kind Therapy | Mean | 64.84 | .03 | 2.15 | 60.50 | 68.86 |
---|---|---|---|---|---|---|

Std. Deviation | 10.680 | -.279 | .999 | 8.338 | 12.367 | |

N | 25 | 0 | 3 | 18 | 32 | |

Psychodyshamic Therapy | Mean | 65.60 | -.01 | 2.15 | 61.63 | 69.83 |

Std. Deviation | 10.836 | -.316 | 1.408 | 7.742 | 13.118 | |

N | 25 | 0 | 3 | 18 | 32 | |

Total | Mean | 65.22 | .01 | 1.53 | 62.10 | 68.34 |

Std. Deviation | 10.655 | -.151 | .871 | 8.643 | 12.185 | |

N | 50 | 0 | 0 | 50 | 50 | |

F | df1 | df2 | Sig. |
---|---|---|---|

.398 | 1 | 48 | .531 |

Source | Type III Sum of Squares | df | Mean Square | F | Sig. |
---|---|---|---|---|---|

Corrected Model | 2761.166a | 1 | 2761.166 | 47.310 | .000 |

Intercept | 2777.500 | 1 | 2777.500 | 47.590 | .000 |

stalk2 | 2761.166 | 1 | 2761.166 | 47.310 | .000 |

Error | 2801.414 | 48 | 58.363 | ||

Total | 218245.000 | 50 | |||

Corrected Total | 5562.580 | 49 | |||

Parameter | B | Std. Error | t | Sig. | 95% Confidence Interval | |
---|---|---|---|---|---|---|

Lower Bound | Upper Bound | |||||

Intercept | 33.083 | 4.796 | 6.899 | .000 | 23.441 | 42.725 |

stalk2 | .550 | .080 | 6.878 | .000 | .389 | .711 |

Parameter | B | Bootstrapa | ||||
---|---|---|---|---|---|---|

Bias | Std. Error | Sig. (2-tailed) | 95% Confidence Interval | |||

Lower | Upper | |||||

Intercept | 33.083 | -1.572 | 7.675 | .003 | 14.174 | 44.067 |

stalk2 | .550 | .025 | .125 | .004 | .366 | .850 |

**Estimated Marginal Means**

Group | Mean | Std.Error | 95% Confidence Interval | Bootstrap for Meanzp | |||||
---|---|---|---|---|---|---|---|---|---|

Lower Bound | Upper Bound | Bias | Std. Error | 95% Confidence Interval | |||||

Lower | Upper | ||||||||

Cruel to be Kind Therapy | 65.220a | 1.080 | 63.048 | 67.392 | .010 | 1.526 | 62.100 | 68.339 | |

Psychodyshamic Therapy | 65.220a | 1.080 | 63.048 | 67.392 | .010 | 1.526 | 62.100 | 68.339 | |

(I) Group | (J) Group | Mean Difference (I-J) | Std. Error | Sig.a | 95% Confidence Interval for Differencea | |
---|---|---|---|---|---|---|

Lower Bound | Upper Bound | |||||

Cruel to be Kind Therapy | Psychodyshamic Therapy | .000 | .000 | . | .000 | .000 |

Psychodyshamic Therapy | Cruel to be Kind Therapy | .000 | .000 | . | .000 | .000 |

**Table of the result in APA format**

**Table 1****:**

## ANCOVA Summary of Data Stalking-type Behavior Therapy at Posttest

Group n Mean SD

Cruel to be kind Therapy 25 54.96a 16.331

Psychodyshamic Therapy 25 61.84b 9.410

**Table 2****:**

Changes in stalking-type behavior after controlling pretest stalking-type behavior

Source dfFPartial η²

Pretest stalking-type behavior (stalk1) 1 50.462*** 0.105

Group 1 5.490**

Error 47

This is a covariate. ***p* < 0.01 not both are the same level of significance

Table 2

Therapy Differences in Stalking Behaviors After Controlling for Pretest Stalking behaviors

Source df F Partial η²

Pretest Stalking Behaviors 1 50.46*

Therapy 1 5.49** .105

Error 47

* p >.001 ** p >.05

**Report the results using correct APA format**

Pre-stalking behavior was a significant covariate for this analysis of covariance (Morrow, 2009).That is the covariate pre-stalking behaviors significantly adjusted the scores of the dependent variable post therapy stalking behaviors(Morrow, 2009). There were significant differences in post stalking behaviors for both interventions. Those who were engaged in cruel to kind intervention engaged far less in stalking behaviors than those in the alternative therapy. But this result should not be taking for granted because some of our assumptions were violated and many outliers identified(Morrow, 2009).

**Sample size using G-Power**

**F tests – **Variance: Test of equality (two sample case)

**Analysis:**A priori: Compute required sample size

**Input:**Tail(s)=Two

Ratio var1/var0=0.5

α err prob=0.05

Power (1-β err prob)=0.80

Allocation ratio N2/N1=2

**Output:**Lower critical F=0.623735

Upper critical F=1.672053

Numerator df=48

Denominator df=98

Sample size group 1=49

Sample size group 2=99

Using G-power set for F-test, effect size 0.50, alpha 0.05, power 0.80, and group 2, sample size calculated is 49 for group 1 and 99 for group 2. Sample size for ANCOVA or within-subjects test with covariate has to be large in order to increase power in analysis and have reliable result.In our case sample size of 49 and 99 are large sample size. One advantage of large sample is that heterogeneity of covariates does not affect the result much and the assumption of homogeneity of variance is not affected. Generalizability and reliability is increased with large sample size (Laureate Edu. Inc., 2009).

References

Field, A. (2013). *Di**scovering Statistics using IBM SPSS.* SAGE Publications Ltd.

Laureate Educational Inc. (2009). *(Executive producer)*. *ANCOVA*.

Sage Publications. (2013). *Andy Field’s Datasets* [Data files]. Available from *Discovering Statistics Using IBM SPSS* Statistics companion website: http://www.sagepub.com/field4e/study/datasets.htm

IBM PASW (formerly SPSS) Statistical Software version 21.

G*Power statistical software 3.1.9.2.