Graduate Statistics Topic 5: Benchmark: Correlation and Regression Project

PSY-520 Graduate Statistics

Topic 5– Benchmark – Correlation and Regression Project

Directions: Use the following information to complete the questions below. While APA format is not required for the body of this assignment, solid academic writing is expected, and documentation of sources should be presented using APA formatting guidelines, which can be found in the APA Style Guide, located in the Student Success Center.

Player # Players Age (x) Batting Averages (y) XY X^2 Y^2
1 26 338 8788 676 14244
2 24 318 7632 576 101124
3 33 318 10494 1089 101124
4 33 316 10428 1089 99856
5 25 315 7875 625 99225
6 41 315 12915 1681 99225
7 24 312 7488 576 97344
8 29 307 8903 841 94249
9 34 304 10336 1156 92416
10 28 302 8428 784 91204
11 28 301 8428 784 90601
12 37 300 11100 1369 90000
13 34 289 10132 1156 88804
14 32 296 9472 1024 87616
15 39 295 11505 1521 87025
16 24 294 7056 576 86436
17 24 294 7056 576 8646
18 30 293 8790 900 85849
19 38 289 10982 1444 83521
20 34 288 9792 1156 82944
21 36 287 10332 1296 82369
22 32 287 9184 1024 82369
23 33 286 9438 1089 81796
24 31 285 8835 961 81225
25 28 284 7952 784 80656
26 31 284 8804 961 80656
27 29 278 8062 841 77284
28 26 276 7126 676 76176
29 29 275 7975 841 75625
30 22 274 6028 484 75076
31 24 274 6576 576 75076
32 31 273 8463 961 74529
33 23 271 6233 529 73441
34 29 271 7859 841 73441
35 26 270 7020 676 72900
36 29 268 7772 841 71824
37 37 268 9916 1369 71824
38 26 267 6942 676 71289
39 25 267 6675 625 71289
R=.06 ∑x=1164 ∑y=11338 ∑xy=338870 ∑x^2=34082 ∑y^2=3208088
           
Model B Std Error Standardized Coefficients Beta t Sig  
1 Constant 275.146 17.817   15.443 .000  
Age .522 .589 .144 .885 .382  

-The R-value of the linear model represents a correlation of 0.144. This is a low correlation. R square represents the variation of the DV, batting averages. This is a weak correlation of 0.6%.

  • Select at least three variables that you believe have a linear relationship.
    • Specify which variable is dependent and which are independent.
    • Independent variable: MLB baseball player’s ages
    • Dependent variable: Batting averages in 2016
    • 3rd variable: Player # used to chronologically order players to help eliminate outliers
  • Collect the data for these variables and describe your data collection technique and why it was appropriate as well as why the sample size was best.
  • -Data was collected through a website that listed the top 2016 MLB baseball players by age and batting average. There were 39 players total which reflected the total top batters in 2016 aside from the total MLB baseball player population.
    • Submit the data collected by submitting the SPSS data file with your submission.
    • Frequency Table:
  • Find the Correlation coefficient for each of the possible pairings of dependent and independent variables and describe the relationship in terms of strength and direction.
  • -The constant is the batting average (y). The sig for the DV is 0, and the ID, age is 382 which is Pearson’ correlation coefficient (r). A sig with a small value between 0 and 1, the greater chance the correlation sign similar would be observed.
  • Coefficients
  • Unstandardized Coefficients
  • Find a linear model of the relationship between the three (or more) variables of interest.
Model SummaryModelRR square Adjusted r square Std Error of the estimate 1.873^2.762.749874.779

-A linear correlation is best represented with a regression line (straight). Based upon the data collected, the relationship between ages of MLB players and batting averages is positive, although weak. There is no statistical significance of a strong correlation in the questioned relationship.

  • Explain the validity of the model.

Reference

Witte, R. S., & Witte, J. S. (2015).  Statistics  (10th ed.). Hoboken, NJ: Wiley.

Place an Order

Plagiarism Free!

Scroll to Top