Descriptive Statistics Data Analysis

PSY 325: Statistics for the Behavioral & Social Sciences

Descriptive Statistical Analysis Data

Introduction:

Descriptive statistics describe the most focal features of a set of data in quantitative terms. Below is the data set representing the first twenty days of sales of a new item for a Bakery, as noted by the bookkeeper. In performing a descriptive statistics analysis on the assigned data using a Descriptive Statistics calculator, the evaluation measures of central tendency will determine if there are potential outliers or not. Our understanding of which descriptive statistics best summarizes the data set will give us a clearer view of why this type of analysis is best suited for this set of data. Entering Data Set:

1 2 3 4 6 7 8 8 8 10 11 11 12 13 16 18 22 24 26 30

Value Frequency Frequency %

Minimum: | 1 | 1 | 1 | 5.00 | |||||

Maximum: | 30 | 2 | 1 | 5.00 | |||||

Range: | 29 | 3 | 1 | 5.00 | |||||

Count: | 20 | 4 | 1 | 5.00 | |||||

Sum: | 240 | 6 | 1 | 5.00 | |||||

Mean: | 12 | 7 | 1 | 5.00 | |||||

Median: | 10.5 | 8 | 3 | 15.00 | |||||

Mode: | 8 | 9 | 1 | 5.00 | |||||

Standard dev.Standard Deviation: | 8.26533662 | 10 | 1 | 5.00 | |||||
---|---|---|---|---|---|---|---|---|---|

Variance: | 68.3157895 | 11 | 1 | 5.00 | |||||

Mid-Range: | 15.5 | 12 | 1 | 5.00 | |||||

Quartile Range: | Q1—6.5 | 13 | 1 | 5.00 | |||||

Q2—10.5 | 16 | 1 | 5.00 | ||||||

Q3—17 | 18 | 1 | 5.00 | ||||||

Interquartile Range: | 10.5 | 22 | 1 | 5.00 | |||||

Sum of Squares: | 1298 | 24 | 1 | 5.00 | |||||

Mean Absolute Dev. | 6.5 | 26 | 1 | 5.005.00 | |||||

Root Mean Square: | 14.4533733 | 30 | 1 | 5.00 | |||||

STD Error of Mean: | 1.84816545 | ||||||||

Skewness: | 0.696282313 | ||||||||

Kurtosis: | 2.40979604 | ||||||||

Co-efficient of Variation | 0.688778052 | ||||||||

Relative Standard Deviation | 68.8778052% |

The measure of tendency is seen as a single value that describes a set of data by

Calculating the central location within the collection. The mean, median, and mode are all accurate measures of central tendency but can be displayed in various ways and under a different situation. Some of them calculate data more appropriately than others (Statistics.laerd.com).

The measure of central tendency in this data analysis report shows that is the most appropriate

measure is the mode. Given the way the data is recorded in this example, the method is probably the best measure of central tendency.

The question here is: Are the mean, median, and mode close to the same value? If not, what does this tell you about the numbers in the set (Week 1 Guidance, 2019)? When there is an odd number of data values in this data set, there is only one middle number. With 19 data values, the central quantity is the data value at position 10. So, the median is 10. I would think that the median and mean can be influenced by outliers. It also means that they may not present a typical bell shape curve distribution. Another issue involving the mode that is common is, it will not provide an excellent measure of central tendency. Especially when the most common mark is far away from the rest of the data in the data set. The mode in the data set is number 8. It appears three times in the data set and is the only number that does. Therefore, the model for this data set is 8; there is only one mode for this data set.

When manually calculating the ranges of the Quartile and the values of Q1 and Q3, it gave me an IQR that I was 10.5. When I have done it manually, I got the IQR to be 3.5. When I did the calculations on the Q1 and Q3 with the IQR that gave the lower outliners were 68.25, and my upper outliners were 178.5. My estimates on my Q1 and Q3 done manually showed the lower outliners IQR as 22.75 and my upper outliners as 59.5.

When testing to find outliers by looking at the data table, I saw none. From checking it with the Interquartile calculator, the Interquartile range is 10.5. If any were in this data set, they would be less than –25 or greater than 48.5. Potential outliers are between -25 and -9.25, inclusive or between 32.75 and 48.5, inclusive. In this data set, I saw no outliers. Also, in this data set, I saw no potential outliers.

I used the method from section 2.4 that used Q1 and Q3 and multiplied each of them separately by the IQR to get my upper and lower outliners (Tanner, 2016). This method, called the IQR method, is related to the five‐number summary. It utilizes the IQR and the 1st and 3rd quartiles, Q1 and Q3. When all the data is calculated from a Gaussian distribution (so no outliers are present), Q has the chance of identifying one or more outliers. My conclusion was the same. I found no outliers.

The descriptive statistic that best summarizes this data set is called the “Measures of Central Tendency.” There are three measures of central tendency: the mode, the median, and the mean. They point out the set of data points and characterizes a value around the distribution. Each measure points out what you would typically find in a game, and each one of the steps depends on a contrasting descriptive standard. As a result, these three statistics are respectful of each other” (Tanner, 2016). The three descriptive statistics that I picked that would satisfy this data set would be **Central Tendency**, **Dispersion or Variation,** and **Measures of Frequency**. They are the choices because within the data set, this would be how bakeries typically produce their goods to be sold within the first 20 days. It is also showing the center of how the norm is being sold, and sometimes that number is repetitive and contains the mode.

**Conclusion**

In this report, we evaluated the measures of central tendency. The mode is the most appropriate measure for this type of data. We found that through the calculations, we concluded that the mean, median, and the mode did not produce outliers that would indifference our data results. The Interquartile range was manually calculated using two methods, with clarification as to the best fit for these statistics. We utilized two different approaches to determine if there are outliers. We also compare the different ways to come to the same conclusions, as researchers in statistical studies sometimes do. And in the end, we explained why the descriptive statistics we chose to fit the representation of the data set. The data is calculated on a bell-shaped curve that skewed left: the same was the range of the data but within the proper areas. The frequency information at the end of the chart shows the number of occurrences for each data value in the data set.

References

Tanner, D. (2016). Statistics for the behavioral & social sciences (2nd ed.). Retrieved from https://content.ashford.edu.