## QNT 561 Signature Assignment

Signature Assignment

QNT-561

Signature Assignment

**Part 1: Preliminary analysis**

The objective of this study is to determine the average census for hospitals in United States, to determine the types of hospital ownership, to determine the proportions of general Medical and psychiatric hospitals, to determine the average number of births and average number of personnel employed in an average hospital in the United States. The questions addressed in the study are:

i) What is the average census for an average hospital in the US?

ii) What are the main types of hospital ownerships in the US?

iii) What are the proportions of general medical and psychiatric hospital?

iv) What is the average number of births in an average hospital in the US?

v) What is the average number of personnel employed by an average hospital in the US?

The population in the study comprises of all general medical and psychiatric hospitals from the seven selected regions in the United States. The seven regions are South, Northeast, Midwest, Southwest, Rocky Mountain, California and Northwest. The sample of the study comprises of a total of 200 hundred Hospitals selected at random from the above seven regions. (Hardy & Bryman, 2009).

The geographical region of the hospital represents a qualitative data. Control also represents a qualitative data as it shows the type of ownership for each hospital. Similarly, service is a qualitative data. It shows the type of the hospital, general medical or psychiatric. Census, births and personnel all represent quantitative data as they involve numerical numbers. Geographical region, control and service data all are in the nominal level measurement. This is because they are simply names with no particular order. Census, births and personnel are all in the ratio level of measurement. They all have a meaningful zero. (Hardy & Bryman, 2009).

**Part 2: Descriptive Statistics**

From the data on geographical region, the distribution of the hospitals in the seven regions is as follows; South- 56, Northwest- 30, Midwest- 60, Southwest- 3, Rocky Mountain- 20, California- 19 and Northwest-12. From the data on control, the ownership of the hospitals in the study is as follows; government, nonfederal- 51, non-government, not-for-profit- 86, for profit- 45 and federal government- 18. From the service data, 168 hospitals in the sample were general medical hospitals and 32 were psychiatric hospitals.

Using Microsoft Excel, the descriptive statistics for the corresponding to census, births and personnel can be determined.

For the data set census; Mean= 100.5. Median= 100.5, mode= 28, standard deviation= 4.0926and the variance= 3350.The five-number summary for census is thus {min=2, first quartile= 47.75, median= 102.5, third quartile= 183.25, max= 1106}.

For births; mean= 874.045, median= 480, mode= 0, range= 5699, standard deviation= 1063.666, variance= 1131384.556. The five-number summary for births is thus {min= 0, first quartile= 0, median= 480, third quartile= 1309.25, max= 5699}.

For personnel; Mean = 861.5, median= 589.5, mode= 328, range= 4037, standard deviation= 821.597, and the variance= 675021.618. The five-number summary for personnel is thus {min= 50, first quartile= 314, median= 589.5, third quartile= 1095.25, max= 4087}. (Shi, & McLarty, January 01, 2009).

Outliers in the data are identified by first determining the Inter-Quartile Range (IQR). This is given by the difference between the third quartile and the first quartile. The IQR is then used to define the lower bound and the upper bound. The lower bound is determined by subtracting 1.5 of the Inter-Quartile-Range from the first quartile. To find the upper bound, 1.5 of the IQR is the addition to the quartile 3. Any value of the data that lies below the lower bound or above the upper bound is considered an outlier. (MacRae, Welford &MacRae, 2011).

From the census data, IQR= 181.75-47.75 = 134, 1.5×IQR=201

Lower bound = 47.75-201 =-153.25, Upper bound = 181.75+201 =382.75

The outliers in the census data are 461, 414, 460, 390, 418, 456, 1106, 516, 395, 923, 416, 797 and 523.

For the births data, IQR = 1309.25-0 =1309.25, 1.5×IQR= 1963.88

Lower bound = 0-1963.88 =-1963.88 and upper bound = 1309.25+1963.88 =3273.13

The outliers in the births data are 3810, 3966, 3714, 3968, 3655, 5699, 3346, 3311 and 4207.

For the personnel data, IQR = 1095.25-314= 781.25, 1.5×IQR =1171.88

Lower bound= 314-1171.88 =-857.88, upper bound= 1095.25+ 1171.88= 2267.13

The outliers in the personnel data are 2310, 3694, 3486, 3301, 3928, 2581, 2534, 2620, 3123, 2745, 4087, 3012, 3090, 2312 and 3516. (MacRae, Welford &MacRae, 2011).

The data for census, births and personnel can be presented on scatter plots where the outliers can clearly be observed. The scatter plots are shown below;

0180975 | ||||||
---|---|---|---|---|---|---|

-13335028575 | ||||||

-13335076200 | ||||||

-16192512382500-161925123825 | ||||||

**Part 3: Inferential statistics.**

1) To create a confidence interval for the approximate of the average census for hospitals, the student-t test is used since the population standard deviation is unknown. The confidence interval for average of census would thus be given by;

μ =X̄ ± t(s/√n) where x̄ is the sample mean, s is the sample standard deviation, n is the size of the sample and t is the p-value obtained from the student t-value tables with the stated confidence level and (n-1) degrees of freedom.

For the 90% confidence level, α=0.1 and degrees of freedom =199, t= 1.6525

Therefore, μ = 144.095 ± 1.6525(149.566/√200)

=144.095 ± 17.477

The confidence interval is [126.618, 161.572] (Schi& Tao, 2008).

For the 95% confidence level, α=0.05 and degrees of freedom = (n-1) =199, t= 1.972

Therefore, μ = 144.095 ± 1.972(149.566/√200)

=144.095 ± 20.856

The confidence interval is [123.239, 164.951]. (Schi& Tao, 2008).

It can be observed that the margin of error increases with increase in confidence level. This is because a high level of confidence has a high p-value, which is directly related to the error margin. For the 90% confidence level, the margin of error is 17.477 while for the 95% confidence level, the margin of error is 20.856. Hence, increase in confidence level increases the confidence interval. However, the point estimate of the mean remains unchanged. (Frederic, 2009).

2) From the hospital database, the sample proportion of hospitals that are ‘general medical’ is determined by dividing the number of general medical hospitals in the sample with the total number of hospitals in the sample. There are 168 general medical hospitals in the sample of 200 hospitals. Hence, sample proportion for the general medical hospitals is 168/200 = 0.84.

The confidence interval for the population proportion is given by the interval [p̂ ± Z√(p̂q̂/n)] where p̂ is the sample proportion, q̂= (1-p̂), Z is the z-score value corresponding to the given confidence level and n is the sample size. At 95% level of confidence, Z = 1.96, p̂= 0.84, q̂= 0.16 and n= 200.

The confidence interval for the population proportion is thus given by;

0.84 ± 1.96 √[(0.84×0.16)/200]

= 0.84 ± 0.05 = [0.79, 0.89]

The point estimate for the population proportion is given by the sample proportion. Hence, in this case point estimate = 0.84. The error in interval is given by the term Z√ (p̂q̂/n). The error in the above interval is 0.05. (Frederic, 2009).

3) To test the accuracy of the claim that the average hospital in the United States averages more than 700 births per year.Stating the null and alternative hypotheses will be the initial step. The equal sign is in the null hypothesis. Hence, in this case the hypotheses are as follows;

H0: μ ≤ 700

H1: μ > 700 (claim)

Since the population variance is unknown, the test statistic is given by;

t = (x̄-μ0)/(s/√n) where x̄ is the sample mean, μ is the hypothesized value = 700, s is the sample standard deviation and n is the sample size. For the births, x̄ =Substituting for the values

t= (874.045-700)/ (1063.666/√200) = 174.045/ 75.125

t=2.317

From the student t-tables, with α = 0.01 and (n-1) =199 degrees of freedom, t0, the 100(1-α) percentile is given by 2.3452.

The null hypothesis is rejected if t ≥ t0. Since 2.317 < 2.3452, that is, t < t0, fail to reject H0. The conclusion made is that there is not enough evidence to support the claim that the average hospital in the United States averages more than 700 births per year. (Lehmann, & Romano, 2010).

4) To test whether hospitals in the United States employ less than 900 personnel, start by formulating the hypotheses. The assumption that hospitals in the US employ less than 900 personnel represents the claim. Therefore, the hypotheses are as follows;

H0: μ ≥ 900

H1: μ < 900 (claim)

Since the population variance is unknown, it is appropriate to use the t-test. The test statistic is given by;

t = (x̄-μ0)/(s/√n). For the personnel, x̄ = 861.5, s = 821.597 and n = 200. μ0 is the hypothesized value. In this case, μ0 = 900. The statistic test is given by;

t = (861.5-900)/ (821.597/√200) = -38.5/ 58.096

t = -0.6627

From the student t-tables, with α = 0.10 and (n-1) =199 degrees of freedom, t0, the 100(1-α) percentile is given by 1.2858.

The null hypothesis is rejected if t ≤ -t0. Since -0.6627 > -1.2858, that is, t > -t0, fail to reject H0. The conclusion made is that there is not enough evidence to support the assumption that hospitals in the United States employ less than 900 personnel. (Lehmann & Romano, 2010).

**Part 4: Conclusions**

From the above data analysis, it can be concluded that 84% of the hospitals in the United States are general medical hospitals and 16% are psychiatric hospital. The Midwest region has the highest number of hospital (60) while the Southwest region has the least number of hospitals (3). A big number of the hospitals in the US are non-governmental, non-for-profit (86) while only a small number are controlled by the federal government (18).

From the results of the tests of hypothesis, it can be concluded that the average hospital in the US averages 700 or less births per year. Also, hospitals in the US employ 900 or more personnel. If the variance of the population was known, it would help in making more certain conclusions.

**References**

Top of Form

Frederic, P. M. A. F. V. J. M. (2009). Confidence interval. Place of publication not identified: Vdm Pub. House.

Top of Form

Hardy, M. A., & Bryman, A. (2009). Handbook of data analysis. Los Angeles: SAGE.

Lehmann, E. L., & Romano, J. P. (2010). Testing statistical hypotheses. New York: Springer.

Top of Form

Top of Form

MacRae, S., Welford, H., &MacRae, S. (2011). Descriptive statistics. Moseley: Stat Basics. Bottom of Form

Top of Form

Schi, N.-Z., & Tao, J. (2008). Statistical hypothesis testing: Theory and methods. Singapore: World Scientific.

**Click following link to download this document**

QNT 561 Signature Assignment.docx