THE SCENARIO: You are the Director of Admissions for a large business school. Students seeking admission must take the GMAT (Graduate Management Admission Test). You want to gather some inferential statistics about your students’ GPA’s. You also wish to determine if a student’s GMAT score is useful for predicting the student’s GPA at graduation.

THE DATA FILE: The GMAT scores and GPA’s at graduation of 20 randomly selected students are contained in the file: GMAT.xlsx, located in Session 8. For linear regression, the *X*-variable is the GMAT score, and the *Y*-variable is GPA.

INSTRUCTIONS: Answer all the questions below. All calculations must be performed with Excel or PHStat. Attach Excel or PHStat output where indicated. You will receive zero credit for any answer lacking the required Excel or PHStat output.

** ROUND OFF ALL CALCULATIONS TO AT LEAST FOUR DECIMAL PLACES. **Highlight the cells with output where decimal places need setting. Then use the “Increase Decimal” tool on Excel’s Home/Number menu. If you have problems obtaining the required decimal places, contact me.

1. Find the mean and standard deviation of the **sample GPA**: [4 POINTS]

PASTE EXCEL DESCRIPTIVE STATISTICS BELOW. [IF YOU FORGOT HOW TO DO THIS, SEE TOPIC 2 CHAPTER 3 EXAMPLES.]

Sample Mean: ____________

Sample Standard deviation: _____________

2. Assume that the population is normally distributed, but the population standard deviation is not known. Use your **sample** data to find a 95% confidence interval for the true mean GPA of all students in the MBA program: [6 POINTS]

PASTE PHSTAT OUTPUT BELOW:

State the margin of error of the confidence interval: ______________

3. Assume that the population standard deviation is 0.30 and assume the population is approximately normally distributed. Find the sample size that would be required to determine a 95% confidence interval for the true mean GPA if we want to be within 0.10 of the true mean. That is, we want the margin of error, *e*, to not exceed 0.10. [4 POINTS]

PASTE PHSTAT OUTPUT BELOW:

4. Using your **sample** data, test this hypothesis at the alpha = 0.01 significance level. You may assume that the population standard deviation is not known and that the population is approximately normally distributed. [14 POINTS]

(a) Is there sufficient evidence to conclude that the mean GPA for all students is **more than** 3.2?

PASTE PHSTAT OUTPUT BELOW:

(b) Test the hypothesis again, changing alpha to 0.05 but not changing anything else.

PASTE PHSTAT OUTPUT BELOW:

Now mark all of the following statements about the two hypothesis tests either T(TRUE) or F(FALSE).

_________ The *p*-value is the probability that the null hypothesis will be rejected.

_________ The second test has a smaller “reject” region than the first.

_________ The test statistic measures the distance between the mean being tested and the sample mean.

_________ The null hypothesis will be rejected provided alpha exceeds the *p*-value.

_________ The critical value is the boundary between the “reject” region and the “do not reject” region.

_________ The *p*-value is the probability of getting a test statistic equal to, or more extreme than the sample result, if the null hypothesis is true.

5. Suppose it is known that 8 out of the 20 students in the sample are women. [8 POINTS]

(a) Find a 95% confidence interval for the true **proportion** of all MBA students who are women.

PASTE PHSTAT OUTPUT BELOW:

(b) What is your opinion of the ** precision **of this confidence interval? Give a reason for your answer.

6. Assume that the population proportion is 0.45, and find the sample size that would be required to determine a 95% confidence interval if we want to be within 0.05 of the true proportion of women MBA students. That is, we want the margin of error, *e*, to not exceed 0.05. [4 POINTS]

PASTE PHSTAT OUTPUT BELOW:

**LINEAR REGRESSION – Use the sample to complete this section. Remember, the X variable is GMAT SCORE, and the Y variable is GPA**

7. PASTE A SCATTER PLOT BELOW: [4 POINTS]

8. Perform the regression analysis using PHSTAT and PASTE THE PRINTOUT BELOW: [4 POINTS]

**NOTE: BEFORE YOU COPY THE PRINTOUT, CHANGE THE FORMAT OF THE P-VALUE FOR GMAT TO SCIENTIFIC NOTATION. HIGHLIGHT THE CELL, THEN ON THE EXCEL HOME/NUMBER MENU, SELECT “SCIENTIFIC” FROM THE DOP-DOWN BOX.**

9. The regression output. [10 POINTS]

i. The regression equation is: ____________________________________

ii. The slope of the equation is: ___________________________________

iii. The *y*-intercept of the equation is: ________________________________

iv. The standard error of the estimate is: ____________________________

v. The coefficient of determination is: _____________________________

10. Using the Excel printout from Question 8, test the hypothesis that there is **no** linear relationship between *X* and *Y*. Test at alpha = 0.05 significance level. [8 POINTS]

i. State the null hypothesis: _______________________

ii. State the alternate hypothesis: ___________________

iii. *p*-value: __________________________________

iv. Test result and reason for test result: ________________________________

11. Interpretation. [6 POINTS]

(a) What does the *y*-intercept of this regression equation represent?

(b) State the exact meaning of the slope in this regression equation.

(c) Predict the GPA of a student with a GMAT score of 600. _______________

12. [12 POINTS] (a) PASTE RESIDUAL PLOT BELOW:

(b) From the residual plot, do you think that the two regression assumptions listed below are satisfied? Give the reason for your conclusion.

Linearity: ___________________________________

Reason: ____________________________________

Equal Variance: ______________________________

Reason: ____________________________________

13. [8.POINTS]

(a) PASTE A NORMAL PROBABILITY PLOT OF RESIDUALS BELOW:

(b) From the normal probability plot, do you think the normality assumption for regression is satisfied? Give the reason for your conclusion.

14. Determine 95% confidence and prediction intervals for *X* = 600. [4 POINTS]

PASTE PHSTAT OUTPUT BELOW:

15. Discuss this model. How good do you think the model is for predicting GPA? Give reasons for your answer. Then state at least two other possible independent variables that you think would be useful for predicting GPA. [4 POINTS]

This final project is due on session 15, April 22, 2021. You need submit this Microsoft Word File in this given format. Further, you can either submit this work on-line or in-class. The project is worth 26% of your overall course grade and it will not be returned.