• Home

8210 Week 10 Dis

GSS (2014) Outputs

Figure 1

Model Summaryb

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

Durbin-Watson

1

.643a

.413

.412

17.2493

1.907

a. Predictors: (Constant), Male, HIGHEST YEAR OF SCHOOL COMPLETED, RESPONDENT INCOME IN CONSTANT DOLLARS

b. Dependent Variable: R’s socioeconomic index (2010)

Figure 2

ANOVAa

Model

Sum of Squares

df

Mean Square

F

Sig.

1

Regression

318344.890

3

106114.963

356.642

<.001b

Residual

451961.519

1519

297.539

Total

770306.410

1522

a. Dependent Variable: R’s socioeconomic index (2010)

b. Predictors: (Constant), Male, HIGHEST YEAR OF SCHOOL COMPLETED, RESPONDENT INCOME IN CONSTANT DOLLARS

Figure 3

Coefficientsa

Model

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

Collinearity Statistics

B

Std. Error

Beta

Tolerance

VIF

1

(Constant)

-7.865

2.261

-3.479

<.001

HIGHEST YEAR OF SCHOOL COMPLETED

3.414

.160

.453

21.300

<.001

.854

1.171

RESPONDENT INCOME IN CONSTANT DOLLARS

.000

.000

.325

14.848

<.001

.806

1.241

Male

-.762

.923

-.017

-.826

.409

.918

1.089

a. Dependent Variable: R’s socioeconomic index (2010)

Figure 4

Collinearity Diagnosticsa

Model

Dimension

Eigenvalue

Condition Index

Variance Proportions

(Constant)

HIGHEST YEAR OF SCHOOL COMPLETED

RESPONDENT INCOME IN CONSTANT DOLLARS

Male

1

1

3.240

1.000

.00

.00

.03

.03

2

.402

2.840

.01

.01

.04

.91

3

.339

3.090

.02

.01

.83

.00

4

.019

13.052

.97

.98

.10

.06

a. Dependent Variable: R’s socioeconomic index (2010)

Figure 5

Residuals Statisticsa

Minimum

Maximum

Mean

Std. Deviation

N

Predicted Value

-8.546

95.164

47.584

14.4624

1523

Std. Predicted Value

-3.881

3.290

.000

1.000

1523

Standard Error of Predicted Value

.622

2.361

.843

.267

1523

Adjusted Predicted Value

-9.394

95.486

47.585

14.4844

1523

Residual

-58.9884

71.9041

.0000

17.2323

1523

Std. Residual

-3.420

4.169

.000

.999

1523

Stud. Residual

-3.437

4.208

.000

1.001

1523

Deleted Residual

-59.5739

73.2768

-.0010

17.2864

1523

Stud. Deleted Residual

-3.449

4.231

.000

1.001

1523

Mahal. Distance

.982

27.513

2.998

3.333

1523

Cook’s Distance

.000

.085

.001

.003

1523

Centered Leverage Value

.001

.018

.002

.002

1523

a. Dependent Variable: R’s socioeconomic index (2010)

Figure 6

Chart, histogram  Description automatically generated

Figure 7

Chart, scatter chart  Description automatically generated

8210 Week 10 Dis

GSS (2014) Outputs

Figure 1

Model Summaryb

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

Durbin-Watson

1

.643a

.413

.412

17.2493

1.907

a. Predictors: (Constant), Male, HIGHEST YEAR OF SCHOOL COMPLETED, RESPONDENT INCOME IN CONSTANT DOLLARS

b. Dependent Variable: R’s socioeconomic index (2010)

Figure 2

ANOVAa

Model

Sum of Squares

df

Mean Square

F

Sig.

1

Regression

318344.890

3

106114.963

356.642

<.001b

Residual

451961.519

1519

297.539

Total

770306.410

1522

a. Dependent Variable: R’s socioeconomic index (2010)

b. Predictors: (Constant), Male, HIGHEST YEAR OF SCHOOL COMPLETED, RESPONDENT INCOME IN CONSTANT DOLLARS

Figure 3

Coefficientsa

Model

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

Collinearity Statistics

B

Std. Error

Beta

Tolerance

VIF

1

(Constant)

-7.865

2.261

-3.479

<.001

HIGHEST YEAR OF SCHOOL COMPLETED

3.414

.160

.453

21.300

<.001

.854

1.171

RESPONDENT INCOME IN CONSTANT DOLLARS

.000

.000

.325

14.848

<.001

.806

1.241

Male

-.762

.923

-.017

-.826

.409

.918

1.089

a. Dependent Variable: R’s socioeconomic index (2010)

Figure 4

Collinearity Diagnosticsa

Model

Dimension

Eigenvalue

Condition Index

Variance Proportions

(Constant)

HIGHEST YEAR OF SCHOOL COMPLETED

RESPONDENT INCOME IN CONSTANT DOLLARS

Male

1

1

3.240

1.000

.00

.00

.03

.03

2

.402

2.840

.01

.01

.04

.91

3

.339

3.090

.02

.01

.83

.00

4

.019

13.052

.97

.98

.10

.06

a. Dependent Variable: R’s socioeconomic index (2010)

Figure 5

Residuals Statisticsa

Minimum

Maximum

Mean

Std. Deviation

N

Predicted Value

-8.546

95.164

47.584

14.4624

1523

Std. Predicted Value

-3.881

3.290

.000

1.000

1523

Standard Error of Predicted Value

.622

2.361

.843

.267

1523

Adjusted Predicted Value

-9.394

95.486

47.585

14.4844

1523

Residual

-58.9884

71.9041

.0000

17.2323

1523

Std. Residual

-3.420

4.169

.000

.999

1523

Stud. Residual

-3.437

4.208

.000

1.001

1523

Deleted Residual

-59.5739

73.2768

-.0010

17.2864

1523

Stud. Deleted Residual

-3.449

4.231

.000

1.001

1523

Mahal. Distance

.982

27.513

2.998

3.333

1523

Cook’s Distance

.000

.085

.001

.003

1523

Centered Leverage Value

.001

.018

.002

.002

1523

a. Dependent Variable: R’s socioeconomic index (2010)

Figure 6

Chart, histogram  Description automatically generated

Figure 7

Chart, scatter chart  Description automatically generated

8210 Week 10 Dis

2

Respond to at least one of your colleagues’ posts in 100 words and provide a constructive comment on their assessment of diagnostics.

1. Were all assumptions tested for?

2. Are there some violations that the model might be robust against? Why or why not?

3. Explain and provide any additional resources (i.e., web links, articles, etc.) to provide your colleague with addressing diagnostic issues.


Terell Johnson

Using the General Social Survey (2014) dataset, two independent variables (highest year of school completed – respondent income in constant dollars) and a dependent variable (socioeconomic index) was chosen to analyze a multiple regression. In addition, a dummy categorical variable (male) was chosen as for recoding practice and interpretation.

RQ: To what extent does highest year of school completed, respondent income in constant dollars, and male (sex) predict socioeconomic index?

H0: Highest year of school completed, respondent income in constant dollars, and male (sex) do not predict socioeconomic index?

Ha: Highest year of school completed, respondent income in constant dollars, and male (sex) do predict socioeconomic index?

As displayed in the model summary (Figure 1), Pearson R is moderately strong at .643 with 41% SES explained. According to Walden University (2016m), Durbin-Watson statistic values range from 0 to 4.0 with values below 1.0 and above 3.0 being considered dangerous and indicates the model has serious serial correlation. For this analysis, the Durbin-Watson statistic has a value of 1.9 and indicates no correlation between the residuals. The ANOVA (Figure 2) test for overall significance of the regression model is .001 and below the threshold of .05 or p < .05, indicating the model has statistical significance and the R Square can be interpreted. The coefficients summary (Figure 3) displays the variance inflation faction (VIF) between 1.0 and 1.2 for the predictor variables, which are well below the general rule of 10 and meets assumption that multicollinearity is not an issue. The Cook’s Distance (Figure 4) values range from a minimum of .000 to a maximum of .085, which is below the genal rule of 1.0 and indicates no undue influence. The histogram (Figure 4) displays normal distribution of standardized residuals, and the scatterplot (Figure 5) indicates no pattern (funnel or cone shape) reflecting homoscedasticity or a linear relationship. Based on this analysis, the null hypothesis is rejected.

Reference

Walden University, LLC. (Producer). (2016m). Regression diagnostics and model evaluation [Video file]. Baltimore, MD: Author.


Check attachment for Module pictures

8210 Week 10 Dis

2

8210 Week 10 Discussion:

Estimating Models Using Dummy Variables

You have had plenty of opportunity to interpret coefficients for metric variables in regression models. Using and interpreting categorical variables takes just a little bit of extra practice. In this Discussion, you will have the opportunity to practice how to recode categorical variables so they can be used in a regression model and how to properly interpret the coefficients. Additionally, you will gain some practice in running diagnostics and identifying any potential problems with the model.

To prepare for this Discussion:

1. Review Warner’s Chapter 12 and Chapter 2 of the Wagner course text and the media program found in this week’s Learning Resources and consider the use of dummy variables.

2. Create a research question using the General Social Survey dataset that can be answered by multiple regression. Using the SPSS software, choose a categorical variable to dummy code as one of your predictor variables.


Assignment Task Part 1

Estimate a multiple regression model that answers your research question. Post your response to the following:

1. What is your research question?

1. Interpret the coefficients for the model, specifically commenting on the dummy variable.

1. Run diagnostics for the regression model. Does the model meet all of the assumptions? Be sure and comment on what assumptions were not met and the possible implications. Is there any possible remedy for one the assumption violations?

Be sure to support your Main Post and Response Post with reference to the week’s Learning Resources and other scholarly evidence in APA Style.


Assignment Task Part 2

Respond to at least one of your colleagues’ posts in 125 words and provide a constructive comment on their assessment of diagnostics.

1. Were all assumptions tested for?

1. Are there some violations that the model might be robust against? Why or why not?

1. Explain and provide any additional resources (i.e., web links, articles, etc.) to provide your colleague with addressing diagnostic issues.

Learning Resources

Required Readings

Wagner, III, W. E. (2020). Using IBM® SPSS® statistics for research methods and social science statistics (7th ed.). Thousand Oaks, CA: Sage Publications.

9. Chapter 2, “Transforming Variables” 

10. Chapter 11, “Editing Output” (previously read in Week 2, 3, 4, 5. 6, 7, 8, and 9)

https://methods.sagepub.com/book/regression-diagnostics/n5.xml

Check attachments for other documents.