MAKE SURE TO SAVE YOUR SPSS/PSPP OUTPUT AS A PDF AND TURN IN WITH YOUR RESPONSES and be sure to properly label your responses]
For those that have trouble reading a cross tab SEE attachment at the end of this assignment (outputN.pdf) You have to make sure you click on the assignment to see the attachment.
1) You will report the information listed below for both the GSS 2012 & GSS 2018 data sets: Examine the relationship between attitudes toward the level of national assistance for childcare (NATCHILD) and the sex of the respondent (SEX). Fill in the following information:
Make a prediction
What percent of Americans believe we spend too little for childcare? ________________
Do you think men and women vary on their perspectives on this issue?
YES NO
Now perform the analysis and report the requested information (Use a cross tab and remember the chi square is to determine if there is a significant difference in the percentages between group/categories that could not be credited to chance:
2012/2018
Percentage of men (out of only men) stating the current level is too little ___________/_____________
Percentage of women (out of only women) stating the current level is too little _________/_______________
Chi Square significance level __________/______________
Make sure you are reporting the level of significance (NOT the chisquare score) as stated in the PSPP output.
Is the relationship statistically significant YES NO (2012)/YES NO (2018)
A) How would you interpret this result beyond just reference to the level of significance? (what do the findings suggest? Interpret). Do not just restate the statistics without any interpretation, and include mention of the 2008 findings in your discussion. Be sure to expand on the findings reported and do not include anything about missing data)
2) You will report the information listed below for the GSS 2018 data sets:
You will report the information listed below for the GSS 2018 data sets: Examine the relationship between race (RACE) and whether a person supports the death penalty for murder (CAPPUN), whether one ever approves of police striking a citizen (POLHITOK), and the belief that Whites are hurt by affirmative action (DISCAFF). Fill in the following information:
Make a prediction
What percent of Americans support the death penalty? __________
Do you think Whites and Blacks vary on their perspectives on this issue (CAPPUN)? YES NO
Do you think Whites and Blacks vary on their perspectives on this issue (POLHITOK)? YES NO
Now perform the analysis and find:
A) CAPPUN BY RACE
Percentage of whites (out of only whites) that disapprove:________________________
Percentage of African Americans (out of only African Americans) that disapprove ________________________
Percent of Others (out of only Others) that disapprove ________________________
Chi Square significance level ________________________
Is the relationship statistically significant?
B) POLHITOK BY RACE
Percentage of whites (out of only whites) that say no:________________________
Percentage of African Americans (out of only African Americans) that say no ________________________
Percent of Others (out of only Others) that say no ________________________
Chi Square significance level ________________________
Is the relationship statistically significant?
C) DISCAFF BY RACE
Make a prediction
What percent of Americans believe whites are hurt by affirmative action? ___________
Do you think Whites and Blacks vary on their perspectives on this issue? YES NO
Now perform the analysis and find:
Percentage of Whites (out of only whites) saying Somewhat likely ______________________
Percentage of Blacks (out of only African Americans) saying Somewhat likely ______________________
Percentage of others (out of only Others) saying Somewhat likely ______________________
Chi Square significance level _______________________
Report the level of significance (NOT the chisquare score) as stated in the PSPP output.
Is the relationship statistically significant YES NO
A) How would you interpret this result beyond just reference to the level of significance? Do not just restate the statistics without any interpretation. What do you observe in the findings (beyond what is asked for above)patterns? Be specific.
3) You will report the information listed below for the GSS 2018 data sets: Examine the relationship between general happiness (HAPPY) and marital status (MAR1). Fill in the following information:
Make a prediction
What percent of Americans are very happy? ________________
Who is the happiest? People who are (underline one):
Married Divorced Never married
Now perform the analysis and find:
Percentage of married individuals (out of only married people) who are very happy ________________________
Percentage of divorced individuals (out of only divorced people) who are very happy ________________________
Percentage of never married individuals (out of only never married people) who are very happy ________________________
Chi Square significance level ________________________
Report the level of significance (NOT the chisquare score) as stated in the SPSS/PSPP output.
Is the relationship statistically significant YES NO
A) How would you interpret this result without just reference to the level of significance? Do not just restate the statistics without any interpretation. What do you observe in the findings (beyond what is asked for above)patterns? be specific?
4). In the absence of performing an analysis, make a prediction: What percent of Americans view their health as either excellent, good, fair and poor? Then use the GSS2018 data, run a frequency on HEALTH, and also create a pie chart. Describe your findings below and explain the extent to which your predictions conformed to the findings. What do these findings suggest about health in America? Exclude missing cases.
Predicted Percent
Excellent ___________
Good ___________
Fair ___________
Poor ___________
How close were your predictions? If they differed, what do you think was the reason?
Valid Percent
Excellent ___________
Good ___________
Fair ___________
Poor ___________
Provide a PIE CHART and a BAR GRAPH:
What do these findings suggest about the state of health in America? Policy implications given perceptions versus the reality of health in America?
5) Using the GSS 2018 data, examine the relationship between sex (SEX) and the belief that a woman will not get a job or promotion over a man (DISCAFFW).
Fill in the following information:
Make a prediction
What percent of Americans believe women are less likely to get a job or promoted over a man:
___________
Do you think males and females vary on their perspectives on this issue? YES NO
Now perform the analysis and find:
Percentage of men (out of only men) saying Very Likely _____________
Percentage of women (out of only women) saying Very Likely ___________
Chi Square significance level __________________
Is the relationship statistically significant YES NO
A) How would you interpret this result without just reference to the level of significance? Do not just restate the statistics without any interpretation. What do you observe in the findings (beyond what is asked for above)patterns? Compare with the 2012 findings from the discussion. Be specific.
IMPORTANT: After you have submitted your assignment the answers will be made available below. If you have any responses that are incorrect, resubmit your corrected responses along with explanations of what you did incorrectly no later than two days after the due dat
1
CHAPTER 3: CHI SQUARE DISTRIBUTIONS
The next statistical procedure is the chi square distribution, which is used to determine the level
of statistical significance for relationships between variables at the nominal and ordinal level of
measurement. Specifically, The Pearson ChiSquare tests whether a particular pattern of
group frequencies is likely due to chance alone. Since the Pearson ChiSquare evaluates two
variables, a significant ChiSquare value tells two things: 1) the pattern of frequencies is
significantly different from a random pattern AND 2) that the values are significantly associated
with each other. We will also look at measures of association appropriate for nominal and
ordinal data: lambda and gamma.
ANALYZE
DESCRIPTIVE STATISTICS>CROSSTABS (click)
Graphic 3.1
A dialogue box will open with the list of variables and two areas to add the variables (row and
column).
Graphic 3.2
2
We are going to look at Click on the first variable so that it is highlighted, then type ABANY.
This should bring you down to Abortion if women wants for any reason.highlight it and
move it to row by clicking on the arrow.
Graphic 3.3
Then go back to the top, repeat but type SEX, it will go to Respondents Sex, highlight it and
move it to column
Graphic 3.4
Click on Statistics, and a new dialogue box will open. In the lowerright hand corner next to the
help button is something that looks like a dogeared page. You can place your cursor it, right
click and drag on it and the dialogue box will expand revealing more choices. We are doing a
ChiSquare so make sure that the Chisq box is checked and we are going to include a measure
of association as well.
But wait! How do I know which measure of association to choose? The following discussion
was adapted from an exercise prepared by Ed Nelson at the Social Science Research and
Instructional Lab: There are many measures of association to choose from. Were going to limit
our discussion to those measures that PSPP will compute. When choosing a measure of
association well start by considering the level of measurement of the two variables (see chapter
2 for review of level of measurement).
3
If one or both of the variables is nominal, then choose one of these measures.
o Contingency Coefficient [automatically calculated by PSPP]
o Phi and Cramers V
o Lambda
If both of the variables are ordinal, then choose from this list.
o Gamma
o Somers d
o Kendalls taub
o Kendalls tauc
Dichotomies should be treated as ordinal. Most variables can be recoded into dichotomies
(also know as dummy variables, where it is either coded a 1 or a 0). For example,
marital status can be recoded into married (1) or not married (0). Race can be recoded as
white (1) or nonwhite (0). All dichotomies should be considered ordinal.
For this exercise, since both variables are ordinal: both the SEX variable and categories available
for ABANY (Yes, NO) are dichotomies and as such is treated as ordinal.
Therefore, you can choose any of the four options listed. We are just choosing Gamma so make
sure to click on Gamma as well. Since the GSS 2008 treats these as nominal variables, choose
Lambda, and then click Continue then click OK.
Graphic 3.5
The following output (Graphic 3.6) will appear
4
Graphic 3.6
Looking at the various parts will help better understand what is portrayed in the output. First is a
listing of the syntax as shown in Graphic 3.7:
Graphic 3.7
Next is a summary of the valid and missing cases or observations. You normally would not
report the missing cases, so you go with the Valid Cases rather than the Total when reporting the
N (Number of Observations). For this question, the N equals 1298.
Graphic 3.8
5
Graphic 3.9 displays a wealth of information but it means little of you do not understand it. To
know what it value stands for, first look at the top of table for a guide. The following appears at
the top: ABORTION IF WOMAN WANTS FOR ANY REASON*RESPONDENTS SEX
[count, row %, column %, total %. This last part is important because it indicates what each
value within a given square stands for.
Graphic 3.9
As an example, look at the row YES.
Starting with the column for MALE, the first value, which is the count is 262. This is the
number of respondents that said YES and are MALE.
The next value in the MALE column is the row percent. The value 47.64% is percent of
those that responded YES that are male, out of ALL those, in other words, EVERYONE
that responded YES. This value could be used to make comparisons across responses,
for example, what was the makeup of those that responded NO or YES.
The third value in the MALE column is the column percent. The value, 44.56% is the
percent MALE that responded YES, out of ONLY MALE respondents. This is the value
one would use if they wanted to make comparisons across groups, in this case MALE
and FEMALE. IMPORTANT: Remember which variables you enter for each row and
column so that you know which is the categorical response (in this case: NO, YES) and
the GROUP (in this case: MALE, FEMALE).
The last value: 20.18% is the percent MALE that responded YES out of the all
(TOTAL) respondents.
Regardless of whether or not there is a significant difference as indicated by the Pearson Chi
Square (Graphic 10), the information in Graphic 3.9 can be used to provide valuable information.
Moving on to Graphic 10, the focus will be only on the top statistic: Chi Square. The following
measures of association (Lambda and Gamma) were also included.
6
Graphic 3.10
The results of various tests are provided in the first of the three tables. If you did not click on
Gamma and Lambda, this would be the only table you would see. The Pearson ChiSquare
tests whether a particular pattern of group frequencies is likely due to chance alone. Recall from
the beginning of the chapter that a significant Pearson ChiSquare value tells us two things: 1)
the pattern of frequencies is significantly different from a random pattern AND 2) that the values
are significantly associated with each other. This association can also be tested with gamma,
lambda, and others as listed above.
Please not that most academic journals consider a significance level of .05 or lower to be
significant (95% of confidence intervals). Although, some journals also researchers to indicate
when test statistics indicate a significance level of .1 or lower. This can vary by journal and
discipline.
Back to our example: The Pearson ChiSquare value or score is 2.10. When statisticians
calculated these scores by hand it was necessary to look up critical values to see if this value was
significant. Statistical software, like PSPP, SPSS, SAS, and STATA and many others, calculates
the significance level for you so this is unnecessary. The significance level is listed under
Asymp. (Asymptotic) Sig. (2 tailed) and is .147. This indicates that it is not significant.
Similarly, neither Gamma or Lambda are significant (based on the ChiSquare) suggesting that
the pattern of frequencies are not significantly different that what could be produced by chance
nor, as indicated by all three test statistics, that these two variables are not significantly
associated with each other.
Central tendency and spread 2
Data visualization to determine measures of central tendency and spread
7/10
Generate measures of central tendency and spread
Statistics 

Per Capita Income, 2015 

N 
Valid 
51 
Missing 
0 

Mean 
46929.3529 

Std. Error of Mean 
1096.95234 

Median 
45002.0000 

Mode 
35444.00a 

Std. Deviation 
7833.80661 

Variance 
61368526.073 

Range 
36052.00 

Minimum 
35444.00 

Maximum 
71496.00 

Sum 
2393397.00 

Percentiles 
25 
40998.0000 
50 
45002.0000 

75 
51146.0000 

a. Multiple modes exist. The smallest value is shown 
B. CRS63 (Prisoners under Sentence of Death: 2008) Which measure of central tendency is the preferred measure and why?
Statistics 

Prisoners Under Sentence of Death: 2008 

N 
Valid 
37 
Missing 
14 

Mean 
85.297 

Std. Error of Mean 
22.4623 

Median 
35.000 

Mode 
2.0 

Std. Deviation 
136.6327 

Variance 
18668.492 

Range 
669.0 

Minimum 
.0 

Maximum 
669.0 

Sum 
3156.0 

Percentiles 
25 
8.000 
50 
35.000 

75 
96.000 
C. HrtDRT17 (Heart Disease Mortality Rate: 2017) Which measure of central tendency is the preferred measure and why?
Statistics 

Heart Disease Mortality Rate, 2017 

N 
Valid 
50 
Missing 
1 

Mean 
165.5100 

Std. Error of Mean 
4.02777 

Median 
157.7500 

Mode 
119.10a 

Std. Deviation 
28.48063 

Variance 
811.146 

Range 
118.10 

Minimum 
119.10 

Maximum 
237.20 

Sum 
8275.50 

Percentiles 
25 
145.2250 
50 
157.7500 

75 
183.9500 

a. Multiple modes exist. The smallest value is shown 
Using the GSS2018 dataset, what is the proper graph (scatterplot, histogram, bar chart)make sure to provide it, and report what you think is the correct measure of central tendency. Why?
A. REALRINC (R’s Income in constant $)
Statistics 

R’s income in constant $ 

N 
Valid 
1363 
Missing 
985 

Mean 
24994.19 

Std. Error of Mean 
782.298 

Median 
17025.00 

Mode 
20430 

Std. Deviation 
28881.511 

Variance 
834141678.466 

Skewness 
2.968 

Std. Error of Skewness 
.066 

Kurtosis 
10.128 

Std. Error of Kurtosis 
.132 

Range 
150824 

Minimum 
227 

Maximum 
151051 

Sum 
34067080 
The correct measure of tendency is mean because the data is evenly distributed. Visualization can be done using a bar graph. Because the data is so spread out, use the median.
B. PARTNERS5 (How many sex partner’s R has in last 5 year)
Statistics 

How many sex partner’s R had in last 5 years 

N 
Valid 
1388 
Missing 
960 

Mean 
1.73 

Std. Error of Mean 
.050 

Median 
1.00 

Mode 
1 

Std. Deviation 
1.856 

Variance 
3.443 

Skewness 
2.053 

Std. Error of Skewness 
.066 

Kurtosis 
4.385 

Std. Error of Kurtosis 
.131 

Range 
9 

Minimum 
0 

Maximum 
9 

Sum 
2406 
The correct measure of tendency is mode because the data is not evenly distributed. Visualization can be done using a histogram. No, given this is ratio level data, use either mean or median. The skew and spread suggest using the median.
C. RELITEN (Strength of Affiliation)
Statistics 

Strength of affiliation 

N 
Valid 
2314 
Missing 
34 

Mean 
2.17 

Std. Error of Mean 
.024 

Median 
2.00 

Mode 
2 

Std. Deviation 
1.140 

Variance 
1.300 

Range 
3 

Minimum 
1 

Maximum 
4 

Sum 
5011 
The correct measure of central tendency in this case mean. It can be used to show data distribution inclined to strength of affiliation. Data visualization can be used using a bar graph because it shows clear data representation. Use the median and a bar graph because this is ordinal level data. Given there is an even number of categories, the mode might make more sense as it shows which level of affiliation is most frequently selected.
3. Using the GSS2018 dataset, what is the proper graph (scatterplot, histogram, bar chartmake sure to provide it), for:
A. HAPPY (General Happiness) by SEX (Respondent’s Sex)
The proper graph to use in this case is an histogram because it gives a more precise data visualization. No, ordinal level data so use the bar graph.
B. ANCESTRS (Believe in Supernatural Power of Deceased Ancestors) by SEX (Respondent’s Sex)
The proper graph to use in this case is an histogram because it gives a more precise data visualization. No, use bar graph because this is ordinal level data.