Chi
Square Test
1 Categorical Dependent Variable
1 (2-level) Categorical Independent
Variable
with cell values > 5

A Chi Square
test measures whether there is a significant difference between the effect of
two categorical independent variables on a categorical dependent variable.
In ‘Users’
Conceptions of Risks and Harms on the Web: A Comparative Study’ [2], Friedman et al. study types of concerns Americans
have about using the internet. The research team tests whether the concerns
vary depending on the type of community the person resides in. The team studies
views of individuals from a small town in Maine, a suburban professional
community in New Jersey, and a high technology community in California.
The
researchers perform a Chi Square test to measure whether subjects from the
high-tech community have more internet informational concerns than individuals
from the small town do. The research team defines information issues as
concerns about content legitimacy, privacy, security, spam, and others. The
team reports statistically significant results as follows:
|
|
Rural (n = 24) |
High-tech (n =
24) |
|
Have information concerns |
13 |
22 |
|
Don’t have information concerns |
11 |
2 |
The figure
above appears to illustrate a valid application of the Chi Square test. The
categorical independent variable, community type, is compared against a
categorical dependent variable, concern about internet information. Thus, the
data parameters fit the Chi Square parameter requirements. Chi Square is not
the correct statistical test, however, because it is applicable when each of
the cell values is > 5, and the ‘High-tech’/’Don’t have information
concerns’ cell has a value of 2. The Fisher Exact would have been the
appropriate test because it does not have a minimum cell value requirement. A
quick run of the Fischer Exact test shows the study results were still
statistically significant when it was applied.
The research
team could have analyzed the data in a slightly different way by using a 2 sample
independent t-test. The team could have compared whether the percentage of
rural residents who had internet information concerns varied significantly from
the percentage of high-tech residents with internet security concerns. The
research team also could have used one-way ANOVA to find if there were
significantly different internet information concerns across all 3 types of
communities they studied. Using one-way ANOVA, the 3 community types would have
served as the independent variable, and internet information concerns as the
dependent variable.
Values to
report:
·
Chi square
value
·
degrees of
freedom
·
p value