Chi Square Test

 

1 Categorical Dependent Variable

1 (2-level) Categorical Independent Variable

with cell values > 5

A Chi Square test measures whether there is a significant difference between the effect of two categorical independent variables on a categorical dependent variable. 

 

In ‘Users’ Conceptions of Risks and Harms on the Web: A Comparative Study’ [2], Friedman et al. study types of concerns Americans have about using the internet. The research team tests whether the concerns vary depending on the type of community the person resides in. The team studies views of individuals from a small town in Maine, a suburban professional community in New Jersey, and a high technology community in California.

 

The researchers perform a Chi Square test to measure whether subjects from the high-tech community have more internet informational concerns than individuals from the small town do. The research team defines information issues as concerns about content legitimacy, privacy, security, spam, and others. The team reports statistically significant results as follows:

 

 

Rural (n = 24)

High-tech (n = 24)

Have information concerns

13

22

Don’t have information concerns

11

2

 

The figure above appears to illustrate a valid application of the Chi Square test. The categorical independent variable, community type, is compared against a categorical dependent variable, concern about internet information. Thus, the data parameters fit the Chi Square parameter requirements. Chi Square is not the correct statistical test, however, because it is applicable when each of the cell values is > 5, and the ‘High-tech’/’Don’t have information concerns’ cell has a value of 2. The Fisher Exact would have been the appropriate test because it does not have a minimum cell value requirement. A quick run of the Fischer Exact test shows the study results were still statistically significant when it was applied.

 

The research team could have analyzed the data in a slightly different way by using a 2 sample independent t-test. The team could have compared whether the percentage of rural residents who had internet information concerns varied significantly from the percentage of high-tech residents with internet security concerns. The research team also could have used one-way ANOVA to find if there were significantly different internet information concerns across all 3 types of communities they studied. Using one-way ANOVA, the 3 community types would have served as the independent variable, and internet information concerns as the dependent variable.

 

Values to report:

·        Chi square value

·        degrees of freedom

·        p value