Linear Discriminant Analysis
1 Categorical Dependent Variable
1 or more Continuous Independent Variable with normal distribution.
Linear discriminant analysis (LDA) classifies a sample object into one of two categories based on certain object properties. LDA tests whether object attributes measured in an experiment predict categorization of the objects.
In ‘Empirically Validated Web Page Design Metrics’ , Ivory et al seek to identify characteristics of award-winning web pages. The researchers identify 11 easily measurable web page characteristics they believe contribute to web page usability. The metrics identified focus on text presentation, link availability, page size, graphic content and use of color. They apply the 11 metrics to web pages entered in the Webby Awards, a competition where design experts evaluate web sites for usability. The purpose of the experiment is to determine whether the web page’s first round Webby Award competition score correlates with the 11 design metrics.
The research team divided web pages from the Webby Awards into two categories based on the score the web pages earned in the first round of the Webby competition. Category 1 included web pages with scores in the top 33% of the competition, and category 2 included web pages with scores in the bottom 67% of the competition. Next, the research team, using an automated tool, measured how well each web page in the competition applied the 11 design metrics identified earlier. They predicted whether the web page was in the Webby Award ‘top 33%’ or ‘bottom 67%’ category based on its performance over the 11 metrics. Overall, their automated tool was 67% accurate in categorizing the web pages.
In this example, the research team demonstrates a valid use of linear discriminant analysis. First, the independent variables, the 11 design metrics, had continuous numeric values, which satisfy the independent variable type needed to apply LDA. Second, the dependent variable, Webby Award score, when made into a dichotomous categorical variable for the experiment, satisfied the LDA dependent variable requirement. LDA, not commonly used in HCI experiments, proved an innovative experimental design tactic and its use shows creativity on the part of the research team.
As alternatives, the research team could have used either quadratic discriminant analysis (QDA) or logistic regression. QDA takes the same input parameters and returns the same results as LDA. QDA uses quadratic equations, rather than linear equations, to produce results. LDA and QDA are interchangeable, and which to use is a matter of preference and/or availability of software to support the analysis. Logistic regression takes the same input parameters and returns the same results as LDA and QDA do. Logistic regression is preferable in unequal grouping conditions. For example, if the researchers expect only .01 of the cases will be categorized in group 1 and .99 will be categorized in group 2, logistic regression would be the better choice because it is more sensitive to unequal groupings than LDA and QDA are.[i]
[i] From personal communication with Michael Peascoe at the University of Minnesota Statistical Consulting Clinic.