project index |

English
or languish - Probing the ramificationsof Hong Kong's language policy |
||

Multiple Discriminant Analysis (MDA) | ||

statistical modelling | multiple discriminant analysis | 3-step analysis (step 1 | step 3) | ||

- General points of interest
- Classification matrices
- Cutting score determination
- Chance models
- Classification matrices
- Measures of the discriminant function's significance are
generally of little value, as the functions themselves might
fail to discriminate well between the groups' centroids even
when the distance between them is statistically significant.
How well the discriminant function is able to classify individual
group members is a better test of the function's utility.
*Hit - ratios* A hit - ratio is the percentage of correctly classified observations. In so far as they tell how much of the total variation in the dependent variable is accounted for by the discriminant function,*hit - ratios*are somewhat analogous to R-square values in regression analysis. Accordingly the F-statistic for regression analysis and the Chi-square statistic for discriminant analysis are analogous.
*Cutting scores (Critical Z-values)*- A cutting score is the decision rule for determining an individual observation's group membership. If the groups are of equal size, then the cutting scores are equal to the midpoints between the cetroids of the different groups. When the groups are of different size the value of the centroids must be weighted.
- Measures of the discriminant function's significance are
generally of little value, as the functions themselves might
fail to discriminate well between the groups' centroids even
when the distance between them is statistically significant.
How well the discriminant function is able to classify individual
group members is a better test of the function's utility.
*Optimal cutting scores*- Weighted cutting scores that do not take into account the cost of misclassification are not optimal, unless the cost of misclassification is the same for both groups.
*Procedure*cluster analysis (research design issues)
After the sample has been split and the discriminant function determined, the observations of the hold-out sample are classified by the function and placed into a table that compares their estimated membership with the known membership.
- A t-test is employed to determine the level of significance of the discriminant function's ability to classify the observations correctly. For a two-group analysis with equal sample size the following formula is employed:
*Chance models*- a few rules of thumb
In general the discriminant function should be able to predict the proper group for each observation better than chance itself. How much better will depend on the cost of generating the discriminant function -- namely, the cost of analysis; and the actual value derived from accurately producing group membership.
Two common rules of thumb in this regard are the
- Maximum chance criterion, and
- Proportional chance criterion
- Only if both statistical accuracy of the discriminant function and a satisfactory level of accuracy in classification are achieved, does interpretation of the discriminant function become useful.
Go to step 3 |
||

top |