English or languish - Probing the ramifications
of Hong Kong's language policy
Key Features

  • Variables
    • Dependent variable - nonmetric categorical (2 or more groups)
    • Independent variables - metric

  • Objectives
    • Determine if statistically significant differences exist between the average score profiles of two or more a priori defined groups.
    • Determine that combination of independent variables which best discriminates among 2 or more apriori defined groups of the dependent variable
    • Determine which independent variables account most for the differences in average score profiles for each group.

  • Statistical procedure - Maximize the between-group variance and minimize the within-group variance. When the variance between groups is large relative to the variance within-groups, then good discrimination has been achieved.

  • Null hypothesis - The group means of two or more groups are statistically identical.

Key Terms

  • The discriminant function

    Z = W1·X1 + W2·Z2 + ... + Wn·Zn
    • Z = discriminant scores
    • W = discriminant weights
    • X = independent variables

  • Centroid - The mean of a statistically determined group of the dependent variable. It is the average discriminant score for a particular group.
  • Cutting score - the criterion against which individual observations are classified according to their discriminant scores
  • Group - those observations that fall into a specific category of the dependent variable
  • Hit ratio - the percentage of statistical observations correctly classified by the discriminant function.
  • Split sample approach

    • Analysis sample - the sample used for developing the discriminant function
    • Hold-out (or validation) sample - the sample used to validate the discriminant function


  • Multivariate normality
  • No prior probabilities with regard to group identity
  • When the sample size is large violations of these assumptions is not adverse.

3-Step Analysis

  • Step 1 - Derivation of the discriminant function

    After the dependent variable is selected from among all variables, the sample data are split into analysis and hold-out samples, and the statistical procedure is run against only the data from the analysis sample.

  • Step 2 - Validation

    Having estimated the determinant function in Step 1 the determinant function is examined for statistical significance. If statistical significance is found, the power of the function to classify correctly is tested using the observations remaining in the hold-out sample .

  • Step 3 - Interpretation

