English or languish - Probing the ramifications
of Hong Kong's language policy
Regression Analysis
project index | statistical modelling (prognostics) | factor analysis (decision tree)

 Quality Assurance

  • Goodness of fit

    • Linearity - Examination of error terms with respect to each independent variable

      • Plot error terms against each independent variable.
      • Partitioning the error terms
        • Requires multiple observations of the dependent variable for each value of the independent variables
      • F - statistic
        Comparing explained with unexplained error.

    • Constant variance of the error term - Plot error terms against predicted values of the dependent variable.

    • Independence of error terms - Plot error terms against time.

    • Normality of error term distribution.

      • Construct a histogram of error terms
      • Plot the cumulative standardized residuals against a straight line.

    • Addition of other variables - Regress error terms against additional variable.

  • Statistical significance

    • T - tests against the null hypothesis for individual independent variables.
    • F - test (Coefficient of determination) - the ratio of explained variation and unexplained variation

  • Predictive value

    • Only predict within the range of the sample estimated

  • Strength of association - regression coefficients

    • Beta coefficients - Coefficients resulting from standardized data that permit direct comparison of the relative importance of each independent variable on the dependent variable. Beta values provide useful comparisons only when

      • multicollinearity is small
      • the number and kind of variables remain the same
      • the range of values of each variable do not change.

  • Selecting the best predictive model - Stepwise regression analysis

    • Backward elimination - a process of elimination whereby a small number of independent variables are chosen from among many independent variables with varying degrees of predictive power.

    • Stepwise forward estimation - a process by which additional independent variables are added to the regression equation based on their ability to account for the unexplained error left by other independent variables

  • Dummy variables