











English
or languish  Probing the ramifications
of Hong Kong's language policy 



Factor
Analysis
Principal Component and Common
Factor Analysis
project index  statistical modelling (diagnostics)
cluster analysis (research
design issues)  mds (number
and interpretation of dimensions) 





Key
Features
 Variables
 Metric (interval or ratio)  Many
 Dummy variables
 Sample Size  As the number of respondents to the
HKLNA study is expected to be large, sample size with regard
to this procedure is not likely to be a problem. As a general
rule the number of obversations should be four to five times
greater than the number of variables.
 Objective  Data reduction and summarization
 General Uses
cluster analysis (key
features)
Factor analysis is an interdependence technique
that examines the interrelationships among a large number of
variables in an effort to determine underlying dimensions (factors).
Factor analysis can be used for the following purposes:
 R  Analysis identifies a set of underlying dimensions
among a large number of variables.
 Q  Analysis
cluster analysis (proximity
measures)
The procedure condenses a large number of observations into distinctly
different groups whose shared characteristics describe the population
from which the observations are drawn.
 Identify key variables among a large number of variables
for the purpose of further analysis using other, often predictive,
statistical analyses  surrogate variable selection.
 Create an entirely new, but less numerous, set of variables
to replace the original set for the purpose of further analysis
using other statistical techniques. (return to
factor model)
 Statistical Procedure  Although
a very useful statistical procedure factor analysis requires
a large number of decisions in order to obtain meaningful results.
For this reason a decision tree
and corresponding explanation
have been created. There are two major approaches two factor
analysis
 Prinicipal component analysis
 Common factor analysis



top 

Key Terms
 Communality  The communality
of a variable is the proportion of a variable's total variance
explained by all of factors on which it loads. It is the sum
of the squared loadings for each variable on all factors. (return to factor
model)
H$$_{i}$$2 = a$$_{i1}^{2}
+ a$$_{i2}^{2} + ... + a$$_{ip}^{2}
The square root of the communality for each variable SQRT(H$$_{i}$$2
) is the length of the variable's factor loading in factor
vector space.
Subtracting the communality of a variable from one yields that
variable's uniqueness (unique variance).
 Eigen
Values  The sum of the column of squared loadings for
each factor
 EV$$_{j} = a$$_{1j}^{2} + a$$_{2j}^{2}
+ ... + a$$_{nj}^{2}
 Eigen values are the roots of the characteristic equation
for the correlation matrix. There is one eigen value for each
factor.
 Σλ$$_{j}
= ΣH$$_{i}$$2
for j = 1, 2,..., P and i = 1, 2,..., N
 Dividing eigenvalues by either the number of variables (component
analysis) or the sum of the communalities (common factor analysis)
and multiplying by 100 yields the percent of variation (see percent of variance) which a single factor takes
into account.
 Although the eigensums are the same for both the rotated
and unrotated factor solutions, the eigenvalues themselves have
different meanings. In the initial, unrotated solution the vertical
sum of the squared loadings tells us something about the relative
importance of each factor. This is not true for the rotated version!
 Factor Loading
 a$$_{ij}
 A factor loading is the correlation
between an original variable (or observation when performing
Qanalysis) and a particular factor.
 a$$_{ij}^{2}  The square of a factor loading
is the percent fraction of variation that a variable shares in
common with a particular factor.
 Factor Matrix  The tabulated
numerical output of a factor solution. A factor matrix generally
includes a list of factors with their associated variable factor
loadings. Communalities and eigen values are also provided.
 Factor Pattern Matrix  This matrix contains the coefficients
of the factors that describe the linear relationship among the
factors which determine the standardized values of each variable
for any given observation:
Z$$_{i} = a$$_{i1}F$$_{1} + a$$_{i2}F$$_{2}
+ ... + a$$_{iP}F$$_{P} + e$$_{i}Y$$_{i}
where Z$$_{i} = the
standardized value of the i$$th
variable ( X$$_{i}),
where a$$_{i1}
= a regression coefficient for the i$$th
variable and the j$$th
factor
where F$$_{j} =
the factor score for the j$$th
factor, and
where ε$$_{i}Y$$_{i} = the error term.
 Factor
Scores
cluster analysis (proximity
measures)
Composite measures that reflect the importance of each factor
relative to individual observations.
 Factor solutions  A factor
solution is simply the set of factors that result from a factor
analysis experiment. Factor solutions fall into two major types:
 Orthogonal Factor Solutions, and
 Oblique Factor Solutions
Orthogonal factor solutions yield factors that are
statistically independent and can be used with other statistical
procedures that must satisfy assumptions of statistical independence.
Oblique factor solutions are solutions for which factors
are correlated. They are usually a better reflection of the underlying
reality which the researcher seeks to describe. (return
to extraction method)
 Percent
of Trace  The part of total model variance taken into
account by a single factor.
PV$$_{j} = [(a$$_{1j}^{2} + a$$_{2j}^{2}
+ ... + a$$_{3j}^{2}) / T] · 100
where j = the j$$_{th}
factor
where T = the trace of the correlation
matrix
Caution: The percents of trace
for rotated and unrotated common factor solutions are
derived differently from those of principal component analysis.
For common factor analysis the trace for rotated solutions is
the same as that for unrotated solutions. Whereas the summed
communalities for unrotated solutions are equal to the trace
of the correlation matrix, their sum tor rotated solutions is
not.
 Total Percent of Trace
 Summing across the percents of trace for each factor yields
the total percent of trace. The total percent of trace is an
indicator of how well a particular factor solution accounts for
the variance of all the variables. If the variables are all very
different the total percent of trace will be low. If the variables
are similar the total percent of trace will be high.
 Percent
of Variance  The part of total variance taken into account
by a single factor. (return to eigenvalue)
PV$$_{j}= λ$$_{j}/ N
where N = the total number of variables or total model variance.
For common factor aalysis the percent of trace and percent
of variance are different insofar as the first measures the proportion
of common variance and the second the proportion of total variance
taken into account by the factor in question.
 Percent
of Total Variance (PTV)  The common variance explained
by all factors as a percentage of total variance.
PTV = PV$$_{1} + PV$$_{2} + ... + PV$$_{p}
 Principal
Diagonal  The principal diagonal of the correlation
matrix whose elements are defined differently according to the
factoring procedure employed. (return to factor model)
 Principal component analysis  the elements of the
principal diagonal are equal to one. In other words the variable
correlates exactly with itself and all variance associated with
that variable, both systematic and unsystematic, are included
in the analysis
 Common factor analysis  the elements of the principal
diagonal are not equal to one; rather, they are equal to the
communalities associated with the variables of the original
unrotated solution.
 Sources of Variance (return
to uniqueness)
 Common variance  the variance that a single variable
shares in common with one or more of the other variables of the
analysis.
 Unique variance  the sum of both the systematic and
unsystematic variance that is common to no other variable. (return to factor
model)
 Specific variance  the systematic variance specific
to a particular variable and not shared with all other variables
.
 Error variance  the unsystematic variance specific
to a particular variable.
 Total variance  For any given variable the total
variance associated with that variable is equal to the sum of
its common variance and unique variance, or alternatively
total variance = common varaince + unique variance
where unique variance = specific variance + error variance
 Surrogate variable  That
variable which loads heaviest on a factor and is used to represent
that factor in subsequent analysis. Surrogate variables can be
utilized in lieu of factor scores. (return to
factor model)
 Trace  The sum of the elements
of the principal diagonal of the correlation matrix (return
to percent of trace)
 Principal component analysis  the trace equals the total
number of variables included in the analysis
 Common factor analysis  the trace equals the sum of the
communalities of all variables of the initial unrotated factor
solution.
 Uniqueness (return
to communality)  The unique
variance of a variable with respect to other variables of a factor
analysis is obatined as follows:
U$$_{i} = 1  H$$_{i}$$2
See Sources of Variance for further discussion.



top 

Orthogonal Rotation
There are several orthogonal rotation techniques including
 Quartimax  The Quartimax approach seeks to simplify the
rows of the factor matrix. The goal is to obtain factor loadings
in which each variable loads high on only one factor and low
on all others. As a result many variables tend to load heavily
on single factors.
 Varimax  The Varimax approach seeks to simplify the columns
of the factor matrix. Thus, for anyone factor all variables tend
to load either very high or very low.
 Equimax  The Equimax approach seeks a balance between row
and factor simplification.
A thorough analysis might employ them all.



top 

Factor Extraction
Criteria
There are several different criteria commonly employed to
determine the number of factors to extract. These include the



top 

Reference
List
Hair, Joseph F., Jr., Rolph E. Anderson, Ronald L. Tatham,
and Bernie J. Grablowsky. 1984. Multivariate Data Analysis with
Readings. New York: Macmillan Publishing Co.
Hughes, Adele. 1984. Class notes to graduate coursework in
Applied Stastistical Methods.



top 
