|
Statistic Definition of Semantic
Axes
Besides
the algebraic interpretation there can also be given an equivalent statistic
interpretation of the main components. Let us perform it in relation to
characteristics of the studied phenomena.
Let us take an elementary situation,
when only two characteristics: and are
studied in objects .
Two -dimensional vectors
univalently determine the selection out of the corresponding two-dimensional
general population.
Therefore, each object in
semantic space is located as a point with coordinates .
Let the characteristics distribution by objects comply with a regular
law: - two-dimensional probability density:

Let us single out a subset of all that got into the domain, restricted by condition: . The respective ellipse of concentration (see
Fig. 3) will be calculated from the following equation:

Hence we receive a second-degree
equation in a general form:
, where
and are
normalized to and are values of and :
.

Fig. 3. Ellipse of concentration obtained by the
section of the “hill”
of normal distribution
density at
the “altitude” of 
The quadric form will
be put in the matrix notation as follows:

The left part of the equation would not
change if the signs of and are
simultaneously changed to the opposite. Consequently, points of the quadric
form graph are coupled symmetrically to the origin of coordinates. That
means that the second power line possesses a symmetry center that is located
at the origin of coordinates.
Let us bring the quadric form to its
canonical form. To do that, let us rotate coordinate axes and so
that an element with product would disappear from the
new coordinates.
.
The system's inverse transformations
are related to the inverse matrix. However, the matrix in the orthogonal
transpositional transformation coincides with the inverse matrix, therefore,
the relation of the old and new coordinates is expressed by the formulae:

For and location, let us plot a unit vector on
the new x-axis (See Fig. 4). Its projections (coordinates) onto the old
axes equal: , where -s an and axes rotation angle.

Fig. 4 Rotation
of coordinate axes by angle .
Unit vector that determines the direction of a new x-axis equals: . By analogy, the unit vector that
determines the direction of a new y-axis, is calculated as follows:
.
, coefficients possess the followinng characteristics:

The latter means that
the axes rotation has been performed with the scale unaffected.
Thus, to reduce a quadric form to the
canonical form, one should convert to the new coordinates so that they
would lack the middle element: . The
left part can be rewritten in the matrix form: . The
right part will be written as follows:
From that issues:
Since for
orthogonal matrices, then having multiplied both parts by
, we receive:
Let us multiply the right
and left matrices:
Comparing the corresponding
matrix elements we get:
Therefore, the equation
system:
has roots and for ,
and roots and for .
Thus, for calculating
и it is necessary to solve the
following equation system:
Unit vectors specify new directions for axes and .
For the system to have
a nontrivial solution, it is necessary for its determinant to equal zero:

Therefore, will
be found from equation:

As long as determinant , consequently and are real
numbers. They are called characteristic numbers and are coefficients for
the unknowns, after reducing it to its canonical form. If and do not equal zero and have the same signs,
the quadric form is called elliptical.
Let ,
then . Having substituted and into the equation system, we obtain two directions: and of the new coordinate axes., where that
quadric form assumes a canonical form. Since
the change of for and for
in
the new frame of reference does not alter the quadric form , consequently, the ellipse is symmetric to and coordinate axes, i.e. coordinate axes
pass through the main directions of the ellipse (Fig. 5).

Fig. 5. Axes rotation
that brings the quadric form to its canonical form.
For we obtain: and . When these values have been substituted
into the system, it turns into an identity, and any major directions comply
with it, the ellipse degenerated into the circle.

Thus, reducing the quadric
form to its canonical form amounts to the solution of a characteristic
equation, i.e. calculation of eigenvalues and eigenvectors, which perform
the coordinate axes rotation in the direction of the major and minor ellipse
semiaxes, which means, toward the maximum of general dispersion of and .
Performing the linear transformation
discussed above in the -dimensional space, we will receive
elliptical hyper-spaces, the main axes of those coinciding with the major
directions after reducing them to the canonical form. All the major directions
are mutually perpendicular. Every eignevalue has a corresponding
eigenvector that coincides with the -th major direction.
The discussed procedure
can be easily generalized in case of variables. In this
case, probability density will be written as follows:

Here is a covariance matrix.
After normalization of:
, we receive:

Let ,
where is a correlation coefficient, and is a correlation matrix. By analogy, let
, therefore:
,
where is a quadric form.
Consequently, for reduction
of the quadric form to its canonical form amounts to a characteristic
equation: .
Thus, the columns of matrix composed of eigenvectors determine weights
of characteristics in the factors, while the lines yield decomposition
of each separate characteristic into factors:
.
Matrix specifies a linear
transformation of converting to a new frame of reference that coincides
with the major directions.
|