Friday, May 28, 2010

LITERATURE REVIEW PART 2/3 - Principal Component Analysis Approach

2.3 Principal Component Analysis Approach


Principal Component Analysis (PCA) is also known as Karhunen-Loeve transformation or eigenspace projection. It is a well known statistical technique to identify patterns in data. It highlight similarities and differences between patterns. Since patterns can be hard to find in data of high dimension, where the luxury of graphical representation is not available, PCA is a powerful tool to extract patterns. The other main advantage of PCA is that once the pattern is found in the data and the data is compressed where the number of dimensions is reduced, however less information is lost [Smith, 2002].


PCA can be easier illustrated from Figure 2.2(b). it is show how quite strong pattern data is produced. The covariance matrix made both variables increase together. On the top of the graph, both eigenvectors are plotted as well that appear as diagonal lines on the plot. As eigenvectors, their appearances are perpendicular to each other. Most importantly, their values provide important information about patterns of the data. The straight lines through the middles of the point shows where the eigenvectors goes like drawing a line of the best fit. This is also illustrated how these two (2) sets of data are related along the line (Figure 2.2(a).). However, for second eigenvector it gives less important that all the points follow the main line but off to the side of the main line by some amount. Thus, the eigenvectors of the covariance matrix, represented as a line.




Figure 2.2: Example of PCA reconstruction [Smith, 2002]




Once the eigenvectors are created, the feature vectors can be represented by combining it with the original data. Figure 2.2(c) shows the new plot of data that using both eigenvectors. This plot is basically the rotation of original data by the axes. The other transformation take only the eigenvectors with largest eigenvalues. In this example, it is only single dimension Figure 2.2(d). This set of data is exactly same with the one resulting using both eigenvectors. Basically PCA transformed the original data so that is expressed in terms of the patterns between them, where the patterns are the lines that most closely describe the relationship between the data. This is useful because new data points are combinations from each of those lines. The original x and y points did not give exactly how the both points relate to the rest of the data. With new values using the both eigenvectors, the data is altered from usual axes. By using the eigenvectors associated with highest eigenvalues, it has removed the contribution due to the smaller eigenvector and left the data in terms of the other [Smith, 2002].




2.3.1 Eigenvalues and Eigenvectors Problems


    In the previous section, it is noticed that the eigenvectors and eigenvalues are the most important criteria to perform the PCA. The approach of using eigenvalues and eigenvectors commonly find in linear transformations where this approach enables the structural engineer to determine the stability of a structure or a numerical analyst to establish the convergence of an iterative algorithm [Health, 2002].
 
Besides, an eigenvector of a matrix determines a direction in which the effect of the matrix is particularly simple: the matrix expands or shrinks any vector lying in that direction by a scalar multiple and the expansion or contraction factor is given by the corresponding eigenvalues, $\lambda$. Thus, the eigenvalues and eigenvectors provide a means of understanding the complicated behavior of a general transformation by decomposing it into simpler actions.


The conventional method to find the eigenvalues for an   matrix A is said to have corresponding eigenvalues,  $\lambda$ if;


                   $ax =\lambda x$                             (2.1)
evidently, from equations (3.5);
                   $(A-\lambda I)v = 0$                        (2.2)


if  $v\ne0$ ,   $det |A-\lambda I|=0$ which if expanded out is an Nth degree polynomial in  $\lambda$ whose root are the eigenvalues. This determinant is easy to solve only if the dimension of matrix and matrix values is small [AB Rahman, 2002]. Hence, the proposed is implemented the numerical method of Jacobi’s to resolve the problem.


Jacobi’s method is an easily understood and more reliable [Health M.T, 2002] algorithm for finding all eigen pairs for a symmetric matrix. It is the method is reliable and it produces uniformly accurate answers for the results [Demmel J., 1989].  A solution is guaranteed for all real symmetric matrices by using this method. The limitation is not severe since many practical problems of applied mathematics and engineering involve symmetric matrices. From a theoretical viewpoint, the method embodies techniques that are found in more sophisticated algorithms [Matthews J.H].


2.3.2 The Jacobi’s series Transformations


    Start with the real symmetric matrix A. Then construct the sequence of orthogonal matrices  as follows [Matthew J.H]:


                  $D_0 = A, D_j = R_j D_j-1R_j for j = 1,2,3 ...$         (2.3)


The sequence $R_j$ is constructing so:
           
                    $\mathop{\lim}\limits_{a \to \infty} D_j = D = diag(\lambda_1, \lambda_2, ...., \lambda_n)$ (2.4)


 In practice, the process stops when all off-diagonal elements are close to the zero. Then
               
                    D_n \approx D                          (2.5)


The construction produces


                    $D_n =  R'_n R'_(n-1)... R'_1AR_1R_2....R_(n-1)R_n$      (2.6)


If the equation is defined by


                     $R = R-1R_2 ... R_(n-1)R_n$                       (2.7)


Then $R^(-1)AR = D_k$, which implies that


                     $AR = RD_k \approx R diag(\lambda_1, \lambda_2, \ldots, \lambda_n)$            (2.8)


Let the column of R be denoted by the vectors . Then R can be expressed as a row vector $X_1, X_2 .... X_n$ of column vectors:


                      $R = [X_1, X_2 .... X_n]$                          (2.9)


The columns of the product in (2.8) now take on the form


                       $[AX_1 AX_2 ... AX_3] \approx [\lambda_1X_1, \lambda_2X_2 ..., \lambda_nX_n] $   (2.10)


From (2.9) and (2.10) the vector $X_j$, which is the jth column of R, is an eigenvector that corresponds to the eigenvalues $\lambda_j$.

0 comments:

Post a Comment