0 votes
in Machine Learning by

Explain Principal Component Analysis (PCA).

1 Answer

0 votes
by

Firstly, this is one of the most important Machine Learning Interview Questions.

In the real world, we deal with multi-dimensional data. Thus, data visualization and computation become more challenging with the increase in dimensions. In such a scenario, we might have to reduce the dimensions to analyze and visualize the data easily. We do this by:

 Removing irrelevant dimensions

 Keeping only the most relevant dimensions

This is where we use Principal Component Analysis (PCA).

Finding a fresh collection of uncorrelated dimensions (orthogonal) and ranking them on the basis of variance are the goals of Principal Component Analysis.

The Mechanism of PCA:

Compute the covariance matrix for data objects

Compute the Eigen vectors and the Eigen values in a descending order

To get the new dimensions, select the initial N Eigen vectors

Finally, change the initial n-dimensional data objects into N-dimensions

Example: Below are the two graphs showing data points (objects) and two directions: one is ‘green’ and the other is ‘yellow.’ We got the Graph 2 by rotating the Graph 1 so that the x-axis and y-axis represent the ‘green’ and ‘yellow’ directions, respectively.

Explain Principal Component Analysis (PCA)

Output from PCA

After the rotation of the data points, we can infer that the green direction (x-axis) gives us the line that best fits the data points.

Here, we are representing 2-dimensional data. But in real-life, the data would be multi-dimensional and complex. So, after recognizing the importance of each direction, we can reduce the area of dimensional analysis by cutting off the less-significant ‘directions.’

...