in Data Science by (31.6k points)
How can outlier values be treated?

2 Answers

0 votes
by (23.9k points)

Outlier values can be identified by using univariate or any other graphical analysis method. If the number of outlier values is few then they can be assessed individually but for a large number of outliers, the values can be substituted with either the 99th or the 1st percentile values.

All extreme values are not outlier values. The most common ways to treat outlier values

1. To change the value and bring it within a range.

2. To just remove the value.

0 votes
by (14.6k points)

You can drop outliers only if it is a garbage value. 

Example: height of an adult = abc ft. This cannot be true, as the height cannot be a string value. In this case, outliers can be removed.

If the outliers have extreme values, they can be removed. For example, if all the data points are clustered between zero to 10, but one point lies at 100, then we can remove this point.

If you cannot drop outliers, you can try the following:

Try a different model. Data detected as outliers by linear models can be fit by nonlinear models. Therefore, be sure you are choosing the correct model.

Try normalizing the data. This way, the extreme data points are pulled to a similar range.

You can use algorithms that are less affected by outliers; an example would be random forests. 

Related questions

+3 votes
asked Jan 17, 2020 in Data Science by rahuljain1 (6.5k points)