0 votes
in Data Science by
You are given a data set consisting of variables with more than 30 percent missing values. How will you deal with them?

1 Answer

0 votes
by
The following are ways to handle missing data values:

If the data set is large, we can just simply remove the rows with missing data values. It is the quickest way; we use the rest of the data to predict the values.

For smaller data sets, we can substitute missing values with the mean or average of the rest of the data using the pandas' data frame in python. There are different ways to do so, such as df.mean(), df.fillna(mean).

10. For the given points, how will you calculate the Euclidean distance in Python?

plot1 = [1,3]

plot2 = [2,5]

The Euclidean distance can be calculated as follows:

euclidean_distance = sqrt( (plot1[0]-plot2[0])**2 + (plot1[1]-plot2[1])**2 )

Related questions

0 votes
asked Apr 17, 2022 in Apache Drill by sharadyadav1986
0 votes
asked Jan 2, 2020 in Data Science by sharadyadav1986
...