in Data Science by
You are given a data set consisting of variables with more than 30 percent missing values. How will you deal with them?

1 Answer

0 votes
The following are ways to handle missing data values:

If the data set is large, we can just simply remove the rows with missing data values. It is the quickest way; we use the rest of the data to predict the values.

For smaller data sets, we can substitute missing values with the mean or average of the rest of the data using the pandas' data frame in python. There are different ways to do so, such as df.mean(), df.fillna(mean).

10. For the given points, how will you calculate the Euclidean distance in Python?

plot1 = [1,3]

plot2 = [2,5]

The Euclidean distance can be calculated as follows:

euclidean_distance = sqrt( (plot1[0]-plot2[0])**2 + (plot1[1]-plot2[1])**2 )

Related questions

+1 vote
asked Dec 31, 2021 in Unstructured Data Classification by rajeshsharma
0 votes
asked Nov 16, 2021 in Azure by Robin
0 votes
asked Sep 19, 2022 in Time Series Analysis by Robin
0 votes
asked Jan 23, 2020 in Data Science by AdilsonLima