• Home
  • Recent Q&A
  • Java
  • Cloud
  • JavaScript
  • Python
  • SQL
  • PHP
  • HTML
  • C++
  • Data Science
  • DBMS
  • Devops
  • Hadoop
  • Machine Learning
in Data Science by
Q:
You are given a data set consisting of variables with more than 30 percent missing values. How will you deal with them?

1 Answer

0 votes
by
The following are ways to handle missing data values:

If the data set is large, we can just simply remove the rows with missing data values. It is the quickest way; we use the rest of the data to predict the values.

For smaller data sets, we can substitute missing values with the mean or average of the rest of the data using the pandas' data frame in python. There are different ways to do so, such as df.mean(), df.fillna(mean).

10. For the given points, how will you calculate the Euclidean distance in Python?

plot1 = [1,3]

plot2 = [2,5]

The Euclidean distance can be calculated as follows:

euclidean_distance = sqrt( (plot1[0]-plot2[0])**2 + (plot1[1]-plot2[1])**2 )

Related questions

0 votes
asked Dec 22, 2021 in Image Classification by rajeshsharma
0 votes
asked Mar 2, 2021 in LISP by SakshiSharma
...