in Data Science by
Q:
Why Data Science?

► Click here to show 1 Answer

0 votes
by
Data is the oil for today's world. With the right tools, technologies, algorithms, we can use data and convert it into a distinctive business advantage

Data Science can help you to detect fraud using advanced machine learning algorithms

It helps you to prevent any significant monetary losses

Allows to build intelligence ability in machines

You can perform sentiment analysis to gauge customer brand loyalty

It enables you to take better and faster decisions

Helps you to recommend the right product to the right customer to enhance your business

Evolution of DataSciences

Data Science Components

Statistics:

Statistics is the most critical unit of Data Science basics. It is the method or science of collecting and analyzing numerical data in large quantities to get useful insights.

Visualization:

Visualization technique helps you to access huge amounts

of data in easy to understand and digestible visuals.

Machine Learning:

Machine Learning explores the building and study of algorithms which learn to make predictions about unforeseen/future data.

Deep Learning:

Deep Learning method is new machine learning research where the algorithm selects the analysis model to follow.

Data Science Process

Now in this Data Science Tutorial, we will learn the Data Science Process:

Data Science Process

1. Discovery:

Discovery step involves acquiring data from all the identified internal & external sources which helps you to answer the business question.

The data can be:

Logs from webservers

Data gathered from social media

Census datasets

Data streamed from online sources using APIs

2. Preparation:

Data can have lots of inconsistencies like missing value, blank columns, incorrect data format which needs to be cleaned. You need to process, explore, and condition data before modeling. The cleaner your data, the better are your predictions.

3. Model Planning:

In this stage, you need to determine the method and technique to draw the relation between input variables. Planning for a model is performed by using different statistical formulas and visualization tools. SQL analysis services, R, and SAS/access are some of the tools used for this purpose.

4. Model Building:

In this step, the actual model building process starts. Here, Data scientist distributes datasets for training and testing. Techniques like association, classification, and clustering are applied to the training data set. The model once prepared is tested against the "testing" dataset.

5. Operationalize:

In this stage, you deliver the final baselined model with reports, code, and technical documents. Model is deployed into a real-time production environment after thorough testing.

6. Communicate Results

In this stage, the key findings are communicated to all stakeholders. This helps you to decide if the results of the project are a success or a failure based on the inputs from the model.

Data Science Jobs Roles

Most prominent Data Scientist job titles are:

Data Scientist

Data Engineer

Data Analyst

Statistician

Data Architect

Data Admin

Business Analyst

Data/Analytics Manager

Now in this Data Science Tutorial, let's learn what each role entails in detail:

Data Scientist:

Role:

A Data Scientist is a professional who manages enormous amounts of data to come up with compelling business visions by using various tools, techniques, methodologies, algorithms, etc.

Languages:

R, SAS, Python, SQL, Hive, Matlab, Pig, Spark

Data Engineer:

Role:

The role of data engineer is of working with large amounts of data. He develops, constructs, tests, and maintains architectures like large scale processing system and databases.

Languages:

SQL, Hive, R, SAS, Matlab, Python, Java, Ruby, C + +, and Perl

Data Analyst:

Role:

A data analyst is responsible for mining vast amounts of data. He or she will look for relationships, patterns, trends in data. Later he or she will deliver compelling reporting and visualization for analyzing the data to take the most viable business decisions.

Languages:

R, Python, HTML, JS, C, C+ + , SQL

Statistician:

Role:

The statistician collects, analyses, understand qualitative and quantitative data by using statistical theories and methods.

Languages:

SQL, R, Matlab, Tableau, Python, Perl, Spark, and Hive

Data Administrator:

Role:

Data admin should ensure that the database is accessible to all relevant users. He also makes sure that it is performing correctly and is being kept safe from hacking.

Languages:

Ruby on Rails, SQL, Java, C#, and Python

Business Analyst:

Role:

This professional need to improves business processes. He/she as an intermediary between the business executive team and IT department.

Languages:

SQL, Tableau, Power BI and, Python
Learn More with Madanswer

Related questions

+3 votes
asked Jan 17, 2020 in Data Science by sharadyadav1986
...