Categories
5G Network
Agile
Amazon EC2
Android
Angular
Ansible
Arduino
Artificial Intelligence
Augmented Reality
AWS
Azure
Big Data
Blockchain
BootStrap
Cache Teachniques
Cassandra
Commercial Insurance
C#
C++
Cloud
CD
CI
Cyber Security
Data Handling
Data using R
Data Science
DBMS
Design-Pattern
DevOps
ECMAScript
Fortify
Ethical Hacking
Framework
GIT
GIT Slack
Gradle
Hadoop
HBase
HDFS
Hibernate
Hive
HTML
Image Processing
IOT
JavaScript
Java
Jenkins
Jira
JUnit
Kibana
Linux
Machine Learning
MangoDB
MVC
NGINX
Onsen UI
Oracle
PHP
Python
QTP
R Language
Regression Analysis
React JS
Robotic
Salesforce
SAP
Selenium
Service Discovery
Service Now
SOAP UI
Spark SQL
Testing
TOGAF
Research Method
Virtual Reality
Vue.js
Home
Recent Q&A
Feedback
Ask a Question
Job Submission and Monitoring overview in MapReducer
Home
>
Big Data | Hadoop
>
Job Submission and Monitoring overview in MapReducer
Jan 8, 2020
in
Big Data | Hadoop
Q: Job Submission and Monitoring overview in MapReducer
1
Answer
0
votes
Jan 8, 2020
Job is the primary interface by which user-job interacts with the ResourceManager.
Job provides facilities to submit jobs, track their progress, access component-tasks’ reports and logs, get the MapReduce cluster’s status information and so on.
The job submission process involves:
Checking the input and output specifications of the job.
Computing the InputSplit values for the job.
Setting up the requisite accounting information for the DistributedCache of the job, if necessary.
Copying the job’s jar and configuration to the MapReduce system directory on the FileSystem.
Submitting the job to the ResourceManager and optionally monitoring it’s status.
Job history files are also logged to user specified directory mapreduce.jobhistory.intermediate-done-dir and mapreduce.jobhistory.done-dir, which defaults to job output directory.
User can view the history logs summary in specified directory using the following command $ mapred job -history output.jhist This command will print job details, failed and killed tip details. More details about the job such as successful tasks and task attempts made for each task can be viewed using the following command $ mapred job -history all output.jhist
Normally the user uses Job to create the application, describe various facets of the job, submit the job, and monitor its progress.
Job Control
Users may need to chain MapReduce jobs to accomplish complex tasks which cannot be done via a single MapReduce job. This is fairly easy since the output of the job typically goes to distributed file-system, and the output, in turn, can be used as the input for the next job.
However, this also means that the onus on ensuring jobs are complete (success/failure) lies squarely on the clients. In such cases, the various job-control options are:
Job.submit() : Submit the job to the cluster and return immediately.
Job.waitForCompletion(boolean) : Submit the job to the cluster and wait for it to finish.
Click here to read more about Loan/Mortgage
Click here to read more about Insurance
Facebook
Twitter
LinkedIn
Related questions
0
votes
Q: Job Output in MapReducer
Jan 8, 2020
in
Big Data | Hadoop
0
votes
Q: Job Input in Mapreducer
Jan 8, 2020
in
Big Data | Hadoop
0
votes
Q: Job Configuration MapReducer
Jan 8, 2020
in
Big Data | Hadoop
0
votes
Q: Private and Public DistributedCache Files in MapReducer
Jan 8, 2020
in
Big Data | Hadoop
0
votes
Q: Hadoop MapReduce overview
Jan 8, 2020
in
Big Data | Hadoop
#hadoop-cluster
#hadoop
#hadoop-vs-spark
0
votes
Q: Skipping Bad Records in MapReducer
Jan 8, 2020
in
Big Data | Hadoop
...