As
we know, Spark runs on Master-Slave Architecture.
Let’s
see the step by step process
1.First
step the moment we submit a Spark JOB
via the Cluster Mode, Spark-Submit utility will interact with
the Resource Manager to Start the Application Master.
2.
Then there is a Spark Driver Programme which runs on the Application Master
container and it has no dependency with
the client Machine, even if we turn-off the client machine, Spark Job will be up
and running.
3.Spark
Driver Programme further interacts with the Resource Manger to start the
containers to process the data.
4.
The Resource Manager will then allocate containers and Spark Driver Programme
would start executors on all the allocated containers and assigns tasks to run.
5.
Executors will interact directly with the Spark Driver Programme and once the
tasks are finished on each of the executors, containers along with the tasks
will be released and the output will be collected by the Spark Driver Programme.
6.Here the
container where the Application Master runs acts as Master node and the
containers where all the executor process runs the tasks are called Slave Node.
No comments:
Post a Comment