Ramprabhu's AI Den: Apache Spark Architecture and processing in breif

As we know, Spark runs on Master-Slave Architecture.

Let’s see the step by step process

1.First step the moment we submit a Spark JOB via the Cluster Mode, Spark-Submit utility will interact with the Resource Manager to Start the Application Master.

2. Then there is a Spark Driver Programme which runs on the Application Master container and it has no dependency with the client Machine, even if we turn-off the client machine, Spark Job will be up and running.

3.Spark Driver Programme further interacts with the Resource Manger to start the containers to process the data.

4. The Resource Manager will then allocate containers and Spark Driver Programme would start executors on all the allocated containers and assigns tasks to run.

5. Executors will interact directly with the Spark Driver Programme and once the tasks are finished on each of the executors, containers along with the tasks will be released and the output will be collected by the Spark Driver Programme.

6.Here the container where the Application Master runs acts as Master node and the containers where all the executor process runs the tasks are called Slave Node.

Ramprabhu's AI Den

Tuesday, April 28, 2020

Apache Spark Architecture and processing in breif

No comments:

Post a Comment

My Logo

Search This Blog