2018-08-08 07:33:55 +00:00
|
|
|
# Deploy Spark On Yarn
|
|
|
|
|
|
|
|
## Client
|
|
|
|
|
|
|
|
```bash
|
|
|
|
docker service create \
|
|
|
|
--name spark-client \
|
|
|
|
--hostname spark-client \
|
|
|
|
--network swarm-net \
|
|
|
|
--replicas 1 \
|
|
|
|
--detach true \
|
2018-08-08 07:51:43 +00:00
|
|
|
--mount type=bind,source=/etc/localtime,target=/etc/localtime \
|
2018-08-08 07:33:55 +00:00
|
|
|
newnius/spark:2.3.1-yarn
|
|
|
|
```
|
|
|
|
|
|
|
|
## Validate installation
|
|
|
|
|
|
|
|
#### spark-submit PI
|
|
|
|
|
|
|
|
```bash
|
|
|
|
spark-submit \
|
|
|
|
--master yarn \
|
|
|
|
--deploy-mode cluster \
|
|
|
|
--class org.apache.spark.examples.JavaSparkPi \
|
|
|
|
./examples/jars/spark-examples*.jar 100
|
|
|
|
```
|
|
|
|
|
|
|
|
#### spark-shell HDFS wordcount
|
|
|
|
|
|
|
|
Enter `spark-shell --master yarn` to enter shell.
|
|
|
|
|
|
|
|
```shell
|
|
|
|
val lines = sc.textFile("hdfs://hadoop-master:8020/user/root/input")
|
|
|
|
|
|
|
|
val words = lines.flatMap(_.split("\\s+"))
|
|
|
|
|
|
|
|
val wc = words.map(word => (word, 1)).reduceByKey(_ + _)
|
|
|
|
|
|
|
|
wc.collect()
|
|
|
|
|
|
|
|
val cnt = words.map(word => 1).reduce(_ + _)
|
|
|
|
```
|
|
|
|
|
|
|
|
## Browse the web UI
|
|
|
|
|
|
|
|
In Spark On Yarn mode, the spark jobs will occur in the Yarn web UI.
|
2018-08-08 07:51:43 +00:00
|
|
|
|
|
|
|
## Custom configuration
|
|
|
|
|
|
|
|
To persist data or modify the conf files, refer to the following script.
|
|
|
|
|
|
|
|
The `/config/hadoop` path is where new conf files to be replaces, you don't have to put all the files.
|
|
|
|
|
|
|
|
|
|
|
|
```bash
|
|
|
|
docker service create \
|
|
|
|
--name spark-client \
|
|
|
|
--hostname spark-client \
|
|
|
|
--network swarm-net \
|
|
|
|
--replicas 1 \
|
|
|
|
--detach true \
|
|
|
|
--mount type=bind,source=/etc/localtime,target=/etc/localtime \
|
|
|
|
--mount type=bind,source=/data/hadoop/config,target=/config/hadoop \
|
|
|
|
newnius/spark:2.3.1-yarn
|
|
|
|
```
|