Dockerfiles/spark/test
2018-08-08 11:25:11 +08:00
..
bootstrap.sh update spark 2018-08-08 11:25:11 +08:00
Dockerfile update spark 2018-08-08 11:25:11 +08:00
README.md update spark 2018-08-08 11:25:11 +08:00

Deploy Spark On Yarn

Client

docker service create \
	--name spark-client \
	--hostname spark-client \
	--network swarm-net \
	--replicas 1 \
	--detach true \
	newnius/spark:2.2.1-yarn

Validate installation

spark-submit PI

spark-submit \
	--master yarn \
	--deploy-mode cluster \
	--class org.apache.spark.examples.JavaSparkPi \
	./examples/jars/spark-examples*.jar 100

spark-shell HDFS wordcount

Enter spark-shell --master yarn to enter shell.

val lines = sc.textFile("hdfs://hadoop-master:8020/user/root/input")

val words = lines.flatMap(_.split("\\s+"))

val wc = words.map(word => (word, 1)).reduceByKey(_ + _)

wc.collect()

val cnt = words.map(word => 1).reduce(_ + _)

Browse the web UI

In Spark On Yarn mode, the spark jobs will occur in the Yarn web UI.