mirror of
https://github.com/newnius/Dockerfiles.git
synced 2025-12-15 18:36:44 +00:00
update spark
This commit is contained in:
45
spark/2.3.1-yarn/README.md
Normal file
45
spark/2.3.1-yarn/README.md
Normal file
@@ -0,0 +1,45 @@
|
||||
# Deploy Spark On Yarn
|
||||
|
||||
## Client
|
||||
|
||||
```bash
|
||||
docker service create \
|
||||
--name spark-client \
|
||||
--hostname spark-client \
|
||||
--network swarm-net \
|
||||
--replicas 1 \
|
||||
--detach true \
|
||||
newnius/spark:2.3.1-yarn
|
||||
```
|
||||
|
||||
## Validate installation
|
||||
|
||||
#### spark-submit PI
|
||||
|
||||
```bash
|
||||
spark-submit \
|
||||
--master yarn \
|
||||
--deploy-mode cluster \
|
||||
--class org.apache.spark.examples.JavaSparkPi \
|
||||
./examples/jars/spark-examples*.jar 100
|
||||
```
|
||||
|
||||
#### spark-shell HDFS wordcount
|
||||
|
||||
Enter `spark-shell --master yarn` to enter shell.
|
||||
|
||||
```shell
|
||||
val lines = sc.textFile("hdfs://hadoop-master:8020/user/root/input")
|
||||
|
||||
val words = lines.flatMap(_.split("\\s+"))
|
||||
|
||||
val wc = words.map(word => (word, 1)).reduceByKey(_ + _)
|
||||
|
||||
wc.collect()
|
||||
|
||||
val cnt = words.map(word => 1).reduce(_ + _)
|
||||
```
|
||||
|
||||
## Browse the web UI
|
||||
|
||||
In Spark On Yarn mode, the spark jobs will occur in the Yarn web UI.
|
||||
Reference in New Issue
Block a user