2018-02-01 11:37:58 +00:00
|
|
|
# Spark on yarn
|
|
|
|
|
|
|
|
## Create a spark cluster in swarm mode
|
|
|
|
|
|
|
|
`--hostname` needs 1.13 or higher
|
|
|
|
|
|
|
|
```bash
|
|
|
|
docker service create \
|
|
|
|
--name spark-master \
|
|
|
|
--hostname spark-master \
|
|
|
|
--detach true \
|
|
|
|
--network swarm-net \
|
|
|
|
--replicas 1 \
|
|
|
|
--endpoint-mode dnsrr \
|
|
|
|
newnius/spark:2.2.1
|
|
|
|
```
|
|
|
|
|
|
|
|
```bash
|
|
|
|
docker service create \
|
|
|
|
--name spark-slave1 \
|
|
|
|
--hostname spark-slave1 \
|
|
|
|
--detach true \
|
|
|
|
--network swarm-net \
|
|
|
|
--replicas 1 \
|
|
|
|
--endpoint-mode dnsrr \
|
|
|
|
newnius/spark:2.2.1
|
|
|
|
```
|
|
|
|
|
|
|
|
```bash
|
|
|
|
docker service create \
|
|
|
|
--name spark-slave2 \
|
|
|
|
--hostname spark-slave2 \
|
|
|
|
--detach true \
|
|
|
|
--network swarm-net \
|
|
|
|
--replicas 1 \
|
|
|
|
--endpoint-mode dnsrr \
|
|
|
|
newnius/spark:2.2.1
|
|
|
|
```
|
|
|
|
|
|
|
|
## Init && Test
|
|
|
|
|
|
|
|
In the first deploy, format hdfs
|
|
|
|
|
|
|
|
### Stop HDFS (in master)
|
|
|
|
```bash
|
|
|
|
sbin/stop-dfs.sh
|
|
|
|
```
|
|
|
|
|
|
|
|
### Format HDFS (in master)
|
|
|
|
```
|
|
|
|
bin/hadoop namenode -format
|
|
|
|
```
|
|
|
|
|
|
|
|
### Start HDFS (in master)
|
|
|
|
```bash
|
|
|
|
sbin/start-dfs.sh
|
|
|
|
```
|
|
|
|
|
|
|
|
### Run Hello World
|
|
|
|
```bash
|
|
|
|
spark-submit \
|
2018-03-24 06:16:03 +00:00
|
|
|
--master yarn \
|
|
|
|
--deploy-mode cluster \
|
2018-02-01 11:37:58 +00:00
|
|
|
--class org.apache.spark.examples.JavaSparkPi \
|
|
|
|
./examples/jars/spark-examples_2.11-2.2.1.jar 100
|
|
|
|
```
|
|
|
|
|
|
|
|
### UI
|
|
|
|
|
|
|
|
YARN: spark-master:8088
|
|
|
|
|
|
|
|
HDFS: spark-master:50070
|
|
|
|
|
|
|
|
_Proxy needed, e.g. [newnius/docker-proxy](https://hub.docker.com/r/newnius/docker-proxy/)_
|
|
|
|
|
|
|
|
## customized config
|
|
|
|
|
|
|
|
```bash
|
|
|
|
docker service create \
|
|
|
|
--name spark-master \
|
|
|
|
--hostname spark-master \
|
2018-03-24 06:16:03 +00:00
|
|
|
--detach=true \
|
2018-02-01 11:37:58 +00:00
|
|
|
--network swarm-net \
|
|
|
|
--replicas 1 \
|
|
|
|
--mount type=bind,source=/mnt/data/spark/hdfs/master,target=/tmp/hadoop-root \
|
|
|
|
--mount type=bind,source=/mnt/data/spark/logs/master,target=/usr/local/hadoop/logs \
|
|
|
|
--mount type=bind,source=/mnt/data/spark/config/hadoop,target=/mnt/config/hadoop \
|
|
|
|
--mount type=bind,source=/mnt/data/spark/config/spark,target=/mnt/config/spark \
|
|
|
|
--endpoint-mode dnsrr \
|
|
|
|
newnius/spark:2.2.1
|
|
|
|
```
|
|
|
|
|
|
|
|
You dont't need to put all files in dir, only add files to be replaced.
|