Dockerfiles/hadoop/2.7.4/README.md

86 lines
1.4 KiB
Markdown
Raw Normal View History

2017-10-14 02:15:59 +00:00
# based on alpine
## Create a hadoop cluster in swarm mode
2017-10-14 02:15:59 +00:00
`--hostname` needs docker 1.13 or higher
2018-08-03 05:07:18 +00:00
```bash
docker service create \
--name hadoop-master \
--hostname hadoop-master \
2018-08-03 05:07:18 +00:00
--network swarm-net \
--replicas 1 \
--endpoint-mode dnsrr \
newnius/hadoop
```
2018-08-03 05:07:18 +00:00
```bash
docker service create \
--name hadoop-slave1 \
--hostname hadoop-slave1 \
2018-08-03 05:07:18 +00:00
--network swarm-net \
--replicas 1 \
--endpoint-mode dnsrr \
newnius/hadoop
```
2018-08-03 05:07:18 +00:00
```bash
docker service create \
--name hadoop-slave2 \
--hostname hadoop-slave2 \
2018-08-03 05:07:18 +00:00
--network swarm-net \
--replicas 1 \
--endpoint-mode dnsrr \
newnius/hadoop
```
2018-08-03 05:07:18 +00:00
```bash
docker service create \
--name hadoop-slave3 \
--hostname hadoop-slave3 \
2018-08-03 05:07:18 +00:00
--network swarm-net \
--replicas 1 \
--endpoint-mode dnsrr \
newnius/hadoop
```
## Init && Test
In the first deploy, format dfs first
### stop cluster (in master)
2018-08-03 05:07:18 +00:00
```bash
sbin/stop-dfs.sh
```
### format hdfs (in master)
2018-08-03 05:07:18 +00:00
```bash
bin/hadoop namenode -format
```
### start cluster (in master)
2018-08-03 05:07:18 +00:00
```bash
sbin/start-dfs.sh
```
### Run a test job
2018-08-03 05:07:18 +00:00
```bash
# prepare input data
bin/hadoop dfs -mkdir -p /user/root/input
2018-08-03 05:07:18 +00:00
bin/hadoop dfs -put etc/hadoop/* /user/root/input
2018-08-03 05:07:18 +00:00
```
```bash
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.4.jar grep input output 'dfs[a-z.]+'
```
### monitor cluster in browser
YARN: hadoop-master:8088
HDFS: hadoop-master:50070
2017-10-14 02:15:59 +00:00
_Proxy needed: newnius/docker-proxy_