Recently we are working on deployment dolphinscheduler on docker & k8s following the official guide.
However, there are some problems that cause the service not to work. This wiki is for recording.
The latest version is 2.0.1. And there is something different between 2.0.0 & 2.0.1. We will describe it in detail below.
1. Docker
The official guide is here.
We follow The Second Way: Start via specifying the existing PostgreSQL and ZooKeeper service.
We have deployed PostgreSQL and ZooKeeper. Because of the commercial license, the default driver is PostgreSQL. First, we start docker using the default image and build a new Mysql image second. This need to deploy two services and replace the xx.xx.xx.xx to your own ip.
- PostgreSQL xx.xx.xx.xx:5432 user: user password: password
- ZooKeeper xx.xx.xx.xx:2181
Step 1:
docker pull dolphinscheduler.docker.scarf.sh/apache/dolphinscheduler:2.0.1
Step 2:
docker run -d –name dolphinscheduler
-e DATABASE_HOST=”xx.xx.xx.xx” -e DATABASE_PORT=”5432” -e DATABASE_DATABASE=”dolphinscheduler”
-e DATABASE_USERNAME=”user” -e DATABASE_PASSWORD=”password”
-e ZOOKEEPER_QUORUM=”xx.xx.xx.xx:2181”
-e REGISTRY_SERVERS=”xx.xx.xx.xx:2181”
-p 12345:12345
apache/dolphinscheduler:2.0.1 all
And the docker container is working now.
Attention, please!
docker run shell should add this line:
-e REGISTRY_SERVERS="xx.xx.xx.xx:2181" \
Because the config ZOOKEEPER_QUORUM is just used for test zookeeper, and the config REGISTRY_SERVERS is to change the zookeeper service. This is a bug maybe the registry config name change from zookeeper to registry but misses some config needs to change.
Miss this line, the docker container conf: /opt/dolphinscheduler/conf/registry.properties is still 127.0.0.1:2181. The version 2.0.0 has the same bug.
Version 2.0.0 has another bug. The process of building dolphinscheduler image missed the file of datasource.properties.tpl and some other tpl. So the container datasource conf can’t be changed. Is always the default config.
2. Kubernetes
The official guide is here.
The Kubernetes model is based on docker. Just add one more step. Save the config data in values.yaml. And at the helm install time, use _hepler.tpl to inject the config. Then use the templates’ yaml to deploy the k8s pods or onfigmap or pvc and so on. And the next will start some pods, they are all docker containers. Will follow by the docker step.
We have already new a namespace named dolphinscheduler.
Step 1:
wget https://dlcdn.apache.org/dolphinscheduler/2.0.1/apache-dolphinscheduler-2.0.1-src.tar.gz
Step 2:
tar -zxvf apache-dolphinscheduler-2.0.1-src.tar.gz
Step 3:
cd apache-dolphinscheduler-2.0.1-src/docker/kubernetes/dolphinscheduler
Step 4:
helm repo add bitnami https://charts.bitnami.com/bitnami
Step 5:
helm dependency update .
Step 6:
helm install dolphinscheduler . -n dolphinscheduler
Following the step, will get an error.
Error: INSTALLATION FAILED: template: dolphinscheduler/templates/statefulset-dolphinscheduler-worker.yaml:72:16: executing “dolphinscheduler/templates/statefulset-dolphinscheduler-worker.yaml” at <include “dolphinscheduler.zookeeper.env_vars” .>: error calling include: template: no template “dolphinscheduler.zookeeper.env_vars” associated with template “gotpl”
Both version 2.0.0 & 2.0.1 have this bug. Will be fixed on 2.0.2.
The k8s model has some other bugs. The solution is below:
1. Change values.yaml
The PostgreSQL (with username root
, password root
and database dolphinscheduler
) and ZooKeeper services will start by default. So first of all, we need to change the datasource & zookeeper config. The config document is values.yaml
.
Modify postgresql
enabled
tofalse
invalues.yaml
Modify externalDatabase (especially modify
host
,username
andpassword
) invalues.yaml
:1
2
3
4
5
6
7
8
9externalDatabase:
type: "postgresql"
driver: "org.postgresql.Driver"
host: ""
port: ""
username: ""
password: ""
database: "dolphinscheduler"
params: "characterEncoding=utf8"Modify zookeeper
enabled
tofalse
invalues.yaml
Modify zookeeper in
values.yaml
:1
2
3externalZookeeper:
zookeeperQuorum: "xx.xx.xx.xx:2181"
zookeeperRoot: "/dolphinscheduler"Add registry config in
values.yaml
:1
2
3externalRegistry:
registryPluginName: "zookeeper"
registryServers: "xx.xx.xx.xx:2181"
2. Add dolphinscheduler.zookeeper.env_vars to _helpers.tpl
Add zookeeper.env_vars config in
_helpers.tpl
:1
2
3
4
5
6
7
8
9
10
11
12
13
14{{- define "dolphinscheduler.zookeeper.env_vars" -}}
- name: REGISTRY_PLUGIN_NAME
{{- if .Values.zookeeper.enabled }}
value: "zookeeper"
{{- else }}
value: {{ .Values.externalRegistry.registryPluginName }}
{{- end }}
- name: REGISTRY_SERVERS
{{- if .Values.zookeeper.enabled }}
value: {{ template "dolphinscheduler.zookeeper.quorum" . }}
{{- else }}
value: {{ .Values.externalRegistry.registryServers }}
{{- end }}
{{- end -}}
3. Change zookeeper.env_vars to registry.env_vars
Follow this commit to change the k8s yaml. It’s the same meaning with step 2. Step 2 or 3 just pick one to do is enough.
Then the k8s pods are running normally.
Build the Mysql driver image,just follow the official guide.