Greetings, and welcome to the first edition of, What the, Kubernetes!
Today's topics: CVE-2017-1002101, init-containers and YOU!
The context
Upgrading a cluster instance group from v1.7.13 to v1.7.14 introduced me to the first-run attempt at solving the problem outlined in the CVE.
The solution to the vulnerability (for the most part affecting untrusted, multi-tenant clusters) involved forcing all configMap and secret bind-mounts to read-only.
The tool we'll use in this post is Helm 2.7.2.
The Problem
A CI component that was running successfully on v1.7.13 after the upgrade to v1.7.14.
$ kubectl get pod -l app=docker-ci
NAME READY STATUS RESTARTS AGE
docker-ci-4050235671-n487p 0/1 CrashLoopBackOff 4 1m
$ kubectl logs docker-ci-4050235671-n487p
[...]
time="2018-03-26T04:57:21Z" level=info msg="containerd successfully booted in 0.016248s" module=containerd
Error starting daemon: Error saving key file: open /etc/docker/.tmp-key.json853378281: read-only file system
Whether or not the problem presents is dependent on the workload. It affects docker and minio but not gitlab-ci-runner. The explanation is simple: it depends on the programmer who wrote the code.
chdir(2)
: it's a syscall, not a law.
Contrary to kube issue #58720, most temp file writes to one of these mounts are performed by programs during initialization and are gone before initialization is complete.
Regardless, since I've written exactly zero lines of Kubernetes code, I will leave public declaration of opinion on the quality of this fix to others in favor of presenting a solution that can help mitigate the results of this change.
The Solution
- initContainer
- Mount a shared emptyDir volume on
/etc/docker
- Mount the configMap/secret to an alternate directory (
/etc/docker_
) - Copy the contents to the expected/configured location
- exit
- Mount a shared emptyDir volume on
- runtime container
- Mount the shared emptyDir volume on
/etc/docker
- Initialize dockerd normally
- Mount the shared emptyDir volume on
One functional loss incurred with this method: the files in the target directory will not be magically updated. A mounted configMap or secret will eventually reflect changes made to the source configMap/secret. Replicating such functionality could be done with a side-car container (a runtime, rather than an init-container) that would monitor the API event bus for changes to the configMap of interest; copying new data to the shared volume when necessary.
The Hiccup
Init-containers aren't new. They have, however, had a rough start. When deploying to a v1.5-1.7 cluster,
it is necessary to use the beta.kubernetes.io/init-containers
annotation to avoid
issue #45627. Post v1.8, it will all be just another
part of .spec.template.spec[]Map
What the...
NOTE: if you are uninterested in the why, this subsection can be skipped.
I'm not sure when init-containers entered the codebase. The feature graduated to beta status in kube v1.5 and ostensibly to GA status in v1.6.
- beta feature annotation:
spec.template.spec.metadata.annotations["beta.kubernetes.io/init-containers"]
. - GA spec path:
spec.templates.spec.initContainers[]
.
The theory goes as follows:
- v1.5: use the beta annotation
- v1.6 through v1.7 : GA/beta deprecation phase; either the annotation or spec form are valid
- v1.8: full GA; annotation is removed.
The full story: issue #45627
The reality of how init-containers are processed on v1.6-v1.7:
- Initial Deployment Received
- Does the deployment define
.spec.template.spec.initContainers[]
? - No
- Does the deployment define beta annotation init-containers?
- Yes
- ingest annotation JSON
- sync data from annotation ingestion to
...spec.initContainers[]
- No (well, then)
- Yes
- Does the deployment define beta annotation init-containers?
- Yes
- Are the
...spec.initContainers[]
sync'd with the beta annotation - Yes
- Capital! Nothing to see here. Carry on!
- No
- Hrmph! We know what's really making the wheels turn here!
...spec.initContainers[]
dropped into/dev/null
.- re-synchronize
...spec.initContainers[]
to reflect the beta annotation.
- Are the
- Does the deployment define
Summary: kube v1.6 through v1.7 the spec definition never has primacy with the scheduler except on initial Deployment. When such a manifest is received, kube converts the ....initContainers[]
spec structure into a JSON string and stores it as a beta annotation value. On subsequent updates, modifications to ....initContainers[]
not only have no effect, but also overwrite ....initContainers[]
with the existing (deserialized) annotation structure. The only way around this situation is to only use the annotation form on Deployment update. The API will entice you to ...spec.initContainers[]
by deserializing your annotation value to its GA spec location. Be strong! Until v1.8, define init-containers as if you were still on v1.5--pretend ...spec.initContainers[]
doesn't exist until then!
The Helm Chart
So we need a method that will allow for gradual cut-over to v1.8 without having to manage separate charts.
The application used for this demonstration is Docker. dockerd is one of those binaries that uses its config directory for pre-initialization scratch space.
Configuration
The fix for the read-only configuration path can be seen towards the bottom of ./values.yaml
with the keys for
driving our init-container named templates above Env
.
NOTE: The following chart files have been pruned for this post. The unpruned version
All indentation is at 2-space increments. Look for any lines in the chart with the
term indent
for adjustment if you adapt this to a different indentation interval.
values.yaml
2
3 Image: docker
4 ImageTag: &itag "18-dind"
5
6 deploymentEnvironment: &env demo
7
8 Plug: docker
9
10 NodeSelectors: []
11
12 InitCommands:
13 -
14 name: config
15 command: cp /etc/docker_/config.json /etc/docker/
16
17
18 Env:
19 -
20 name: DOCKER_HOST
21 value: localhost:49152
22 -
23 name: IMAGE_TAG
24 value: *itag
52
53 # Volumes
54 Volumes:
55 -
56 name: docker-config
57 configMap:
58 name: docker-config
59 items:
60 -
61 key: config
62 path: config.json
63 mode: 0600
64 -
65 name: docker-config-directory
66 emptyDir: {}
67
68
69
70 VolumeMounts:
71 -
72 name: docker-config
73 mountPath: /etc/docker_
74 -
75 name: docker-config-directory
76 mountPath: /etc/docker
Q: Dear Stephen: Why are your init-container commands listed in values.yaml
?
A: I am glad you asked! As template markup gets thicker, readability decreases. Having critical aspects of a deployment hidden within a tangle of unrelated symbols and formatting has the danger of obscuring what the target workload is. I've been meaning to work out the gotpl incantations to make this happen and this series seemed to be a perfect reason to do it!
./settings/one
1
2 {
3 "log-driver": "gcplogs",
4 "group": "root",
5 "iptables": true,
6 "ip-masq": true
7 }
Named Templates (doc link)
op.ed time
The following is the meat of the presented solution. It involves Helm. Helm is a templating utility that is working its way towards fulfilling its stated goal of being a package manager for Kubernetes.
Because, um... well.. Kubernetes and uh... Golang, Helm, unsurprisingly, uses Go templates. If its notably inelegant appearance displeases you, well, the large pile of sand is over there. And here is your mallet. And you were born with the other critical piece to that puzzle. Go for it!
For everyone else, without further ado, third party plugins or wrapper scripts, I give you...
The Meat (or salty, smokey-flavored tempeh)
NOTE: the filenames prefixed with an underscore signal to helm that the contents are not Kube manifests.
First up: the InitMethod
template. See this if
you are unfamiliar with postfix notation (reference to lines 9 and 13).
Within this wee mess we have a thing that, when included in another template, will emit a term, annotation
or spec
, indicative of the form supported by the target kube cluster.
./templates/_helpers.yaml
1 {{/* vim: set filetype=sls sw=2 ts=2: */}}
2
3
4 {{- define "InitMethod" -}}
5 {{- $major := .Capabilities.KubeVersion.Major -}}
6 {{- $minor_ := ( splitList "+" .Capabilities.KubeVersion.Minor ) -}}
7 {{- $minor := index $minor_ 0 -}}
8 {{- if and (lt (int $major) 2) (lt (int $minor) 8) }}
9 {{- printf "annotation" -}}
10 {{- else -}}
11 {{- if (eq (int $major) 1) and (ge (int $minor) 8) }}
12 {{- printf "spec" -}}
13 {{- end -}} {{/* else if */}}
14 {{- end -}} {{/* if */}}
15 {{- end -}} {{/* define */}}
Once your eyes are able to blur past the template markup, it is quite straightforward:
- InitMethod
- what version of Kubernetes are we talking to?
- less than v1.8: we use the annotation form
- v1.8 and beyond: use the spec form
- what version of Kubernetes are we talking to?
NOTE: GKE decided to augment the kube version with a "+". Lines 6-7 are required to deal with this anomaly
./templates/_init-containers.yaml
1 {{/* vim: set filetype=sls sw=2 ts=2: */}}
2
3 {{- define "InitSpec" }}
4 {{- if eq (include "InitMethod" .) "spec" }}
5 {{- $env := .Values.Env }}
6 {{- $volumes := .Values.VolumeMounts }}
7 {{- $image := ( printf "%s:%s" .Values.Image .Values.ImageTag ) }}
8 initContainers:
9 {{- range .Values.initCommands }}
10 -
11 name: {{ .name }}
12 image: {{ $image }}
13 command: ["/bin/sh", "-c"]
14 args:
15 - {{ .command | quote }}
16 env:
17 {{ toYaml $env | indent 8 }}
18 volumeMounts:
19 {{ toYaml $volumes | indent 8 }}
20 {{- end }} {{/* range */}}
21 {{- end }} {{/* if */}}
22 {{- end }} {{/* define */}}
23
24
25
26 {{- define "InitAnnotation" }}
27 {{- if eq (include "InitMethod" .) "annotation" }}
28 {{- $env := .Values.Env }}
29 {{- $volumes := .Values.VolumeMounts }}
30 {{- $image := ( printf "%s:%s" .Values.Image .Values.ImageTag ) }}
31 pod.beta.kubernetes.io/init-containers: |
32 [
33 {{- range $ic_index, $ic := .Values.initCommands }}
34 {{- if $ic_index }},{{end}}
35 {
36 "name": {{ .name | quote }},
37 "image": {{ $image | quote }},
38 "command": ["/bin/sh", "-c"],
39 "args": [ {{ .command | quote }} ],
40 "env":
41 [
42 {{- range $ev_index, $ev := $env }}
43 {{- if $ev_index}},{{end}}
44 {{ toJson $ev | indent 12 }}
45 {{- end }}
46 ],
47 "volumeMounts":
48 [
49 {{- range $vm_index, $vm := $volumes }}
50 {{- if $vm_index }},{{end}}
51 {{ toJson $vm | indent 12 }}
52 {{- end }}
53 ]
54 }
55 {{- end }}
56 ]
57 {{- end }}
58 {{- end }}
59
60 {{- define "InitContainers" }}
61 {{- if eq ( include "InitMethod" ) "annotation" }}
62 {{- include "InitAnnotation" }}
63 {{- end }}
64 {{- if eq ( include "InitMethod" ) "spec" }}
65 {{- include "InitSpec" }}
66 {{- end }}
67 {{- end }}
See lines 4 & 27 for how InitMethod is called.
The Cheese (or congealed soy paste cheese analog)
- InitSpec
- InitAnnotation
What follows are the templates that inject the appropriate init-containers definition when included by a Deployment manifest. They can be included as part of a Chart's boilerplate (if one is so inclined) as they do not add to the manifest's structure if no init-container commands are defined to drive them.
Encapsulating the if/else logic within the helper templates allows for a Deployment manifest template to get away with only two template-related statements. Only the form that provides full functionality actually renders anything; therefore, the desire for a chart that is version-agnostic (viz. init-containers)
templates/deployment.yaml
1 apiVersion: extensions/v1beta1
2 kind: Deployment
3 metadata:
4 name: {{.Values.Plug}}-{{.Values.deploymentEnvironment}}
5 namespace: {{.Release.Namespace}}
6 labels:
7 app: {{.Values.Plug}}
8 env: {{.Values.deploymentEnvironment}}
9 imageTag: {{.Values.ImageTag | quote }}
10 heritage: {{.Release.Service | quote }}
11 release: {{ .Release.Name | quote }}
12 chart: {{.Chart.Name}}-{{.Chart.Version}}
13 spec:
14 selector:
15 matchLabels:
16 app: {{.Values.Plug}}-{{.Values.deploymentEnvironment}}
17 env: {{.Values.deploymentEnvironment}}
18 imageTag: {{.Values.ImageTag | quote }}
19 release: {{ .Release.Name | quote }}
20 template:
21 metadata:
22 labels:
23 app: {{.Values.Plug}}-{{.Values.deploymentEnvironment}}
24 env: {{.Values.deploymentEnvironment}}
25 imageTag: {{.Values.ImageTag | quote }}
26 release: {{ .Release.Name | quote }}
27 annotations:
28 chksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum | quote }}
29 {{- include "InitAnnotation" . | indent 8 }}
30 spec:
31 {{- include "InitSpec" . | indent 6 }}
32 {{- if .Values.NodeSelectors }}
33 nodeSelector:
34 {{- toYaml .Values.NodeSelectors | indent 10 }}
35 {{- end }}
36 volumes:
37 {{ toYaml .Values.Volumes | indent 8 }}
38 containers:
39 -
40 name: docker
41 image: {{.Values.Image}}:{{.Values.ImageTag}}
42 command:
43 - /usr/local/bin/dockerd
44 args:
45 - --config-file=/etc/docker/config.json
46 - -H
47 - 0.0.0.0:49152
48 - --dns
49 - 8.8.8.8
50 - --insecure-registry•
51 - registry--ci.ci
52 securityContext:
53 privileged: true
54 ports:
55 -
56 protocol: TCP
57 containerPort: 49152
58 volumeMounts:
59 {{ toYaml .Values.VolumeMounts | indent 12 }}
60 env:
61 {{ toYaml .Values.Env | indent 12 }}
Conclusion
The original target example for this post was to have been demonstrating a Django and Celery application. Once I encountered CVE-2017-1002101 I decided to refocus my initial foray towards the simpler and much more immediate problem domain. This example doesn't have complex requirements for making it work. As long as the init-container image has a functioning cp
binary it will foot the bill--a valid argument can be presented that the post-initial-deployment functionality of this init-container has no effect on the long-term viability of that Deployment (as long as the command is entered correctly the first time).
Because python runtimes (e.g.: Django, Celery, Gunicorn) directly consume application code, it is critical those runtimes' environments are always in sync. The next post will cover such a deployment and will include the methods demonstrated today.
Thanks for reading!
init-containers CVE-2017-1002101 initContainers kubernetes configMap gotpl k8s sub-path