Bug 1093132 - kubeadm not running on Tumbleweed
kubeadm not running on Tumbleweed
Status: CONFIRMED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Containers
Current
Other Other
: P2 - High : Normal (vote)
: ---
Assigned To: Rafael Fernández López
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2018-05-14 12:23 UTC by Panagiotis Georgiadis
Modified: 2018-08-16 12:39 UTC (History)
7 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
kubelet logs (5.76 MB, application/x-gzip)
2018-05-28 13:50 UTC, Panagiotis Georgiadis
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Panagiotis Georgiadis 2018-05-14 12:23:40 UTC
This seems to be related to 'cri-tools'

How to reproduce:

Install the following packages:
> kubernetes-client kubernetes-kubelet kubernetes-kubeadm docker-kubic cri-tools

Run:
> kubeadm init

PS: If you get any error related a port that's being used, then stop the kubelet first (systemctl stop kubelet), reset kubeadm (kubeadm reset) and try again later.

The error you will get is:

> ERROR CRI: unable to check if the container runtime at "/var/run/dockershim.sock" is running: exit status 1
Comment 1 Valentin Rothberg 2018-05-14 12:31:26 UTC
(In reply to Panagiotis Georgiadis from comment #0)
> This seems to be related to 'cri-tools'

I don't see how cri-tools (i.e., crictl and critest) could cause this error. To me it looks more like a K8s-related issue as crictl shouldn't be called in the reproducing steps. What makes you draw the conclusion that it could be crictl?
Comment 2 Panagiotis Georgiadis 2018-05-14 12:41:14 UTC
If you do the exact same procedure but remove the 'cri-tools' package, then you will see this:

> [WARNING FileExisting-crictl]: crictl not found in system path

As you can see, during the pre-flight checks, kubeadm is looking for 'crictl' binary. If it's present on the system (this is the case when cri-tools package is installed) is trying to use it, and it fails with that message:

> ERROR CRI: unable to check if the container runtime at "/var/run/dockershim.sock" is running: exit status 1

By removing the package and trying again. I end up with a different error:

> [markmaster] Will mark node ultron as master by adding a label and a taint
> error marking master: timed out waiting for the condition.

As a result, I conclude that the 'ERROR CRI' is triggered when 'cri-tools' package is installed.
Comment 3 Valentin Rothberg 2018-05-14 12:49:45 UTC
(In reply to Panagiotis Georgiadis from comment #2)
[...]
> As a result, I conclude that the 'ERROR CRI' is triggered when 'cri-tools'
> package is installed.

Ouch, thanks for the details. I completely agree with you. Can you check if `/etc/crictl.yaml` points to dockershim.sock? In the meantime, I am looking at the code to see the error conditions.
Comment 4 Valentin Rothberg 2018-05-14 12:51:27 UTC
(In reply to Valentin Rothberg from comment #3)
> Ouch, thanks for the details. I completely agree with you. Can you check if
> `/etc/crictl.yaml` points to dockershim.sock? In the meantime, I am looking
> at the code to see the error conditions.

If no such config is provided, the runtime endpoint defaults to "unix:///var/run/dockershim.sock" (which is also fine).
Comment 5 Panagiotis Georgiadis 2018-05-14 13:00:29 UTC
(In reply to Valentin Rothberg from comment #3)
> Ouch, thanks for the details. I completely agree with you. Can you check if
> `/etc/crictl.yaml` points to dockershim.sock? 

Seems that these files are not present in the system:

> # ls -l /etc/crictl.yaml
> ls: cannot access '/etc/crictl.yaml': No such file or directory

> # ls -l /var/run/dockershim.sock
> ls: cannot access '/var/run/dockershim.sock': No such file or directory


Only the docker.sock exists:

> # ls -l /var/run/docker.sock
> srw-rw---- 1 root docker 0 May  9 14:02 /var/run/docker.sock

(In reply to Valentin Rothberg from comment #3)
> at the code to see the error conditions.

Maybe this https://git.io/vpHur will bee helpful. I am not familiar with kubernets code base :/
Comment 6 Valentin Rothberg 2018-05-14 13:05:06 UTC
It looks like that kubelet isn't running. Can you do a `systemctl start kubelet` and check again? That will start the dockershim.sock as well.
Comment 7 Panagiotis Georgiadis 2018-05-14 13:13:06 UTC
(In reply to Valentin Rothberg from comment #6)
> It looks like that kubelet isn't running. Can you do a `systemctl start
> kubelet` and check again? That will start the dockershim.sock as well.

Indeed, 'dockershim.sock' is created as soon as kubelet starts:

# systemctl start kubelet
# ls -l /var/run/*.sock
srw-rw---- 1 root docker 0 May  9 14:02 /var/run/docker.sock
srwxr-xr-x 1 root root   0 May 14 15:07 /var/run/dockershim.sock <--
srw-rw-rw- 1 root root   0 May  8 17:02 /var/run/rpcbind.sock

However, now I get this:

> # systemctl start kubelet
> # kubeadm init
> [init] Using Kubernetes version: v1.9.7
> [init] Using Authorization modes: [Node RBAC]
>	[ERROR Port-10250]: Port 10250 is in use
> [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

The 10250 port seems to be already in use.
Looks like 'kubeadm' wants to be the one to start kubelet or there's a bug in kubelet.

FYI: newer versions of kubelet and kubeadm in Ubuntu doesn't have that problem (I tested it last week). So maybe we need a version bump (?)
a pastebin of that can be found at: https://pastebin.com/raw/SdLztTHj
Comment 8 Panagiotis Georgiadis 2018-05-14 13:24:09 UTC
Taken from: https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/

> The kubelet is now restarting every few seconds, as it waits in a crashloop for kubeadm to tell it what to do. This crashloop is expected and normal, please proceed with the next step and the kubelet will start running normally.

So I guess that kubeadm expects kubelet to be already running before issuing 'kubeadm init'. In that case we have a bug with that port that is being used.

However, googling for that most people say that 'kubeadm reset' fixing that problem with the port. Doing so, one of the steps is to stop kubelet:

> [reset] Stopping the kubelet service.

And as soon as the kubelet is stopped, I end up with the CRI-ERROR.

To sum up:
----------
kubelet running: Port error
kubelet not running: CRI error

Seems I'm stuck :/
Comment 9 Valentin Rothberg 2018-05-14 13:54:53 UTC
(In reply to Panagiotis Georgiadis from comment #8)
[...]
> To sum up:
> ----------
> kubelet running: Port error
> kubelet not running: CRI error
> 
> Seems I'm stuck :/

I can reproduce it in a Tumbleweed VM. The odd thing is that kubelet is actually using port 10250 (via `$ fuse 10250/tcp`), but maybe kubeadm requires some more switches for kubelet.

Richard was looking into kubeadm lately. Richard, do you know what's going on? (set needinfo)
Comment 10 Valentin Rothberg 2018-05-15 05:59:28 UTC
I've tried on CaaSP node, where it was initially complaining about ports 10250 and 2379 being used. What I did was `kubeadm reset`, `systemctl stop kubelet`, `systemctl stop etcd`, `rm -rf /var/lib/etcd/*` and `kubeadm init` ran way further until erroring out with:

"error marking master: timed out waiting for the condition"
Comment 11 Valentin Rothberg 2018-05-15 06:04:10 UTC
@Panos: Can you try `kubeadm init --ignore-preflight-errors=cri` to ignore the "dockershim.sock running" error? I have no idea why the error log occurs (as dockershim is running), but with ignoring it I come as far as on a CaaSP node (i.e., to the master marking timeout).
Comment 12 Panagiotis Georgiadis 2018-05-15 09:29:15 UTC
(In reply to Valentin Rothberg from comment #11)
> @Panos: Can you try `kubeadm init --ignore-preflight-errors=cri` to ignore
> the "dockershim.sock running" error? I have no idea why the error log occurs
> (as dockershim is running), but with ignoring it I come as far as on a CaaSP
> node (i.e., to the master marking timeout).

Given that I have manually stopped kubelet before (otherwise I run into this 'port is used' message), then the CRI ERROR turns into a simple warning, thus the process is not exited.

>  [WARNING CRI]: unable to check if the container runtime at "/var/run/dockershim.sock" is running: exit status 1

However, issuing 'kubeadm reset' fails to remove the containers using 'crictl', but fortunately it falls back to docker, thus the 'reset' functionality works.:

> [reset] Cleaning up running containers using crictl with socket /var/run/dockershim.sock
> [reset] Failed to stop the running containers using crictl. Trying using docker instead.

The rest of the log:

> [preflight] Starting the kubelet service
> [certificates] Generated ca certificate and key.
> [certificates] Generated apiserver certificate and key.
> [certificates] apiserver serving cert is signed for DNS names [ultron kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.160.5.162]
> [certificates] Generated apiserver-kubelet-client certificate and key.
> [certificates] Generated sa key and public key.
> [certificates] Generated front-proxy-ca certificate and key.
> [certificates] Generated front-proxy-client certificate and key.
> [certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
> [kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
> [kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
> [kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
> [kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"
> [controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
> [controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
> [controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
> [etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
> [init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
> [init] This might take a minute or longer if the control plane images have to be pulled.
> [apiclient] All control plane components are healthy after 18.002374 seconds
> [uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
> [markmaster] Will mark node ultron as master by adding a label and a taint
> error marking master: timed out waiting for the condition

Looking at the logs of kubelet:

> hyperkube[11237]: I0515 10:44:20.757152   11237 prober.go:111] Liveness probe for "kube-apiserver-ultron_kube-system(2493d5406e150ab99897ddd64627789c):kube-apiserver" failed (failure): HTTP probe failed with statuscode: 500

If you run the command using --dry-run=true you will find what kubeadm tries to apply:

> [markmaster] Will mark node ultron.suse.de as master by adding a label and a taint
> [dryrun] Would perform action GET on resource "nodes" in API group "core/v1"
> [dryrun] Resource name: "ultron.suse.de"
> [dryrun] Would perform action PATCH on resource "nodes" in API group "core/v1"
> [dryrun] Resource name: "ultron.suse.de"
> [dryrun] Attached patch:
	{"metadata":{"labels":{"node-role.kubernetes.io/master":""}},"spec":{"taints":[{"effect":"NoSchedule","key":"node-role.kubernetes.io/master"}]}}
> [markmaster] Master ultron.suse.de tainted and labelled with key/value: node-role.kubernetes.io/master=""

Also look at:

[1] https://github.com/kubernetes/kops/issues/4390
[2] https://github.com/kubernetes/kubeadm/issues/584
Comment 13 Valentin Rothberg 2018-05-28 06:52:35 UTC
@Panos: There have been several fixes to our Salt code in the past two weeks, which could solve your issue. Would you mind checking it's working now? The main issues we've seen so far is that the kubelet does not start or fails quickly after having started while Salt still assumes that everything's working fine.
Comment 14 Panagiotis Georgiadis 2018-05-28 13:32:26 UTC
(In reply to Valentin Rothberg from comment #13)
> @Panos: There have been several fixes to our Salt code in the past two
> weeks, which could solve your issue. Would you mind checking it's working
> now? The main issues we've seen so far is that the kubelet does not start or
> fails quickly after having started while Salt still assumes that
> everything's working fine.

I don't see how the salt code is related to this but anyway I've updated the packages in my TW:

> - docker-kubic 17.09.1_ce-5.1
> + docker-kubic 17.09.1_ce-10.1

The rest of the packages (that is: kubernetes-client kubernetes-kubelet kubernetes-kubeadm docker-kubic cri-tools) didn't have any newer version. Anyway, I tested it again and the results are the same.
Comment 15 Valentin Rothberg 2018-05-28 13:37:36 UTC
(In reply to Panagiotis Georgiadis from comment #14)
[...]
> The rest of the packages (that is: kubernetes-client kubernetes-kubelet
> kubernetes-kubeadm docker-kubic cri-tools) didn't have any newer version.
> Anyway, I tested it again and the results are the same.

That's unfortunate.  Can you provide the logs from kubelet to have a closer look?
Comment 16 Panagiotis Georgiadis 2018-05-28 13:50:44 UTC
Created attachment 771566 [details]
kubelet logs
Comment 17 Flavio Castelli 2018-05-29 06:52:55 UTC
(In reply to Panagiotis Georgiadis from comment #14)
> I don't see how the salt code is related to this but anyway I've updated the
> packages in my TW:

To provide some context, the docker.shim package is created by kubelet and is required by the crictl tool.

If the kubelet service doesn't start there won't be any docker.shim socket, hence our orchestration will fail.

We had some issues inside of our salt states that lead to errors similars to the ones you have seen.
Comment 18 Panagiotis Georgiadis 2018-06-06 11:27:14 UTC
JFYI: https://github.com/kubernetes/kubernetes/pull/64706 got merged today. This might fix the 'timeout' issue we are facing in kubeadm.
Comment 19 Joaquín Rivera 2018-06-14 06:52:40 UTC
There is a problem with packages, kubernetes-kubeadm-1.9.7-1.1.x86_64 conflicts with kubernetes-master in TW: https://openqa.opensuse.org/tests/690639?machine=64bit-4G-HD40G&test=MicroOS-plain&version=Tumbleweed&flavor=DVD&limit_previous=50&arch=x86_64&distri=kubic#step/kubeadm/6
Is it related/fallout from previous work or should create another bug?
Comment 20 Panagiotis Georgiadis 2018-06-14 07:36:25 UTC
(In reply to Joaquín Rivera from comment #19)
> There is a problem with packages, kubernetes-kubeadm-1.9.7-1.1.x86_64
> conflicts with kubernetes-master in TW:
> https://openqa.opensuse.org/tests/690639?machine=64bit-4G-HD40G&test=MicroOS-
> plain&version=Tumbleweed&flavor=DVD&limit_previous=50&arch=x86_64&distri=kubi
> c#step/kubeadm/6
> Is it related/fallout from previous work or should create another bug?

Please create another bug :)
Comment 23 David Cassany 2018-07-19 15:12:42 UTC
Panagoitis,

This seams to be Kubernetes issue. Kubernetes has been updated up to v1.10, is it possible for you to verify if the bug is still present?

Thanks
Comment 24 Valentin Rothberg 2018-07-30 12:29:05 UTC
I'm having another look with the latest K8s in Tumbleweed.
Comment 25 Valentin Rothberg 2018-07-30 12:29:14 UTC
I'm having another look with the latest K8s in Tumbleweed.
Comment 26 Valentin Rothberg 2018-07-30 13:23:14 UTC
> JFYI: https://github.com/kubernetes/kubernetes/pull/64706 got merged today. This might fix the 'timeout' issue we are facing in kubeadm.

The fix made it into Kubernetes 1.11. I had a quick look at the source and the patches don't apply to 1.10.x (currently in Tumbleweed). I suggest that we update K8s soon in TW/Kubic rather than backporting it.
Comment 27 Panagiotis Georgiadis 2018-08-09 10:58:15 UTC
I was on vacation. k8s 1.10 is still the same error. As Valentin says, we wait for 1.11 to test it over there. Let's see ...
Comment 28 Rafael Fernández López 2018-08-13 20:51:10 UTC
SR open: https://build.opensuse.org/request/show/629006
Comment 29 Panagiotis Georgiadis 2018-08-14 00:34:20 UTC
I tested the SR and indeed fixes the problem! Now kubeadm can run, but still not without major problems. Although the cluster is up, there's a new problem with systemd [1]. The fix is documented at the PR, and I would suggest Richard to include this configuration in the package as part of kubernetes.


> cat /etc/sysconfig/kubelet
> KUBELET_EXTRA_ARGS="--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice"


Apart from the that, the most important problem now is that CoreDNS (which is now the default in kubeadm since 1.11) is not starting. I tracked down the issue and I found that we miss a lot of CNI binaries. Actually we ship just one (?). A did the following experiments:

1) Weave-net plugin

> We need: /opt/cni/bin/loopback (I fetched it from upstream)


2) Flannel

> # instead of Weavenet, then you need more CNI binary rather than just loopback.
> # These are: (bridge, flannel, host-local, loopback, portmap).
> # All of these are included in the compressed *.tgz archive.

I managed to fetch them from upstream, but I also had to put them in the right directory first, since kubeadm expects them into '/opt/cni/bin'

> # cat /var/lib/kubelet/kubeadm-flags.env
> KUBELET_KUBEADM_ARGS=--cgroup-driver=cgroupfs --cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --network-plugin=cni

In this regard, please fixing the location of the binaries for the CNI package.

> cd /opt/cni/bin
> curl -L -O https://github.com/containernetworking/plugins/releases/download/v0.7.1/cni-plugins-amd64-v0.7.1.tgz
> tar -xf cni-plugins-amd64-v0.7.1.tgz

After doing that, just restart the kubelet and CoreDNS will start running.

Do you want me to create a new bug for this CNI issue?


[1] https://github.com/kubernetes/kubernetes/issues/56850
Comment 30 Valentin Rothberg 2018-08-14 05:59:04 UTC
(In reply to Panagiotis Georgiadis from comment #29)
[...]
> 2) Flannel
> 
> > # instead of Weavenet, then you need more CNI binary rather than just loopback.
> > # These are: (bridge, flannel, host-local, loopback, portmap).
> > # All of these are included in the compressed *.tgz archive.
> 
> I managed to fetch them from upstream, but I also had to put them in the
> right directory first, since kubeadm expects them into '/opt/cni/bin'
> 
> > # cat /var/lib/kubelet/kubeadm-flags.env
> > KUBELET_KUBEADM_ARGS=--cgroup-driver=cgroupfs --cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --network-plugin=cni
> 
> In this regard, please fixing the location of the binaries for the CNI
> package.
> 
> > cd /opt/cni/bin
> > curl -L -O https://github.com/containernetworking/plugins/releases/download/v0.7.1/cni-plugins-amd64-v0.7.1.tgz
> > tar -xf cni-plugins-amd64-v0.7.1.tgz
> 
> After doing that, just restart the kubelet and CoreDNS will start running.
> 
> Do you want me to create a new bug for this CNI issue?

Thanks for the detailed summary, Panos.

I am a bit surprised though as we have cni-plugins but we install them under `/usr/lib/cni` (including flannel) as openSUSE doesn't use /opt. Can't we use `-cni-bin-dir=/usr/lib/cni` for kubelet?
Comment 31 Richard Brown 2018-08-14 07:30:00 UTC
(In reply to Valentin Rothberg from comment #30)

> > Do you want me to create a new bug for this CNI issue?
> 
> Thanks for the detailed summary, Panos.
> 
> I am a bit surprised though as we have cni-plugins but we install them under
> `/usr/lib/cni` (including flannel) as openSUSE doesn't use /opt. Can't we
> use `-cni-bin-dir=/usr/lib/cni` for kubelet?

I think we can, and I think we should

I also think Panos' observation of the cgroup settings needs me to re-read what upstream have in their recommended settings - it looks to me like upstream k8s might have set it wrong for openSUSE, and if that's the case I'll submit the fix there before patching it in our packages
Comment 32 Rafael Fernández López 2018-08-14 08:08:29 UTC
(In reply to Panagiotis Georgiadis from comment #29)
> I tested the SR and indeed fixes the problem! Now kubeadm can run, but still
> not without major problems. Although the cluster is up, there's a new
> problem with systemd [1]. The fix is documented at the PR, and I would
> suggest Richard to include this configuration in the package as part of
> kubernetes.
> 
> 
> > cat /etc/sysconfig/kubelet
> > KUBELET_EXTRA_ARGS="--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice"
> 
> 
> Apart from the that, the most important problem now is that CoreDNS (which
> is now the default in kubeadm since 1.11) is not starting. I tracked down
> the issue and I found that we miss a lot of CNI binaries. Actually we ship
> just one (?). A did the following experiments:
> 
> 1) Weave-net plugin
> 
> > We need: /opt/cni/bin/loopback (I fetched it from upstream)
> 
> 
> 2) Flannel
> 
> > # instead of Weavenet, then you need more CNI binary rather than just loopback.
> > # These are: (bridge, flannel, host-local, loopback, portmap).
> > # All of these are included in the compressed *.tgz archive.
> 
> I managed to fetch them from upstream, but I also had to put them in the
> right directory first, since kubeadm expects them into '/opt/cni/bin'
> 
> > # cat /var/lib/kubelet/kubeadm-flags.env
> > KUBELET_KUBEADM_ARGS=--cgroup-driver=cgroupfs --cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --network-plugin=cni
> 
> In this regard, please fixing the location of the binaries for the CNI
> package.
> 
> > cd /opt/cni/bin
> > curl -L -O https://github.com/containernetworking/plugins/releases/download/v0.7.1/cni-plugins-amd64-v0.7.1.tgz
> > tar -xf cni-plugins-amd64-v0.7.1.tgz
> 
> After doing that, just restart the kubelet and CoreDNS will start running.
> 
> Do you want me to create a new bug for this CNI issue?
> 
> 
> [1] https://github.com/kubernetes/kubernetes/issues/56850

It is expected for CoreDNS to not work until the CNI plugin has been deployed. Have you tried to install the CNI plugin directly with `kubectl`?

# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.10.0/Documentation/kube-flannel.yml

(Note that you need [as you specify] to set `--pod-network-cidr` in the `kubeadm init` command in order for flannel to work correctly).

I'm not sure if I'm misunderstanding, but we should try to avoid using packages from the distribution/image/build where possible, and CNI has always been kind of a special case in this regard (deploy a DaemonSet that will install the CNI binary from within the container right into the host, on all machines).
Comment 33 Panagiotis Georgiadis 2018-08-14 10:08:36 UTC
I was not aware about 'cni-plugins' package. In that case:

1. Make cni-plugins a dependency of kubernetes
2. We need to tell kubelet to use our path of the plugins instead of using /opt/

Then it should works.

(In reply to Valentin Rothberg from comment #30)

> I am a bit surprised though as we have cni-plugins but we install them under
> `/usr/lib/cni` (including flannel) as openSUSE doesn't use /opt. Can't we
> use `-cni-bin-dir=/usr/lib/cni` for kubelet?

I guess we can. We need to find out how kubeadm generates '/var/lib/kubelet/kubeadm-flags.env'

Reading the /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

> # Note: This dropin only works with kubeadm and kubelet v1.11+
> [Service]
> Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
> Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
> # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
> EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
> # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
> # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
> EnvironmentFile=-/etc/sysconfig/kubelet
> ExecStart=
> ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS


I've tried to put these info into: /etc/sysconfig/kubelet 

ultron:/ # cat /etc/sysconfig/kubelet 

> KUBELET_EXTRA_ARGS=""
> KUBELET_KUBEADM_ARGS="--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice --cni-bin-dir=/usr/lib/cni"

It even after 'kubeadem reset' and 'init' the env file "/var/lib/kubelet/kubeadm-flags.env" gets regenerates with the same old wrong parameters. If I edit the file on-the-fly, everything is good. But this is not a good solution. We have to find how kubeadm can generate it correctly.
Comment 34 Rafael Fernández López 2018-08-14 10:50:13 UTC
(In reply to Panagiotis Georgiadis from comment #33)
> I was not aware about 'cni-plugins' package. In that case:
> 
> 1. Make cni-plugins a dependency of kubernetes
> 2. We need to tell kubelet to use our path of the plugins instead of using
> /opt/
> 
> Then it should works.
> 
> (In reply to Valentin Rothberg from comment #30)
> 
> > I am a bit surprised though as we have cni-plugins but we install them under
> > `/usr/lib/cni` (including flannel) as openSUSE doesn't use /opt. Can't we
> > use `-cni-bin-dir=/usr/lib/cni` for kubelet?
> 
> I guess we can. We need to find out how kubeadm generates
> '/var/lib/kubelet/kubeadm-flags.env'
> 
> Reading the /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
> 
> > # Note: This dropin only works with kubeadm and kubelet v1.11+
> > [Service]
> > Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
> > Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
> > # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
> > EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
> > # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
> > # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
> > EnvironmentFile=-/etc/sysconfig/kubelet
> > ExecStart=
> > ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
> 
> 
> I've tried to put these info into: /etc/sysconfig/kubelet 
> 
> ultron:/ # cat /etc/sysconfig/kubelet 
> 
> > KUBELET_EXTRA_ARGS=""
> > KUBELET_KUBEADM_ARGS="--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice --cni-bin-dir=/usr/lib/cni"
> 
> It even after 'kubeadem reset' and 'init' the env file
> "/var/lib/kubelet/kubeadm-flags.env" gets regenerates with the same old
> wrong parameters. If I edit the file on-the-fly, everything is good. But
> this is not a good solution. We have to find how kubeadm can generate it
> correctly.

`kubeadm` is not enough to get a working final cluster with CNI. The idea would be to use kubeadm to deploy the initial node (when bootstrapping), but then we have to deploy CNI on it (this would install the CNI plugin on the host coming from a container image); I see them as different steps, not something completely controlled by `kubeadm`.

I would avoid making kubernetes depend on `cni-plugins`, that is a package that comes baked in the image/ISO; the idea is to depend as less as possible on packages that are baked into our image/ISO, but instead, use a registry (registry.opensuse.org) to distribute this kind of pieces.

We have the concept of an `init-container` (or `bootstrap-container`), that when it runs with the "bootstrap" action, will perform some actions, call to `kubeadm init` with the right parameters (possibly using the go api), and then deploy CNI on that one-node cluster, what will presumably deploy a DaemonSet that will install the chosen CNI plugin on the host.

The point is that `kubeadm` won't set up everything from the beginning to the end, we still need another piece that will wrap `kubeadm` and will deploy more things on top of it (cni, the new velum, haproxy [or whatever we'll use for HA between nodes]...).
Comment 35 Richard Brown 2018-08-14 11:30:21 UTC
(In reply to Panagiotis Georgiadis from comment #33)
 
> I've tried to put these info into: /etc/sysconfig/kubelet 
> 
> ultron:/ # cat /etc/sysconfig/kubelet 
> 
> > KUBELET_EXTRA_ARGS=""
> > KUBELET_KUBEADM_ARGS="--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice --cni-bin-dir=/usr/lib/cni"
> 
> It even after 'kubeadem reset' and 'init' the env file
> "/var/lib/kubelet/kubeadm-flags.env" gets regenerates with the same old
> wrong parameters. If I edit the file on-the-fly, everything is good. But
> this is not a good solution. We have to find how kubeadm can generate it
> correctly.

You can't just go adding your own parameters into /etc/sysconfig/kubelet and expect it to be read properly

/etc/sysconfig/kubelet is ONLY for KUBELET_EXTRA_ARGS.. please use it appropriately.
Comment 36 Panagiotis Georgiadis 2018-08-14 13:35:22 UTC
(In reply to Richard Brown from comment #35)
> You can't just go adding your own parameters into /etc/sysconfig/kubelet and
> expect it to be read properly
> 
> /etc/sysconfig/kubelet is ONLY for KUBELET_EXTRA_ARGS.. please use it
> appropriately.

One does not simply walk into Mordor :D

I am very much afraid that this specific path is hardcoded somewhere in kubeadm or there's a bug. Even if I change it, it keeps generating it and I end up with two paths: mine and the wrong one.

I created an upstream bug: https://github.com/kubernetes/kubernetes/issues/67390
Comment 37 Panagiotis Georgiadis 2018-08-16 12:38:05 UTC
If we use:

> --cri-socket /var/run/crio/crio.sock

Everything is better :)

https://pastebin.com/raw/bJB0zJLD

My only problem is that the kubelet logs are full of this:

> Failed to create summary reader for "/system.slice/localkube.service": none of the resources are being tracked.

Apart from that I'm fine with the way it is right now.
Comment 38 Panagiotis Georgiadis 2018-08-16 12:39:48 UTC
(In reply to Panagiotis Georgiadis from comment #29)
> > cat /etc/sysconfig/kubelet
> > KUBELET_EXTRA_ARGS="--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice"

Also this is not needed unless we use docker.