Skip to content

Running kubectl requires using sudo kubectl #513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 3, 2022

Conversation

afbjorklund
Copy link
Member

@afbjorklund afbjorklund commented Jan 1, 2022

A bit of mismatch between k3s and k8s, and needs better documentation...

Make it easier to set up user config, and make sudo kubectl work as well.

i.e. running directly on the guest, is supposed to require sudo

running remotely from the host, is configured to not use sudo


k3s, when not using sudo

WARN[0000] Unable to read /etc/rancher/k3s/k3s.yaml, please start server with --write-kubeconfig-mode to modify kube config permissions 
error: error loading config file "/etc/rancher/k3s/k3s.yaml": open /etc/rancher/k3s/k3s.yaml: permission denied

k8s, without $KUBECONFIG

The connection to the server localhost:8080 was refused - did you specify the right host or port?

@AkihiroSuda
Copy link
Member

DCO check failing

jandubois
jandubois previously approved these changes Jan 2, 2022
Copy link
Member

@jandubois jandubois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested it myself, but otherwise LGTM.

@afbjorklund
Copy link
Member Author

I wanted it to fail, when not using sudo. Previously we had a model where docker was set-uid (through the docker group)

In the new model, both nerdctl and docker will run as rootless - unless you explicitely ask them to run as root (using sudo)

So use the same model for Kubernetes.

In the future, it might also run in fakeroot.

@jandubois
Copy link
Member

Change looks good to me, but the config doesn't actually work (tried twice):

[  111.958492] cloud-init[1420]: error execution phase preflight: [preflight] Some fatal errors occurred:
[  111.958744] cloud-init[1420]:        [ERROR ImagePull]: failed to pull image k8s.gcr.io/etcd:3.5.1-0: output: time="2022-01-03T03:07:01Z" level=fatal msg="pulling image: rpc error: code = Unknown desc = failed to pull and unpack image \"k8s.gcr.io/etcd:3.5.1-0\": failed to resolve reference \"k8s.gcr.io/etcd:3.5.1-0\": failed to do request: Head \"https://k8s.gcr.io/v2/etcd/manifests/3.5.1-0\": dial tcp 142.250.114.82:443: connect: network is unreachable"
[  111.959415] cloud-init[1420]: , error: exit status 1
[  111.960081] cloud-init[1420]:        [ERROR ImagePull]: failed to pull image k8s.gcr.io/coredns/coredns:v1.8.6: output: time="2022-01-03T03:07:01Z" level=fatal msg="pulling image: rpc error: code = Unknown desc = failed to pull and unpack image \"k8s.gcr.io/coredns/coredns:v1.8.6\": failed to resolve reference \"k8s.gcr.io/coredns/coredns:v1.8.6\": failed to do request: Head \"https://k8s.gcr.io/v2/coredns/coredns/manifests/v1.8.6\": dial tcp 142.250.114.82:443: connect: network is unreachable"
[  111.960459] cloud-init[1420]: , error: exit status 1

Don't have time to look closer right now; please confirm that things work as-is for you!

@afbjorklund
Copy link
Member Author

afbjorklund commented Jan 3, 2022

Could it be a temporary network issue ? We could pull the images explicitly, to make it more obvious ?

[   82.826629] cloud-init[1509]: [init] Using Kubernetes version: v1.23.1
[   82.826866] cloud-init[1509]: [preflight] Running pre-flight checks
[   82.917051] cloud-init[1509]: [preflight] Pulling images required for setting up a Kubernetes cluster
[   82.917153] cloud-init[1509]: [preflight] This might take a minute or two, depending on the speed of your internet connection
[   82.917435] cloud-init[1509]: [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'

Something that I do interactively, is to run them with xargs (or parallel) to get some progress output.

[  143.637714] cloud-init[1509]: Your Kubernetes control-plane has initialized successfully!
[  143.637808] cloud-init[1509]: To start using your cluster, you need to run the following as a regular user:
[  143.638178] cloud-init[1509]:   mkdir -p $HOME/.kube
[  143.638519] cloud-init[1509]:   sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[  143.638884] cloud-init[1509]:   sudo chown $(id -u):$(id -g) $HOME/.kube/config
[  143.639238] cloud-init[1509]: Alternatively, if you are the root user, you can run:
[  143.639587] cloud-init[1509]:   export KUBECONFIG=/etc/kubernetes/admin.conf
[  143.639905] cloud-init[1509]: You should now deploy a pod network to the cluster.
[  143.640319] cloud-init[1509]: Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
[  143.640743] cloud-init[1509]:   https://kubernetes.io/docs/concepts/cluster-administration/addons/
[  143.641034] cloud-init[1509]: Then you can join any number of worker nodes by running the following on each as root:
[  143.641316] cloud-init[1509]: kubeadm join 192.168.5.15:6443 --token eupt69.1nzxhoz82yzjxswy \
[  143.641597] cloud-init[1509]: 	--discovery-token-ca-cert-hash sha256:9cea4afd4bf1b1bf2c0c8f106db93e0eb73c1a9e2e08562520fc8e436a27e54f
[  143.649869] cloud-init[1509]: + kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/v0.14.0/Documentation/kube-flannel.yml
[  144.746203] cloud-init[1509]: Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
[  144.750733] cloud-init[1509]: podsecuritypolicy.policy/psp.flannel.unprivileged created
[  144.755456] cloud-init[1509]: clusterrole.rbac.authorization.k8s.io/flannel created
[  144.760234] cloud-init[1509]: clusterrolebinding.rbac.authorization.k8s.io/flannel created
[  144.765909] cloud-init[1509]: serviceaccount/flannel created
[  144.771691] cloud-init[1509]: configmap/kube-flannel-cfg created
[  144.782799] cloud-init[1509]: daemonset.apps/kube-flannel-ds created
[  144.785344] cloud-init[1509]: + kubectl taint nodes --all node-role.kubernetes.io/master-
[  144.837817] cloud-init[1509]: node/lima-k8s untainted
[  144.839504] cloud-init[1509]: + sed -e '/server:/ s/192.168.5.15/127.0.0.1/' -i /etc/kubernetes/admin.conf
[  144.841402] cloud-init[1509]: + mkdir -p /root/.kube
[  144.842487] cloud-init[1509]: + cp -f /etc/kubernetes/admin.conf /root/.kube/config
[  144.845090] cloud-init[1509]: LIMA| Exiting with code 0

Or cache the images locally, like previously discussed.

anders@lima-k8s:~$ sudo kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.23.1
k8s.gcr.io/kube-controller-manager:v1.23.1
k8s.gcr.io/kube-scheduler:v1.23.1
k8s.gcr.io/kube-proxy:v1.23.1
k8s.gcr.io/pause:3.6
k8s.gcr.io/etcd:3.5.1-0
k8s.gcr.io/coredns/coredns:v1.8.6
anders@lima-k8s:~$ sudo kubeadm config images list | xargs -n 1 sudo crictl pull
Image is up to date for sha256:b6d7abedde39968d56e9f53aaeea02a4fe6413497c4dedf091868eae09dcc320
Image is up to date for sha256:f51846a4fd28801f333d9a13e4a77a96bd52f06e587ba664c2914f015c38e5d1
Image is up to date for sha256:71d575efe62835f4882115d409a676dd24102215eee650bf23b9cf42af0e7c05
Image is up to date for sha256:b46c42588d5116766d0eb259ff372e7c1e3ecc41a842b0c18a8842083e34d62e
Image is up to date for sha256:6270bb605e12e581514ada5fd5b3216f727db55dc87d5889c790e4c760683fee
Image is up to date for sha256:25f8c7f3da61c2a810effe5fa779cf80ca171afb0adf94c7cb51eb9a8546629d
Image is up to date for sha256:a4ca41631cc7ac19ce1be3ebf0314ac5f47af7c711f17066006db82ee3b75b03
anders@lima-k8s:~$ sudo kubeadm version --output short
v1.23.1
anders@lima-k8s:~$ sudo kubeadm config images list > images.txt
anders@lima-k8s:~$ xargs sudo ctr -n k8s.io images export images.tar < images.txt
anders@lima-k8s:~$ du -hs images.tar 
216M	images.tar
anders@lima-k8s:~$ sudo ctr -n k8s.io images import images.tar 
unpacking k8s.gcr.io/kube-apiserver:v1.23.1 (sha256:f54681a71cce62cbc1b13ebb3dbf1d880f849112789811f98b6aebd2caa2f255)...done
unpacking k8s.gcr.io/kube-controller-manager:v1.23.1 (sha256:a7ed87380108a2d811f0d392a3fe87546c85bc366e0d1e024dfa74eb14468604)...done
unpacking k8s.gcr.io/kube-scheduler:v1.23.1 (sha256:8be4eb1593cf9ff2d91b44596633b7815a3753696031a1eb4273d1b39427fa8c)...done
unpacking k8s.gcr.io/kube-proxy:v1.23.1 (sha256:e40f3a28721588affcf187f3f246d1e078157dabe274003eaa2957a83f7170c8)...done
unpacking k8s.gcr.io/pause:3.6 (sha256:3d380ca8864549e74af4b29c10f9cb0956236dfb01c40ca076fb6c37253234db)...done
unpacking k8s.gcr.io/etcd:3.5.1-0 (sha256:64b9ea357325d5db9f8a723dcf503b5a449177b17ac87d69481e126bb724c263)...done
unpacking k8s.gcr.io/coredns/coredns:v1.8.6 (sha256:5b6ec0d6de9baaf3e92d0f66cd96a25b9edbce8716f5f15dcd1a616b3abd590e)...done

Other flags:

      --image-repository string     Choose a container registry to pull control plane images from (default "k8s.gcr.io")
      --kubernetes-version string   Choose a specific Kubernetes version for the control plane. (default "stable-1")

@jandubois
Copy link
Member

I still get the error from the provisioning script:

+ kubeadm config images pull
[config/images] Pulled k8s.gcr.io/kube-apiserver:v1.23.1
[config/images] Pulled k8s.gcr.io/kube-controller-manager:v1.23.1
[config/images] Pulled k8s.gcr.io/kube-scheduler:v1.23.1
[config/images] Pulled k8s.gcr.io/kube-proxy:v1.23.1
[config/images] Pulled k8s.gcr.io/pause:3.6
failed to pull image "k8s.gcr.io/etcd:3.5.1-0": output: time="2022-01-03T17:25:59Z" level=fatal msg="pulling image: rpc error: code = Unknown desc = failed to pull and unpack image \"k8s.gcr.io/etcd:3.5.1-0\": failed to resolve reference \"k8s.gcr.io/etcd:3.5.1-0\": failed to do request: Head \"https://k8s.gcr.io/v2/etcd/manifests/3.5.1-0\": dial tcp 74.125.135.82:443: connect: network is unreachable"
, error: exit status 1
To see the stack trace of this error execute with --v=5 or higher

I could pull the image later manually, but of course the earlier error aborted the provisioning.

root@lima-k8s:/var/log# kubeadm config images pull
[config/images] Pulled k8s.gcr.io/kube-apiserver:v1.23.1
[config/images] Pulled k8s.gcr.io/kube-controller-manager:v1.23.1
[config/images] Pulled k8s.gcr.io/kube-scheduler:v1.23.1
[config/images] Pulled k8s.gcr.io/kube-proxy:v1.23.1
[config/images] Pulled k8s.gcr.io/pause:3.6
[config/images] Pulled k8s.gcr.io/etcd:3.5.1-0
[config/images] Pulled k8s.gcr.io/coredns/coredns:v1.8.6

No time to debug right now; will look more later...

@afbjorklund
Copy link
Member Author

At least it makes the network issue more clear, even if unrelated to this PR.

Copy link
Member

@jandubois jandubois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jandubois
Copy link
Member

At least it makes the network issue more clear, even if unrelated to this PR.

Yes, I get the same error even with the old scripts, so no reason not to merge this.

@jandubois jandubois merged commit 351c4ae into lima-vm:master Jan 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants