首页
关于
打字游戏
更多
关于
打字游戏
Search
1
Typecho安装支持生成目录、流程图(Mermaid)、数学公式(MathJax)Markdown解析器插件Parsedown
845 阅读
2
Ubuntu22.04修改网络IP地址
434 阅读
3
使用frp进行内网穿透,实现远程ssh连接内网主机
312 阅读
4
Ubuntu22.04中安装Kubernetes1.27高可用(Docker作为容器运行时)
250 阅读
5
Gitlab和Redmine集成问题追踪系统,Intellij IDEA中集成问题追踪系统
192 阅读
云原生
docker
kubernetes
typecho
web前端
DevOps
Git
英语
english-in-use-primary
大数据
Flink
StarRocks
Kafka
ClickHouse
Hadoop
HBase
ChatGPT
编程语言
时事热点
Tools
Intellij IDEA
frp
json
Linux
Ubuntu
登录
Search
标签搜索
Kubernetes
docker
Ubuntu22.04
k8s
docker-compose
docker安装
docker-compose安装
linux
Typecho
Markdown解析插件
TOC
Intellij IDEA
IDEA
Gitlab
Redmine
Gitlab集成Redmine
IDEA匹配ISSUE链接
frp
ssh
内网穿透
累计撰写
10
篇文章
累计收到
0
条评论
首页
栏目
云原生
docker
kubernetes
typecho
web前端
DevOps
Git
英语
english-in-use-primary
大数据
Flink
StarRocks
Kafka
ClickHouse
Hadoop
HBase
ChatGPT
编程语言
时事热点
Tools
Intellij IDEA
frp
json
Linux
Ubuntu
页面
关于
打字游戏
搜索到
3
篇与
的结果
2023-12-23
CKS练习题
Pre Setup 第一题:Contexts 问题 解答1 解答2 第二题: Runtime Security with Falco 问题 解答: Use Falco as service Eliminate offending Pods Use Falco from command line Create logs in correct format Local falco rules 第三题: Apiserver Security 问题 解答 第四题: Pod Security Standard 问题 解答 第五题: CIS Benchmark 问题 解答 Number 1 Number 2 Number 3 Number 4 第六题: Verify Platform Binaries 问题 解答 第七题: Open Policy Agent 问题 解答 第八题: Secure Kubernetes Dashboard 问题 解答 第九题: AppArmor Profile 解答 Part 1 Part 2 Part 3 第十题: Container Runtime Sandbox gVisor 问题 解答 第十一题: Secrets in ETCD 问题 解答 第十二题: Hack Secrets 问题 解答 第十三题: Restrict access to Metadata Server 问题 解答 第十四题: Syscall Activity 问题 解答 第十五题: Configure TLS on Ingress 问题 解答 Investigate Implement own TLS certificate 第十六题: Docker Image Attack Surface 问题 解答 第十七题: Audit Log Policy 问题 解答 第十八题: Investigate Break-in via Audit Log 问题 解答 第十九题: Immutable Root FileSystem 问题: 解答 第二十题: Update Kubernetes 问题 解决 Control Plane Master Components Control Plane kubelet and kubectl Data Plane 第二十一题: Image Vulnerability Scanning 问题 解答 第二十二题: Manual Static Security Analysis 问题 解答 Number 1 Number 2 Number 3 Pre Setup Once you've gained access to your terminal it might be wise to spend ~1 minute to setup your environment. You could set these: alias k=kubectl # will already be pre-configured export do="--dry-run=client -o yaml" # k create deploy nginx --image=nginx $do export now="--force --grace-period 0" # k delete pod x $now The following settings will already be configured in your real exam environment in ~/.vimrc. But it can never hurt to be able to type these down: set tabstop=2 set expandtab set shiftwidth=2 More setup suggestions are in the tips section. 第一题:Contexts 问题 Task weight: 1% You have access to multiple clusters from your main terminal through kubectlcontexts. Write all context names into /opt/course/1/contexts, one per line. From the kubeconfig extract the certificate of user restricted@infra-prodand write it decoded to /opt/course/1/cert. 解答1 kubectl config get-contexts -o name > /opt/course/1/contexts kubectl config view --raw -o jsonpath='{.users[?(@name == "restricted@infra-prod")].user.client-certificate-data}' | base64 -d > /opt/course/1/cert 解答2 Answer: Maybe the fastest way is just to run: k config get-contexts # copy by hand k config get-contexts -o name > /opt/course/1/contexts Or using jsonpath: k config view -o jsonpath="{.contexts[*].name}" k config view -o jsonpath="{.contexts[*].name}" | tr " " "\n" # new lines k config view -o jsonpath="{.contexts[*].name}" | tr " " "\n" > /opt/course/1/contexts The content could then look like: # /opt/course/1/contexts gianna@infra-prod infra-prod restricted@infra-prod workload-prod workload-stage For the certificate we could just run k config view --raw And copy it manually. Or we do: k config view --raw -ojsonpath="{.users[2].user.client-certificate-data}" | base64 -d > /opt/course/1/cert Or even: k config view --raw -ojsonpath="{.users[?(.name == 'restricted@infra-prod')].user.client-certificate-data}" | base64 -d > /opt/course/1/cert # /opt/course/1/cert -----BEGIN CERTIFICATE----- MIIDHzCCAgegAwIBAgIQN5Qe/Rj/PhaqckEI23LPnjANBgkqhkiG9w0BAQsFADAV MRMwEQYDVQQDEwprdWJlcm5ldGVzMB4XDTIwMDkyNjIwNTUwNFoXDTIxMDkyNjIw NTUwNFowKjETMBEGA1UEChMKcmVzdHJpY3RlZDETMBEGA1UEAxMKcmVzdHJpY3Rl ZDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAL/Jaf/QQdijyJTWIDij qa5p4oAh+xDBX3jR9R0G5DkmPU/FgXjxej3rTwHJbuxg7qjTuqQbf9Fb2AHcVtwH gUjC12ODUDE+nVtap+hCe8OLHZwH7BGFWWscgInZOZW2IATK/YdqyQL5OKpQpFkx iAknVZmPa2DTZ8FoyRESboFSTZj6y+JVA7ot0pM09jnxswstal9GZLeqioqfFGY6 YBO/Dg4DDsbKhqfUwJVT6Ur3ELsktZIMTRS5By4Xz18798eBiFAHvgJGq1TTwuPM EhBfwYwgYbalL8DSHeFrelLBKgciwUKjr1lolnnuc1vhkX1peV1J3xrf6o2KkyMc lY0CAwEAAaNWMFQwDgYDVR0PAQH/BAQDAgWgMBMGA1UdJQQMMAoGCCsGAQUFBwMC MAwGA1UdEwEB/wQCMAAwHwYDVR0jBBgwFoAUPrspZIWR7YMN8vT5DF3s/LvpxPQw DQYJKoZIhvcNAQELBQADggEBAIDq0Zt77gXI1s+uW46zBw4mIWgAlBLl2QqCuwmV kd86eH5bD0FCtWlb6vGdcKPdFccHh8Z6z2LjjLu6UoiGUdIJaALhbYNJiXXi/7cf M7sqNOxpxQ5X5hyvOBYD1W7d/EzPHV/lcbXPUDYFHNqBYs842LWSTlPQioDpupXp FFUQPxsenNXDa4TbmaRvnK2jka0yXcqdiXuIteZZovp/IgNkfmx2Ld4/Q+Xlnscf CFtWbjRa/0W/3EW/ghQ7xtC7bgcOHJesoiTZPCZ+dfKuUfH6d1qxgj6Jwt0HtyEf QTQSc66BdMLnw5DMObs4lXDo2YE6LvMrySdXm/S7img5YzU= -----END CERTIFICATE----- 第二题: Runtime Security with Falco 问题 Task weight: 4% Use context: kubectl config use-context workload-prod Falco is installed with default configuration on node cluster1-node1. Connect using ssh cluster1-node1. Use it to: Find a Pod running image nginx which creates unwanted package management processes inside its container. Find a Pod running image httpd which modifies /etc/passwd. Save the Falco logs for case 1 under /opt/course/2/falco.log in format: time-with-nanosconds,container-id,container-name,user-name No other information should be in any line. Collect the logs for at least 30 seconds. Afterwards remove the threads (both 1 and 2) by scaling the replicas of the Deployments that control the offending Pods down to 0. 解答: Use Falco as service First we can investigate Falco config a little: ➜ ssh cluster1-node1 ➜ root@cluster1-node1:~# service falco status ● falco.service - LSB: Falco syscall activity monitoring agent Loaded: loaded (/etc/init.d/falco; generated) Active: active (running) since Sat 2020-10-10 06:36:15 UTC; 2h 1min ago ... ➜ root@cluster1-node1:~# cd /etc/falco ➜ root@cluster1-node1:/etc/falco# ls falco.yaml falco_rules.local.yaml falco_rules.yaml k8s_audit_rules.yaml rules.available rules.d This is the default configuration, if we look into falco.yaml we can see: # /etc/falco/falco.yaml ... # Where security notifications should go. # Multiple outputs can be enabled. syslog_output: enabled: true ... This means that Falco is writing into syslog, hence we can do: ➜ root@cluster1-node1:~# cat /var/log/syslog | grep falco Sep 15 08:44:04 ubuntu2004 falco: Falco version 0.29.1 (driver version 17f5df52a7d9ed6bb12d3b1768460def8439936d) Sep 15 08:44:04 ubuntu2004 falco: Falco initialized with configuration file /etc/falco/falco.yaml Sep 15 08:44:04 ubuntu2004 falco: Loading rules from file /etc/falco/falco_rules.yaml: ... Yep, quite some action going on in there. Let's investigate the first offending Pod: ➜ root@cluster1-node1:~# cat /var/log/syslog | grep falco | grep nginx | grep process Sep 16 06:23:47 ubuntu2004 falco: 06:23:47.376241377: Error Package management process launched in container (user=root user_loginuid=-1 command=apk container_id=7a5ea6a080d1 container_name=nginx image=docker.io/library/nginx:1.19.2-alpine) ... ➜ root@cluster1-node1:~# crictl ps -id 7a5ea6a080d1 CONTAINER ID IMAGE NAME ... POD ID 7a5ea6a080d1b 6f715d38cfe0e nginx ... 7a864406b9794 root@cluster1-node1:~# crictl pods -id 7a864406b9794 POD ID ... NAME NAMESPACE ... 7a864406b9794 ... webapi-6cfddcd6f4-ftxg4 team-blue ... First Pod is webapi-6cfddcd6f4-ftxg4 in Namespace team-blue. ➜ root@cluster1-node1:~# cat /var/log/syslog | grep falco | grep httpd | grep passwd Sep 16 06:23:48 ubuntu2004 falco: 06:23:48.830962378: Error File below /etc opened for writing (user=root user_loginuid=-1 command=sed -i $d /etc/passwd parent=sh pcmdline=sh -c echo hacker >> /etc/passwd; sed -i '$d' /etc/passwd; true file=/etc/passwdngFmAl program=sed gparent=<NA> ggparent=<NA> gggparent=<NA> container_id=b1339d5cc2de image=docker.io/library/httpd) ➜ root@cluster1-node1:~# crictl ps -id b1339d5cc2de CONTAINER ID IMAGE NAME ... POD ID b1339d5cc2dee f6b40f9f8ad71 httpd ... 595af943c3245 root@cluster1-node1:~# crictl pods -id 595af943c3245 POD ID ... NAME NAMESPACE ... 595af943c3245 ... rating-service-68cbdf7b7-v2p6g team-purple ... Second Pod is rating-service-68cbdf7b7-v2p6g in Namespace team-purple. Eliminate offending Pods The logs from before should allow us to find and "eliminate" the offending Pods: ➜ k get pod -A | grep webapi team-blue webapi-6cfddcd6f4-ftxg4 1/1 Running ➜ k -n team-blue scale deploy webapi --replicas 0 deployment.apps/webapi scaled ➜ k get pod -A | grep rating-service team-purple rating-service-68cbdf7b7-v2p6g 1/1 Running ➜ k -n team-purple scale deploy rating-service --replicas 0 deployment.apps/rating-service scaled Use Falco from command line We can also use Falco directly from command line, but only if the service is disabled: ➜ root@cluster1-node1:~# service falco stop ➜ root@cluster1-node1:~# falco Thu Sep 16 06:33:11 2021: Falco version 0.29.1 (driver version 17f5df52a7d9ed6bb12d3b1768460def8439936d) Thu Sep 16 06:33:11 2021: Falco initialized with configuration file /etc/falco/falco.yaml Thu Sep 16 06:33:11 2021: Loading rules from file /etc/falco/falco_rules.yaml: Thu Sep 16 06:33:11 2021: Loading rules from file /etc/falco/falco_rules.local.yaml: Thu Sep 16 06:33:11 2021: Loading rules from file /etc/falco/k8s_audit_rules.yaml: Thu Sep 16 06:33:12 2021: Starting internal webserver, listening on port 8765 06:33:17.382603204: Error Package management process launched in container (user=root user_loginuid=-1 command=apk container_id=7a5ea6a080d1 container_name=nginx image=docker.io/library/nginx:1.19.2-alpine) ... We can see that rule files are loaded and logs printed afterwards. Create logs in correct format The task requires us to store logs for "unwanted package management processes" in format time,container-id,container-name,user-name. The output from falco shows entries for "Error Package management process launched" in a default format. Let's find the proper file that contains the rule and change it: ➜ root@cluster1-node1:~# cd /etc/falco/ ➜ root@cluster1-node1:/etc/falco# grep -r "Package management process launched" . ./falco_rules.yaml: Package management process launched in container (user=%user.name user_loginuid=%user.loginuid ➜ root@cluster1-node1:/etc/falco# cp falco_rules.yaml falco_rules.yaml_ori ➜ root@cluster1-node1:/etc/falco# vim falco_rules.yaml Find the rule which looks like this: # Container is supposed to be immutable. Package management should be done in building the image. - rule: Launch Package Management Process in Container desc: Package management process ran inside container condition: > spawned_process and container and user.name != "_apt" and package_mgmt_procs and not package_mgmt_ancestor_procs and not user_known_package_manager_in_container output: > Package management process launched in container (user=%user.name user_loginuid=%user.loginuid command=%proc.cmdline container_id=%container.id container_name=%container.name image=%container.image.repository:%container.image.tag) priority: ERROR tags: [process, mitre_persistence] Should be changed into the required format (在local.yaml文件中覆盖这条规则更好): # Container is supposed to be immutable. Package management should be done in building the image. - rule: Launch Package Management Process in Container desc: Package management process ran inside container condition: > spawned_process and container and user.name != "_apt" and package_mgmt_procs and not package_mgmt_ancestor_procs and not user_known_package_manager_in_container output: > Package management process launched in container %evt.time,%container.id,%container.name,%user.name priority: ERROR tags: [process, mitre_persistence] For all available fields we can check https://falco.org/docs/rules/supported-fields, which should be allowed to open during the exam. Next we check the logs in our adjusted format: ➜ root@cluster1-node1:/etc/falco# falco | grep "Package management" 06:38:28.077150666: Error Package management process launched in container 06:38:28.077150666,090aad374a0a,nginx,root 06:38:33.058263010: Error Package management process launched in container 06:38:33.058263010,090aad374a0a,nginx,root 06:38:38.068693625: Error Package management process launched in container 06:38:38.068693625,090aad374a0a,nginx,root 06:38:43.066159360: Error Package management process launched in container 06:38:43.066159360,090aad374a0a,nginx,root 06:38:48.059792139: Error Package management process launched in container 06:38:48.059792139,090aad374a0a,nginx,root 06:38:53.063328933: Error Package management process launched in container 06:38:53.063328933,090aad374a0a,nginx,root This looks much better. Copy&paste the output into file /opt/course/2/falco.log on your main terminal. The content should be cleaned like this: # /opt/course/2/falco.log 06:38:28.077150666,090aad374a0a,nginx,root 06:38:33.058263010,090aad374a0a,nginx,root 06:38:38.068693625,090aad374a0a,nginx,root 06:38:43.066159360,090aad374a0a,nginx,root 06:38:48.059792139,090aad374a0a,nginx,root 06:38:53.063328933,090aad374a0a,nginx,root 06:38:58.070912841,090aad374a0a,nginx,root 06:39:03.069592140,090aad374a0a,nginx,root 06:39:08.064805371,090aad374a0a,nginx,root 06:39:13.078109098,090aad374a0a,nginx,root 06:39:18.065077287,090aad374a0a,nginx,root 06:39:23.061012151,090aad374a0a,nginx,root For a few entries it should be fast to just clean it up manually. If there are larger amounts of entries we could do(现将日志重定向保存到/opt/course/2/falco.log.dirty, 然后使用以下命令更快): cat /opt/course/2/falco.log.dirty | cut -d" " -f 9 > /opt/course/2/falco.log Local falco rules There is also a file /etc/falco/falco_rules.local.yaml in which we can override existing default rules. This is a much cleaner solution for production. Choose the faster way for you in the exam if nothing is specified in the task. 第三题: Apiserver Security 问题 Task weight: 3% Use context: kubectl config use-context workload-prod You received a list from the DevSecOps team which performed a security investigation of the k8s cluster1 (workload-prod). The list states the following about the apiserver setup: Accessible through a NodePort Service Change the apiserver setup so that: Only accessible through a ClusterIP Service 解答 In order to modify the parameters for the apiserver, we first ssh into the master node and check which parameters the apiserver process is running with: ➜ ssh cluster1-controlplane1 ➜ root@cluster1-controlplane1:~# ps aux | grep kube-apiserver root 27622 7.4 15.3 1105924 311788 ? Ssl 10:31 11:03 kube-apiserver --advertise-address=192.168.100.11 --allow-privileged=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key --etcd-servers=https://127.0.0.1:2379 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --kubernetes-service-node-port=31000 --proxy-client-cert- ... We may notice the following argument: --kubernetes-service-node-port=31000 We can also check the Service and see it's of type NodePort: ➜ root@cluster1-controlplane1:~# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes NodePort 10.96.0.1 <none> 443:31000/TCP 5d2h The apiserver runs as a static Pod, so we can edit the manifest. But before we do this we also create a copy in case we mess things up: ➜ root@cluster1-controlplane1:~# cp /etc/kubernetes/manifests/kube-apiserver.yaml ~/3_kube-apiserver.yaml ➜ root@cluster1-controlplane1:~# vim /etc/kubernetes/manifests/kube-apiserver.yaml We should remove the unsecure settings: # /etc/kubernetes/manifests/kube-apiserver.yaml apiVersion: v1 kind: Pod metadata: annotations: kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 192.168.100.11:6443 creationTimestamp: null labels: component: kube-apiserver tier: control-plane name: kube-apiserver namespace: kube-system spec: containers: - command: - kube-apiserver - --advertise-address=192.168.100.11 - --allow-privileged=true - --authorization-mode=Node,RBAC - --client-ca-file=/etc/kubernetes/pki/ca.crt - --enable-admission-plugins=NodeRestriction - --enable-bootstrap-token-auth=true - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key - --etcd-servers=https://127.0.0.1:2379 - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname # 注释这里,或者设置成0 # - --kubernetes-service-node-port=31000 # delete or set to 0 - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key ... Once the changes are made, give the apiserver some time to start up again. Check the apiserver's Pod status and the process parameters: ➜ root@cluster1-controlplane1:~# kubectl -n kube-system get pod | grep apiserver kube-apiserver-cluster1-controlplane1 1/1 Running 0 38s ➜ root@cluster1-controlplane1:~# ps aux | grep kube-apiserver | grep node-port The apiserver got restarted without the unsecure settings. However, the Service kubernetes will still be of type NodePort: ➜ root@cluster1-controlplane1:~# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes NodePort 10.96.0.1 <none> 443:31000/TCP 5d3h We need to delete the Service for the changes to take effect: ➜ root@cluster1-controlplane1:~# kubectl delete svc kubernetes service "kubernetes" deleted After a few seconds: ➜ root@cluster1-controlplane1:~# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 6s This should satisfy the DevSecOps team. 第四题: Pod Security Standard 问题 ask weight: 8% Use context: kubectl config use-context workload-prod There is Deployment container-host-hacker in Namespace team-red which mounts /run/containerd as a hostPath volume on the Node where it's running. This means that the Pod can access various data about other containers running on the same Node. To prevent this configure Namespace team-red to enforce the baseline Pod Security Standard. Once completed, delete the Pod of the Deployment mentioned above. Check the ReplicaSet events and write the event/log lines containing the reason why the Pod isn't recreated into /opt/course/4/logs. 解答 Making Namespaces use Pod Security Standards works via labels. We can simply edit it: k edit ns team-red ow we configure the requested label: # kubectl edit namespace team-red apiVersion: v1 kind: Namespace metadata: labels: kubernetes.io/metadata.name: team-red pod-security.kubernetes.io/enforce: baseline # add name: team-red ... This should already be enough for the default Pod Security Admission Controller to pick up on that change. Let's test it and delete the Pod to see if it'll be recreated or fails, it should fail! ➜ k -n team-red get pod NAME READY STATUS RESTARTS AGE container-host-hacker-dbf989777-wm8fc 1/1 Running 0 115s ➜ k -n team-red delete pod container-host-hacker-dbf989777-wm8fc pod "container-host-hacker-dbf989777-wm8fc" deleted ➜ k -n team-red get pod No resources found in team-red namespace. Usually the ReplicaSet of a Deployment would recreate the Pod if deleted, here we see this doesn't happen. Let's check why: ➜ k -n team-red get rs NAME DESIRED CURRENT READY AGE container-host-hacker-dbf989777 1 0 0 5m25s ➜ k -n team-red describe rs container-host-hacker-dbf989777 Name: container-host-hacker-dbf989777 Namespace: team-red ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- ... Warning FailedCreate 2m41s replicaset-controller Error creating: pods "container-host-hacker-dbf989777-bjwgv" is forbidden: violates PodSecurity "baseline:latest": hostPath volumes (volume "containerdata") Warning FailedCreate 2m2s (x9 over 2m40s) replicaset-controller (combined from similar events): Error creating: pods "container-host-hacker-dbf989777-kjfpn" is forbidden: violates PodSecurity "baseline:latest": hostPath volumes (volume "containerdata") There we go! Finally we write the reason into the requested file so that Mr Scoring will be happy too! # /opt/course/4/logs Warning FailedCreate 2m2s (x9 over 2m40s) replicaset-controller (combined from similar events): Error creating: pods "container-host-hacker-dbf989777-kjfpn" is forbidden: violates PodSecurity "baseline:latest": hostPath volumes (volume "containerdata") od Security Standards can give a great base level of security! But when one finds themselves wanting to deeper adjust the levels like baseline or restricted... this isn't possible and 3rd party solutions like OPA could be looked at. 第五题: CIS Benchmark 问题 Task weight: 3% Use context: kubectl config use-context infra-prod You're ask to evaluate specific settings of cluster2 against the CIS Benchmark recommendations. Use the tool kube-bench which is already installed on the nodes. Connect using ssh cluster2-controlplane1 and ssh cluster2-node1. On the master node ensure (correct if necessary) that the CIS recommendations are set for: The--profiling` argument of the kube-controller-manager The ownership of directory /var/lib/etcd On the worker node ensure (correct if necessary) that the CIS recommendations are set for: The permissions of the kubelet configuration /var/lib/kubelet/config.yaml The --client-ca-file argument of the kubelet 解答 Number 1 First we ssh into the master node run kube-bench against the master components: ➜ ssh cluster2-controlplane1 ➜ root@cluster2-controlplane1:~# kube-bench run --targets=master ... == Summary == 41 checks PASS 13 checks FAIL 11 checks WARN 0 checks INFO We see some passes, fails and warnings. Let's check the required task (1) of the controller manager: ➜ root@cluster2-controlplane1:~# kube-bench run --targets=master | grep kube-controller -A 3 1.3.1 Edit the Controller Manager pod specification file /etc/kubernetes/manifests/kube-controller-manager.yaml on the master node and set the --terminated-pod-gc-threshold to an appropriate threshold, for example: --terminated-pod-gc-threshold=10 -- 1.3.2 Edit the Controller Manager pod specification file /etc/kubernetes/manifests/kube-controller-manager.yaml on the master node and set the below parameter. --profiling=false 1.3.6 Edit the Controller Manager pod specification file /etc/kubernetes/manifests/kube-controller-manager.yaml on the master node and set the --feature-gates parameter to include RotateKubeletServerCertificate=true. --feature-gates=RotateKubeletServerCertificate=true There we see 1.3.2 which suggests to set --profiling=false, so we obey: ➜ root@cluster2-controlplane1:~# vim /etc/kubernetes/manifests/kube-controller-manager.yaml Edit the corresponding line: # /etc/kubernetes/manifests/kube-controller-manager.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-controller-manager tier: control-plane name: kube-controller-manager namespace: kube-system spec: containers: - command: - kube-controller-manager - --allocate-node-cidrs=true - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf - --bind-address=127.0.0.1 - --client-ca-file=/etc/kubernetes/pki/ca.crt - --cluster-cidr=10.244.0.0/16 - --cluster-name=kubernetes - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key - --controllers=*,bootstrapsigner,tokencleaner - --kubeconfig=/etc/kubernetes/controller-manager.conf - --leader-elect=true - --node-cidr-mask-size=24 - --port=0 - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt - --root-ca-file=/etc/kubernetes/pki/ca.crt - --service-account-private-key-file=/etc/kubernetes/pki/sa.key - --service-cluster-ip-range=10.96.0.0/12 - --use-service-account-credentials=true - --profiling=false # add ... We wait for the Pod to restart, then run kube-bench again to check if the problem was solved: ➜ root@cluster2-controlplane1:~# kube-bench run --targets=master | grep kube-controller -A 3 1.3.1 Edit the Controller Manager pod specification file /etc/kubernetes/manifests/kube-controller-manager.yaml on the master node and set the --terminated-pod-gc-threshold to an appropriate threshold, for example: --terminated-pod-gc-threshold=10 -- 1.3.6 Edit the Controller Manager pod specification file /etc/kubernetes/manifests/kube-controller-manager.yaml on the master node and set the --feature-gates parameter to include RotateKubeletServerCertificate=true. --feature-gates=RotateKubeletServerCertificate=true Problem solved and 1.3.2 is passing: root@cluster2-controlplane1:~# kube-bench run --targets=master | grep 1.3.2 [PASS] 1.3.2 Ensure that the --profiling argument is set to false (Scored) Number 2 Next task (2) is to check the ownership of directory /var/lib/etcd, so we first have a look: ➜ root@cluster2-controlplane1:~# ls -lh /var/lib | grep etcd drwx------ 3 root root 4.0K Sep 11 20:08 etcd ooks like user root and group root. Also possible to check using: ➜ root@cluster2-controlplane1:~# stat -c %U:%G /var/lib/etcd root:root But what has kube-bench to say about this? ➜ root@cluster2-controlplane1:~# kube-bench run --targets=master | grep "/var/lib/etcd" -B5 1.1.12 On the etcd server node, get the etcd data directory, passed as an argument --data-dir, from the below command: ps -ef | grep etcd Run the below command (based on the etcd data directory found above). For example, chown etcd:etcd /var/lib/etcd To comply we run the following: ➜ root@cluster2-controlplane1:~# chown etcd:etcd /var/lib/etcd ➜ root@cluster2-controlplane1:~# ls -lh /var/lib | grep etcd drwx------ 3 etcd etcd 4.0K Sep 11 20:08 etcd This looks better. We run kube-bench again, and make sure test 1.1.12. is passing. ➜ root@cluster2-controlplane1:~# kube-bench run --targets=master | grep 1.1.12 [PASS] 1.1.12 Ensure that the etcd data directory ownership is set to etcd:etcd (Scored) Done. Number 3 To continue with number (3), we'll head to the worker node and ensure that the kubelet configuration file has the minimum necessary permissions as recommended: ➜ ssh cluster2-node1 ➜ root@cluster2-node1:~# kube-bench run --targets=node ... == Summary == 13 checks PASS 10 checks FAIL 2 checks WARN 0 checks INFO Also here some passes, fails and warnings. We check the permission level of the kubelet config file: ➜ root@cluster2-node1:~# stat -c %a /var/lib/kubelet/config.yaml 777 777 is highly permissive access level and not recommended by the kube-bench guidelines: ➜ root@cluster2-node1:~# kube-bench run --targets=node | grep /var/lib/kubelet/config.yaml -B2 4.1.9 Run the following command (using the config file location identified in the Audit step) chmod 644 /var/lib/kubelet/config.yaml We obey and set the recommended permissions: ➜ root@cluster2-node1:~# chmod 644 /var/lib/kubelet/config.yaml ➜ root@cluster2-node1:~# stat -c %a /var/lib/kubelet/config.yaml 644 And check if test 2.2.10 is passing: ➜ root@cluster2-node1:~# kube-bench run --targets=node | grep 4.1.9 [PASS] 2.2.10 Ensure that the kubelet configuration file has permissions set to 644 or more restrictive (Scored) Number 4 Finally for number (4), let's check whether --client-ca-file argument for the kubelet is set properly according to kube-bench recommendations: ➜ root@cluster2-node1:~# kube-bench run --targets=node | grep client-ca-file [PASS] 4.2.3 Ensure that the --client-ca-file argument is set as appropriate (Automated) This looks passing with 4.2.3. The other ones are about the file that the parameter points to and can be ignored here. To further investigate we run the following command to locate the kubelet config file, and open it: ➜ root@cluster2-node1:~# ps -ef | grep kubelet root 5157 1 2 20:28 ? 00:03:22 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.2 root 19940 11901 0 22:38 pts/0 00:00:00 grep --color=auto kubelet ➜ root@croot@cluster2-node1:~# vim /var/lib/kubelet/config.yaml # /var/lib/kubelet/config.yaml apiVersion: kubelet.config.k8s.io/v1beta1 authentication: anonymous: enabled: false webhook: cacheTTL: 0s enabled: true x509: clientCAFile: /etc/kubernetes/pki/ca.crt ... The clientCAFile points to the location of the certificate, which is correct. 第六题: Verify Platform Binaries 问题 Task weight: 2% (can be solved in any kubectl context) There are four Kubernetes server binaries located at /opt/course/6/binaries. You're provided with the following verified sha512 values for these: kube-apiserver f417c0555bc0167355589dd1afe23be9bf909bf98312b1025f12015d1b58a1c62c9908c0067a7764fa35efdac7016a9efa8711a44425dd6692906a7c283f032c kube-controller-manager 60100cc725e91fe1a949e1b2d0474237844b5862556e25c2c655a33boa8225855ec5ee22fa4927e6c46a60d43a7c4403a27268f96fbb726307d1608b44f38a60 kube-proxy 52f9d8ad045f8eee1d689619ef8ceef2d86d50c75a6a332653240d7ba5b2a114aca056d9e513984ade24358c9662714973c1960c62a5cb37dd375631c8a614c6 kubelet 4be40f2440619e990897cf956c32800dc96c2c983bf64519854a3309fa5aa21827991559f9c44595098e27e6f2ee4d64a3fdec6baba8a177881f20e3ec61e26c Delete those binaries that don't match with the sha512 values above. 解答 We check the directory: ➜ cd /opt/course/6/binaries ➜ ls kube-apiserver kube-controller-manager kube-proxy kubelet To generate the sha512 sum of a binary we do: ➜ sha512sum kube-apiserver f417c0555bc0167355589dd1afe23be9bf909bf98312b1025f12015d1b58a1c62c9908c0067a7764fa35efdac7016a9efa8711a44425dd6692906a7c283f032c kube-apiserver Looking good, next: ➜ sha512sum kube-controller-manager 60100cc725e91fe1a949e1b2d0474237844b5862556e25c2c655a33b0a8225855ec5ee22fa4927e6c46a60d43a7c4403a27268f96fbb726307d1608b44f38a60 kube-controller-manager Okay, next: ➜ sha512sum kube-proxy 52f9d8ad045f8eee1d689619ef8ceef2d86d50c75a6a332653240d7ba5b2a114aca056d9e513984ade24358c9662714973c1960c62a5cb37dd375631c8a614c6 kube-proxy Also good, and finally: ➜ sha512sum kubelet 7b720598e6a3483b45c537b57d759e3e82bc5c53b3274f681792f62e941019cde3d51a7f9b55158abf3810d506146bc0aa7cf97b36f27f341028a54431b335be kubelet Catch! Binary kubelet has a different hash! But did we actually compare everything properly before? Let's have a closer look at kube-controller-manager again: ➜ sha512sum kube-controller-manager > compare ➜ vim compare Edit to only have the provided hash and the generated one in one line each: # ./compare 60100cc725e91fe1a949e1b2d0474237844b5862556e25c2c655a33b0a8225855ec5ee22fa4927e6c46a60d43a7c4403a27268f96fbb726307d1608b44f38a60 60100cc725e91fe1a949e1b2d0474237844b5862556e25c2c655a33boa8225855ec5ee22fa4927e6c46a60d43a7c4403a27268f96fbb726307d1608b44f38a60 Looks right at a first glance, but if we do: ➜ cat compare | uniq 60100cc725e91fe1a949e1b2d0474237844b5862556e25c2c655a33b0a8225855ec5ee22fa4927e6c46a60d43a7c4403a27268f96fbb726307d1608b44f38a60 60100cc725e91fe1a949e1b2d0474237844b5862556e25c2c655a33boa8225855ec5ee22fa4927e6c46a60d43a7c4403a27268f96fbb726307d1608b44f38a60 This shows they are different, by just one character actually. To complete the task we do: rm kubelet kube-controller-manager 第七题: Open Policy Agent 问题 Task weight: 6% Use context: kubectl config use-context infra-prod The Open Policy Agent and Gatekeeper have been installed to, among other things, enforce blacklisting of certain image registries. Alter the existing constraint and/or template to also blacklist images from very-bad-registry.com. Test it by creating a single Pod using image very-bad-registry.com/image in Namespace default, it shouldn't work. You can also verify your changes by looking at the existing Deployment untrusted in Namespace default, it uses an image from the new untrusted source. The OPA contraint should throw violation messages for this one. 解答 We look at existing OPA constraints, these are implemeted using CRDs by Gatekeeper: ➜ k get crd NAME CREATED AT blacklistimages.constraints.gatekeeper.sh 2020-09-14T19:29:31Z configs.config.gatekeeper.sh 2020-09-14T19:29:04Z constraintpodstatuses.status.gatekeeper.sh 2020-09-14T19:29:05Z constrainttemplatepodstatuses.status.gatekeeper.sh 2020-09-14T19:29:05Z constrainttemplates.templates.gatekeeper.sh 2020-09-14T19:29:05Z requiredlabels.constraints.gatekeeper.sh 2020-09-14T19:29:31Z So we can do: ➜ k get constraint NAME AGE blacklistimages.constraints.gatekeeper.sh/pod-trusted-images 10m NAME AGE requiredlabels.constraints.gatekeeper.sh/namespace-mandatory-labels 10m and then look at the one that is probably about blacklisting images: k edit blacklistimages pod-trusted-images # kubectl edit blacklistimages pod-trusted-images apiVersion: constraints.gatekeeper.sh/v1beta1 kind: BlacklistImages metadata: ... spec: match: kinds: - apiGroups: - "" kinds: - Pod It looks like this constraint simply applies the template to all Pods, no arguments passed. So we edit the template: k edit constrainttemplates blacklistimages # kubectl edit constrainttemplates blacklistimages apiVersion: templates.gatekeeper.sh/v1beta1 kind: ConstraintTemplate metadata: ... spec: crd: spec: names: kind: BlacklistImages targets: - rego: | package k8strustedimages images { image := input.review.object.spec.containers[_].image not startswith(image, "docker-fake.io/") not startswith(image, "google-gcr-fake.com/") not startswith(image, "very-bad-registry.com/") # ADD THIS LINE } violation[{"msg": msg}] { not images msg := "not trusted image!" } target: admission.k8s.gatekeeper.sh We simply have to add another line. After editing we try to create a Pod of the bad image: ➜ k run opa-test --image=very-bad-registry.com/image Error from server ([denied by pod-trusted-images] not trusted image!): admission webhook "validation.gatekeeper.sh" denied the request: [denied by pod-trusted-images] not trusted image! Nice! After some time we can also see that Pods of the existing Deployment "untrusted" will be listed as violators: ➜ k describe blacklistimages pod-trusted-images ... Total Violations: 2 Violations: Enforcement Action: deny Kind: Namespace Message: you must provide labels: {"security-level"} Name: sidecar-injector Enforcement Action: deny Kind: Pod Message: not trusted image! Name: untrusted-68c4944d48-tfsnb Namespace: default Events: <none> Great, OPA fights bad registries ! 第八题: Secure Kubernetes Dashboard 问题 Task weight: 3% Use context: kubectl config use-context workload-prod The Kubernetes Dashboard is installed in Namespace kubernetes-dashboard and is configured to: Allow users to "skip login" Allow insecure access (HTTP without authentication) Allow basic authentication Allow access from outside the cluster You are asked to make it more secure by: Deny users to "skip login" Deny insecure access, enforce HTTPS (self signed certificates are ok for now) Add the --auto-generate-certificates argument Enforce authentication using a token (with possibility to use RBAC) Allow only cluster internal access 解答 Head to https://github.com/kubernetes/dashboard/tree/master/docs to find documentation about the dashboard. This link is not on the allowed list of urls during the real exam. This means you should be provided will all information necessary in case of a task like this. First we have a look in Namespace kubernetes-dashboard: ➜ k -n kubernetes-dashboard get pod,svc NAME READY STATUS RESTARTS AGE pod/dashboard-metrics-scraper-7b59f7d4df-fbpd9 1/1 Running 0 24m pod/kubernetes-dashboard-6d8cd5dd84-w7wr2 1/1 Running 0 24m NAME TYPE ... PORT(S) AGE service/dashboard-metrics-scraper ClusterIP ... 8000/TCP 24m service/kubernetes-dashboard NodePort ... 9090:32520/TCP,443:31206/TCP 24m We can see one running Pod and a NodePort Service exposing it. Let's try to connect to it via a NodePort, we can use IP of any Node: (your port might be a different) ➜ k get node -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP ... cluster1-controlplane1 Ready master 37m v1.28.2 192.168.100.11 ... cluster1-node1 Ready <none> 36m v1.28.2 192.168.100.12 ... cluster1-node2 Ready <none> 34m v1.28.2 192.168.100.13 ... ➜ curl http://192.168.100.11:32520 <!-- Copyright 2017 The Kubernetes Authors. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 The dashboard is not secured because it allows unsecure HTTP access without authentication and is exposed externally. It's is loaded with a few parameter making it insecure, let's fix this. First we create a backup in case we need to undo something: k -n kubernetes-dashboard get deploy kubernetes-dashboard -oyaml > 8_deploy_kubernetes-dashboard.yaml Then: k -n kubernetes-dashboard edit deploy kubernetes-dashboard The changes to make are : template: spec: containers: - args: - --namespace=kubernetes-dashboard - --authentication-mode=token # change or delete, "token" is default - --auto-generate-certificates # add #- --enable-skip-login=true # delete or set to false #- --enable-insecure-login # delete image: kubernetesui/dashboard:v2.0.3 imagePullPolicy: Always name: kubernetes-dashboard Next, we'll have to deal with the NodePort Service: k -n kubernetes-dashboard get svc kubernetes-dashboard -o yaml > 8_svc_kubernetes-dashboard.yaml # backup k -n kubernetes-dashboard edit svc kubernetes-dashboard And make the following changes: spec: clusterIP: 10.107.176.19 externalTrafficPolicy: Cluster # delete internalTrafficPolicy: Cluster ports: - name: http nodePort: 32513 # delete port: 9090 protocol: TCP targetPort: 9090 - name: https nodePort: 32441 # delete port: 443 protocol: TCP targetPort: 8443 selector: k8s-app: kubernetes-dashboard sessionAffinity: None type: ClusterIP # change or delete status: loadBalancer: {} Let's confirm the changes, we can do that even without having a browser: ➜ k run tmp --image=nginx:1.19.2 --restart=Never --rm -it -- bash If you don't see a command prompt, try pressing enter. root@tmp:/# curl http://kubernetes-dashboard.kubernetes-dashboard:9090 curl: (7) Failed to connect to kubernetes-dashboard.kubernetes-dashboard port 9090: Connection refused ➜ root@tmp:/# curl https://kubernetes-dashboard.kubernetes-dashboard curl: (60) SSL certificate problem: self signed certificate More details here: https://curl.haxx.se/docs/sslcerts.html curl failed to verify the legitimacy of the server and therefore could not establish a secure connection to it. To learn more about this situation and how to fix it, please visit the web page mentioned above. ➜ root@tmp:/# curl https://kubernetes-dashboard.kubernetes-dashboard -k <!-- Copyright 2017 The Kubernetes Authors. We see that insecure access is disabled and HTTPS works (using a self signed certificate for now). Let's also check the remote access: (your port might be a different) ➜ curl http://192.168.100.11:32520 curl: (7) Failed to connect to 192.168.100.11 port 32520: Connection refused ➜ k -n kubernetes-dashboard get svc NAME TYPE CLUSTER-IP ... PORT(S) dashboard-metrics-scraper ClusterIP 10.111.171.247 ... 8000/TCP kubernetes-dashboard ClusterIP 10.100.118.128 ... 9090/TCP,443/TCP Much better. 第九题: AppArmor Profile Task weight: 3% Use context: kubectl config use-context workload-prod Some containers need to run more secure and restricted. There is an existing AppArmor profile located at/opt/course/9/profile for this. Install the AppArmor profile on Node cluster1-node1. Connect using ssh cluster1-node1. Add label security=apparmor to the Node Create a Deployment named apparmor in Namespace default with: One replica of image nginx:1.19.2 NodeSelector for security=apparmor Single container named c1 with the AppArmor profile enabled The Pod might not run properly with the profile enabled. Write the logs of the Pod into /opt/course/9/logs so another team can work on getting the application running. 解答 参考链接: https://kubernetes.io/docs/tutorials/clusters/apparmor Part 1 First we have a look at the provided profile: vim /opt/course/9/profile # /opt/course/9/profile #include <tunables/global> profile very-secure flags=(attach_disconnected) { #include <abstractions/base> file, # Deny all file writes. deny /** w, } Very simple profile named very-secure which denies all file writes. Next we copy it onto the Node: ➜ scp /opt/course/9/profile cluster1-node1:~/ Warning: Permanently added the ECDSA host key for IP address '192.168.100.12' to the list of known hosts. profile 100% 161 329.9KB/s 00:00 ➜ ssh cluster1-node1 ➜ root@cluster1-node1:~# ls profile And install it: ➜ root@cluster1-node1:~# apparmor_parser -q ./profile Verify it has been installed: ➜ root@cluster1-node1:~# apparmor_status apparmor module is loaded. 17 profiles are loaded. 17 profiles are in enforce mode. /sbin/dhclient ... man_filter man_groff very-secure 0 profiles are in complain mode. 56 processes have profiles defined. 56 processes are in enforce mode. ... 0 processes are in complain mode. 0 processes are unconfined but have a profile defined. There we see among many others the very-secure one, which is the name of the profile specified in /opt/course/9/profile. Part 2 We label the Node: k label -h # show examples k label node cluster1-node1 security=apparmor Part 3 Now we can go ahead and create the Deployment which uses the profile. k create deploy apparmor --image=nginx:1.19.2 $do > 9_deploy.yaml vim 9_deploy.yaml # 9_deploy.yaml apiVersion: apps/v1 kind: Deployment metadata: creationTimestamp: null labels: app: apparmor name: apparmor namespace: default spec: replicas: 1 selector: matchLabels: app: apparmor strategy: {} template: metadata: creationTimestamp: null labels: app: apparmor annotations: # add container.apparmor.security.beta.kubernetes.io/c1: localhost/very-secure # add spec: nodeSelector: # add security: apparmor # add containers: - image: nginx:1.19.2 name: c1 # change resources: {} k -f 9_deploy.yaml create What the damage? ➜ k get pod -owide | grep apparmor apparmor-85c65645dc-jbch8 0/1 CrashLoopBackOff ... cluster1-node1 ➜ k logs apparmor-85c65645dc-w852p /docker-entrypoint.sh: 13: /docker-entrypoint.sh: cannot create /dev/null: Permission denied /docker-entrypoint.sh: No files found in /docker-entrypoint.d/, skipping configuration 2021/09/15 11:51:57 [emerg] 1#1: mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied) nginx: [emerg] mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied) ```bash This looks alright, the Pod is running on `cluster1-node1` because of the nodeSelector. The AppArmor profile simply denies all filesystem writes, but Nginx needs to write into some locations to run, hence the errors. It looks like our profile is running but we can confirm this as well by inspecting the container: ```bash ➜ ssh cluster1-node1 ➜ root@cluster1-node1:~# crictl pods | grep apparmor be5c0aecee7c7 4 minutes ago Ready apparmor-85c65645dc-jbch8 ... ➜ root@cluster1-node1:~# crictl ps -a | grep be5c0aecee7c7 e4d91cbdf72fb ... Exited c1 6 be5c0aecee7c7 ➜ root@cluster1-node1:~# crictl inspect e4d91cbdf72fb | grep -i profile "apparmor_profile": "localhost/very-secure", "apparmorProfile": "very-secure", First we find the Pod by it's name and get the pod-id. Next we use crictl ps -a to also show stopped containers. Then crictl inspect shows that the container is using our AppArmor profile. Notice to be fast between ps and inspect as K8s will restart the Pod periodically when in error state. To complete the task we write the logs into the required location: k logs apparmor-85c65645dc-jbch8 > /opt/course/9/logs Fixing the errors is the job of another team, lucky us. 第十题: Container Runtime Sandbox gVisor 问题 Task weight: 4% Use context: kubectl config use-context workload-prod Team purple wants to run some of their workloads more secure. Worker node cluster1-node2 has container engine containerd already installed and it's configured to support the runsc/gvisor runtime. Create a RuntimeClass named gvisor with handler runsc. Create a Pod that uses the RuntimeClass. The Pod should be in Namespace team-purple, named gvisor-test and of image nginx:1.19.2. Make sure the Pod runs on cluster1-node2. Write the dmesg output of the successfully started Pod into /opt/course/10/gvisor-test-dmesg. 解答 We check the nodes and we can see that all are using containerd: ➜ k get node -o wide NAME STATUS ROLES ... CONTAINER-RUNTIME cluster1-controlplane1 Ready control-plane ... containerd://1.5.2 cluster1-node1 Ready <none> ... containerd://1.5.2 cluster1-node2 Ready <none> ... containerd://1.5.2 But just one has containerd configured to work with runsc/gvisor runtime which is cluster1-node2. (Optionally) we ssh into the worker node and check if containerd+runsc is configured: ➜ ssh cluster1-node2 ➜ root@cluster1-node2:~# runsc --version runsc version release-20201130.0 spec: 1.0.1-dev ➜ root@cluster1-node2:~# cat /etc/containerd/config.toml | grep runsc [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runsc] runtime_type = "io.containerd.runsc.v1" Now we best head to the k8s docs for RuntimeClasses https://kubernetes.io/docs/concepts/containers/runtime-class, steal an example and create the gvisor one: vim 10_rtc.yaml # 10_rtc.yaml apiVersion: node.k8s.io/v1 kind: RuntimeClass metadata: name: gvisor handler: runsc k -f 10_rtc.yaml create And the required Pod: k -n team-purple run gvisor-test --image=nginx:1.19.2 $do > 10_pod.yaml vim 10_pod.yaml # 10_pod.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: gvisor-test name: gvisor-test namespace: team-purple spec: nodeName: cluster1-node2 # add runtimeClassName: gvisor # add containers: - image: nginx:1.19.2 name: gvisor-test resources: {} dnsPolicy: ClusterFirst restartPolicy: Always status: {} k -f 10_pod.yaml create After creating the pod we should check if it's running and if it uses the gvisor sandbox: ➜ k -n team-purple get pod gvisor-test NAME READY STATUS RESTARTS AGE gvisor-test 1/1 Running 0 30s ➜ k -n team-purple exec gvisor-test -- dmesg [ 0.000000] Starting gVisor... [ 0.417740] Checking naughty and nice process list... [ 0.623721] Waiting for children... [ 0.902192] Gathering forks... [ 1.258087] Committing treasure map to memory... [ 1.653149] Generating random numbers by fair dice roll... [ 1.918386] Creating cloned children... [ 2.137450] Digging up root... [ 2.369841] Forking spaghetti code... [ 2.840216] Rewriting operating system in Javascript... [ 2.956226] Creating bureaucratic processes... [ 3.329981] Ready! Looking good. And as required we finally write the dmesg output into the file: k -n team-purple exec gvisor-test > /opt/course/10/gvisor-test-dmesg -- dmesg 第十一题: Secrets in ETCD 问题 Task weight: 7% Use context: kubectl config use-context workload-prod There is an existing Secret called database-access in Namespace team-green. Read the complete Secret content directly from ETCD (using etcdctl) and store it into /opt/course/11/etcd-secret-content. Write the plain and decoded Secret's value of key "pass" into /opt/course/11/database-password. 解答 Let's try to get the Secret value directly from ETCD, which will work since it isn't encrypted. First, we ssh into the master node where ETCD is running in this setup and check if etcdctl is installed and list it's options: ➜ ssh cluster1-controlplane1 ➜ root@cluster1-controlplane1:~# etcdctl NAME: etcdctl - A simple command line client for etcd. WARNING: Environment variable ETCDCTL_API is not set; defaults to etcdctl v2. Set environment variable ETCDCTL_API=3 to use v3 API or ETCDCTL_API=2 to use v2 API. USAGE: etcdctl [global options] command [command options] [arguments...] ... --cert-file value identify HTTPS client using this SSL certificate file --key-file value identify HTTPS client using this SSL key file --ca-file value verify certificates of HTTPS-enabled servers using this CA bundle ... Among others we see arguments to identify ourselves. The apiserver connects to ETCD, so we can run the following command to get the path of the necessary .crt and .key files: cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep etcd The output is as follows : - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key - --etcd-servers=https://127.0.0.1:2379 # optional since we're on same node With this information we query ETCD for the secret value: ➜ root@cluster1-controlplane1:~# ETCDCTL_API=3 etcdctl \ --cert /etc/kubernetes/pki/apiserver-etcd-client.crt \ --key /etc/kubernetes/pki/apiserver-etcd-client.key \ --cacert /etc/kubernetes/pki/etcd/ca.crt get /registry/secrets/team-green/database-access ETCD in Kubernetes stores data under /registry/{type}/{namespace}/{name}. This is how we came to look for /registry/secrets/team-green/database-access. There is also an example on a page in the k8s documentation which you could save as a bookmark to access fast during the exam. The tasks requires us to store the output on our terminal. For this we can simply copy&paste the content into a new file on our terminal: # /opt/course/11/etcd-secret-content /registry/secrets/team-green/database-access k8s v1Secret database-access team-green"*$3e0acd78-709d-4f07-bdac-d5193d0f2aa32bB 0kubectl.kubernetes.io/last-applied-configuration{"apiVersion":"v1","data":{"pass":"Y29uZmlkZW50aWFs"},"kind":"Secret","metadata":{"annotations":{},"name":"database-access","namespace":"team-green"}} z kubectl-client-side-applyUpdatevFieldsV1: {"f:data":{".":{},"f:pass":{}},"f:metadata":{"f:annotations":{".":{},"f:kubectl.kubernetes.io/last-applied-configuration":{}}},"f:type":{}} pass confidentialOpaque" We're also required to store the plain and "decrypted" database password. For this we can copy the base64-encoded value from the ETCD output and run on our terminal: ➜ echo Y29uZmlkZW50aWFs | base64 -d > /opt/course/11/database-password ➜ cat /opt/course/11/database-password confidential 第十二题: Hack Secrets 问题 Task weight: 8% Use context: kubectl config use-context restricted@infra-prod You're asked to investigate a possible permission escape in Namespace restricted. The context authenticates as user restricted which has only limited permissions and shouldn't be able to read Secret values. Try to find the password-key values of the Secrets secret1, secret2 and secret3 in Namespace restricted. Write the decoded plaintext values into files /opt/course/12/secret1, /opt/course/12/secret2 and /opt/course/12/secret3. 解答 First we should explore the boundaries, we can try: ➜ k -n restricted get role,rolebinding,clusterrole,clusterrolebinding Error from server (Forbidden): roles.rbac.authorization.k8s.io is forbidden: User "restricted" cannot list resource "roles" in API group "rbac.authorization.k8s.io" in the namespace "restricted" Error from server (Forbidden): rolebindings.rbac.authorization.k8s.io is forbidden: User "restricted" cannot list resource "rolebindings" in API group "rbac.authorization.k8s.io" in the namespace "restricted" Error from server (Forbidden): clusterroles.rbac.authorization.k8s.io is forbidden: User "restricted" cannot list resource "clusterroles" in API group "rbac.authorization.k8s.io" at the cluster scope Error from server (Forbidden): clusterrolebindings.rbac.authorization.k8s.io is forbidden: User "restricted" cannot list resource "clusterrolebindings" in API group "rbac.authorization.k8s.io" at the cluster scope But no permissions to view RBAC resources. So we try the obvious: ➜ k -n restricted get secret Error from server (Forbidden): secrets is forbidden: User "restricted" cannot list resource "secrets" in API group "" in the namespace "restricted" ➜ k -n restricted get secret -o yaml apiVersion: v1 items: [] kind: List metadata: resourceVersion: "" selfLink: "" Error from server (Forbidden): secrets is forbidden: User "restricted" cannot list resource "secrets" in API group "" in the namespace "restricted" We're not allowed to get or list any Secrets. What can we see though? ➜ k -n restricted get all NAME READY STATUS RESTARTS AGE pod1-fd5d64b9c-pcx6q 1/1 Running 0 37s pod2-6494f7699b-4hks5 1/1 Running 0 37s pod3-748b48594-24s76 1/1 Running 0 37s Error from server (Forbidden): replicationcontrollers is forbidden: User "restricted" cannot list resource "replicationcontrollers" in API group "" in the namespace "restricted" Error from server (Forbidden): services is forbidden: User "restricted" cannot list resource "services" in API group "" in the namespace "restricted" ... There are some Pods, lets check these out regarding Secret access: k -n restricted get pod -o yaml | grep -i secret This output provides us with enough information to do: ➜ k -n restricted exec pod1-fd5d64b9c-pcx6q -- cat /etc/secret-volume/password you-are ➜ echo you-are > /opt/course/12/secret1 And for the second Secret: ➜ k -n restricted exec pod2-6494f7699b-4hks5 -- env | grep PASS PASSWORD=an-amazing ➜ echo an-amazing > /opt/course/12/secret2 None of the Pods seem to mount secret3 though. Can we create or edit existing Pods to mount secret3? ➜ k -n restricted run test --image=nginx Error from server (Forbidden): pods is forbidden: User "restricted" cannot create resource "pods" in API group "" in the namespace "restricted" ➜ k -n restricted delete pod pod1 Error from server (Forbidden): pods "pod1" is forbidden: User "restricted" cannot delete resource "pods" in API group "" in the namespace "restricted" Doesn't look like it. But the Pods seem to be able to access the Secrets, we can try to use a Pod's ServiceAccount to access the third Secret. We can actually see (like using k -n restricted get pod -o yaml | grep automountServiceAccountToken) that only Pod pod3-* has the ServiceAccount token mounted: ➜ k -n restricted exec -it pod3-748b48594-24s76 -- sh / # mount | grep serviceaccount tmpfs on /run/secrets/kubernetes.io/serviceaccount type tmpfs (ro,relatime) / # ls /run/secrets/kubernetes.io/serviceaccount ca.crt namespace token NOTE: You should have knowledge about ServiceAccounts and how they work with Pods like described in the docs We can see all necessary information to contact the apiserver manually: / # curl https://kubernetes.default/api/v1/namespaces/restricted/secrets -H "Authorization: Bearer $(cat /run/secrets/kubernetes.io/serviceaccount/token)" -k ... { "metadata": { "name": "secret3", "namespace": "restricted", ... } ] }, "data": { "password": "cEVuRXRSYVRpT24tdEVzVGVSCg==" }, "type": "Opaque" } ... Let's encode it and write it into the requested location: ➜ echo cEVuRXRSYVRpT24tdEVzVGVSCg== | base64 -d pEnEtRaTiOn-tEsTeR ➜ echo cEVuRXRSYVRpT24tdEVzVGVSCg== | base64 -d > /opt/course/12/secret3 This will give us: # /opt/course/12/secret1 you-are # /opt/course/12/secret2 an-amazing # /opt/course/12/secret3 pEnEtRaTiOn-tEsTeR We hacked all Secrets! It can be tricky to get RBAC right and secure. NOTE: One thing to consider is that giving the permission to "list" Secrets, will also allow the user to read the Secret values like using kubectl get secrets -o yaml even without the "get" permission set. 第十三题: Restrict access to Metadata Server 问题 Task weight: 7% Use context: kubectl config use-context infra-prod There is a metadata service available at http://192.168.100.21:32000 on which Nodes can reach sensitive data, like cloud credentials for initialisation. By default, all Pods in the cluster also have access to this endpoint. The DevSecOps team has asked you to restrict access to this metadata server. In Namespace metadata-access: Create a NetworkPolicy named metadata-deny which prevents egress to 192.168.100.21 for all Pods but still allows access to everything else Create a NetworkPolicy named metadata-allow which allows Pods having label role: metadata-accessor to access endpoint 192.168.100.21 There are existing Pods in the target Namespace with which you can test your policies, but don't change their labels. 解答 There was a famous hack at Shopify which was based on revealed information via metadata for nodes. Check the Pods in the Namespace metadata-access and their labels: ➜ k -n metadata-access get pods --show-labels NAME ... LABELS pod1-7d67b4ff9-xrcd7 ... app=pod1,pod-template-hash=7d67b4ff9 pod2-7b6fc66944-2hc7n ... app=pod2,pod-template-hash=7b6fc66944 pod3-7dc879bd59-hkgrr ... app=pod3,role=metadata-accessor,pod-template-hash=7dc879bd59 There are three Pods in the Namespace and one of them has the label role=metadata-accessor. Check access to the metadata server from the Pods: ➜ k exec -it -n metadata-access pod1-7d67b4ff9-xrcd7 -- curl http://192.168.100.21:32000 metadata server ➜ k exec -it -n metadata-access pod2-7b6fc66944-2hc7n -- curl http://192.168.100.21:32000 metadata server ➜ k exec -it -n metadata-access pod3-7dc879bd59-hkgrr -- curl http://192.168.100.21:32000 metadata server All three are able to access the metadata server. To restrict the access, we create a NetworkPolicy to deny access to the specific IP. vim 13_metadata-deny.yaml # 13_metadata-deny.yaml apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: metadata-deny namespace: metadata-access spec: podSelector: {} policyTypes: - Egress egress: - to: - ipBlock: cidr: 0.0.0.0/0 except: - 192.168.100.21/32 k -f 13_metadata-deny.yaml apply NOTE: You should know about general default-deny K8s NetworkPolcies. Verify that access to the metadata server has been blocked, but other endpoints are still accessible: ➜ k exec -it -n metadata-access pod1-7d67b4ff9-xrcd7 -- curl http://192.168.100.21:32000 curl: (28) Failed to connect to 192.168.100.21 port 32000: Operation timed out command terminated with exit code 28 ➜ kubectl exec -it -n metadata-access pod1-7d67b4ff9-xrcd7 -- curl -I https://kubernetes.io HTTP/2 200 cache-control: public, max-age=0, must-revalidate content-type: text/html; charset=UTF-8 date: Mon, 14 Sep 2020 15:39:39 GMT etag: "b46e429397e5f1fecf48c10a533f5cd8-ssl" strict-transport-security: max-age=31536000 age: 13 content-length: 22252 server: Netlify x-nf-request-id: 1d94a1d1-6bac-4a98-b065-346f661f1db1-393998290 Similarly, verify for the other two Pods. Now create another NetworkPolicy that allows access to the metadata server from Pods with label role=metadata-accessor. vim 13_metadata-allow.yaml # 13_metadata-allow.yaml apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: metadata-allow namespace: metadata-access spec: podSelector: matchLabels: role: metadata-accessor policyTypes: - Egress egress: - to: - ipBlock: cidr: 192.168.100.21/32 k -f 13_metadata-allow.yaml apply Verify that required Pod has access to metadata endpoint and others do not: ➜ k -n metadata-access exec pod3-7dc879bd59-hkgrr -- curl http://192.168.100.21:32000 metadata server ➜ k -n metadata-access exec pod2-7b6fc66944-9ngzr -- curl http://192.168.100.21:32000 ^Ccommand terminated with exit code 130 It only works for the Pod having the label. With this we implemented the required security restrictions. If a Pod doesn't have a matching NetworkPolicy then all traffic is allowed from and to it. Once a Pod has a matching NP then the contained rules are additive. This means that for Pods having label metadata-accessor the rules will be combined to: # merged policies into one for pods with label metadata-accessor spec: podSelector: {} policyTypes: - Egress egress: - to: # first rule - ipBlock: # condition 1 cidr: 0.0.0.0/0 except: - 192.168.100.21/32 - to: # second rule - ipBlock: # condition 1 cidr: 192.168.100.21/32 We can see that the merged NP contains two separate rules with one condition each. We could read it as: Allow outgoing traffic if: (destination is 0.0.0.0/0 but not 192.168.100.21/32) OR (destination is 192.168.100.21/32) Hence it allows Pods with label metadata-accessor to access everything. 第十四题: Syscall Activity 问题 Task weight: 4% Use context: kubectl config use-context workload-prod There are Pods in Namespace team-yellow. A security investigation noticed that some processes running in these Pods are using the Syscall kill, which is forbidden by a Team Yellow internal policy. Find the offending Pod(s) and remove these by reducing the replicas of the parent Deployment to 0. 解答 Syscalls are used by processes running in Userspace to communicate with the Linux Kernel. There are many available syscalls: https://man7.org/linux/man-pages/man2/syscalls.2.html. It makes sense to restrict these for container processes and Docker/Containerd already restrict some by default, like the reboot Syscall. Restricting even more is possible for example using Seccomp or AppArmor. But for this task we should simply find out which binary process executes a specific Syscall. Processes in containers are simply run on the same Linux operating system, but isolated. That's why we first check on which nodes the Pods are running: ➜ k -n team-yellow get pod -owide NAME ... NODE NOMINATED NODE ... collector1-7585cc58cb-n5rtd 1/1 ... cluster1-node1 <none> ... collector1-7585cc58cb-vdlp9 1/1 ... cluster1-node1 <none> ... collector2-8556679d96-z7g7c 1/1 ... cluster1-node1 <none> ... collector3-8b58fdc88-pjg24 1/1 ... cluster1-node1 <none> ... collector3-8b58fdc88-s9ltc 1/1 ... cluster1-node1 <none> ... All on cluster1-node1, hence we ssh into it and find the processes for the first Deployment collector1 . ➜ ssh cluster1-node1 ➜ root@cluster1-node1:~# crictl pods --name collector1 POD ID CREATED STATE NAME ... 21aacb8f4ca8d 17 minutes ago Ready collector1-7585cc58cb-vdlp9 ... 186631e40104d 17 minutes ago Ready collector1-7585cc58cb-n5rtd ... ➜ root@cluster1-node1:~# crictl ps --pod 21aacb8f4ca8d CONTAINER ID IMAGE CREATED ... POD ID 9ea02422f8660 5d867958e04e1 12 minutes ago ... 21aacb8f4ca8d ➜ root@cluster1-node1:~# crictl inspect 9ea02422f8660 | grep args -A1 "args": [ "./collector1-process" Using crictl pods we first searched for the Pods of Deployment `collector1, which has two replicas We then took one pod-id to find it's containers using crictl ps And finally we used crictl inspect to find the process name, which is collector1-process We can find the process PIDs (two because there are two Pods): ➜ root@cluster1-node1:~# ps aux | grep collector1-process root 35039 0.0 0.1 702208 1044 ? Ssl 13:37 0:00 ./collector1-process root 35059 0.0 0.1 702208 1044 ? Ssl 13:37 0:00 ./collector1-process Using the PIDs we can call strace to find Sycalls: ➜ root@cluster1-node1:~# strace -p 35039 strace: Process 35039 attached futex(0x4d7e68, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 kill(666, SIGTERM) = -1 ESRCH (No such process) epoll_pwait(3, [], 128, 999, NULL, 1) = 0 kill(666, SIGTERM) = -1 ESRCH (No such process) epoll_pwait(3, [], 128, 999, NULL, 1) = 0 kill(666, SIGTERM) = -1 ESRCH (No such process) epoll_pwait(3, ^Cstrace: Process 35039 detached <detached ...> ... First try and already a catch! We see it uses the forbidden Syscall by calling kill(666, SIGTERM). Next let's check the Deployment collector2 processes: ➜ root@cluster1-node1:~# ps aux | grep collector2-process root 35375 0.0 0.0 702216 604 ? Ssl 13:37 0:00 ./collector2-process ➜ root@cluster1-node1:~# strace -p 35375 strace: Process 35375 attached futex(0x4d9e68, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 futex(0x4d9e68, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 futex(0x4d9e68, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 futex(0x4d9e68, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 ... Looks alright. What about collector3: ➜ root@cluster1-node1:~# ps aux | grep collector3-process root 35155 0.0 0.1 702472 1040 ? Ssl 13:37 0:00 ./collector3-process root 35241 0.0 0.1 702472 1044 ? Ssl 13:37 0:00 ./collector3-process ➜ root@cluster1-node1:~# strace -p 35155 strace: Process 35155 attached futex(0x4d9e68, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 futex(0x4d9e68, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 futex(0x4d9e68, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 epoll_pwait(3, [], 128, 999, NULL, 1) = 0 epoll_pwait(3, [], 128, 999, NULL, 1) = 0 ... Also nothing about the forbidden Syscall. So we finalise the task: k -n team-yellow scale deploy collector1 --replicas 0 And the world is a bit safer again. 第十五题: Configure TLS on Ingress 问题 Task weight: 4% Use context: kubectl config use-context workload-prod In Namespace team-pink there is an existing Nginx Ingress resources named secure which accepts two paths /app and /api which point to different ClusterIP Services. From your main terminal you can connect to it using for example: HTTP: curl -v http://secure-ingress.test:31080/app HTTPS: curl -kv https://secure-ingress.test:31443/app Right now it uses a default generated TLS certificate by the Nginx Ingress Controller. You're asked to instead use the key and certificate provided at /opt/course/15/tls.key and /opt/course/15/tls.crt. As it's a self-signed certificate you need to use curl -k when connecting to it. 解答 Investigate We can get the IP address of the Ingress and we see it's the same one to which secure-ingress.test is pointing to: ➜ k -n team-pink get ing secure NAME CLASS HOSTS ADDRESS PORTS AGE secure <none> secure-ingress.test 192.168.100.12 80 7m11s ➜ ping secure-ingress.test PING cluster1-node1 (192.168.100.12) 56(84) bytes of data. 64 bytes from cluster1-node1 (192.168.100.12): icmp_seq=1 ttl=64 time=0.316 ms Now, let's try to access the paths /app and /api via HTTP: ➜ curl http://secure-ingress.test:31080/app This is the backend APP! ➜ curl http://secure-ingress.test:31080/api This is the API Server! What about HTTPS? ➜ curl https://secure-ingress.test:31443/api curl: (60) SSL certificate problem: unable to get local issuer certificate More details here: https://curl.haxx.se/docs/sslcerts.html curl failed to verify the legitimacy of the server and therefore could not establish a secure connection to it. To learn more about this situation and how to fix it, please visit the web page mentioned above. ➜ curl -k https://secure-ingress.test:31443/api This is the API Server! HTTPS seems to be already working if we accept self-signed certificated using -k. But what kind of certificate is used by the server? ➜ curl -kv https://secure-ingress.test:31443/api ... * Server certificate: * subject: O=Acme Co; CN=Kubernetes Ingress Controller Fake Certificate * start date: Sep 28 12:28:35 2020 GMT * expire date: Sep 28 12:28:35 2021 GMT * issuer: O=Acme Co; CN=Kubernetes Ingress Controller Fake Certificate * SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway. ... It seems to be "Kubernetes Ingress Controller Fake Certificate". Implement own TLS certificate First, let us generate a Secret using the provided key and certificate: ➜ cd /opt/course/15 ➜ :/opt/course/15$ ls tls.crt tls.key ➜ :/opt/course/15$ k -n team-pink create secret tls tls-secret --key tls.key --cert tls.crt secret/tls-secret created Now, we configure the Ingress to make use of this Secret: ➜ k -n team-pink get ing secure -oyaml > 15_ing_bak.yaml ➜ k -n team-pink edit ing secure # kubectl -n team-pink edit ing secure apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: ... generation: 1 name: secure namespace: team-pink ... spec: tls: # add - hosts: # add - secure-ingress.test # add secretName: tls-secret # add rules: - host: secure-ingress.test http: paths: - backend: service: name: secure-app port: 80 path: /app pathType: ImplementationSpecific - backend: service: name: secure-api port: 80 path: /api pathType: ImplementationSpecific ... After adding the changes we check the Ingress resource again: ➜ k -n team-pink get ing NAME CLASS HOSTS ADDRESS PORTS AGE secure <none> secure-ingress.test 192.168.100.12 80, 443 25m It now actually lists port 443 for HTTPS. To verify: ➜ curl -k https://secure-ingress.test:31443/api This is the API Server! ➜ curl -kv https://secure-ingress.test:31443/api ... * Server certificate: * subject: CN=secure-ingress.test; O=secure-ingress.test * start date: Sep 25 18:22:10 2020 GMT * expire date: Sep 20 18:22:10 2040 GMT * issuer: CN=secure-ingress.test; O=secure-ingress.test * SSL certificate verify result: self signed certificate (18), continuing anyway. ... We can see that the provided certificate is now being used by the Ingress for TLS termination. 第十六题: Docker Image Attack Surface 问题 Task weight: 7% Use context: kubectl config use-context workload-prod There is a Deployment image-verify in Namespace team-blue which runs image registry.killer.sh:5000/image-verify:v1. DevSecOps has asked you to improve this image by: Changing the base image to alpine:3.12 Not installing curl Updating nginx to use the version constraint >=1.18.0 Running the main process as user myuser Do not add any new lines to the Dockerfile, just edit existing ones. The file is located at /opt/course/16/image/Dockerfile. Tag your version as v2. You can build, tag and push using: cd /opt/course/16/image podman build -t registry.killer.sh:5000/image-verify:v2 . podman run registry.killer.sh:5000/image-verify:v2 # to test your changes podman push registry.killer.sh:5000/image-verify:v2 Make the Deployment use your updated image tag v2. Make the Deployment use your updated image tag v2. 解答 We should have a look at the Docker Image at first: cd /opt/course/16/image cp Dockerfile Dockerfile.bak vim Dockerfile # /opt/course/16/image/Dockerfile FROM alpine:3.4 RUN apk update && apk add vim curl nginx=1.10.3-r0 RUN addgroup -S myuser && adduser -S myuser -G myuser COPY ./run.sh run.sh RUN ["chmod", "+x", "./run.sh"] USER root ENTRYPOINT ["/bin/sh", "./run.sh"] Very simple Dockerfile which seems to execute a script run.sh : # /opt/course/16/image/run.sh while true; do date; id; echo; sleep 1; done So it only outputs current date and credential information in a loop. We can see that output in the existing Deployment image-verify: ➜ k -n team-blue logs -f -l id=image-verify Fri Sep 25 20:59:12 UTC 2020 uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video) We see it's running as root. Next we update the Dockerfile according to the requirements: # /opt/course/16/image/Dockerfile # change FROM alpine:3.12 # change RUN apk update && apk add vim nginx>=1.18.0 RUN addgroup -S myuser && adduser -S myuser -G myuser COPY ./run.sh run.sh RUN ["chmod", "+x", "./run.sh"] # change USER myuser ENTRYPOINT ["/bin/sh", "./run.sh"] Then we build the new image: ➜ :/opt/course/16/image$ podman build -t registry.killer.sh:5000/image-verify:v2 . ... STEP 7/7: ENTRYPOINT ["/bin/sh", "./run.sh"] COMMIT registry.killer.sh:5000/image-verify:v2 --> ceb8989101b Successfully tagged registry.killer.sh:5000/image-verify:v2 ceb8989101bccd9f6b9c3b4c6c75f6c3561f19a5b784edd1f1a36fa0fb34a9df We can then test our changes by running the container locally: ➜ :/opt/course/16/image$ podman run registry.killer.sh:5000/image-verify:v2 Thu Sep 16 06:01:47 UTC 2021 uid=101(myuser) gid=102(myuser) groups=102(myuser) Thu Sep 16 06:01:48 UTC 2021 uid=101(myuser) gid=102(myuser) groups=102(myuser) Thu Sep 16 06:01:49 UTC 2021 uid=101(myuser) gid=102(myuser) groups=102(myuser) Looking good, so we push: ➜ :/opt/course/16/image$ podman push registry.killer.sh:5000/image-verify:v2 Getting image source signatures Copying blob cd0853834d88 done Copying blob 5298d0709c3e skipped: already exists Copying blob e6688e911f15 done Copying blob dbc406096645 skipped: already exists Copying blob 98895ed393d9 done Copying config ceb8989101 done Writing manifest to image destination Storing signatures And we update the Deployment to use the new image: k -n team-blue edit deploy image-verify # kubectl -n team-blue edit deploy image-verify apiVersion: apps/v1 kind: Deployment metadata: ... spec: ... template: ... spec: containers: - image: registry.killer.sh:5000/image-verify:v2 # change And afterwards we can verify our changes by looking at the Pod logs: ➜ k -n team-blue logs -f -l id=image-verify Fri Sep 25 21:06:55 UTC 2020 uid=101(myuser) gid=102(myuser) groups=102(myuser) Also to verify our changes even further: ➜ k -n team-blue exec image-verify-55fbcd4c9b-x2flc -- curl OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"curl\": executable file not found in $PATH": unknown command terminated with exit code 126 ➜ k -n team-blue exec image-verify-55fbcd4c9b-x2flc -- nginx -v nginx version: nginx/1.18.0 Another task solved. 第十七题: Audit Log Policy 问题 Task weight: 7% Use context: kubectl config use-context infra-prod Audit Logging has been enabled in the cluster with an Audit Policy located at /etc/kubernetes/audit/policy.yaml on cluster2-controlplane1. Change the configuration so that only one backup of the logs is stored. Alter the Policy in a way that it only stores logs: From Secret resources, level Metadata From "system:nodes" userGroups, level RequestResponse After you altered the Policy make sure to empty the log file so it only contains entries according to your changes, like using truncate -s 0 /etc/kubernetes/audit/logs/audit.log. NOTE: You can use jq to render json more readable. cat data.json | jq 解答 First we check the apiserver configuration and change as requested: ➜ ssh cluster2-controlplane1 ➜ root@cluster2-controlplane1:~# cp /etc/kubernetes/manifests/kube-apiserver.yaml ~/17_kube-apiserver.yaml # backup ➜ root@cluster2-controlplane1:~# vim /etc/kubernetes/manifests/kube-apiserver.yaml # /etc/kubernetes/manifests/kube-apiserver.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-apiserver tier: control-plane name: kube-apiserver namespace: kube-system spec: containers: - command: - kube-apiserver - --audit-policy-file=/etc/kubernetes/audit/policy.yaml - --audit-log-path=/etc/kubernetes/audit/logs/audit.log - --audit-log-maxsize=5 - --audit-log-maxbackup=1 # CHANGE - --advertise-address=192.168.100.21 - --allow-privileged=true ... NOTE: You should know how to enable Audit Logging completely yourself as described in the docs. Feel free to try this in another cluster in this environment. Now we look at the existing Policy: ➜ root@cluster2-controlplane1:~# vim /etc/kubernetes/audit/policy.yaml # /etc/kubernetes/audit/policy.yaml apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata We can see that this simple Policy logs everything on Metadata level. So we change it to the requirements: # /etc/kubernetes/audit/policy.yaml apiVersion: audit.k8s.io/v1 kind: Policy rules: # log Secret resources audits, level Metadata - level: Metadata resources: - group: "" resources: ["secrets"] # log node related audits, level RequestResponse - level: RequestResponse userGroups: ["system:nodes"] # for everything else don't log anything - level: None After saving the changes we have to restart the apiserver: ➜ root@cluster2-controlplane1:~# cd /etc/kubernetes/manifests/ ➜ root@cluster2-controlplane1:/etc/kubernetes/manifests# mv kube-apiserver.yaml .. ➜ root@cluster2-controlplane1:/etc/kubernetes/manifests# watch crictl ps # wait for apiserver gone ➜ root@cluster2-controlplane1:/etc/kubernetes/manifests# truncate -s 0 /etc/kubernetes/audit/logs/audit.log ➜ root@cluster2-controlplane1:/etc/kubernetes/manifests# mv ../kube-apiserver.yaml . Once the apiserver is running again we can check the new logs and scroll through some entries: cat audit.log | tail | jq { "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "Metadata", "auditID": "e598dc9e-fc8b-4213-aee3-0719499ab1bd", "stage": "RequestReceived", "requestURI": "...", "verb": "watch", "user": { "username": "system:serviceaccount:gatekeeper-system:gatekeeper-admin", "uid": "79870838-75a8-479b-ad42-4b7b75bd17a3", "groups": [ "system:serviceaccounts", "system:serviceaccounts:gatekeeper-system", "system:authenticated" ] }, "sourceIPs": [ "192.168.102.21" ], "userAgent": "manager/v0.0.0 (linux/amd64) kubernetes/$Format", "objectRef": { "resource": "secrets", "apiVersion": "v1" }, "requestReceivedTimestamp": "2020-09-27T20:01:36.238911Z", "stageTimestamp": "2020-09-27T20:01:36.238911Z", "annotations": { "authentication.k8s.io/legacy-token": "..." } } Above we logged a watch action by OPA Gatekeeper for Secrets, level Metadata. { "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "RequestResponse", "auditID": "c90e53ed-b0cf-4cc4-889a-f1204dd39267", "stage": "ResponseComplete", "requestURI": "...", "verb": "list", "user": { "username": "system:node:cluster2-controlplane1", "groups": [ "system:nodes", "system:authenticated" ] }, "sourceIPs": [ "192.168.100.21" ], "userAgent": "kubelet/v1.19.1 (linux/amd64) kubernetes/206bcad", "objectRef": { "resource": "configmaps", "namespace": "kube-system", "name": "kube-proxy", "apiVersion": "v1" }, "responseStatus": { "metadata": {}, "code": 200 }, "responseObject": { "kind": "ConfigMapList", "apiVersion": "v1", "metadata": { "selfLink": "/api/v1/namespaces/kube-system/configmaps", "resourceVersion": "83409" }, "items": [ { "metadata": { "name": "kube-proxy", "namespace": "kube-system", "selfLink": "/api/v1/namespaces/kube-system/configmaps/kube-proxy", "uid": "0f1c3950-430a-4543-83e4-3f9c87a478b8", "resourceVersion": "232", "creationTimestamp": "2020-09-26T20:59:50Z", "labels": { "app": "kube-proxy" }, "annotations": { "kubeadm.kubernetes.io/component-config.hash": "..." }, "managedFields": [ { ... } ] }, ... } ] }, "requestReceivedTimestamp": "2020-09-27T20:01:36.223781Z", "stageTimestamp": "2020-09-27T20:01:36.225470Z", "annotations": { "authorization.k8s.io/decision": "allow", "authorization.k8s.io/reason": "" } } And in the one above we logged a list action by system:nodes for a ConfigMaps, level RequestResponse. Because all JSON entries are written in a single line in the file we could also run some simple verifications on our Policy: # shows Secret entries cat audit.log | grep '"resource":"secrets"' | wc -l # confirms Secret entries are only of level Metadata cat audit.log | grep '"resource":"secrets"' | grep -v '"level":"Metadata"' | wc -l # shows RequestResponse level entries cat audit.log | grep -v '"level":"RequestResponse"' | wc -l # shows RequestResponse level entries are only for system:nodes cat audit.log | grep '"level":"RequestResponse"' | grep -v "system:nodes" | wc -l Looks like our job is done. 第十八题: Investigate Break-in via Audit Log 问题 Task weight: 4% Use context: kubectl config use-context infra-prod Namespace security contains five Secrets of type Opaque which can be considered highly confidential. The latest Incident-Prevention-Investigation revealed that ServiceAccount p.auster had too broad access to the cluster for some time. This SA should've never had access to any Secrets in that Namespace. Find out which Secrets in Namespace security this SA did access by looking at the Audit Logs under /opt/course/18/audit.log. Change the password to any new string of only those Secrets that were accessed by this SA. NOTE: You can use jq to render json more readable. cat data.json | jq 解答 First we look at the Secrets this is about: ➜ k -n security get secret | grep Opaque kubeadmin-token Opaque 1 37m mysql-admin Opaque 1 37m postgres001 Opaque 1 37m postgres002 Opaque 1 37m vault-token Opaque 1 37m Next we investigate the Audit Log file: ➜ cd /opt/course/18 ➜ :/opt/course/18$ ls -lh total 7.1M -rw-r--r-- 1 k8s k8s 7.5M Sep 24 21:31 audit.log ➜ :/opt/course/18$ cat audit.log | wc -l 4451 Audit Logs can be huge and it's common to limit the amount by creating an Audit Policy and to transfer the data in systems like Elasticsearch. In this case we have a simple JSON export, but it already contains 4451 lines. We should try to filter the file down to relevant information: ➜ :/opt/course/18$ cat audit.log | grep "p.auster" | wc -l 28 Not too bad, only 28 logs for ServiceAccount p.auster. ➜ :/opt/course/18$ cat audit.log | grep "p.auster" | grep Secret | wc -l 2 And only 2 logs related to Secrets... ➜ :/opt/course/18$ cat audit.log | grep "p.auster" | grep Secret | grep list | wc -l 0 ➜ :/opt/course/18$ cat audit.log | grep "p.auster" | grep Secret | grep get | wc -l 2 No list actions, which is good, but 2 get actions, so we check these out: cat audit.log | grep "p.auster" | grep Secret | grep get | jq { "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "RequestResponse", "auditID": "74fd9e03-abea-4df1-b3d0-9cfeff9ad97a", "stage": "ResponseComplete", "requestURI": "/api/v1/namespaces/security/secrets/vault-token", "verb": "get", "user": { "username": "system:serviceaccount:security:p.auster", "uid": "29ecb107-c0e8-4f2d-816a-b16f4391999c", "groups": [ "system:serviceaccounts", "system:serviceaccounts:security", "system:authenticated" ] }, ... "userAgent": "curl/7.64.0", "objectRef": { "resource": "secrets", "namespace": "security", "name": "vault-token", "apiVersion": "v1" }, ... } { "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "RequestResponse", "auditID": "aed6caf9-5af0-4872-8f09-ad55974bb5e0", "stage": "ResponseComplete", "requestURI": "/api/v1/namespaces/security/secrets/mysql-admin", "verb": "get", "user": { "username": "system:serviceaccount:security:p.auster", "uid": "29ecb107-c0e8-4f2d-816a-b16f4391999c", "groups": [ "system:serviceaccounts", "system:serviceaccounts:security", "system:authenticated" ] }, ... "userAgent": "curl/7.64.0", "objectRef": { "resource": "secrets", "namespace": "security", "name": "mysql-admin", "apiVersion": "v1" }, ... } There we see that Secrets vault-token and mysql-admin were accessed by p.auster. Hence we change the passwords for those. ➜ echo new-vault-pass | base64 bmV3LXZhdWx0LXBhc3MK ➜ k -n security edit secret vault-token ➜ echo new-mysql-pass | base64 bmV3LW15c3FsLXBhc3MK ➜ k -n security edit secret mysql-admin ```bash Audit Logs ftw. By running cat audit.log | grep "p.auster" | grep Secret | grep password we can see that passwords are stored in the Audit Logs, because they store the complete content of Secrets. It's never a good idea to reveal passwords in logs. In this case it would probably be sufficient to only store Metadata level information of Secrets which can be controlled via a Audit Policy. 第十九题: Immutable Root FileSystem 问题: Task weight: 2% Use context: kubectl config use-context workload-prod The Deployment immutable-deployment in Namespace team-purple should run immutable, it's created from file /opt/course/19/immutable-deployment.yaml. Even after a successful break-in, it shouldn't be possible for an attacker to modify the filesystem of the running container. Modify the Deployment in a way that no processes inside the container can modify the local filesystem, only /tmp directory should be writeable. Don't modify the Docker image. Save the updated YAML under /opt/course/19/immutable-deployment-new.yaml and update the running Deployment. 解答 Processes in containers can write to the local filesystem by default. This increases the attack surface when a non-malicious process gets hijacked. Preventing applications to write to disk or only allowing to certain directories can mitigate the risk. If there is for example a bug in Nginx which allows an attacker to override any file inside the container, then this only works if the Nginx process itself can write to the filesystem in the first place. Making the root filesystem readonly can be done in the Docker image itself or in a Pod declaration. Let us first check the Deployment immutable-deployment in Namespaceteam-purple: ➜ k -n team-purple edit deploy -o yaml # kubectl -n team-purple edit deploy -o yaml apiVersion: apps/v1 kind: Deployment metadata: namespace: team-purple name: immutable-deployment labels: app: immutable-deployment ... spec: replicas: 1 selector: matchLabels: app: immutable-deployment template: metadata: labels: app: immutable-deployment spec: containers: - image: busybox:1.32.0 command: ['sh', '-c', 'tail -f /dev/null'] imagePullPolicy: IfNotPresent name: busybox restartPolicy: Always ... The container has write access to the Root File System, as there are no restrictions defined for the Pods or containers by an existing SecurityContext. And based on the task we're not allowed to alter the Docker image. So we modify the YAML manifest to include the required changes: cp /opt/course/19/immutable-deployment.yaml /opt/course/19/immutable-deployment-new.yaml vim /opt/course/19/immutable-deployment-new.yaml # /opt/course/19/immutable-deployment-new.yaml apiVersion: apps/v1 kind: Deployment metadata: namespace: team-purple name: immutable-deployment labels: app: immutable-deployment spec: replicas: 1 selector: matchLabels: app: immutable-deployment template: metadata: labels: app: immutable-deployment spec: containers: - image: busybox:1.32.0 command: ['sh', '-c', 'tail -f /dev/null'] imagePullPolicy: IfNotPresent name: busybox securityContext: # add readOnlyRootFilesystem: true # add volumeMounts: # add - mountPath: /tmp # add name: temp-vol # add volumes: # add - name: temp-vol # add emptyDir: {} # add restartPolicy: Always SecurityContexts can be set on Pod or container level, here the latter was asked. Enforcing readOnlyRootFilesystem: true will render the root filesystem readonly. We can then allow some directories to be writable by using an emptyDir volume. Once the changes are made, let us update the Deployment: ➜ k delete -f /opt/course/19/immutable-deployment-new.yaml deployment.apps "immutable-deployment" deleted ➜ k create -f /opt/course/19/immutable-deployment-new.yaml deployment.apps/immutable-deployment created We can verify if the required changes are propagated: ➜ k -n team-purple exec immutable-deployment-5b7ff8d464-j2nrj -- touch /abc.txt touch: /abc.txt: Read-only file system command terminated with exit code 1 ➜ k -n team-purple exec immutable-deployment-5b7ff8d464-j2nrj -- touch /var/abc.txt touch: /var/abc.txt: Read-only file system command terminated with exit code 1 ➜ k -n team-purple exec immutable-deployment-5b7ff8d464-j2nrj -- touch /etc/abc.txt touch: /etc/abc.txt: Read-only file system command terminated with exit code 1 ➜ k -n team-purple exec immutable-deployment-5b7ff8d464-j2nrj -- touch /tmp/abc.txt ➜ k -n team-purple exec immutable-deployment-5b7ff8d464-j2nrj -- ls /tmp abc.txt The Deployment has been updated so that the container's file system is read-only, and the updated YAML has been placed under the required location. Sweet! 第二十题: Update Kubernetes 问题 Task weight: 8% Use context: kubectl config use-context workload-stage The cluster is running Kubernetes 1.27.6, update it to 1.28.2. Use apt package manager and kubeadm for this. Use ssh cluster3-controlplane1 and ssh cluster3-node1 to connect to the instances. 解决 过程总结: 先升级Master节点 驱逐pod 升级kubeadm 执行升级计算和应用升级 升级kubelet和kubectl 恢复驱逐pod 在升级Node节点 (所有node节点) 驱逐pod 升级kubeadm 执行升级节点 升级kubelet和kubectl 恢复驱逐pod Let's have a look at the current versions: ➜ k get node cluster3-controlplane1 Ready control-plane 96m v1.27.6 cluster3-node1 Ready <none> 91m v1.27.6 Control Plane Master Components First we should update the control plane components running on the master node, so we drain it: ➜ k drain cluster3-controlplane1 --ignore-daemonsets Next we ssh into it and check versions: ➜ ssh cluster3-controlplane1 ➜ root@cluster3-controlplane1:~# kubelet --version Kubernetes v1.27.6 ➜ root@cluster3-controlplane1:~# kubeadm version kubeadm version: &version.Info{Major:"1", Minor:"28", GitVersion:"v1.28.2", GitCommit:"89a4ea3e1e4ddd7f7572286090359983e0387b2f", GitTreeState:"clean", BuildDate:"2023-09-13T09:34:32Z", GoVersion:"go1.20.8", Compiler:"gc", Platform:"linux/amd64"} We see above that kubeadm is already installed in the required version. Else we would need to install it: # not necessary because here kubeadm is already installed in correct version apt-mark unhold kubeadm apt-mark hold kubectl kubelet apt install kubeadm=1.28.2-00 apt-mark hold kubeadm Check what kubeadm has available as an upgrade plan: ➜ root@cluster3-controlplane1:~# kubeadm upgrade plan [upgrade/config] Making sure the configuration is correct: [upgrade/config] Reading configuration from the cluster... [upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [preflight] Running pre-flight checks. [upgrade] Running cluster health checks [upgrade] Fetching available versions to upgrade to [upgrade/versions] Cluster version: v1.27.6 [upgrade/versions] kubeadm version: v1.28.2 [upgrade/versions] Target version: v1.28.2 [upgrade/versions] Latest version in the v1.27 series: v1.27.6 Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply': COMPONENT CURRENT TARGET kubelet 2 x v1.27.6 v1.28.2 Upgrade to the latest stable version: COMPONENT CURRENT TARGET kube-apiserver v1.27.6 v1.28.2 kube-controller-manager v1.27.6 v1.28.2 kube-scheduler v1.27.6 v1.28.2 kube-proxy v1.27.6 v1.28.2 CoreDNS v1.10.1 v1.10.1 etcd 3.5.7-0 3.5.9-0 You can now apply the upgrade by executing the following command: kubeadm upgrade apply v1.28.2 _____________________________________________________________________ The table below shows the current state of component configs as understood by this version of kubeadm. Configs that have a "yes" mark in the "MANUAL UPGRADE REQUIRED" column require manual config upgrade or resetting to kubeadm defaults before a successful upgrade can be performed. The version to manually upgrade to is denoted in the "PREFERRED VERSION" column. API GROUP CURRENT VERSION PREFERRED VERSION MANUAL UPGRADE REQUIRED kubeproxy.config.k8s.io v1alpha1 v1alpha1 no kubelet.config.k8s.io v1beta1 v1beta1 no _____________________________________________________________________ And we apply to the required version: ➜ root@cluster3-controlplane1:~# kubeadm upgrade apply v1.28.2 [upgrade/config] Making sure the configuration is correct: [upgrade/config] Reading configuration from the cluster... [upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [preflight] Running pre-flight checks. [upgrade] Running cluster health checks [upgrade/version] You have chosen to change the cluster version to "v1.28.2" [upgrade/versions] Cluster version: v1.27.6 [upgrade/versions] kubeadm version: v1.28.2 [upgrade] Are you sure you want to proceed? [y/N]: y [upgrade/prepull] Pulling images required for setting up a Kubernetes cluster ... [bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy [upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.28.2". Enjoy! [upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so. Next we can check if our required version was installed correctly: ➜ root@cluster3-controlplane1:~# kubeadm upgrade plan [upgrade/config] Making sure the configuration is correct: [upgrade/config] Reading configuration from the cluster... [upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [preflight] Running pre-flight checks. [upgrade] Running cluster health checks [upgrade] Fetching available versions to upgrade to [upgrade/versions] Cluster version: v1.28.2 [upgrade/versions] kubeadm version: v1.28.2 [upgrade/versions] Target version: v1.28.2 [upgrade/versions] Latest version in the v1.28 series: v1.28.2 Control Plane kubelet and kubectl Now we have to upgrade kubelet and kubectl: ➜ root@cluster3-controlplane1:~# apt update Hit:1 http://ppa.launchpad.net/rmescandon/yq/ubuntu focal InRelease Hit:3 http://us.archive.ubuntu.com/ubuntu bionic InRelease Hit:2 https://packages.cloud.google.com/apt kubernetes-xenial InRelease Reading package lists... Done Building dependency tree Reading state information... Done 2 packages can be upgraded. Run 'apt list --upgradable' to see them. ➜ root@cluster3-controlplane1:~# apt-mark unhold kubelet kubectl kubelet was already not hold. kubectl was already not hold. ➜ root@cluster3-controlplane1:~# apt install kubelet=1.28.2-00 kubectl=1.28.2-00 Reading package lists... Done Building dependency tree Reading state information... Done The following packages will be upgraded: kubectl kubelet 2 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. Need to get 29.8 MB of archives. After this operation, 5,194 kB of additional disk space will be used. Get:1 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubectl amd64 1.28.2-00 [10.3 MB] Get:2 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubelet amd64 1.28.2-00 [19.5 MB] Fetched 29.8 MB in 2s (17.3 MB/s) (Reading database ... 112527 files and directories currently installed.) Preparing to unpack .../kubectl_1.28.2-00_amd64.deb ... Unpacking kubectl (1.28.2-00) over (1.27.6-00) ... Preparing to unpack .../kubelet_1.28.2-00_amd64.deb ... Unpacking kubelet (1.28.2-00) over (1.27.6-00) ... Setting up kubectl (1.28.2-00) ... Setting up kubelet (1.28.2-00) ... ➜ root@cluster3-controlplane1:~# apt-mark hold kubelet kubectl kubelet set on hold. kubectl set on hold. ➜ root@cluster3-controlplane1:~# service kubelet restart ➜ root@cluster3-controlplane1:~# service kubelet status ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Tue 2023-09-26 12:52:25 UTC; 3s ago Docs: https://kubernetes.io/docs/home/ Main PID: 34030 (kubelet) Tasks: 11 (limit: 1066) Memory: 34.4M CGroup: /system.slice/kubelet.service ... ➜ root@cluster3-controlplane1:~# kubectl get node NAME STATUS ROLES AGE VERSION cluster3-controlplane1 Ready,SchedulingDisabled control-plane 150m v1.28.2 cluster3-node1 Ready <none> 143m v1.27.6 Done, and uncordon: ➜ k uncordon cluster3-controlplane1 node/cluster3-controlplane1 uncordoned Data Plane ➜ k get node NAME STATUS ROLES AGE VERSION cluster3-controlplane1 Ready control-plane 150m v1.28.2 cluster3-node1 Ready <none> 143m v1.27.6 Our data plane consist of one single worker node, so let's update it. First thing is we should drain it: k drain cluster3-node1 --ignore-daemonsets Next we ssh into it and upgrade kubeadm to the wanted version, or check if already done: ➜ ssh cluster3-node1 ➜ root@cluster3-node1:~# apt update Hit:1 http://ppa.launchpad.net/rmescandon/yq/ubuntu focal InRelease Hit:3 http://us.archive.ubuntu.com/ubuntu bionic InRelease Get:2 https://packages.cloud.google.com/apt kubernetes-xenial InRelease [8,993 B] Fetched 8,993 B in 1s (17.8 kB/s) Reading package lists... Done Building dependency tree Reading state information... Done 3 packages can be upgraded. Run 'apt list --upgradable' to see them. ➜ root@cluster3-node1:~# apt-mark unhold kubeadm kubeadm was already not hold. ➜ root@cluster3-node1:~# apt-mark hold kubectl kubelet kubectl set on hold. kubelet set on hold. ➜ root@cluster3-node1:~# apt install kubeadm=1.28.2-00 Reading package lists... Done Building dependency tree Reading state information... Done The following packages will be upgraded: kubeadm 1 upgraded, 0 newly installed, 0 to remove and 2 not upgraded. Need to get 10.3 MB of archives. After this operation, 2,589 kB of additional disk space will be used. Get:1 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubeadm amd64 1.28.2-00 [10.3 MB] Fetched 10.3 MB in 1s (19.1 MB/s) (Reading database ... 112527 files and directories currently installed.) Preparing to unpack .../kubeadm_1.28.2-00_amd64.deb ... Unpacking kubeadm (1.28.2-00) over (1.27.6-00) ... Setting up kubeadm (1.28.2-00) ... ➜ root@cluster3-node1:~# apt-mark hold kubeadm kubeadm set on hold. ➜ root@cluster3-node1:~# kubeadm upgrade node [upgrade] Reading configuration from the cluster... [upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [preflight] Running pre-flight checks [preflight] Skipping prepull. Not a control plane node. [upgrade] Skipping phase. Not a control plane node. [upgrade] Backing up kubelet config file to /etc/kubernetes/tmp/kubeadm-kubelet-config1123040998/config.yaml [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [upgrade] The configuration for this node was successfully updated! [upgrade] Now you should go ahead and upgrade the kubelet package using your package manager. Now we follow that kubeadm told us in the last line and upgrade kubelet (and kubectl): ➜ root@cluster3-node1:~# apt-mark unhold kubectl kubelet Canceled hold on kubectl. Canceled hold on kubelet. ➜ root@cluster3-node1:~# apt install kubelet=1.28.2-00 kubectl=1.28.2-00 Reading package lists... Done Building dependency tree Reading state information... Done The following packages will be upgraded: kubectl kubelet 2 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. Need to get 29.8 MB of archives. After this operation, 5,194 kB of additional disk space will be used. Get:1 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubectl amd64 1.28.2-00 [10.3 MB] Get:2 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubelet amd64 1.28.2-00 [19.5 MB] Fetched 29.8 MB in 2s (14.5 MB/s) (Reading database ... 112527 files and directories currently installed.) Preparing to unpack .../kubectl_1.28.2-00_amd64.deb ... Unpacking kubectl (1.28.2-00) over (1.27.6-00) ... Preparing to unpack .../kubelet_1.28.2-00_amd64.deb ... Unpacking kubelet (1.28.2-00) over (1.27.6-00) ... Setting up kubectl (1.28.2-00) ... Setting up kubelet (1.28.2-00) ... ➜ root@cluster3-node1:~# service kubelet restart ➜ root@cluster3-node1:~# service kubelet status ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Tue 2023-09-26 12:56:19 UTC; 4s ago Docs: https://kubernetes.io/docs/home/ Main PID: 34075 (kubelet) Tasks: 9 (limit: 1066) Memory: 26.4M CGroup: /system.slice/kubelet.service ... Looking good, what does the node status say? ➜ k get node NAME STATUS ROLES AGE VERSION cluster3-controlplane1 Ready control-plane 154m v1.28.2 cluster3-node1 Ready,SchedulingDisabled <none> 147m v1.28.2 Beautiful, let's make it schedulable again: ➜ k uncordon cluster3-node1 node/cluster3-node1 uncordoned ➜ k get node NAME STATUS ROLES AGE VERSION cluster3-controlplane1 Ready control-plane 154m v1.28.2 cluster3-node1 Ready <none> 147m v1.28.2 We're up to date. 第二十一题: Image Vulnerability Scanning 问题 Task weight: 2% (can be solved in any kubectl context) The Vulnerability Scanner trivy is installed on your main terminal. Use it to scan the following images for known CVEs: nginx:1.16.1-alpine k8s.gcr.io/kube-apiserver:v1.18.0 k8s.gcr.io/kube-controller-manager:v1.18.0 docker.io/weaveworks/weave-kube:2.7.0 Write all images that don't contain the vulnerabilities `CVE-2020-10878 orCVE-2020-1967into/opt/course/21/good-images`. 解答 The tool trivy is very simple to use, it compares images against public databases. ➜ trivy nginx:1.16.1-alpine 2020-10-09T20:59:39.198Z INFO Need to update DB 2020-10-09T20:59:39.198Z INFO Downloading DB... 18.81 MiB / 18.81 MiB [------------------------------------- 2020-10-09T20:59:45.499Z INFO Detecting Alpine vulnerabilities... nginx:1.16.1-alpine (alpine 3.10.4) =================================== Total: 7 (UNKNOWN: 0, LOW: 0, MEDIUM: 7, HIGH: 0, CRITICAL: 0) +---------------+------------------+----------+------------------- | LIBRARY | VULNERABILITY ID | SEVERITY | INSTALLED VERSION +---------------+------------------+----------+------------------- | libcrypto1.1 | CVE-2020-1967 | MEDIUM | 1.1.1d-r2 ... To solve the task we can run: ➜ trivy nginx:1.16.1-alpine | grep -E 'CVE-2020-10878|CVE-2020-1967' | libcrypto1.1 | CVE-2020-1967 | MEDIUM | libssl1.1 | CVE-2020-1967 | ➜ trivy k8s.gcr.io/kube-apiserver:v1.18.0 | grep -E 'CVE-2020-10878|CVE-2020-1967' | perl-base | CVE-2020-10878 | HIGH ➜ trivy k8s.gcr.io/kube-controller-manager:v1.18.0 | grep -E 'CVE-2020-10878|CVE-2020-1967' | perl-base | CVE-2020-10878 | HIGH ➜ trivy docker.io/weaveworks/weave-kube:2.7.0 | grep -E 'CVE-2020-10878|CVE-2020-1967' ➜ The only image without the any of the two CVEs is docker.io/weaveworks/weave-kube:2.7.0, hence our answer will be: #/opt/course/21/good-images docker.io/weaveworks/weave-kube:2.7.0 第二十二题: Manual Static Security Analysis 问题 Task weight: 3% (can be solved in any kubectl context) The Release Engineering Team has shared some YAML manifests and Dockerfiles with you to review. The files are located under /opt/course/22/files. As a container security expert, you are asked to perform a manual static analysis and find out possible security issues with respect to unwanted credential exposure. Running processes as root is of no concern in this task. Write the filenames which have issues into `/opt/course/22/security-issues. NOTE: In the Dockerfile and YAML manifests, assume that the referred files, folders, secrets and volume mounts are present. Disregard syntax or logic errors. 解答 We check location /opt/course/22/files and list the files. ➜ ls -la /opt/course/22/files total 48 drwxr-xr-x 2 k8s k8s 4096 Sep 16 19:08 . drwxr-xr-x 3 k8s k8s 4096 Sep 16 19:08 .. -rw-r--r-- 1 k8s k8s 692 Sep 16 19:08 Dockerfile-go -rw-r--r-- 1 k8s k8s 897 Sep 16 19:08 Dockerfile-mysql -rw-r--r-- 1 k8s k8s 743 Sep 16 19:08 Dockerfile-py -rw-r--r-- 1 k8s k8s 341 Sep 16 19:08 deployment-nginx.yaml -rw-r--r-- 1 k8s k8s 705 Sep 16 19:08 deployment-redis.yaml -rw-r--r-- 1 k8s k8s 392 Sep 16 19:08 pod-nginx.yaml -rw-r--r-- 1 k8s k8s 228 Sep 16 19:08 pv-manual.yaml -rw-r--r-- 1 k8s k8s 188 Sep 16 19:08 pvc-manual.yaml -rw-r--r-- 1 k8s k8s 211 Sep 16 19:08 sc-local.yaml -rw-r--r-- 1 k8s k8s 902 Sep 16 19:08 statefulset-nginx.yaml We have 3 Dockerfiles and 7 Kubernetes Resource YAML manifests. Next we should go over each to find security issues with the way credentials have been used. NOTE: You should be comfortable with Docker Best Practices and the Kubernetes Configuration Best Practices. While navigating through the files we might notice: Number 1 File Dockerfile-mysql might look innocent on first look. It copies a file secret-token over, uses it and deletes it afterwards. But because of the way Docker works, every RUN, COPY and ADD command creates a new layer and every layer is persistet in the image. This means even if the file secret-token get's deleted in layer Z, it's still included with the image in layer X and Y. In this case it would be better to use for example variables passed to Docker. # /opt/course/22/files/Dockerfile-mysql FROM ubuntu # Add MySQL configuration COPY my.cnf /etc/mysql/conf.d/my.cnf COPY mysqld_charset.cnf /etc/mysql/conf.d/mysqld_charset.cnf RUN apt-get update && \ apt-get -yq install mysql-server-5.6 && # Add MySQL scripts COPY import_sql.sh /import_sql.sh COPY run.sh /run.sh # Configure credentials COPY secret-token . # LAYER X RUN /etc/register.sh ./secret-token # LAYER Y RUN rm ./secret-token # delete secret token again # LATER Z EXPOSE 3306 CMD ["/run.sh"] So we do: echo Dockerfile-mysql >> /opt/course/22/security-issues Number 2 The file deployment-redis.yaml is fetching credentials from a Secret named mysecret and writes these into environment variables. So far so good, but in the command of the container it's echoing these which can be directly read by any user having access to the logs. # /opt/course/22/files/deployment-redis.yaml apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: mycontainer image: redis command: ["/bin/sh"] args: - "-c" - "echo $SECRET_USERNAME && echo $SECRET_PASSWORD && docker-entrypoint.sh" # NOT GOOD env: - name: SECRET_USERNAME valueFrom: secretKeyRef: name: mysecret key: username - name: SECRET_PASSWORD valueFrom: secretKeyRef: name: mysecret key: password Credentials in logs is never a good idea, hence we do: echo deployment-redis.yaml >> /opt/course/22/security-issues Number 3 In file statefulset-nginx.yaml, the password is directly exposed in the environment variable definition of the container. # /opt/course/22/files/statefulset-nginx.yaml ... apiVersion: apps/v1 kind: StatefulSet metadata: name: web spec: serviceName: "nginx" replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: k8s.gcr.io/nginx-slim:0.8 env: - name: Username value: Administrator - name: Password value: MyDiReCtP@sSw0rd # NOT GOOD ports: - containerPort: 80 name: web .. This should better be injected via a Secret. So we do: echo statefulset-nginx.yaml >> /opt/course/22/security-issues ➜ cat /opt/course/22/security-issues Dockerfile-mysql deployment-redis.yaml statefulset-nginx.yaml
2023年12月23日
72 阅读
0 评论
0 点赞
2023-11-26
Kubernetes安装metrics-server服务(指标采集服务)
Kubernetes安装metrics-server服务(指标采集服务) 创建文件metrics-server.yaml 安装 Kubernetes安装metrics-server服务(指标采集服务) 创建文件metrics-server.yaml cat << EOF > metrics-server.yaml apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server rbac.authorization.k8s.io/aggregate-to-admin: "true" rbac.authorization.k8s.io/aggregate-to-edit: "true" rbac.authorization.k8s.io/aggregate-to-view: "true" name: system:aggregated-metrics-reader rules: - apiGroups: - metrics.k8s.io resources: - pods - nodes verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server name: system:metrics-server rules: - apiGroups: - "" resources: - nodes/metrics verbs: - get - apiGroups: - "" resources: - pods - nodes verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server-auth-reader namespace: kube-system roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: extension-apiserver-authentication-reader subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server:system:auth-delegator roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:auth-delegator subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: system:metrics-server roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:metrics-server subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: v1 kind: Service metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: ports: - name: https port: 443 protocol: TCP targetPort: https selector: k8s-app: metrics-server --- apiVersion: apps/v1 kind: Deployment metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: selector: matchLabels: k8s-app: metrics-server strategy: rollingUpdate: maxUnavailable: 0 template: metadata: labels: k8s-app: metrics-server spec: containers: - args: - --cert-dir=/tmp - --secure-port=4443 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s image: registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.3 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /livez port: https scheme: HTTPS periodSeconds: 10 name: metrics-server ports: - containerPort: 4443 name: https protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /readyz port: https scheme: HTTPS initialDelaySeconds: 20 periodSeconds: 10 resources: requests: cpu: 100m memory: 200Mi securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 1000 volumeMounts: - mountPath: /tmp name: tmp-dir nodeSelector: kubernetes.io/os: linux priorityClassName: system-cluster-critical serviceAccountName: metrics-server volumes: - emptyDir: {} name: tmp-dir --- apiVersion: apiregistration.k8s.io/v1 kind: APIService metadata: labels: k8s-app: metrics-server name: v1beta1.metrics.k8s.io spec: group: metrics.k8s.io groupPriorityMinimum: 100 insecureSkipTLSVerify: true service: name: metrics-server namespace: kube-system version: v1beta1 versionPriority: 100 EOF 安装 kubectl apply -f metrics-server.yaml
2023年11月26日
108 阅读
0 评论
0 点赞
2023-11-26
Ubuntu22.04中安装Kubernetes1.27高可用(Docker作为容器运行时)
Kubernetes1.27版本安装记录过程 安装前准备 集群规划 前提准备 开始安装 1. 安装容器运行时(所有节点执行) 安装前准备 更新apt 并 安装必要工具包 添加Docker官方GPG key 设置apt库 安装Docker 安装24.0.0版本Docker 验证是否成功 手动设置Docker使用systemd cgroup驱动 启动并设置开机启动 查看Docker的cgroup驱动是systemd 安装cri-dockerd 下载cri-docker版本:0.3.2 安装deb软件包 添加infra容器镜像配置 启动并设置开机启动 2. 安装 kubeadm、kubelet 和 kubectl (所有节点执行) 3. 配置cgoup驱动程序(所有节点执行) 4. 配置负载均衡(控制平面节点执行) 说明 安装KeepAlive(k8s静态Pod方式安装) 创建/etc/keepalived/keepalived.conf配置文件(主: k8s-master01节点) 创建/etc/keepalived/keepalived.conf配置文件(备: k8s-master02、k8s-master03节点) 创建心跳检测文件(主备都执行: k8s-master01、k8s-master02、k8s-master03) 创建Keepalive Pod yaml文件(主备都执行: k8s-master01、k8s-master02、k8s-master03) 安装HAProxy(k8s静态Pod方式安装) 创建HAProxy配置文件 (k8s-master01、k8s-master02、k8s-master03执行) 创建HAProxy Pod 需要的yaml文件(k8s-master01、k8s-master02、k8s-master03执行) 针对Keepalive和HAProxy的说明 5. 初始化Master节点(控制平面节点执行) 参考界面 查看kubeadm init命令默认配置文件 (参考) 下载镜像 手动安装1.27.1版本需要的镜像 启动初始化 配置用户可以使用kubectl 命令 验证kubectl 命令是否可用 安装容器网络(CNI)插件 移除控制平面污点 7. Worker节点加入集群(所有Woker节点执行,必须使用root用户执行) 8. 高可用Master主节点加入集群(比如使用Root用户执行) 9. 可选操作 (可选)从控制平面节点以外的计算机控制集群 (可选)将 API 服务器代理到本地主机 Kubernetes1.27版本安装记录过程 安装参考: https://kubernetes.io/zh-cn/docs/setup/production-environment/tools/kubeadm/install-kubeadm/ 安装前准备 集群规划 系统: Ubuntu22.04 root@k8s-master01:~# lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 22.04.2 LTS Release: 22.04 Codename: jammy 安装工具:kubeadm 集群分布: 3主节点 + 2个工作节点 容器运行时: 选用Docker 因为1.24版本在k8s主干分支中移除了 dockershim ,并将其给了docker维护,后将其改名为cni-dockerd,因此我们需要安装: 安装Docker 安装cni-dockerd 采用堆叠的方式创建高可用集群(etcd和控制平面其他组件在同一个节点上): 集群节点图: 高可用参考: https://developer.aliyun.com/article/853054 前提准备 节点之中不可以有重复的主机名、MAC 地址或 product_uuid。请参见这里了解更多详细信息。 主机名验证 # 查看主机名 hostname # 修改为k8s-master (替换成每个节点主机名) hostnamectl hostname k8s-master Mac地址验证 使用命令 ip link 或 ifconfig -a 来获取网络接口的 MAC 地址 Product_uuid验证 sudo cat /sys/class/dmi/id/product_uuid 配置主机名映射 配置文件:/etc/hosts 192.168.0.18 k8s-master 192.168.0.19 k8s-master01 192.168.0.20 k8s-master02 192.168.0.21 k8s-master03 192.168.0.22 k8s-slave01 192.168.0.23 k8s-slave02 192.168.0.18 k8s-master 这一条是为了配置高可用集群准备的。 关闭防火墙 ufw disable 转发 IPv4 并让 iptables 看到桥接流量 执行下述指令: cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf overlay br_netfilter EOF sudo modprobe overlay sudo modprobe br_netfilter # 设置所需的 sysctl 参数,参数在重新启动后保持不变 cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.ipv4.ip_forward = 1 EOF # 应用 sysctl 参数而不重新启动 sudo sysctl --system 通过运行以下指令确认 br_netfilter 和 overlay 模块被加载 lsmod | grep br_netfilter lsmod | grep overlay 通过运行以下指令确认 net.bridge.bridge-nf-call-iptables、net.bridge.bridge-nf-call-ip6tables 和 net.ipv4.ip_forward 系统变量在你的 sysctl 配置中被设置为 1: sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward 一台兼容的 Linux 主机。Kubernetes 项目为基于 Debian 和 Red Hat 的 Linux 发行版以及一些不提供包管理器的发行版提供通用的指令。 每台机器 2 GB 或更多的 RAM(如果少于这个数字将会影响你应用的运行内存)。 CPU 2 核心及以上。 集群中的所有机器的网络彼此均能相互连接(公网和内网都可以)。 开启机器上的某些端口。请参见这里了解更多详细信息。 nc 127.0.0.1 6443 无输出,表示正常 禁用交换分区。为了保证 kubelet 正常工作,你必须禁用交换分区。 例如,sudo swapoff -a 将暂时禁用交换分区。要使此更改在重启后保持不变,请确保在如 /etc/fstab、systemd.swap 等配置文件中禁用交换分区,具体取决于你的系统如何配置。 #关闭swap swapoff -a sed -ri 's/.*swap.*/#&/' /etc/fstab 查看主机cgroup版本 stat -fc %T /sys/fs/cgroup/ 对于 cgroup v1,输出为 tmpfs。 对于 cgroup v2,输出为 cgroup2fs。 cgroup v2 具有以下要求: 操作系统发行版启用 cgroup v2 Linux 内核为 5.8 或更高版本 容器运行时支持 cgroup v2。例如: containerd v1.4 和更高版本 cri-o v1.20 和更高版本 kubelet 和容器运行时被配置为使用 systemd cgroup 驱动 参考: https://kubernetes.io/zh-cn/docs/concepts/architecture/cgroups/ 开始安装 1. 安装容器运行时(所有节点执行) https://kubernetes.io/zh-cn/docs/setup/production-environment/container-runtimes/#docker 安装前准备 更新apt 并 安装必要工具包 sudo apt-get remove docker docker-engine docker.io containerd runc sudo apt-get update sudo apt-get install ca-certificates curl gnupg 添加Docker官方GPG key sudo install -m 0755 -d /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg sudo chmod a+r /etc/apt/keyrings/docker.gpg 设置apt库 echo \ "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \ "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null 安装Docker 安装24.0.0版本Docker sudo apt-get update VERSION_STRING=5:24.0.0-1~ubuntu.22.04~jammy sudo apt-get install docker-ce=$VERSION_STRING docker-ce-cli=$VERSION_STRING containerd.io docker-buildx-plugin docker-compose-plugin -y 验证是否成功 root@k8s-master01:~# docker version Client: Docker Engine - Community Version: 24.0.0 API version: 1.43 Go version: go1.20.4 Git commit: 98fdcd7 Built: Mon May 15 18:49:22 2023 OS/Arch: linux/amd64 Context: default Server: Docker Engine - Community Engine: Version: 24.0.0 API version: 1.43 (minimum version 1.12) Go version: go1.20.4 Git commit: 1331b8c Built: Mon May 15 18:49:22 2023 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.6.21 GitCommit: 3dce8eb055cbb6872793272b4f20ed16117344f8 runc: Version: 1.1.7 GitCommit: v1.1.7-0-g860f061 docker-init: Version: 0.19.0 GitCommit: de40ad0 手动设置Docker使用systemd cgroup驱动 修改: /etc/docker/daemon.json文件,添加如下内容 保存重启docker cat > /etc/docker/daemon.json << EOF { "registry-mirrors": ["https://84bkfzte.mirror.aliyuncs.com"], "exec-opts": ["native.cgroupdriver=systemd"] } EOF systemctl daemon-reload && systemctl restart docker 启动并设置开机启动 systemctl enable docker --now 查看Docker的cgroup驱动是systemd root@k8s-master01:~# docker info | grep -i cgroup Cgroup Driver: systemd Cgroup Version: 2 cgroupns 安装cri-dockerd 下载cri-docker版本:0.3.2 https://github.com/Mirantis/cri-dockerd/releases/tag/v0.3.2 安装deb软件包 sudo dpkg -i ./cri-dockerd_0.3.2.3-0.ubuntu-jammy_amd64.deb 添加infra容器镜像配置 修改镜像地址为国内,否则kubelet拉取不了镜像导致启动失败 sudo sed -i 's|ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd://|ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd:// --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.9|' /usr/lib/systemd/system/cri-docker.service 修改以后内容如下: cat /usr/lib/systemd/system/cri-docker.service ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd:// --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.9 启动并设置开机启动 systemctl daemon-reload systemctl enable cri-docker --now 2. 安装 kubeadm、kubelet 和 kubectl (所有节点执行) 你需要在每台机器上安装以下的软件包: kubeadm:用来初始化集群的指令。 kubelet:在集群中的每个节点上用来启动 Pod 和容器等。 kubectl:用来与集群通信的命令行工具。 更新 apt 包索引并安装使用 Kubernetes apt 仓库所需要的包, 并配置阿里apt源: sudo apt-get update sudo apt-get install -y apt-transport-https ca-certificates curl curl -fsSL https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo gpg --dearmor -o /usr/share/keyrings/kubernetes-archive-keyring.gpg echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list 更新 apt 包索引,安装 kubelet、kubeadm 和 kubectl,并锁定其版本: sudo apt-get update sudo apt-get install -y kubelet=1.27.1-00 kubeadm=1.27.1-00 kubectl=1.27.1-00 sudo apt-mark hold kubelet kubeadm kubectl 查看对应版本 root@k8s-master01:~# kubectl version -o yaml clientVersion: buildDate: "2023-04-14T13:21:19Z" compiler: gc gitCommit: 4c9411232e10168d7b050c49a1b59f6df9d7ea4b gitTreeState: clean gitVersion: v1.27.1 goVersion: go1.20.3 major: "1" minor: "27" platform: linux/amd64 kustomizeVersion: v5.0.1 The connection to the server localhost:8080 was refused - did you specify the right host or port? root@k8s-master01:~# kubelet --version Kubernetes v1.27.1 root@k8s-master01:~# kubeadm version kubeadm version: &version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.1", GitCommit:"4c9411232e10168d7b050c49a1b59f6df9d7ea4b", GitTreeState:"clean", BuildDate:"2023-04-14T13:20:04Z", GoVersion:"go1.20.3", Compiler:"gc", Platform:"linux/amd64"} root@k8s-master01:~# 注意:kubelet 现在每隔几秒就会重启,因为它陷入了一个等待 kubeadm 指令的死循环(正常现象,初始化好主节点就好了)。 3. 配置cgoup驱动程序(所有节点执行) 配置配置kubelet cgroup驱动 https://kubernetes.io/zh-cn/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/#%E6%9B%B4%E6%96%B0%E6%89%80%E6%9C%89%E8%8A%82%E7%82%B9%E7%9A%84-cgroup-%E9%A9%B1%E5%8A%A8 警告: 你需要确保容器运行时和 kubelet 所使用的是相同的 cgroup 驱动,否则 kubelet 进程会失败。 相, 我们都是使用 systemd, 新建文件并写入内容: root@k8s-master01:~# cat << EOF >> /etc/default/kubelet KUBELET_EXTRA_ARGS="--cgroup-driver=systemd" EOF root@k8s-master01:~# cat /etc/default/kubelet KUBELET_EXTRA_ARGS="--cgroup-driver=systemd" root@k8s-master01:~# systemctl restart kubelet 4. 配置负载均衡(控制平面节点执行) 参考页面:https://github.com/kubernetes/kubeadm/blob/main/docs/ha-considerations.md#options-for-software-load-balancing 说明 我们使用KeepAlive和HAProxy完成负载均衡高可用的配置。 安装KeepAlive(k8s静态Pod方式安装) 创建/etc/keepalived/keepalived.conf配置文件(主: k8s-master01节点) sudo mkdir -p /etc/keepalived sudo cat << EOF > /etc/keepalived/keepalived.conf ! /etc/keepalived/keepalived.conf ! Configuration File for keepalived global_defs { router_id LVS_DEVEL } vrrp_script check_apiserver { # 指定验证KeepAlive是否存活脚本位置 script "/etc/keepalived/check_apiserver.sh" interval 3 weight -2 fall 10 rise 2 } vrrp_instance VI_1 { ! 指定MASTER 或者 BACKUP, 这里我们使用k8s-master01节点作为MASTER state MASTER ! 网卡名称 interface enp0s3 ! 指定router_id,集群该值必须相同,这里指定为:51 virtual_router_id 51 ! 优先级: MASTER节点优先级要高于BACKUP节点,我们MASTER节点配置:101, BACKUP节点设置:100 priority 101 authentication { auth_type PASS ! 验证密码: 集群该值必须相同,这里指定为:42 auth_pass 42 } virtual_ipaddress { ! 虚拟IP地址: 改地址将作为KeepAlive对外暴露的地址,指定的IP必须是你集群所在的网络里面没有被使用的IP地址,这里指定:192.168.0.18 ! 同时改地址也是将要指定kubeadm init 命令 --control-plane-endpoint 参数中的,至于端口, 需要在HAProxy里面指定 192.168.0.18 } track_script { check_apiserver } } EOF cat /etc/keepalived/keepalived.conf 创建/etc/keepalived/keepalived.conf配置文件(备: k8s-master02、k8s-master03节点) sudo mkdir -p /etc/keepalived sudo cat << EOF > /etc/keepalived/keepalived.conf ! /etc/keepalived/keepalived.conf ! Configuration File for keepalived global_defs { router_id LVS_DEVEL } vrrp_script check_apiserver { # 指定验证KeepAlive是否存活脚本位置 script "/etc/keepalived/check_apiserver.sh" interval 3 weight -2 fall 10 rise 2 } vrrp_instance VI_1 { ! 指定MASTER 或者 BACKUP, 这里我们使用k8s-master01节点作为MASTER state BACKUP ! 网卡名称 interface enp0s3 ! 指定router_id,集群该值必须相同,这里指定为:51 virtual_router_id 51 ! 优先级: MASTER节点优先级要高于BACKUP节点,我们MASTER节点配置:101, BACKUP节点设置:100 priority 100 authentication { auth_type PASS ! 验证密码: 集群该值必须相同,这里指定为:42 auth_pass 42 } virtual_ipaddress { ! 虚拟IP地址: 改地址将作为KeepAlive对外暴露的地址,指定的IP必须是你集群所在的网络里面没有被使用的IP地址,这里指定:192.168.0.18 ! 同时改地址也是将要指定kubeadm init 命令 --control-plane-endpoint 参数中的,至于端口, 需要在HAProxy里面指定 192.168.0.18 } track_script { check_apiserver } } EOF cat /etc/keepalived/keepalived.conf 创建心跳检测文件(主备都执行: k8s-master01、k8s-master02、k8s-master03) sudo mkdir -p /etc/keepalived sudo cat << EOF > /etc/keepalived/check_apiserver.sh # /etc/keepalived/check_apiserver.sh #!/bin/sh errorExit() { echo "*** $*" 1>&2 exit 1 } curl --silent --max-time 2 --insecure https://localhost:8443/ -o /dev/null || errorExit "Error GET https://localhost:8443/" if ip addr | grep -q 192.168.0.18; then curl --silent --max-time 2 --insecure https://192.168.0.18:8443/ -o /dev/null || errorExit "Error GET https://192.168.0.18:8443/" fi EOF cat /etc/keepalived/check_apiserver.sh 创建Keepalive Pod yaml文件(主备都执行: k8s-master01、k8s-master02、k8s-master03) 文件名: /etc/kubernetes/manifests/keepalived.yaml sudo cat << EOF > /etc/kubernetes/manifests/keepalived.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null name: keepalived namespace: kube-system spec: containers: - image: osixia/keepalived:2.0.17 name: keepalived resources: {} securityContext: capabilities: add: - NET_ADMIN - NET_BROADCAST - NET_RAW volumeMounts: - mountPath: /usr/local/etc/keepalived/keepalived.conf name: config - mountPath: /etc/keepalived/check_apiserver.sh name: check hostNetwork: true volumes: - hostPath: path: /etc/keepalived/keepalived.conf name: config - hostPath: path: /etc/keepalived/check_apiserver.sh name: check status: {} EOF cat /etc/kubernetes/manifests/keepalived.yaml 安装HAProxy(k8s静态Pod方式安装) 说明: 由于现在没有进行kubeadm init 操作,因此现在kubelet组件启动不了,因此想要看到效果需要等到kubeadm init 指定完成以后。 创建HAProxy配置文件 (k8s-master01、k8s-master02、k8s-master03执行) sudo mkdir -p /etc/haproxy sudo cat << EOF > /etc/haproxy/haproxy.cfg # /etc/haproxy/haproxy.cfg #--------------------------------------------------------------------- # Global settings #--------------------------------------------------------------------- global log /dev/log local0 log /dev/log local1 notice daemon #--------------------------------------------------------------------- # common defaults that all the 'listen' and 'backend' sections will # use if not designated in their block #--------------------------------------------------------------------- defaults mode http log global option httplog option dontlognull option http-server-close option forwardfor except 127.0.0.0/8 option redispatch retries 1 timeout http-request 10s timeout queue 20s timeout connect 5s timeout client 20s timeout server 20s timeout http-keep-alive 10s timeout check 10s #--------------------------------------------------------------------- # apiserver frontend which proxys to the control plane nodes #--------------------------------------------------------------------- frontend apiserver # 指定负载均衡绑定的地址和端口,这里的端口需要和/etc/keepalived/check_apiserver.sh文件中监控的端口相同 bind *:8443 mode tcp option tcplog default_backend apiserver #--------------------------------------------------------------------- # round robin balancing for apiserver #--------------------------------------------------------------------- backend apiserver option httpchk GET /healthz http-check expect status 200 mode tcp option ssl-hello-chk balance roundrobin # HAProxy负载均衡器代理的后端节点 server k8s-master01 192.168.0.19:6443 check server k8s-master02 192.168.0.20:6443 check server k8s-master03 192.168.0.21:6443 check # [...] EOF cat /etc/haproxy/haproxy.cfg 创建HAProxy Pod 需要的yaml文件(k8s-master01、k8s-master02、k8s-master03执行) sudo cat << EOF > /etc/kubernetes/manifests/haproxy.yaml apiVersion: v1 kind: Pod metadata: name: haproxy namespace: kube-system spec: containers: - image: haproxy:2.1.4 name: haproxy livenessProbe: failureThreshold: 8 httpGet: host: localhost path: /healthz # 指定HAProxy代理的端口,该端口必须和/etc/haproxy/haproxy.cfg配置的端口相同 port: 8443 scheme: HTTPS volumeMounts: - mountPath: /usr/local/etc/haproxy/haproxy.cfg name: haproxyconf readOnly: true hostNetwork: true volumes: - hostPath: path: /etc/haproxy/haproxy.cfg type: FileOrCreate name: haproxyconf status: {} EOF cat /etc/kubernetes/manifests/haproxy.yaml 针对Keepalive和HAProxy的说明 我们配置负载均衡使用8443端口而没有使用默认的6443端口是因为我们Keepalive和HAProxy都部署在主节点上,而主节点上也部署Kubernetes的api-server组件,而api-server主键已经占用了 6443端口,因此,我们这里配置了8443端口。 当使用Keepalived和HAProxy组合作为高可用负载均衡时,可以构建一个可靠的架构来提供高可用性和负载均衡功能。下面是一个简化的架构图,以可视化方式展示Keepalived和HAProxy的组合: +----------------------+ | Load Balancer | | (HAProxy) | +----------------------+ | | | | +--------+ +--------+ | | +--------------+ +--------------+ | Backend 1 | | Backend 2 | +--------------+ +--------------+ 在上面的架构图中,有以下组件: Load Balancer (HAProxy):负责接收客户端请求并将其转发到后端服务器。HAProxy是一种高性能的负载均衡器,能够根据不同的负载均衡算法将请求分发到后端服务器,以实现负载均衡和高可用性。 Backend 1 和 Backend 2:这些是真实的后端服务器,用于处理客户端请求。可以有多个后端服务器,以实现负载均衡和高可用性。这些后端服务器可以是应用服务器、数据库服务器等。 Keepalived:用于实现高可用性的组件。Keepalived监测Load Balancer节点的可用性,并在主节点发生故障时将其切换到备份节点。Keepalived使用虚拟IP地址(VIP)来提供无缝的故障转移和高可用性。 在这个架构中,客户端请求首先到达Load Balancer(HAProxy),然后根据负载均衡算法选择一个后端服务器进行处理。如果Load Balancer节点出现故障,Keepalived会自动检测到,并将主节点的VIP切换到备份节点,以确保服务的持续可用性。 访问HAProxy的入口是Keepalived提供的虚拟IP(VIP)。Keepalived会将虚拟IP绑定到主节点上,以便客户端可以通过该虚拟IP与负载均衡器通信。 在高可用负载均衡架构中,客户端不需要直接连接到单个负载均衡器节点。相反,客户端将请求发送到虚拟IP地址,该虚拟IP地址由Keepalived管理并绑定到当前的主节点上。通过这种方式,客户端可以无需关心主节点和备份节点之间的切换,而始终通过虚拟IP与负载均衡器通信。 当Keepalived检测到主节点故障时,它会自动将虚拟IP迁移到备份节点上,以实现无缝的故障转移。这样,客户端可以继续使用相同的虚拟IP与负载均衡器通信,无需感知主备节点的切换。 总结起来,访问HAProxy的入口就是通过Keepalived提供的虚拟IP。客户端可以使用该虚拟IP来连接负载均衡器,并由负载均衡器将请求转发到后端服务器。 5. 初始化Master节点(控制平面节点执行) 参考界面 kubeadm init 命令运行过程: https://kubernetes.io/zh-cn/docs/reference/setup-tools/kubeadm/kubeadm-init/#custom-images kubeadm config print 打印kubeadm join 或者 kubeadm init 命令默认值: https://kubernetes.io/zh-cn/docs/reference/setup-tools/kubeadm/kubeadm-config/#cmd-config-print 查看kubeadm init命令默认配置文件 (参考) 这里输出的yaml格式不正确: 正确格式参考: https://kubernetes.io/docs/reference/config-api/kubeadm-config.v1beta3/ root@k8s-master01:~# kubeadm config print init-defaults | tee kubeadm-config.yaml apiVersion: kubeadm.k8s.io/v1beta3 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 1.2.3.4 bindPort: 6443 nodeRegistration: criSocket: unix:///var/run/containerd/containerd.sock imagePullPolicy: IfNotPresent name: node taints: null --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta3 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: {} etcd: local: dataDir: /var/lib/etcd imageRepository: registry.k8s.io kind: ClusterConfiguration kubernetesVersion: 1.27.0 networking: dnsDomain: cluster.local serviceSubnet: 10.96.0.0/12 scheduler: {} 下载镜像 手动安装1.27.1版本需要的镜像 # 注意: 这里 EOF 是使用单引号括起来的,不适用单引号,后面的脚本文件会执行,导致结果错误 sudo cat << 'EOF' > download_image.sh #!/bin/bash # Kubernetes 安装的版本 KUBERNETES_VERSION=$(kubeadm version | grep -oP 'GitVersion:"v\K[^"]+') # 阿里Kubernetes官方镜像库 AILI_KUBERNETES_REGISTRY="registry.cn-hangzhou.aliyuncs.com/google_containers" echo "KUBERNETES_VERSION => ${KUBERNETES_VERSION}" echo "AILI_KUBERNETES_REGISTRY => ${AILI_KUBERNETES_REGISTRY}" # 下载并重命名镜像 function download_and_tag_image() { # 官方镜像全称: registry.k8s.io/xxx/xxx:xxx # 比如: registry.k8s.io/kube-proxy:v1.27.1 local full_official_image=$1 local ali_image ali_image=$(echo "$full_official_image" | sed -E "s|(.*/)(.*)|$AILI_KUBERNETES_REGISTRY/\2|") echo "downloading image => $ali_image" echo "downloading image => $ali_image" sudo docker pull "$ali_image" # 重命名镜像 echo "rename image $ali_image to $full_official_image" sudo docker tag "$ali_image" "$full_official_image" } # 官方镜像列表 OFFICIAL_IMAGE_LIST=$(kubeadm config images list --kubernetes-version "$KUBERNETES_VERSION" 2>/dev/null | grep "$OFFICIAL_KUBERNETES_REGISTRY") for official_image in $OFFICIAL_IMAGE_LIST; do download_and_tag_image "$official_image" done EOF cat download_image.sh sudo chmod u+x ./download_image.sh && ./download_image.sh 启动初始化 root@k8s-master01:~# kubeadm init \ --apiserver-advertise-address=192.168.0.19 \ --kubernetes-version v1.27.1 \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16 \ --cri-socket=unix:///var/run/cri-dockerd.sock \ --control-plane-endpoint=k8s-master:8443 \ --upload-certs [init] Using Kubernetes version: v1.27.1 [preflight] Running pre-flight checks [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' W0520 20:12:40.038258 30744 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.1, falling back to the nearest etcd version (3.5.7-0) W0520 20:12:40.293262 30744 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.6" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image. [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [k8s-master k8s-master01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.19] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [k8s-master01 localhost] and IPs [192.168.0.19 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [k8s-master01 localhost] and IPs [192.168.0.19 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" W0520 20:12:42.465414 30744 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [kubeconfig] Writing "admin.conf" kubeconfig file W0520 20:12:42.555482 30744 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [kubeconfig] Writing "kubelet.conf" kubeconfig file W0520 20:12:42.934781 30744 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [kubeconfig] Writing "controller-manager.conf" kubeconfig file W0520 20:12:43.058171 30744 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [kubeconfig] Writing "scheduler.conf" kubeconfig file [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" W0520 20:12:43.628841 30744 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.1, falling back to the nearest etcd version (3.5.7-0) [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 9.011713 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [upload-certs] Using certificate key: 3636bc7d84515aeb36ca79597792b07cc64a888ebdea9221ab68a5bae93ac947 [mark-control-plane] Marking the node k8s-master01 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers] [mark-control-plane] Marking the node k8s-master01 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule] [bootstrap-token] Using token: th5i1f.fnzc9v0yb6z3aok8 [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes [bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS W0520 20:12:54.340208 30744 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of the control-plane node running the following command on each as root: kubeadm join k8s-master:8443 --token th5i1f.fnzc9v0yb6z3aok8 \ --discovery-token-ca-cert-hash sha256:25357bff7f44a787886222dc9439916ab271dc5af5d5bbef274288fdd8e245b4 \ --control-plane --certificate-key 3636bc7d84515aeb36ca79597792b07cc64a888ebdea9221ab68a5bae93ac947 Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join k8s-master:8443 --token th5i1f.fnzc9v0yb6z3aok8 \ --discovery-token-ca-cert-hash sha256:25357bff7f44a787886222dc9439916ab271dc5af5d5bbef274288fdd8e245b4 --apiserver-advertise-address=192.168.0.20 : 用于为控制平面节点的 API server 设置广播地址(必须指定master节点IP) --cri-socket=unix:///var/run/cri-dockerd.sock : 指定CRI 套接字的路径,我们这里有两个运行时环境: containerd 和 cri-dockerd, 这里我们指定cri-dockerd。 注意: kubeadm join 命令的时候也需要指定。 --control-plane-endpoint=k8s-master:8443 : 用于为所有控制平面(高可用环境必须指定)节点设置共享端点,集群节点都要需要配置 /etc/hosts 将k8s-master 指定我们Keepalive中配置的虚拟IP地址, 8443这个端口是我们之前在HAProxy中配置的。 kubeadm 不支持将没有 --control-plane-endpoint 参数的单个控制平面集群转换为高可用性集群。 --service-cidr=10.96.0.0/12 : 指定service网段 --pod-network-cidr=10.244.0.0/16: 指定pod IP地址网段 --kubernetes-version 1.27.0 : 指定k8s版本 --upload-certs : 指定上传证书 , 高可用集群建议指定。 如果不指定,也可以手动指定: 参考如下部分证书分配手册。 或者安装完成集群以后,重新上传证书: sudo kubeadm init phase upload-certs --upload-certs 这里有可能失败,如果失败就手动下载这个镜像:registry.k8s.io/pause:3.6 cat << 'EOF' > download.sh #!/bin/bash # 官方镜像列表 OFFICIAL_IMAGE_LIST=("$@") # 阿里Kubernetes官方惊镜像库 AILI_KUBERNETES_REGISTRY="registry.cn-hangzhou.aliyuncs.com/google_containers" echo "AILI_KUBERNETES_REGISTRY => ${AILI_KUBERNETES_REGISTRY}" # 下载并重命名镜像 function download_and_tag_image() { # 官方镜像全称: registry.k8s.io/xxx/xxx:xxx # 比如: registry.k8s.io/kube-proxy:v1.27.1 local full_official_image=$1 local ali_image ali_image=$(echo "$full_official_image" | sed -E "s|(.*/)(.*)|$AILI_KUBERNETES_REGISTRY/\2|") echo "downloading image => $ali_image" sudo docker pull "$ali_image" # 重命名镜像 echo "rename image $ali_image to $full_official_image" sudo docker tag "$ali_image" "$full_official_image" } for official_image in "${OFFICIAL_IMAGE_LIST[@]}"; do download_and_tag_image "$official_image" done EOF sudo chmod u+x ./download.sh && ./download.sh registry.k8s.io/pause:3.6 也有可能报错:Nameserver limits exceeded root@k8s-master01:/etc/kubernetes# systemctl status kubelet ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Tue 2023-05-30 00:34:59 CST; 3s ago Docs: https://kubernetes.io/docs/home/ Main PID: 7672 (kubelet) Tasks: 13 (limit: 13832) Memory: 25.7M CPU: 588ms CGroup: /system.slice/kubelet.service └─7672 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --containe> May 30 00:35:01 k8s-master01 kubelet[7672]: E0530 00:35:01.033186 7672 dns.go:158] "Nameserver limits exceeded" err="Nameserver limits were exceeded, some nameservers have been omitted, > 解决:Nameserver limits exceeded 参考: https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/#known-issues root@k8s-master01:~# cat /etc/resolv.conf # This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8). # Do not edit. # # This file might be symlinked as /etc/resolv.conf. If you're looking at # /etc/resolv.conf and seeing this text, you have followed the symlink. # # This is a dynamic resolv.conf file for connecting local clients to the # internal DNS stub resolver of systemd-resolved. This file lists all # configured search domains. # # Run "resolvectl status" to see details about the uplink DNS servers # currently in use. # # Third party programs should typically not access this file directly, but only # through the symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a # different way, replace this symlink by a static file or a different symlink. # # See man:systemd-resolved.service(8) for details about the supported modes of # operation for /etc/resolv.conf. ## 注释: 这里只有三条解析,我们做修改 nameserver 127.0.0.53 options edns0 trust-ad search . root@k8s-master01:~# cat /etc/systemd/resolved.conf # This file is part of systemd. # # systemd is free software; you can redistribute it and/or modify it under the # terms of the GNU Lesser General Public License as published by the Free # Software Foundation; either version 2.1 of the License, or (at your option) # any later version. # # Entries in this file show the compile time defaults. Local configuration # should be created by either modifying this file, or by creating "drop-ins" in # the resolved.conf.d/ subdirectory. The latter is generally recommended. # Defaults can be restored by simply deleting this file and all drop-ins. # # Use 'systemd-analyze cat-config systemd/resolved.conf' to display the full config. # # See resolved.conf(5) for details. ## 注释: 这个文件没有解析,不做修改 [Resolve] # Some examples of DNS servers which may be used for DNS= and FallbackDNS=: # Cloudflare: 1.1.1.1#cloudflare-dns.com 1.0.0.1#cloudflare-dns.com 2606:4700:4700::1111#cloudflare-dns.com 2606:4700:4700::1001#cloudflare-dns.com # Google: 8.8.8.8#dns.google 8.8.4.4#dns.google 2001:4860:4860::8888#dns.google 2001:4860:4860::8844#dns.google # Quad9: 9.9.9.9#dns.quad9.net 149.112.112.112#dns.quad9.net 2620:fe::fe#dns.quad9.net 2620:fe::9#dns.quad9.net #DNS= #FallbackDNS= #Domains= #DNSSEC=no #DNSOverTLS=no #MulticastDNS=no #LLMNR=no #Cache=no-negative #CacheFromLocalhost=no #DNSStubListener=yes #DNSStubListenerExtra= #ReadEtcHosts=yes #ResolveUnicastSingleLabel=no root@k8s-master01:~# cat /run/systemd/resolve/resolv.conf # This is /run/systemd/resolve/resolv.conf managed by man:systemd-resolved(8). # Do not edit. # # This file might be symlinked as /etc/resolv.conf. If you're looking at # /etc/resolv.conf and seeing this text, you have followed the symlink. # # This is a dynamic resolv.conf file for connecting local clients directly to # all known uplink DNS servers. This file lists all configured search domains. # # Third party programs should typically not access this file directly, but only # through the symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a # different way, replace this symlink by a static file or a different symlink. # # See man:systemd-resolved.service(8) for details about the supported modes of # operation for /etc/resolv.conf. # 这个文件有多条nameserver解析,我们注释其中的两条,因为我们不使用ip6,所以这里注释掉ip6解析 nameserver 192.168.0.1 nameserver 192.168.1.1 # 注释掉 # nameserver fe80::1%2 # Too many DNS servers configured, the following entries may be ignored. # 注释掉 # nameserver 240c::6666 search . root@k8s-master01:~# vim /run/systemd/resolve/resolv.conf # 重启 root@k8s-master01:~# systemctl restart kubelet # 重置kubeadm后,重新执行以上kubeadm init命令,即可解决 root@k8s-master01:~# kubeadm reset \ --cri-socket=unix:///var/run/cri-dockerd.sock [reset] Reading configuration from the cluster... [reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' W0530 00:45:44.483361 37269 reset.go:106] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: configmaps "kubeadm-config" not found W0530 00:45:44.483476 37269 preflight.go:56] [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted. [reset] Are you sure you want to proceed? [y/N]: y [preflight] Running pre-flight checks W0530 00:45:46.625805 37269 removeetcdmember.go:106] [reset] No kubeadm config, using etcd pod spec to get data directory [reset] Deleted contents of the etcd data directory: /var/lib/etcd [reset] Stopping the kubelet service [reset] Unmounting mounted directories in "/var/lib/kubelet" [reset] Deleting contents of directories: [/etc/kubernetes/manifests /var/lib/kubelet /etc/kubernetes/pki] [reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf] The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d The reset process does not reset or clean up iptables rules or IPVS tables. If you wish to reset iptables, you must do so manually by using the "iptables" command. If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar) to reset your system's IPVS tables. The reset process does not clean your kubeconfig files and you must remove them manually. Please, check the contents of the $HOME/.kube/config file. 重要的说明: 要重新配置一个已经创建的集群, 请参见重新配置一个 kubeadm 集群。 要再次运行 kubeadm init,你必须首先卸载集群。 配置用户可以使用kubectl 命令 非root用户执行(root用户也可以): mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config root用户(如果执行完上面的语句, 下面可以不执行) export KUBECONFIG=/etc/kubernetes/admin.conf 验证kubectl 命令是否可用 root@k8s-master01:~# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master NotReady control-plane 21m v1.27.1 安装容器网络(CNI)插件 为了保证pod之间能够正常访问,需要安装容器网络插件。 注意: 每个集群只能安装一个 Pod 网络(容器网络插件)。 你必须部署一个基于 Pod 网络插件的 容器网络接口 (CNI),以便你的 Pod 可以相互通信。 在安装网络之前,集群 DNS (CoreDNS) 将不会启动。 这里有多种网络插件:https://kubernetes.io/zh-cn/docs/concepts/cluster-administration/addons/#networking-and-network-policy 我们选择: Calico root@k8s-master01:~# kubectl apply -f https://projectcalico.docs.tigera.io/archive/v3.25/manifests/calico.yaml poddisruptionbudget.policy/calico-kube-controllers created serviceaccount/calico-kube-controllers created serviceaccount/calico-node created configmap/calico-config created customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/caliconodestatuses.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipreservations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created clusterrole.rbac.authorization.k8s.io/calico-node created clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created clusterrolebinding.rbac.authorization.k8s.io/calico-node created daemonset.apps/calico-node created deployment.apps/calico-kube-controllers created 查看是否安装完成: root@k8s-master01:~# kubectl get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-6c99c8747f-dmwg7 0/1 Pending 0 118s kube-system calico-node-t5bc7 0/1 Init:0/3 0 118s kube-system coredns-5d78c9869d-gm5vt 0/1 Pending 0 33m kube-system coredns-5d78c9869d-xgkbj 0/1 Pending 0 33m kube-system etcd-k8s-master 1/1 Running 0 34m kube-system kube-apiserver-k8s-master 1/1 Running 0 34m kube-system kube-controller-manager-k8s-master 1/1 Running 0 34m kube-system kube-proxy-d26m7 1/1 Running 0 33m kube-system kube-scheduler-k8s-master 1/1 Running 0 34m root@k8s-master01:~# kubectl get pod -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-6c99c8747f-cfqg9 1/1 Running 0 41s kube-system calico-node-rczss 1/1 Running 0 41s kube-system coredns-5d78c9869d-gm5vt 1/1 Running 0 80m kube-system coredns-5d78c9869d-xgkbj 1/1 Running 0 80m kube-system etcd-k8s-master 1/1 Running 1 (10m ago) 80m kube-system kube-apiserver-k8s-master 1/1 Running 1 (10m ago) 80m kube-system kube-controller-manager-k8s-master 1/1 Running 1 (10m ago) 80m kube-system kube-proxy-d26m7 1/1 Running 1 (10m ago) 80m kube-system kube-scheduler-k8s-master 1/1 Running 1 (10m ago) 80m 需要等10多分钟。。。 最后 calico 相关的pod都 Running 状态, 并且都是 READY 1/1 ,表示安装完成, 可以看到 calico 安装完成以后, coredns 相关的pod也都是Running 状态了。 移除控制平面污点 默认控制平面会打上污点,pod是不会被调度到控制平面上的, 如果想要让pod调度到控制平面上,可以执行以下命令移除污点: root@k8s-master01:~# kubectl taint nodes --all node-role.kubernetes.io/control-plane- node/k8s-master untainted 7. Worker节点加入集群(所有Woker节点执行,必须使用root用户执行) 我这里就是: k8s-slave-1 和 k8s-slave02 这两台集群执行: root@k8s-slave01:~# kubeadm join k8s-master:8443 --token th5i1f.fnzc9v0yb6z3aok8 \ --discovery-token-ca-cert-hash sha256:25357bff7f44a787886222dc9439916ab271dc5af5d5bbef274288fdd8e245b4 \ --cri-socket=unix:///var/run/cri-dockerd.sock [preflight] Running pre-flight checks [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Starting the kubelet [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster. 如果在等一段时间还是node还是NotReady状态,则使用在master节点中执行以下命令: root@k8s-master01:~# kubectl get node NAME STATUS ROLES AGE VERSION k8s-master01 Ready control-plane 27m v1.27.1 k8s-master02 Ready control-plane 10m v1.27.1 k8s-master03 Ready control-plane 9m45s v1.27.1 k8s-slave01 NotReady <none> 49s v1.27.1 k8s-slave02 NotReady <none> 44s v1.27.1 root@k8s-master01:~# kubectl get pod -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-6c99c8747f-fslqk 1/1 Running 0 23m kube-system calico-node-6wgk4 0/1 Init:Error 4 (76s ago) 2m37s kube-system calico-node-grb97 1/1 Running 0 11m kube-system calico-node-ltczv 0/1 Init:CrashLoopBackOff 4 (56s ago) 2m42s kube-system calico-node-pffcg 1/1 Running 0 12m kube-system calico-node-vtcqg 1/1 Running 0 23m kube-system coredns-5d78c9869d-m5zgd 1/1 Running 0 28m kube-system coredns-5d78c9869d-mnxzj 1/1 Running 0 28m kube-system etcd-k8s-master01 1/1 Running 0 29m kube-system etcd-k8s-master02 1/1 Running 0 12m kube-system etcd-k8s-master03 1/1 Running 0 11m kube-system haproxy-k8s-master01 1/1 Running 0 29m kube-system haproxy-k8s-master02 1/1 Running 0 12m kube-system haproxy-k8s-master03 1/1 Running 0 11m kube-system keepalived-k8s-master01 1/1 Running 0 29m kube-system keepalived-k8s-master02 1/1 Running 0 12m kube-system keepalived-k8s-master03 1/1 Running 0 10m kube-system kube-apiserver-k8s-master01 1/1 Running 0 29m kube-system kube-apiserver-k8s-master02 1/1 Running 0 12m kube-system kube-apiserver-k8s-master03 1/1 Running 1 (11m ago) 11m kube-system kube-controller-manager-k8s-master01 1/1 Running 1 (12m ago) 29m kube-system kube-controller-manager-k8s-master02 1/1 Running 0 12m kube-system kube-controller-manager-k8s-master03 1/1 Running 0 10m kube-system kube-proxy-lmw7g 1/1 Running 0 11m kube-system kube-proxy-mb8hx 0/1 ErrImagePull 0 2m42s kube-system kube-proxy-nvx8b 0/1 ImagePullBackOff 0 2m37s kube-system kube-proxy-phvcm 1/1 Running 0 28m kube-system kube-proxy-psst7 1/1 Running 0 12m kube-system kube-scheduler-k8s-master01 1/1 Running 1 (12m ago) 29m kube-system kube-scheduler-k8s-master02 1/1 Running 0 12m kube-system kube-scheduler-k8s-master03 1/1 Running 0 10m # 可以看到kube-proxy镜像可能现在失败,我们看一下Pod的详情 root@k8s-master01:~# kubectl describe pod kube-proxy-mb8hx -n kube-system # 在Events里面可以看到是应为镜像没有pull下来,我们去两个node节点手动下载一下镜像 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 3m57s default-scheduler Successfully assigned kube-system/kube-proxy-mb8hx to k8s-slave01 Warning Failed 3m24s kubelet Failed to pull image "registry.k8s.io/kube-proxy:v1.27.1": rpc error: code = Unknown desc = Error response from daemon: Head "https://asia-east1-docker.pkg.dev/v2/k8s-artifacts-prod/images/kube-proxy/manifests/v1.27.1": dial tcp 64.233.187.82:443: i/o timeout Normal Pulling 62s (x4 over 3m54s) kubelet Pulling image "registry.k8s.io/kube-proxy:v1.27.1" Warning Failed 31s (x4 over 3m24s) kubelet Error: ErrImagePull Warning Failed 31s (x3 over 2m42s) kubelet Failed to pull image "registry.k8s.io/kube-proxy:v1.27.1": rpc error: code = Unknown desc = Error response from daemon: Head "https://asia-east1-docker.pkg.dev/v2/k8s-artifacts-prod/images/kube-proxy/manifests/v1.27.1": dial tcp 64.233.188.82:443: i/o timeout Normal BackOff 5s (x6 over 3m23s) kubelet Back-off pulling image "registry.k8s.io/kube-proxy:v1.27.1" Warning Failed 5s (x6 over 3m23s) kubelet Error: ImagePullBackOff # k8s-slave01节点下载镜像 root@k8s-slave01:~# bash download.sh registry.k8s.io/kube-proxy:v1.27.1 AILI_KUBERNETES_REGISTRY => registry.cn-hangzhou.aliyuncs.com/google_containers downloading image => registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.27.1 v1.27.1: Pulling from google_containers/kube-proxy b6425c1785a5: Pull complete 5730c7a042b6: Pull complete Digest: sha256:958ddb03a4d4d7a567d3563c759a05f3e95aa42ca8af2964aa76867aafc43610 Status: Downloaded newer image for registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.27.1 registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.27.1 rename image registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.27.1 to registry.k8s.io/kube-proxy:v1.27.1 # k8s-slave02节点下载镜像 root@k8s-slave02:~# bash download.sh registry.k8s.io/kube-proxy:v1.27.1 AILI_KUBERNETES_REGISTRY => registry.cn-hangzhou.aliyuncs.com/google_containers downloading image => registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.27.1 v1.27.1: Pulling from google_containers/kube-proxy b6425c1785a5: Pull complete 5730c7a042b6: Pull complete Digest: sha256:958ddb03a4d4d7a567d3563c759a05f3e95aa42ca8af2964aa76867aafc43610 Status: Downloaded newer image for registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.27.1 registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.27.1 rename image registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.27.1 to registry.k8s.io/kube-proxy:v1.27. # 看到kube-proxy已经Running并且READY 了 root@k8s-master01:~# kubectl get pod -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-6c99c8747f-fslqk 1/1 Running 0 28m kube-system calico-node-6wgk4 0/1 Init:CrashLoopBackOff 6 (57s ago) 7m38s kube-system calico-node-grb97 1/1 Running 0 16m kube-system calico-node-ltczv 0/1 Init:CrashLoopBackOff 6 (79s ago) 7m43s kube-system calico-node-pffcg 1/1 Running 0 17m kube-system calico-node-vtcqg 1/1 Running 0 28m kube-system coredns-5d78c9869d-m5zgd 1/1 Running 0 33m kube-system coredns-5d78c9869d-mnxzj 1/1 Running 0 33m kube-system etcd-k8s-master01 1/1 Running 0 34m kube-system etcd-k8s-master02 1/1 Running 0 17m kube-system etcd-k8s-master03 1/1 Running 0 16m kube-system haproxy-k8s-master01 1/1 Running 0 34m kube-system haproxy-k8s-master02 1/1 Running 0 17m kube-system haproxy-k8s-master03 1/1 Running 0 16m kube-system keepalived-k8s-master01 1/1 Running 0 34m kube-system keepalived-k8s-master02 1/1 Running 0 17m kube-system keepalived-k8s-master03 1/1 Running 0 15m kube-system kube-apiserver-k8s-master01 1/1 Running 0 34m kube-system kube-apiserver-k8s-master02 1/1 Running 0 17m kube-system kube-apiserver-k8s-master03 1/1 Running 1 (16m ago) 16m kube-system kube-controller-manager-k8s-master01 1/1 Running 1 (17m ago) 34m kube-system kube-controller-manager-k8s-master02 1/1 Running 0 17m kube-system kube-controller-manager-k8s-master03 1/1 Running 0 15m kube-system kube-proxy-lmw7g 1/1 Running 0 16m kube-system kube-proxy-mb8hx 1/1 Running 0 7m43s kube-system kube-proxy-nvx8b 1/1 Running 0 7m38s kube-system kube-proxy-phvcm 1/1 Running 0 33m kube-system kube-proxy-psst7 1/1 Running 0 17m kube-system kube-scheduler-k8s-master01 1/1 Running 1 (17m ago) 34m kube-system kube-scheduler-k8s-master02 1/1 Running 0 17m kube-system kube-scheduler-k8s-master03 1/1 Running 0 15m # 但是现在还有Calico网络插件没有Running, 我们看一下原因 root@k8s-master01:~# kubectl describe pod calico-node-ltczv -n kube-system # Events看到镜像已经下载成功了, 但是Back-off了,这里我们重新启动一下 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 9m38s default-scheduler Successfully assigned kube-system/calico-node-ltczv to k8s-slave01 Normal Pulled 9m35s kubelet Container image "docker.io/calico/cni:v3.25.0" already present on machine Normal Created 9m35s kubelet Created container upgrade-ipam Normal Started 9m35s kubelet Started container upgrade-ipam Normal Pulled 7m55s (x5 over 9m35s) kubelet Container image "docker.io/calico/cni:v3.25.0" already present on machine Normal Created 7m55s (x5 over 9m35s) kubelet Created container install-cni Normal Started 7m54s (x5 over 9m34s) kubelet Started container install-cni Warning BackOff 4m32s (x22 over 9m27s) kubelet Back-off restarting failed container install-cni in pod calico-node-ltczv_kube-system(c89e2e76-5045-4474-af93-9b839e1d2206) # 重启一下Calico DaemonSet控制器控制的Pod root@k8s-master01:~# kubectl -n kube-system rollout restart DaemonSet/calico-node daemonset.apps/calico-node restarted root@k8s-master01:~# kubectl get pod -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-6c99c8747f-fslqk 1/1 Running 0 33m kube-system calico-node-bbfx2 1/1 Running 0 15s kube-system calico-node-cf55q 0/1 Init:2/3 0 4s kube-system calico-node-ltczv 1/1 Running 0 12m kube-system calico-node-pffcg 1/1 Running 0 22m kube-system calico-node-vtcqg 1/1 Running 0 33m kube-system coredns-5d78c9869d-m5zgd 1/1 Running 0 38m kube-system coredns-5d78c9869d-mnxzj 1/1 Running 0 38m kube-system etcd-k8s-master01 1/1 Running 0 39m kube-system etcd-k8s-master02 1/1 Running 0 22m kube-system etcd-k8s-master03 1/1 Running 0 21m kube-system haproxy-k8s-master01 1/1 Running 0 39m kube-system haproxy-k8s-master02 1/1 Running 0 22m kube-system haproxy-k8s-master03 1/1 Running 0 21m kube-system keepalived-k8s-master01 1/1 Running 0 39m kube-system keepalived-k8s-master02 1/1 Running 0 22m kube-system keepalived-k8s-master03 1/1 Running 0 20m kube-system kube-apiserver-k8s-master01 1/1 Running 0 39m kube-system kube-apiserver-k8s-master02 1/1 Running 0 22m kube-system kube-apiserver-k8s-master03 1/1 Running 1 (21m ago) 21m kube-system kube-controller-manager-k8s-master01 1/1 Running 1 (22m ago) 39m kube-system kube-controller-manager-k8s-master02 1/1 Running 0 22m kube-system kube-controller-manager-k8s-master03 1/1 Running 0 20m kube-system kube-proxy-lmw7g 1/1 Running 0 21m kube-system kube-proxy-mb8hx 1/1 Running 0 12m kube-system kube-proxy-nvx8b 1/1 Running 0 12m kube-system kube-proxy-phvcm 1/1 Running 0 38m kube-system kube-proxy-psst7 1/1 Running 0 22m kube-system kube-scheduler-k8s-master01 1/1 Running 1 (22m ago) 39m kube-system kube-scheduler-k8s-master02 1/1 Running 0 22m kube-system kube-scheduler-k8s-master03 1/1 Running 0 20m # 过一会就都启动了 root@k8s-master01:~# kubectl get pod -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-6c99c8747f-fslqk 1/1 Running 0 34m kube-system calico-node-82c5z 1/1 Running 0 48s kube-system calico-node-9vrzk 1/1 Running 0 37s kube-system calico-node-bbfx2 1/1 Running 0 69s kube-system calico-node-cf55q 1/1 Running 0 58s kube-system calico-node-scrp4 1/1 Running 0 26s kube-system coredns-5d78c9869d-m5zgd 1/1 Running 0 39m kube-system coredns-5d78c9869d-mnxzj 1/1 Running 0 39m kube-system etcd-k8s-master01 1/1 Running 0 40m kube-system etcd-k8s-master02 1/1 Running 0 23m kube-system etcd-k8s-master03 1/1 Running 0 22m kube-system haproxy-k8s-master01 1/1 Running 0 40m kube-system haproxy-k8s-master02 1/1 Running 0 23m kube-system haproxy-k8s-master03 1/1 Running 0 22m kube-system keepalived-k8s-master01 1/1 Running 0 40m kube-system keepalived-k8s-master02 1/1 Running 0 23m kube-system keepalived-k8s-master03 1/1 Running 0 20m kube-system kube-apiserver-k8s-master01 1/1 Running 0 40m kube-system kube-apiserver-k8s-master02 1/1 Running 0 23m kube-system kube-apiserver-k8s-master03 1/1 Running 1 (22m ago) 22m kube-system kube-controller-manager-k8s-master01 1/1 Running 1 (23m ago) 40m kube-system kube-controller-manager-k8s-master02 1/1 Running 0 23m kube-system kube-controller-manager-k8s-master03 1/1 Running 0 21m kube-system kube-proxy-lmw7g 1/1 Running 0 22m kube-system kube-proxy-mb8hx 1/1 Running 0 13m kube-system kube-proxy-nvx8b 1/1 Running 0 13m kube-system kube-proxy-phvcm 1/1 Running 0 39m kube-system kube-proxy-psst7 1/1 Running 0 23m kube-system kube-scheduler-k8s-master01 1/1 Running 1 (23m ago) 40m kube-system kube-scheduler-k8s-master02 1/1 Running 0 23m kube-system kube-scheduler-k8s-master03 1/1 Running 0 21m # 节点都是 Ready 状态 root@k8s-master01:~# kubectl get node NAME STATUS ROLES AGE VERSION k8s-master01 Ready control-plane 41m v1.27.1 k8s-master02 Ready control-plane 24m v1.27.1 k8s-master03 Ready control-plane 24m v1.27.1 k8s-slave01 Ready <none> 15m v1.27.1 k8s-slave02 Ready <none> 15m v1.27.1 可能会出现镜像下载失败问题等等。 8. 高可用Master主节点加入集群(比如使用Root用户执行) 参考: https://kubernetes.io/zh-cn/docs/setup/production-environment/tools/kubeadm/high-availability/ 如果使用kubeadm init时,没有指定--upload-certs 选项, 则可以现在重新配置上传证书阶段: root@k8s-master01:~# kubeadm init phase upload-certs --upload-certs [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [upload-certs] Using certificate key: c82900d92a026aa6f6498b41ea70c9602e052c88eaca3e019d99b297af43230e 不要忘记默认情况下,--certificate-key 中的解密秘钥会在两个小时后过期。 我们这里就是root@k8s-master02和 root@k8s-master03 这台机器, 使用上面命令生成的 certificate key: c82900d92a026aa6f6498b41ea70c9602e052c88eaca3e019d99b297af43230e: root@k8s-master02:~# kubeadm join k8s-master:8443 --token th5i1f.fnzc9v0yb6z3aok8 \ --discovery-token-ca-cert-hash sha256:25357bff7f44a787886222dc9439916ab271dc5af5d5bbef274288fdd8e245b4 \ --control-plane \ --certificate-key 3636bc7d84515aeb36ca79597792b07cc64a888ebdea9221ab68a5bae93ac947 \ --cri-socket=unix:///var/run/cri-dockerd.sock [preflight] Running pre-flight checks [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [preflight] Running pre-flight checks before initializing the new control plane instance [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' W0520 20:29:27.062790 14892 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.1, falling back to the nearest etcd version (3.5.7-0) W0520 20:29:27.337990 14892 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.6" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image. [download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [download-certs] Saving the certificates to the folder: "/etc/kubernetes/pki" [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [k8s-master k8s-master02 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.20] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [k8s-master02 localhost] and IPs [192.168.0.20 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [k8s-master02 localhost] and IPs [192.168.0.20 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Valid certificates and keys now exist in "/etc/kubernetes/pki" [certs] Using the existing "sa" key [kubeconfig] Generating kubeconfig files [kubeconfig] Using kubeconfig folder "/etc/kubernetes" W0520 20:29:29.131836 14892 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [kubeconfig] Writing "admin.conf" kubeconfig file W0520 20:29:29.206366 14892 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [kubeconfig] Writing "controller-manager.conf" kubeconfig file W0520 20:29:29.479200 14892 endpoint.go:57] [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [check-etcd] Checking that the etcd cluster is healthy [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Starting the kubelet [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... [etcd] Announced new etcd member joining to the existing etcd cluster [etcd] Creating static Pod manifest for "etcd" W0520 20:29:31.931154 14892 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.1, falling back to the nearest etcd version (3.5.7-0) [etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s The 'update-status' phase is deprecated and will be removed in a future release. Currently it performs no operation [mark-control-plane] Marking the node k8s-master02 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers] [mark-control-plane] Marking the node k8s-master02 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule] This node has joined the cluster and a new control plane instance was created: * Certificate signing request was sent to apiserver and approval was received. * The Kubelet was informed of the new secure connection details. * Control plane label and taint were applied to the new node. * The Kubernetes control plane instances scaled up. * A new etcd member was added to the local/stacked etcd cluster. To start administering your cluster from this node, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Run 'kubectl get nodes' to see this node join the cluster. 配置节点kubectl命令 mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config 查看节点是否正常 (到此基本集群安装完成) root@k8s-master03:~# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master01 Ready control-plane 18m v1.27.1 k8s-master02 Ready control-plane 90s v1.27.1 k8s-master03 Ready control-plane 37s v1.27.1 9. 可选操作 (可选)从控制平面节点以外的计算机控制集群 为了使 kubectl 在其他计算机(例如笔记本电脑)上与你的集群通信, 你需要将管理员 kubeconfig 文件从控制平面节点复制到工作站,如下所示: scp root@<control-plane-host>:/etc/kubernetes/admin.conf . kubectl --kubeconfig ./admin.conf get nodes 说明: 上面的示例假定为 root 用户启用了 SSH 访问。如果不是这种情况, 你可以使用 scp 将 admin.conf 文件复制给其他允许访问的用户。 admin.conf 文件为用户提供了对集群的超级用户特权。 该文件应谨慎使用。对于普通用户,建议生成一个你为其授予特权的唯一证书。 你可以使用 kubeadm alpha kubeconfig user --client-name <CN> 命令执行此操作。 该命令会将 KubeConfig 文件打印到 STDOUT,你应该将其保存到文件并分发给用户。 之后,使用 kubectl create (cluster)rolebinding 授予特权。 (可选)将 API 服务器代理到本地主机 如果要从集群外部连接到 API 服务器,则可以使用 kubectl proxy: scp root@<control-plane-host>:/etc/kubernetes/admin.conf . kubectl --kubeconfig ./admin.conf proxy 你现在可以在本地访问 API 服务器 http://localhost:8001/api/v1。
2023年11月26日
250 阅读
0 评论
0 点赞