引言
Kubernetes(简称K8s)作为云原生时代的容器编排标准,已经成为现代应用部署和管理的核心技术。无论是个人开发者、运维工程师还是企业架构师,掌握Kubernetes都至关重要。本指南将系统性地介绍从入门到精通的学习路径,并提供丰富的实战学习资料,帮助您构建完整的知识体系。
第一部分:Kubernetes基础概念与核心组件
1.1 什么是Kubernetes?
Kubernetes是一个开源的容器编排平台,最初由Google设计,现在由云原生计算基金会(CNCF)维护。它自动化了容器化应用的部署、扩展和管理。
核心优势:
- 自动化部署:无需手动配置每个容器
- 弹性伸缩:根据负载自动调整应用实例数量
- 自我修复:自动重启失败的容器或替换不可用的节点
- 服务发现与负载均衡:自动分配IP地址和DNS名称
1.2 Kubernetes核心组件架构
Kubernetes集群由控制平面(Control Plane)和工作节点(Worker Nodes)组成:
┌─────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
├─────────────────────────────────────────────────────────┤
│ Control Plane (Master Nodes) │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ API Server │ │ Scheduler │ │ Controller │ │
│ │ │ │ │ │ Manager │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ etcd (分布式键值存储) │ │
│ └─────────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────────────┤
│ Worker Nodes │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Node 1 │ │ Node 2 │ │ Node 3 │ │
│ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │
│ │ │ Kubelet │ │ │ │ Kubelet │ │ │ │ Kubelet │ │ │
│ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │
│ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │
│ │ │ Kube- │ │ │ │ Kube- │ │ │ │ Kube- │ │ │
│ │ │ proxy │ │ │ │ proxy │ │ │ │ proxy │ │ │
│ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │
│ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │
│ │ │ Pods │ │ │ │ Pods │ │ │ │ Pods │ │ │
│ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────┘
各组件详细说明:
- API Server:集群的前端,所有操作都通过它进行
- etcd:分布式键值存储,保存集群状态
- Scheduler:决定Pod应该调度到哪个节点
- Controller Manager:管理集群状态,确保实际状态与期望状态一致
- Kubelet:节点代理,管理Pod和容器
- Kube-proxy:网络代理,实现服务发现和负载均衡
1.3 Kubernetes核心概念详解
1.3.1 Pod:最小的可部署单元
Pod是Kubernetes中最小的调度单元,包含一个或多个紧密耦合的容器。
示例:创建一个简单的Pod
# nginx-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: nginx
spec:
containers:
- name: nginx-container
image: nginx:1.21
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
创建和查看Pod:
# 创建Pod
kubectl apply -f nginx-pod.yaml
# 查看Pod状态
kubectl get pods
# 查看Pod详细信息
kubectl describe pod nginx-pod
# 查看Pod日志
kubectl logs nginx-pod
1.3.2 Deployment:声明式部署管理
Deployment提供Pod的声明式更新,支持滚动更新和回滚。
示例:创建Deployment
# nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
Deployment操作命令:
# 创建Deployment
kubectl apply -f nginx-deployment.yaml
# 查看Deployment状态
kubectl get deployments
# 查看Pod状态
kubectl get pods -l app=nginx
# 扩展副本数
kubectl scale deployment nginx-deployment --replicas=5
# 查看滚动更新历史
kubectl rollout history deployment nginx-deployment
# 回滚到上一个版本
kubectl rollout undo deployment nginx-deployment
1.3.3 Service:服务发现与负载均衡
Service为一组Pod提供稳定的网络访问端点。
示例:创建Service
# nginx-service.yaml
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
type: ClusterIP # 可选:ClusterIP, NodePort, LoadBalancer, ExternalName
Service类型说明:
- ClusterIP:默认类型,仅集群内部可访问
- NodePort:通过节点IP和静态端口暴露服务
- LoadBalancer:使用云提供商的负载均衡器
- ExternalName:将服务映射到外部DNS名称
1.3.4 ConfigMap和Secret:配置管理
ConfigMap示例:
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
database.url: "postgres://localhost:5432/mydb"
app.port: "8080"
log.level: "info"
Secret示例(Base64编码):
# secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: app-secret
type: Opaque
data:
db-password: cGFzc3dvcmQxMjM= # "password123"的Base64编码
api-key: YXBpa2V5MTIzNDU2Nzg5 # "apikey123456789"的Base64编码
在Pod中使用ConfigMap和Secret:
# pod-with-config.yaml
apiVersion: v1
kind: Pod
metadata:
name: app-pod
spec:
containers:
- name: app-container
image: myapp:1.0
env:
- name: DATABASE_URL
valueFrom:
configMapKeyRef:
name: app-config
key: database.url
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: app-secret
key: db-password
volumeMounts:
- name: config-volume
mountPath: /etc/app-config
volumes:
- name: config-volume
configMap:
name: app-config
第二部分:Kubernetes安装与配置实战
2.1 本地开发环境搭建
2.1.1 使用Minikube(推荐初学者)
Minikube是官方推荐的本地Kubernetes开发工具。
安装Minikube:
# macOS
brew install minikube
# Linux
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube
# Windows
# 下载安装程序:https://minikube.sigs.k8s.io/docs/start/
启动Minikube集群:
# 启动集群(使用Docker作为驱动)
minikube start --driver=docker
# 查看集群状态
minikube status
# 查看集群信息
kubectl cluster-info
# 启用仪表板
minikube dashboard
Minikube常用命令:
# 停止集群
minikube stop
# 删除集群
minikube delete
# 查看集群日志
minikube logs
# 进入集群节点
minikube ssh
# 在集群内执行命令
minikube kubectl -- get pods
2.1.2 使用Kind(Kubernetes in Docker)
Kind是另一个流行的本地Kubernetes工具,特别适合CI/CD环境。
安装Kind:
# macOS
brew install kind
# Linux
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.20.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind
# Windows
# 下载二进制文件:https://kind.sigs.k8s.io/docs/user/quick-start/
创建Kind集群:
# 创建集群
kind create cluster --name my-cluster
# 查看集群
kubectl get nodes
# 删除集群
kind delete cluster --name my-cluster
# 创建多节点集群配置文件
cat > kind-config.yaml <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
EOF
# 使用配置文件创建集群
kind create cluster --config kind-config.yaml
2.2 生产环境集群部署
2.2.1 使用kubeadm部署(推荐)
kubeadm是Kubernetes官方推荐的集群部署工具。
系统要求:
- 2GB以上内存
- 2个以上CPU核心
- 禁用Swap分区
- 网络连通
部署步骤:
- 安装Docker和kubeadm
# 安装Docker
sudo apt-get update
sudo apt-get install -y docker.io
sudo systemctl enable docker
sudo systemctl start docker
# 安装kubeadm、kubelet和kubectl
sudo apt-get update && sudo apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee -a /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
- 初始化控制平面节点
# 初始化集群(使用Pod网络CIDR 10.244.0.0/16)
sudo kubeadm init --pod-network-cidr=10.244.0.0/16
# 配置kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
- 安装Pod网络插件(Flannel)
# 下载Flannel配置文件
curl -O https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
# 应用配置
kubectl apply -f kube-flannel.yml
# 查看节点状态
kubectl get nodes
- 加入工作节点
# 在控制平面节点获取加入命令
kubeadm token create --print-join-command
# 在工作节点执行加入命令
sudo kubeadm join <control-plane-ip>:6443 --token <token> --discovery-token-ca-cert-hash <hash>
2.2.2 使用托管Kubernetes服务
主流云服务商:
- AWS EKS:Amazon Elastic Kubernetes Service
- Google GKE:Google Kubernetes Engine
- Azure AKS:Azure Kubernetes Service
- 阿里云ACK:阿里云容器服务Kubernetes版
- 腾讯云TKE:腾讯云容器服务
以GKE为例:
# 安装Google Cloud SDK
# 配置gcloud
gcloud auth login
gcloud config set project your-project-id
# 创建GKE集群
gcloud container clusters create my-cluster \
--zone us-central1-a \
--num-nodes 3 \
--machine-type e2-medium \
--enable-autoscaling \
--min-nodes 1 \
--max-nodes 10
# 获取集群凭证
gcloud container clusters get-credentials my-cluster --zone us-central1-a
# 验证
kubectl get nodes
第三部分:Kubernetes进阶概念与实战
3.1 StatefulSet:有状态应用部署
StatefulSet用于管理有状态应用,提供稳定的网络标识和持久化存储。
示例:部署MySQL数据库
# mysql-statefulset.yaml
apiVersion: v1
kind: Service
metadata:
name: mysql
labels:
app: mysql
spec:
ports:
- port: 3306
selector:
app: mysql
clusterIP: None
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: "mysql"
replicas: 3
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8.0
ports:
- containerPort: 3306
env:
- name: MYSQL_ROOT_PASSWORD
value: "rootpassword"
volumeMounts:
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumeClaimTemplates:
- metadata:
name: mysql-persistent-storage
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
StatefulSet特性:
- 稳定的网络标识:Pod名称格式为
<statefulset-name>-<ordinal>,如mysql-0、mysql-1 - 稳定的持久化存储:每个Pod有独立的PersistentVolumeClaim
- 有序部署和扩展:按顺序创建和删除Pod
- 有序滚动更新:按顺序更新Pod
3.2 DaemonSet:节点级部署
DaemonSet确保所有(或指定)节点都运行一个Pod副本。
示例:部署日志收集器
# fluentd-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
labels:
k8s-app: fluentd-logging
spec:
selector:
matchLabels:
name: fluentd
template:
metadata:
labels:
name: fluentd
spec:
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.16.2-debian-elasticsearch7-1.0
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
3.3 Job和CronJob:批处理任务
3.3.1 Job:一次性任务
# job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: data-processing-job
spec:
completions: 5 # 需要完成的Pod数量
parallelism: 2 # 并行运行的Pod数量
backoffLimit: 4 # 失败重试次数
template:
spec:
containers:
- name: processor
image: busybox
command: ["sh", "-c", "echo Processing data... && sleep 10"]
restartPolicy: Never
3.3.2 CronJob:定时任务
# cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: backup-job
spec:
schedule: "0 2 * * *" # 每天凌晨2点执行
concurrencyPolicy: Forbid # 禁止并发执行
startingDeadlineSeconds: 600 # 超过10分钟未启动则放弃
jobTemplate:
spec:
completions: 1
parallelism: 1
template:
spec:
containers:
- name: backup
image: alpine
command: ["/bin/sh", "-c", "echo 'Backup started at $(date)' && tar -czf /backup.tar.gz /data"]
volumeMounts:
- name: data-volume
mountPath: /data
restartPolicy: OnFailure
volumes:
- name: data-volume
persistentVolumeClaim:
claimName: data-pvc
3.4 Horizontal Pod Autoscaler(HPA):自动扩缩容
HPA根据CPU使用率或其他指标自动调整Pod数量。
示例:配置HPA
# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70
启用Metrics Server:
# 安装Metrics Server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# 验证
kubectl top nodes
kubectl top pods
查看HPA状态:
kubectl get hpa
kubectl describe hpa nginx-hpa
第四部分:Kubernetes网络与存储
4.1 Kubernetes网络模型
Kubernetes网络遵循以下原则:
- Pod间通信:所有Pod可以直接通信,无需NAT
- Service抽象:提供稳定的访问端点
- DNS服务:自动DNS解析
4.1.1 Service类型详解
ClusterIP示例:
# clusterip-service.yaml
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app: myapp
ports:
- protocol: TCP
port: 80
targetPort: 9376
type: ClusterIP
NodePort示例:
# nodeport-service.yaml
apiVersion: v1
kind: Service
metadata:
name: my-nodeport-service
spec:
type: NodePort
selector:
app: myapp
ports:
- port: 80
targetPort: 80
nodePort: 30080 # 可选,不指定则随机分配
LoadBalancer示例(云环境):
# loadbalancer-service.yaml
apiVersion: v1
kind: Service
metadata:
name: my-loadbalancer-service
spec:
type: LoadBalancer
selector:
app: myapp
ports:
- port: 80
targetPort: 80
4.1.2 Ingress:HTTP/HTTPS路由
Ingress提供外部访问集群服务的入口点。
安装Ingress Controller(Nginx Ingress):
# 安装Nginx Ingress Controller
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/cloud/deploy.yaml
# 验证
kubectl get pods -n ingress-nginx
Ingress示例:
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: example-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: "myapp.example.com"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp-service
port:
number: 80
- host: "api.example.com"
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 8080
4.2 Kubernetes存储
4.2.1 PersistentVolume(PV)和PersistentVolumeClaim(PVC)
PV示例:
# pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-10Gi
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: manual
hostPath:
path: /mnt/data
PVC示例:
# pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: manual
4.2.2 StorageClass:动态卷供应
StorageClass示例:
# storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/aws-ebs # 根据云提供商调整
parameters:
type: gp3
iops: "3000"
throughput: "125"
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
使用StorageClass的PVC:
# pvc-with-storageclass.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: fast-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: fast-ssd
第五部分:Kubernetes安全与最佳实践
5.1 RBAC(基于角色的访问控制)
RBAC是Kubernetes的核心安全机制。
角色(Role)和角色绑定(RoleBinding)示例:
# role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
---
# rolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-pods
namespace: default
subjects:
- kind: User
name: jane
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
集群角色(ClusterRole)和集群角色绑定(ClusterRoleBinding):
# clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: secret-reader
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "watch", "list"]
---
# clusterrolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: read-secrets-global
subjects:
- kind: Group
name: "manager"
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: secret-reader
apiGroup: rbac.authorization.k8s.io
5.2 Pod安全策略(PSP)和Pod安全标准
Pod安全标准示例:
# pod-security-standard.yaml
apiVersion: v1
kind: Pod
metadata:
name: secure-pod
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
containers:
- name: app
image: nginx:1.21
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
5.3 网络策略(NetworkPolicy)
NetworkPolicy控制Pod之间的网络流量。
示例:限制Pod间通信
# networkpolicy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all
namespace: default
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
# allow-frontend-to-backend.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: default
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
5.4 安全最佳实践
- 最小权限原则:为服务账户分配最小必要权限
- 镜像安全:使用可信镜像源,定期扫描漏洞
- Secret管理:使用外部Secret管理工具(如HashiCorp Vault)
- 审计日志:启用Kubernetes审计日志
- Pod安全标准:使用Pod安全标准限制Pod权限
第六部分:Kubernetes监控与日志
6.1 Prometheus + Grafana监控栈
安装Prometheus Operator:
# 使用Helm安装
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace \
--set grafana.adminPassword="admin123"
自定义监控指标示例:
# service-monitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: myapp-monitor
namespace: monitoring
spec:
selector:
matchLabels:
app: myapp
endpoints:
- port: web
interval: 30s
path: /metrics
6.2 日志收集方案
EFK栈(Elasticsearch + Fluentd + Kibana):
- 安装Elasticsearch:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/logging-elasticsearch/es-deployment.yaml
- 安装Fluentd(DaemonSet):
kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/logging-elasticsearch/fluentd-es-ds.yaml
- 安装Kibana:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/logging-elasticsearch/kibana-deployment.yaml
Loki + Promtail + Grafana方案:
# 使用Helm安装Loki
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm install loki grafana/loki-stack \
--namespace logging \
--create-namespace \
--set promtail.enabled=true \
--set grafana.enabled=true
第七部分:Kubernetes高级主题
7.1 自定义资源定义(CRD)
CRD允许用户扩展Kubernetes API。
示例:创建自定义资源
# crd-definition.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: databases.example.com
spec:
group: example.com
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
engine:
type: string
version:
type: string
replicas:
type: integer
scope: Namespaced
names:
plural: databases
singular: database
kind: Database
shortNames:
- db
使用CRD:
# database-instance.yaml
apiVersion: example.com/v1
kind: Database
metadata:
name: my-database
spec:
engine: postgresql
version: "13"
replicas: 3
7.2 Operator模式
Operator是使用自定义资源管理应用的Kubernetes控制器。
Operator SDK示例:
# 安装Operator SDK
# macOS
brew install operator-sdk
# 创建Operator项目
operator-sdk init --domain example.com --repo github.com/example/my-operator
# 创建API
operator-sdk create api --group myapp --version v1 --kind MyApp --resource --controller
# 生成代码
make generate
make manifests
# 构建Operator
make docker-build IMG=my-operator:1.0
# 部署Operator
make deploy IMG=my-operator:1.0
7.3 服务网格(Service Mesh)
Istio安装:
# 下载Istio
curl -L https://istio.io/downloadIstio | sh -
cd istio-*
export PATH=$PWD/bin:$PATH
# 安装Istio
istioctl install --set profile=demo -y
# 验证
kubectl get pods -n istio-system
Istio VirtualService示例:
# virtualservice.yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: reviews
spec:
hosts:
- reviews
http:
- route:
- destination:
host: reviews
subset: v1
weight: 50
- destination:
host: reviews
subset: v2
weight: 50
第八部分:实战项目与学习路径
8.1 实战项目推荐
项目1:微服务应用部署
架构:
- 前端:React应用
- 后端:Node.js API服务
- 数据库:PostgreSQL + Redis缓存
- 监控:Prometheus + Grafana
- 日志:EFK栈
部署步骤:
- 创建命名空间
- 部署数据库(StatefulSet)
- 部署后端服务(Deployment + Service)
- 部署前端服务(Deployment + Service)
- 配置Ingress路由
- 设置HPA自动扩缩容
- 配置监控和日志
项目2:CI/CD流水线
工具链:
- GitLab/GitHub Actions
- Jenkins
- ArgoCD(GitOps)
- Harbor(镜像仓库)
流水线示例:
# .gitlab-ci.yml
stages:
- build
- test
- deploy
build:
stage: build
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
test:
stage: test
script:
- docker run --rm $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA npm test
deploy:
stage: deploy
script:
- kubectl set image deployment/myapp myapp=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
only:
- main
8.2 学习路径建议
阶段1:基础入门(1-2周)
- 学习容器基础:Docker基础命令和概念
- Kubernetes核心概念:Pod、Service、Deployment
- 本地环境搭建:Minikube或Kind
- 基础操作:kubectl命令行工具
推荐资源:
- 官方文档:https://kubernetes.io/docs/home/
- Kubernetes in Action(书籍)
- K8s官方教程:https://kubernetes.io/docs/tutorials/
阶段2:进阶学习(2-4周)
- 高级资源类型:StatefulSet、DaemonSet、Job、CronJob
- 配置管理:ConfigMap、Secret、环境变量
- 存储管理:PV、PVC、StorageClass
- 网络模型:Service类型、Ingress、NetworkPolicy
推荐资源:
- Kubernetes: Up and Running(书籍)
- Kubernetes官方博客
- K8s社区文档
阶段3:生产实践(4-8周)
- 集群部署:kubeadm、云托管集群
- 监控与日志:Prometheus、Grafana、EFK
- 安全实践:RBAC、Pod安全、网络策略
- CI/CD集成:GitOps、ArgoCD
推荐资源:
- Kubernetes Patterns(书籍)
- CNCF官方课程
- 云厂商认证课程(CKA/CKAD/CKS)
阶段4:高级主题(持续学习)
- Operator开发:Operator SDK
- 服务网格:Istio、Linkerd
- 多集群管理:Kubernetes Federation
- Serverless:Knative、OpenFaaS
8.3 认证考试准备
CKA(Certified Kubernetes Administrator):
- 考试时长:3小时
- 考试形式:实操题
- 重点:集群管理、故障排除、部署配置
CKAD(Certified Kubernetes Application Developer):
- 考试时长:2小时
- 考试形式:实操题
- 重点:应用部署、配置管理、调试
CKS(Certified Kubernetes Security Specialist):
- 考试时长:2小时
- 考试形式:实操题
- 重点:安全配置、漏洞管理、运行时安全
备考资源:
- 官方考试指南:https://www.cncf.io/certification/cka/
- KodeKloud课程
- Killer.sh模拟考试环境
第九部分:Kubernetes生态系统与工具
9.1 常用工具列表
命令行工具:
- kubectl:Kubernetes官方CLI
- k9s:Kubernetes终端UI
- helm:Kubernetes包管理器
- kubectx:快速切换集群和命名空间
- stern:多Pod日志查看器
开发工具:
- Skaffold:Kubernetes本地开发工具
- Telepresence:本地开发与集群集成
- DevSpace:云原生开发平台
部署工具:
- Helm:Kubernetes包管理
- Kustomize:原生配置管理
- FluxCD:GitOps工具
- ArgoCD:GitOps工具
监控工具:
- Prometheus:监控系统
- Grafana:可视化仪表板
- Jaeger:分布式追踪
- Loki:日志聚合
9.2 Helm实战
Helm安装:
# macOS
brew install helm
# Linux
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
创建Helm Chart:
# 创建Chart
helm create mychart
# Chart结构
mychart/
├── Chart.yaml # Chart元数据
├── values.yaml # 默认值
├── templates/ # 模板文件
│ ├── deployment.yaml
│ ├── service.yaml
│ └── ingress.yaml
└── charts/ # 依赖Chart
Chart.yaml示例:
apiVersion: v2
name: mychart
description: A Helm chart for Kubernetes
type: application
version: 0.1.0
appVersion: "1.0.0"
values.yaml示例:
replicaCount: 1
image:
repository: nginx
tag: "1.21"
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 80
ingress:
enabled: false
className: ""
annotations: {}
hosts:
- host: chart-example.local
paths:
- path: /
pathType: ImplementationSpecific
部署Helm Chart:
# 添加仓库
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
# 安装Chart
helm install my-release bitnami/nginx
# 自定义值安装
helm install my-release bitnami/nginx \
--set service.type=NodePort \
--set service.nodePort=30080
# 使用values文件安装
helm install my-release bitnami/nginx -f custom-values.yaml
# 升级
helm upgrade my-release bitnami/nginx --set image.tag=1.22
# 回滚
helm rollback my-release 1
# 卸载
helm uninstall my-release
9.3 Kustomize实战
Kustomize基础结构:
myapp/
├── base/
│ ├── deployment.yaml
│ ├── service.yaml
│ └── kustomization.yaml
└── overlays/
├── development/
│ ├── kustomization.yaml
│ └── replica-count.yaml
├── staging/
│ ├── kustomization.yaml
│ └── replica-count.yaml
└── production/
├── kustomization.yaml
└── replica-count.yaml
base/kustomization.yaml:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- deployment.yaml
- service.yaml
overlays/development/kustomization.yaml:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../base
patchesStrategicMerge:
- replica-count.yaml
overlays/development/replica-count.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 1
应用Kustomize配置:
# 应用开发环境
kubectl apply -k overlays/development
# 应用生产环境
kubectl apply -k overlays/production
# 生成YAML
kustomize build overlays/development
第十部分:故障排除与最佳实践
10.1 常见故障排除命令
# 查看集群状态
kubectl cluster-info
kubectl get nodes
kubectl get pods --all-namespaces
# 查看Pod详细信息
kubectl describe pod <pod-name> -n <namespace>
# 查看Pod日志
kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -c <container-name> -n <namespace>
# 查看事件
kubectl get events --sort-by='.lastTimestamp'
# 查看资源使用情况
kubectl top pods
kubectl top nodes
# 进入Pod调试
kubectl exec -it <pod-name> -- /bin/bash
# 端口转发
kubectl port-forward pod/<pod-name> 8080:80
# 调试网络
kubectl run -it --rm --image=nicolaka/netshoot debug-pod -- /bin/bash
10.2 常见问题与解决方案
问题1:Pod处于Pending状态
可能原因:
- 资源不足(CPU/内存)
- 节点选择器不匹配
- 亲和性/反亲和性限制
- 存储卷未就绪
排查步骤:
kubectl describe pod <pod-name>
kubectl get events --sort-by='.lastTimestamp'
问题2:Service无法访问
可能原因:
- 端口不匹配
- 标签选择器错误
- 网络策略限制
- 节点网络问题
排查步骤:
kubectl get svc <service-name>
kubectl get endpoints <service-name>
kubectl run -it --rm --image=busybox test-pod -- nslookup <service-name>
问题3:镜像拉取失败
可能原因:
- 镜像不存在或拼写错误
- 私有仓库认证失败
- 网络问题
- 镜像标签不存在
排查步骤:
kubectl describe pod <pod-name>
kubectl get events --field-selector reason=Failed
10.3 Kubernetes最佳实践
1. 资源管理
# 资源请求和限制
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
2. 健康检查
# 就绪探针和存活探针
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
3. 镜像管理
- 使用特定标签,避免
latest - 定期扫描镜像漏洞
- 使用多阶段构建减小镜像体积
4. 配置管理
- 使用ConfigMap和Secret管理配置
- 避免硬编码配置
- 使用环境变量或卷挂载
5. 安全实践
- 使用最小权限原则
- 定期更新Kubernetes版本
- 启用RBAC和网络策略
- 使用Pod安全标准
第十一部分:学习资源汇总
11.1 官方资源
- Kubernetes官方文档:https://kubernetes.io/docs/home/
- Kubernetes GitHub:https://github.com/kubernetes/kubernetes
- CNCF官方课程:https://www.cncf.io/training/
- Kubernetes博客:https://kubernetes.io/blog/
11.2 书籍推荐
《Kubernetes in Action》 - Marko Lukša
- 适合初学者,全面介绍Kubernetes概念
《Kubernetes: Up and Running》 - Brendan Burns, Joe Beda, Kelsey Hightower
- 官方作者编写,权威性强
《Kubernetes Patterns》 - Bilgin Ibryam, Roland Huß
- 介绍Kubernetes设计模式和最佳实践
《Kubernetes Security》 - Liz Rice, Michael Hausenblas
- 专注于Kubernetes安全领域
11.3 在线课程
- Kubernetes官方教程:https://kubernetes.io/docs/tutorials/
- KodeKloud Kubernetes课程:https://kodekloud.com/
- Udemy课程:Kubernetes for the Absolute Beginners
- Pluralsight课程:Kubernetes Fundamentals
11.4 社区资源
- Kubernetes Slack:https://kubernetes.slack.com/
- Kubernetes Forum:https://discuss.kubernetes.io/
- Kubernetes Meetup:https://www.meetup.com/
- Kubernetes中文社区:https://www.kubernetes.org.cn/
11.5 实战平台
- Katacoda:https://www.katacoda.com/courses/kubernetes
- Play with Kubernetes:https://labs.play-with-kubernetes.com/
- Kubernetes Playground:https://kubernetes.io/docs/tutorials/kubernetes-basics/
第十二部分:持续学习与进阶
12.1 关注Kubernetes发展
- Kubernetes版本发布:每3-4个月发布新版本
- CNCF项目:关注CNCF生态项目发展
- 云原生技术:Service Mesh、Serverless、GitOps等
12.2 参与社区
- 贡献代码:参与Kubernetes或相关项目
- 撰写博客:分享学习经验和实践
- 参加会议:KubeCon、云原生技术大会
- 加入SIG:Special Interest Groups
12.3 构建个人知识体系
- 建立知识图谱:整理Kubernetes概念关系
- 实践项目:持续构建和部署应用
- 故障排除经验:记录和分析问题
- 技术分享:向他人讲解Kubernetes
结语
Kubernetes作为云原生时代的核心技术,学习曲线虽然陡峭,但回报丰厚。通过系统性的学习和实践,您将能够掌握这一强大的容器编排平台。本指南提供了从入门到精通的完整学习路径,但请记住,实践是最好的老师。建议您在学习过程中:
- 动手实践:不要只看文档,要实际操作
- 循序渐进:从简单应用开始,逐步增加复杂度
- 参与社区:与其他学习者和专家交流
- 持续学习:Kubernetes生态快速发展,保持学习热情
祝您在Kubernetes的学习之旅中取得成功!
