Kubernetes leader election 源码分析

1. leader election

leader election 通过多个副本抢占资源锁的方式实现单实例运行。在 Kubernetes 中,[Configmap|Lease|Endpoint] 可以作为资源锁的实现。

1.1 示例

直接看 leader election 代码容易晕,这里从示例入手,看 leader election 是怎么运行的。

// lease.go
/*
package mainfunc buildConfig(kubeconfig string) (*rest.Config, error) {...
}func main() {klog.InitFlags(nil)var kubeconfig stringvar leaseLockName stringvar leaseLockNamespace stringvar id stringflag.StringVar(&kubeconfig, "kubeconfig", "", "absolute path to the kubeconfig file")flag.StringVar(&id, "id", uuid.New().String(), "the holder identity name")flag.StringVar(&leaseLockName, "lease-lock-name", "kube-scheduler", "the lease lock resource name")flag.StringVar(&leaseLockNamespace, "lease-lock-namespace", "kube-system", "the lease lock resource namespace")flag.Parse()if leaseLockName == "" {klog.Fatal("unable to get lease lock resource name (missing lease-lock-name flag).")}if leaseLockNamespace == "" {klog.Fatal("unable to get lease lock resource namespace (missing lease-lock-namespace flag).")}config, err := buildConfig(kubeconfig)if err != nil {klog.Fatal(err)}client := clientset.NewForConfigOrDie(config)run := func(ctx context.Context) {klog.Info("Controller loop...")}ctx, cancel := context.WithCancel(context.Background())defer cancel()ch := make(chan os.Signal, 1)signal.Notify(ch, os.Interrupt, syscall.SIGTERM)go func() {<-chklog.Info("Received termination, signaling shutdown")cancel()}()lock := &resourcelock.LeaseLock{LeaseMeta: metav1.ObjectMeta{Name:      leaseLockName,Namespace: leaseLockNamespace,},Client: client.CoordinationV1(),LockConfig: resourcelock.ResourceLockConfig{Identity: id,},}leaderelection.RunOrDie(ctx, leaderelection.LeaderElectionConfig{Lock: lock,ReleaseOnCancel: true,LeaseDuration:   60 * time.Second,RenewDeadline:   15 * time.Second,RetryPeriod:     5 * time.Second,Callbacks: leaderelection.LeaderCallbacks{OnStartedLeading: func(ctx context.Context) {// we're notified when we start - this is where you would// usually put your coderun(ctx)},OnStoppedLeading: func() {// we can do cleanup hereklog.Infof("leader lost: %s", id)os.Exit(0)},OnNewLeader: func(identity string) {// we're notified when new leader electedif identity == id {// I just got the lockreturn}klog.Infof("new leader elected: %s", identity)},},})
}

运行 id 为 1 的进程抢占 lease:

# go run lease.go --kubeconfig=/root/.kube/config -id=1 -lease-lock-name=example -lease-lock-namespace=default 
I0222 09:44:40.389093 4054932 leaderelection.go:250] attempting to acquire leader lease default/example...
I0222 09:44:40.400607 4054932 leaderelection.go:260] successfully acquired lease default/example
I0222 09:44:40.402998 4054932 lease.go:87] Controller loop...

可以看到,id 为 1 的进程(简称进程 1)抢占到 lease,接着进入进程的运行逻辑。

继续运行 id 为 2 的进程(简称进程 2)抢占 lease:

# go run lease.go --kubeconfig=/root/.kube/config -id=2 -lease-lock-name=example -lease-lock-namespace=default
I0222 10:26:53.162990 4070458 leaderelection.go:250] attempting to acquire leader lease default/example...
I0222 10:26:53.170046 4070458 lease.go:151] new leader elected: 1

进程 2 抢占不到 lease,因为此时进程 1 是 leader,且 lease 未被释放或者过期。

kill 掉进程 1(注:此 id 不是进程的 PID),查看进程 2 能否抢占到 lease:

# go run lease.go --kubeconfig=/root/.kube/config -id=2 -lease-lock-name=example -lease-lock-namespace=default
I0222 10:26:53.162990 4070458 leaderelection.go:250] attempting to acquire leader lease default/example...
I0222 10:26:53.170046 4070458 lease.go:151] new leader elected: 1
I0222 10:31:54.704714 4070458 leaderelection.go:260] successfully acquired lease default/example
I0222 10:31:54.704931 4070458 lease.go:87] Controller loop...

可以看到,进程 2 成功抢占 lease。

1.2 源码分析

1.2.1 lease 资源

lease 作为 leader election 的资源锁存在,多副本抢占 lease 实际是对 lease 进行 [Get|Create|Update] 操作的过程。我们看 lease 在 Kubernetes 的资源表示:

# kubectl describe lease example -n default
Name:         example
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  coordination.k8s.io/v1
Kind:         Lease
Metadata:...
Spec:Acquire Time:            2024-02-22T08:31:54.696415ZHolder Identity:         2Lease Duration Seconds:  60Lease Transitions:       5Renew Time:              2024-02-22T08:51:47.060020Z
Events:                    <none>

在 Spec 域中有五个字段,分别介绍如下:

  • Acquire Time: 首次获取 lease 的时间。
  • Holder Identity: 当前持有 lease 的用户身份。
  • Lease Duration Seconds: lease 过期时间。
  • Lease Transitions: lease 被多少 Holder 持有过。
  • Renew Time: Holder 刷新 lease 的时间。lease 的持有者需要 renew lease,否则其它副本将抢占过期的 lease。

多副本抢占 lease 的信息将被记录到这几个字段中。lease 存储在 etcd 中,通过 Raft 算法实现 lease 资源的唯一性和原子性,以保证多副本仅可抢占唯一的 lease。

1.2.2 抢占状态图

多副本抢占 lease 的状态图如下,结合状态图看源码会更加清晰。

image

1.2.3 创建 lease

我们从创建 lease 开始分析。

首先,删掉 lease 资源。

# kubectl get lease
NAME      HOLDER   AGE
example   2        23h
# kubectl delete lease example
lease.coordination.k8s.io "example" deleted

然后,调式模式下运行 lease.go,进入 leaderelection.RunOrDie 查看函数做了什么。

// k8s.io/client-go/tools/leaderelection/leaderelection.go
func RunOrDie(ctx context.Context, lec LeaderElectionConfig) {// 创建 leader elector 实例 lele, err := NewLeaderElector(lec)if err != nil {panic(err)}...// 运行 leader elector 实例le.Run(ctx)
}

进入 LeaderElector.Run

// k8s.io/client-go/tools/leaderelection/leaderelection.go
func (le *LeaderElector) Run(ctx context.Context) {...if !le.acquire(ctx) {return // ctx signalled done}...le.renew(ctx)
}

LeaderElector.Run 中有两个重要的函数 LeaderElector.acquire 和 LeaderElector.renewLeaderElector.acquire 负责获取(抢占)锁资源。LeaderElector.renew 负责更新锁资源,从而一直持有锁。

进入 LeaderElector.acquire

// k8s.io/client-go/tools/leaderelection/leaderelection.go
func (le *LeaderElector) acquire(ctx context.Context) bool {...succeeded := falsedesc := le.config.Lock.Describe()klog.Infof("attempting to acquire leader lease %v...", desc)wait.JitterUntil(func() {// 重点是这里,tryAcquireOrRenew 负责获取锁,或者更新已经获取锁的信息succeeded = le.tryAcquireOrRenew(ctx)le.maybeReportTransition()if !succeeded {klog.V(4).Infof("failed to acquire lease %v", desc)return}le.config.Lock.RecordEvent("became leader")le.metrics.leaderOn(le.config.Name)klog.Infof("successfully acquired lease %v", desc)cancel()}, le.config.RetryPeriod, JitterFactor, true, ctx.Done())return succeeded
}

LeaderElector.acquire 函数的重点是 LeaderElector.tryAcquireOrRenew

// k8s.io/client-go/tools/leaderelection/leaderelection.go
func (le *LeaderElector) tryAcquireOrRenew(ctx context.Context) bool {now := metav1.NewTime(le.clock.Now())leaderElectionRecord := rl.LeaderElectionRecord{HolderIdentity:       le.config.Lock.Identity(),LeaseDurationSeconds: int(le.config.LeaseDuration / time.Second),RenewTime:            now,AcquireTime:          now,}// 1. obtain or create the ElectionRecordoldLeaderElectionRecord, oldLeaderElectionRawRecord, err := le.config.Lock.Get(ctx)if err != nil {if !errors.IsNotFound(err) {klog.Errorf("error retrieving resource lock %v: %v", le.config.Lock.Describe(), err)return false}if err = le.config.Lock.Create(ctx, leaderElectionRecord); err != nil {klog.Errorf("error initially creating leader election record: %v", err)return false}le.setObservedRecord(&leaderElectionRecord)return true}...return true
}

LeaderElector.tryAcquireOrRenew 实现的就是状态转移图中的逻辑,这里我们仅关注创建锁这一流程。

首先,LeaderElector.config.Lock.Get 通过 client-go 获取 etcd 中锁的信息。

// k8s.io/client-go/tools/leaderelection/resourcelock/leaselock.go
func (ll *LeaseLock) Get(ctx context.Context) (*LeaderElectionRecord, []byte, error) {// 获取锁(lease)lease, err := ll.Client.Leases(ll.LeaseMeta.Namespace).Get(ctx, ll.LeaseMeta.Name, metav1.GetOptions{})if err != nil {return nil, nil, err}...return record, recordByte, nil
}

没有找到锁,进入 LeaderElector.config.Lock.Create 创建锁。

// k8s.io/client-go/tools/leaderelection/resourcelock/leaselock.go
func (ll *LeaseLock) Create(ctx context.Context, ler LeaderElectionRecord) error {var err errorll.lease, err = ll.Client.Leases(ll.LeaseMeta.Namespace).Create(ctx, &coordinationv1.Lease{ObjectMeta: metav1.ObjectMeta{Name:      ll.LeaseMeta.Name,Namespace: ll.LeaseMeta.Namespace,},Spec: LeaderElectionRecordToLeaseSpec(&ler),}, metav1.CreateOptions{})return err
}

查看创建锁的信息:

# kubectl describe lease example -n default
Name:         example
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  coordination.k8s.io/v1
Kind:         Lease
Metadata:...
Spec:Acquire Time:            2024-02-23T05:42:07.781552ZHolder Identity:         1Lease Duration Seconds:  60Lease Transitions:       0Renew Time:              2024-02-23T05:42:07.781552Z

1.2.4 更新 lease

接着看状态转移图中的更新 lease 的流程,场景是 lease 资源已存在并过期,holder 是 id 为 1 的进程。

进入 LeaderElector.tryAcquireOrRenew

func (le *LeaderElector) tryAcquireOrRenew(ctx context.Context) bool {...// 1. obtain or create the ElectionRecordoldLeaderElectionRecord, oldLeaderElectionRawRecord, err := le.config.Lock.Get(ctx)if err != nil {...}// 2. Record obtained, check the Identity & Timeif !bytes.Equal(le.observedRawRecord, oldLeaderElectionRawRecord) {le.setObservedRecord(oldLeaderElectionRecord)le.observedRawRecord = oldLeaderElectionRawRecord}if len(oldLeaderElectionRecord.HolderIdentity) > 0 &&le.observedTime.Add(time.Second*time.Duration(oldLeaderElectionRecord.LeaseDurationSeconds)).After(now.Time) &&!le.IsLeader() {klog.V(4).Infof("lock is held by %v and has not yet expired", oldLeaderElectionRecord.HolderIdentity)return false}// 3. We're going to try to update. The leaderElectionRecord is set to it's default// here. Let's correct it before updating.if le.IsLeader() {leaderElectionRecord.AcquireTime = oldLeaderElectionRecord.AcquireTimeleaderElectionRecord.LeaderTransitions = oldLeaderElectionRecord.LeaderTransitions} else {leaderElectionRecord.LeaderTransitions = oldLeaderElectionRecord.LeaderTransitions + 1}// update the lock itselfif err = le.config.Lock.Update(ctx, leaderElectionRecord); err != nil {klog.Errorf("Failed to update lock: %v", err)return false}
}

在 LeaderElector.tryAcquireOrRenew 中获取锁,记录锁信息,比较锁 holder 是否是当前请求的 holder。如果是,记录锁的 AcquireTime 和 LeaderTransitions。最后,进入 LeaderElector.config.Lock.Update 更新锁:

func (ll *LeaseLock) Update(ctx context.Context, ler LeaderElectionRecord) error {...lease, err := ll.Client.Leases(ll.LeaseMeta.Namespace).Update(ctx, ll.lease, metav1.UpdateOptions{})if err != nil {return err}ll.lease = leasereturn nil
}

查看更新的 lease 信息:

# kubectl describe lease example -n default
Name:         example
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  coordination.k8s.io/v1
Kind:         Lease
Metadata:...
Spec:Acquire Time:            2024-02-23T05:42:07.781552ZHolder Identity:         1Lease Duration Seconds:  60Lease Transitions:       0Renew Time:              2024-02-23T06:00:52.802923Z

可以看到,holder 更新了 Renew Time,并且这里的 Lease Transitions 未更新。

1.2.5 renew lease

当 holder 拿到 lease 后需要 renew lease,以一直持有 lease。renew 是在获取锁之后,实现如下:

func (le *LeaderElector) Run(ctx context.Context) {...if !le.acquire(ctx) {return // ctx signalled done}...le.renew(ctx)
}func (le *LeaderElector) renew(ctx context.Context) {...wait.Until(func() {...err := wait.PollImmediateUntil(le.config.RetryPeriod, func() (bool, error) {// 获取并且 renew leasereturn le.tryAcquireOrRenew(timeoutCtx), nil}, timeoutCtx.Done())le.maybeReportTransition()desc := le.config.Lock.Describe()if err == nil {klog.V(5).Infof("successfully renewed lease %v", desc)return}le.metrics.leaderOff(le.config.Name)klog.Infof("failed to renew lease %v: %v", desc, err)cancel()}, le.config.RetryPeriod, ctx.Done())// if we hold the lease, give it upif le.config.ReleaseOnCancel {le.release()}
}

renew lease 也是通过 LeaderElector.tryAcquireOrRenew 更新 lease 信息实现的,这里就不多赘述了。

2. kube-scheduler 和 leader election

kube-scheduler 中调度和 leader election 结合的部分在 kube-schduler 的 Run 函数:

func Run(ctx context.Context, cc *schedulerserverconfig.CompletedConfig, sched *scheduler.Scheduler) error {...// If leader election is enabled, runCommand via LeaderElector until done and exit.// 配置 leader electionif cc.LeaderElection != nil {cc.LeaderElection.Callbacks = leaderelection.LeaderCallbacks{// OnStartedLeading 回调函数,当抢占到 lease 时回调OnStartedLeading: func(ctx context.Context) {...// 作为 leader 运行调度器的调度逻辑sched.Run(ctx)},OnStoppedLeading: func() {...}},}leaderElector, err := leaderelection.NewLeaderElector(*cc.LeaderElection)if err != nil {return fmt.Errorf("couldn't create leader elector: %v", err)}// 运行 leader election 抢占 leaseleaderElector.Run(ctx)return fmt.Errorf("lost lease")}// Leader election is disabled, so runCommand inline until done.close(waitingForLeader)sched.Run(ctx)return fmt.Errorf("finished without leader elect")
}

3. 小结

本文从源码角度分析 leader election 的实现,并且介绍了 leader election 和 kube-scheduler 结合实现是怎么实现多副本,单实例运行的。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/661913.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

一些优雅的监控运维技巧

准备工作 安装 sysstat sudo apt install sysstat查看某个进程的cpu情况 pidstst -u -p 256432查看某个进程的RAM情况 pidstst -r -p 256432查看某个进程的IO情况 pidstst -d -p 256432查看某个进程下的线程执行情况 pidstst -t -p 256432查看指定PID的进程对应的可执行文件…

Text-to-SQL小白入门(12)Awesome-Text2SQL开源项目star破1000

项目介绍 项目地址 23年9月份刚开源这个项目&#xff0c;大半年过去了&#xff0c;star数终于破1000啦&#xff0c;决定在知乎更新一下内容&#xff0c;看看内容变化&#xff0c;知乎有上当时项目介绍的链接&#xff1a;追光者&#xff1a;Text-to-SQL小白入门&#xff08;六&…

[python趣味实战]----基于python代码实现浪漫爱心 დ

正文 01-效果演示 下图是代码运行之后的爱心显示结果&#xff1a; 下面的视频该爱心是动态效果&#xff0c;较为简洁&#xff0c;如果需要使用&#xff0c;可以进行完善&#xff0c;这里只是一个趣味实战&#xff0c;下面将对代码实现进行非常详细地描述&#xff1a; 浪漫爱心…

77、贪心-买卖股票的最佳时机

思路 具体会导致全局最优&#xff0c;这里就可以使用贪心算法。方式如下&#xff1a; 遍历每一位元素找出当前元素最佳卖出是收益是多少。然后依次获取最大值&#xff0c;就是全局最大值。 这里可以做一个辅助数组&#xff1a;右侧最大数组&#xff0c;求右侧最大数组就要从…

Linux专栏03:使用Xshell远程连接云服务器

博客主页&#xff1a;Duck Bro 博客主页系列专栏&#xff1a;Linux专栏关注博主&#xff0c;后期持续更新系列文章如果有错误感谢请大家批评指出&#xff0c;及时修改感谢大家点赞&#x1f44d;收藏⭐评论✍ 使用Xshell远程连接云服务器 编号&#xff1a;03 文章目录 使用Xsh…

rust疑难杂症

rust疑难杂症解决 边碰到边记录&#xff0c;后续可能会逐步增加&#xff0c;备查 cargo build时碰到 Blocking waiting for file lock on package cache 原因是Cargo 无法获取对包缓存的文件锁&#xff0c; 有时vscode中项目比较多&#xff0c;如果其中某些库应用有问题&…

border-image-slice详细说明

上一篇文章我们介绍了 border-image的用法&#xff0c;其中border-image-source、border-image-width、 border-image-outset都比较简单好理解&#xff0c;这边文章我们重点学一下border-image-slice 属性&#xff0c;它用于定义边框图像如何被切割并应用到元素的边框上。这个属…

排序-八大排序FollowUp

FollowUp 1.插入排序 (1).直接插入排序 时间复杂度:最坏情况下:0(n^2) 最好情况下:0(n)当数据越有序 排序越快 适用于: 待排序序列 已经基本上趋于有序了! 空间复杂度:0(1) 稳定性:稳定的 public static void insertSort(int[] array){for (int i 1; i < array.length; i…

二维码门楼牌管理应用平台建设:实现用户权限的高效管理

文章目录 前言一、用户权限管理的重要性二、用户管理中心的构建三、用户权限管理的实施策略四、用户权限管理的挑战与应对五、结语 前言 随着信息技术的飞速发展&#xff0c;二维码门楼牌管理应用平台已成为城市管理的重要组成部分。本文将深入探讨如何通过用户权限管理&#…

Windows 容器镜像踩坑记录

为什么研究windows容器&#xff1f;emm&#xff0c;公司需要&#xff0c;不想多说。 dotnet后端 问题描述&#xff1a; 基于mcr.microsoft.com/dotnet/aspnet:6.0镜像撰写dockerfile编译.net core后端项目后运行容器出现类库不存在问题&#xff1a; 程序中使用了fastreport&a…

前端 CSS

目录 选择器 复合选择器 伪类-超链接 结构伪装选择器 伪元素选择器 画盒子 字体属性 CSS三大属性 Emmet写法 背景属性 显示模式 盒子模型 盒子模型-组成 盒子模型-向外溢出 盒子模型-圆角 盒子模型-阴影 flex position定位 CSS小精灵 字体图标 垂直对齐方式…

2024HW --->操作系统权限维持

前面不是说要讲隧道吗&#xff0c; 没错&#xff0c;我又鸽了&#x1f54a;&#x1f54a;&#x1f54a;&#x1f54a;&#xff0c;那么今天现在讲一下操作系统的权限维持&#xff01; 在我们提权之后&#xff0c;或者说打穿了整个内网之后&#xff0c;我们需要做权限维持&#…