Preemption in Go: an introduction
In my post about the sync.Pool
internals, I briefly touched on preemption, but I didn’t explain what it really is about. Let’s fix that.
If you are in France, like me, you just completed your income tax declaration. If the go runtime is the tax administration, cpu-time is money, and preemption would be a letter of reminder. Or a bailiff banging on your door if you’ve been naughty.
What is preemption?
Preemption is one of the ways the go runtime ensures fairness. It distributes the time between all goroutines, so that no goroutine can take all the cpu resources and starve the others. Not only other goroutines could starve, but scheduling could also be delayed, so this could derail into serious problems.
Preemption as we know it today was first introduced in go 1.2, back in 2013. Before that, the go runtime was not preemptive, meaning that it would not enforce time distribution fairness. The whole stuff would rely on the goroutines to play by the rules. It was a time where we could trust each others, we didn’t need to lock our doors, and children could go out all day without parents knowing what they were up to. Ah, the good old days… Where was I again? Oh yes, the go runtime.
So a goroutine could suck up all the cpu-time, starve the others, and delay new ones, until it explicitly gave up control, such as by calling runtime.Gosched()
or during I/O operations.
And here’s preemption coming to the rescue. It occurs at so-called “safe-points”, which are places in the code where we know for sure that we can take over control flow from a goroutine without getting into trouble. What I mean by that is that we don’t want to preempt goroutines anytime, but only at places where it is easy to stop them - think about it - what if I suddenly stopped you while taking a dump? It would have been easier to have stopped you before, or after, but not in the middle of it. Not sure about the analogy, but you got it.
The first step towards preemption is using blocked safe-points. These safe-points happens either during descheduling, or when a goroutine is blocked on synchronization primitives, or on a system call. The goroutine is just hanging there doing nothing, so of course it can be preempted. But this is not enough! What if a goroutine does not stop processing stuff, such as crunching numbers?
Synchronous preemption
Synchronous preemption happens when a goroutine checks for a preemption request: the go runtime communicates its intent to preempt a goroutine, the goroutine “sees” the request and bails out. But there’s something interesting in how this is implemented.
As you may already know, a function begins with a prologue, and ends with an epilogue. These two parts are added by the compiler to do various stuff, preparing and finalizing the function call. In the prologue, there’s a stack bound check, which is a complicated way of saying that we check that there’s enough space on the stack for this call.
Let’s take a look at this engineering masterpiece:
func main() {
if sum(10, 10) != 20 {
panic("earth must be flat")
}
}
func sum(a, b int) int {
var s int
if a > 0 {
s++
a--
}
if b > 0 {
s++
b--
}
if s == 0 {
return 0
}
return s + sum(a, b)
}
That’s how the cool kids do it. Now, let’s have a look at what the compiler spits out when confronted with this level of quality, using go tool compile -S
:
00000 CMPQ SP, 16(R14)
00000 TEXT main.sum(SB), ABIInternal, $32-16
00004 PCDATA $0, $-2
00004 JLS 89
00006 PCDATA $0, $-1
00006 PUSHQ BP
00007 MOVQ SP, BP
00010 SUBQ $24, SP
00014 FUNCDATA $0, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
00014 FUNCDATA $1, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
00014 FUNCDATA $5, main.sum.arginfo1(SB)
00014 FUNCDATA $6, main.sum.argliveinfo(SB)
00014 PCDATA $3, $1
00014 LEAQ -1(AX), CX
00018 LEAQ -1(BX), DX
00022 TESTQ AX, AX
00025 SETGT SIB
00029 CMOVQGT CX, AX
00033 MOVBLZX SIB, CX
00037 LEAQ 1(CX), SI
00041 TESTQ BX, BX
00044 CMOVQGT SI, CX
00048 CMOVQGT DX, BX
00052 TESTQ CX, CX
00055 JNE 65
00057 XORL AX, AX
00059 ADDQ $24, SP
00063 POPQ BP
00064 RET
00065 MOVQ CX, main.s+16(SP)
00070 PCDATA $1, $0
00070 CALL main.sum(SB)
00075 MOVQ main.s+16(SP), CX
00080 ADDQ CX, AX
00083 ADDQ $24, SP
00087 POPQ BP
00088 RET
00089 NOP
00089 PCDATA $1, $-1
00089 PCDATA $0, $-2
00089 MOVQ AX, 8(SP)
00094 MOVQ BX, 16(SP)
00099 CALL runtime.morestack_noctxt(SB)
00104 PCDATA $0, $-1
00104 MOVQ 8(SP), AX
00109 MOVQ 16(SP), BX
00114 JMP 0
The first instruction CMPQ SP, 16(R14)
compares the stack pointer with the stack guard. Then at 00004
, there’s a conditional jump JLS 89
which says that we should jump at 89 if the stack pointer was greater than or equal to the stack guard, which would indicate that there’s not enough space in the stack for this call.
Then, between 00089
and 00099
, we save AX
and BX
onto the stack so that we do not lose them later on, and call runtime.morestack_noctxt
.
This runtime.morestack_noctxt
is more or less the same as runtime.morestack
which is implemented using assembly, and in the end, it calls runtime.newstack
- just trust me on this.
runtime.newstack
’s job is to allocate a bigger stack, and relocate the old one onto it. Let’s have a quick look at it:
func newstack() {
[...]
stackguard0 := atomic.Loaduintptr(&gp.stackguard0)
preempt := stackguard0 == stackPreempt
if preempt {
if !canPreemptM(thisg.m) {
// Let the goroutine keep running for now.
// gp->preempt is set, so it will be preempted next time.
gp.stackguard0 = gp.stack.lo + stackGuard
gogo(&gp.sched) // never return
}
}
[...]
if preempt {
[...]
// Act like goroutine called runtime.Gosched.
gopreempt_m(gp) // never return
}
[...]
}
And here it is: newstack
contains code related to preemption. It checks if the stack guard was equal to stackPreempt
, which is 0xfffffade
, in which case it fixes the stack guard and preempt right now, calling gopreempt_m
, which will act as if the goroutine itself called runtime.Gosched()
.
This is how the runtime forces preemption at the next function prologue: it poisons the stackguard of the goroutine it wants to preempt, and waits for the next function prologue to fail, as SP
is always greater than 0xfffffade
. Using that trick, we ensure that the goroutine is at a safe-point, in a function prologue. It also does not cost a dime: should the runtime not have preempted the goroutine, the stack bound check would have happened anyway.
But what if the sum
function was crunching number in a tight loop, maybe with a couple inlined functions and other optimizations? There would not be any function prologue to check against the stack guard. The stack bound check technique is a letter of reminder - now we need to call the bailiff.
Asynchronous preemption
Asynchronous preemption is a more recent addition to the go runtime. It was introduced in go 1.14, in 2020, to fix tight loop preemption, which was getting worse with compiler optimizations getting better. It works by sending a SIGURG
signal to the thread where the goroutine is running:
// preemptM sends a preemption request to mp. This request may be
// handled asynchronously and may be coalesced with other requests to
// the M. When the request is received, if the running G or P are
// marked for preemption and the goroutine is at an asynchronous
// safe-point, it will preempt the goroutine. It always atomically
// increments mp.preemptGen after handling a preemption request.
func preemptM(mp *m) {
// On Darwin, don't try to preempt threads during exec.
// Issue #41702.
if GOOS == "darwin" || GOOS == "ios" {
execLock.rlock()
}
if mp.signalPending.CompareAndSwap(0, 1) {
if GOOS == "darwin" || GOOS == "ios" {
pendingPreemptSignals.Add(1)
}
// If multiple threads are preempting the same M, it may send many
// signals to the same M such that it hardly make progress, causing
// live-lock problem. Apparently this could happen on darwin. See
// issue #37741.
// Only send a signal if there isn't already one pending.
signalM(mp, sigPreempt) // sigPreempt is SIGURG
}
if GOOS == "darwin" || GOOS == "ios" {
execLock.runlock()
}
}
On the other side, in the thread, a signal handler listens for it:
func sighandler(sig uint32, info *siginfo, ctxt unsafe.Pointer, gp *g) {
[...]
if sig == sigPreempt && debug.asyncpreemptoff == 0 && !delayedSignal {
// Might be a preemption signal.
doSigPreempt(gp, c)
// Even if this was definitely a preemption signal, it
// may have been coalesced with another signal, so we
// still let it through to the application.
}
[...]
}
// doSigPreempt handles a preemption signal on gp.
func doSigPreempt(gp *g, ctxt *sigctxt) {
// Check if this G wants to be preempted and is safe to
// preempt.
if wantAsyncPreempt(gp) {
if ok, newpc := isAsyncSafePoint(gp, ctxt.sigpc(), ctxt.sigsp(), ctxt.siglr()); ok {
// Adjust the PC and inject a call to asyncPreempt.
ctxt.pushCall(abi.FuncPCABI0(asyncPreempt), newpc)
}
}
// Acknowledge the preemption.
gp.m.preemptGen.Add(1)
gp.m.signalPending.Store(0)
if GOOS == "darwin" || GOOS == "ios" {
pendingPreemptSignals.Add(-1)
}
}
Here, doSigPreempt
checks if we are at an asynchronous safe-point, and forces the goroutine to get into a call to asyncPreempt
. Here is asyncPreempt
:
// asyncPreempt saves all user registers and calls asyncPreempt2.
//
// When stack scanning encounters an asyncPreempt frame, it scans that
// frame and its parent frame conservatively.
//
// asyncPreempt is implemented in assembly.
func asyncPreempt()
//go:nosplit
func asyncPreempt2() {
gp := getg()
gp.asyncSafePoint = true
if gp.preemptStop {
mcall(preemptPark)
} else {
mcall(gopreempt_m)
}
gp.asyncSafePoint = false
}
As said in the comments, asyncPreempt
spills all registers to save the context, and gets into asyncPreempt2
, which in turn goes into gopreempt_m
, that will act as if the goroutine just called runtime.Gosched()
.
If we simplify things a little, we can say that when the runtime requests synchronous preemption and poisons the stack guard, it also prepares for asynchronous preemption. The goroutine has around 10 microseconds to get preempted, or the SIGURG signal will be sent to its thread.
I hope this cleared things up a little. Now go have fun!