Bounds checking in Go

Posted on Nov 15, 2023

When accessing an element in a slice, you might go past its length. This is a typical off-by-one error, and fortunately there is a safeguard for this: bounds checking. It protects you from buffer overflow attacks and tells you there’s something wrong with your code - so thank you, bounds checking. Before accessing the element, the index is compared with the slice’s length to make sure it is within bounds. If it is not, runtime.panicIndex() is called!

a := []int{1,2,3}
_ = a[3] // out of bounds

I am simplifying things here: there’s actually two types of bounds checks. The one I described above watches after indexes, but slicing is also covered.

Bounds checking elimination

In later phases of compilation, there’s an optimization to remove unecessary bounds checks: if a check covers multiple operations following it, there’s no need to recheck.

a := []int{1,2,3}
_ = a[2] // bound check for 2
_ = a[1] // ...no need to bound check here

That optimization is called Bounds Checking Elimination, or BCE. The Go compiler is quite smart about it, and will remove a lot of checks to speed up the execution and size of the binary.

BCE can be disabled using -gcflags=-B - but think before doing this.

Early bounds check

You can hint the compiler and force a bounds check early. But first, let’s consider the following code:

package main

import "fmt"

func main() {
    s := sum([]int{1,2,3})
    fmt.Printf("1+2+3=%d\n", s)
}

func sum(a []int) int {
    var s int
    for i := 0; i <= 2; i++ {
        s += a[i] // Found IsInBounds
    }
    return s
}

Passing -gcflags="-d=ssa/check_bce/debug=1" to go build will print out where bounds checks are happening:

❯ go build -gcflags="-d=ssa/check_bce/debug=1" ./cmd/b
# playground/bce/cmd/b
cmd/b/a.go:13:9: Found IsInBounds

Let’s dive in the assembly for that program, obtained with the -S flag:

00000 TEXT    main.sum(SB), ABIInternal, $32-24
00000 MOVD    16(g), R16
00004 PCDATA  $0, $-2
00004 CMP     R16, RSP
00008 BLS     100
00012 PCDATA  $0, $-1
00012 MOVD.W  R30, -32(RSP)
00016 MOVD    R29, -8(RSP)
00020 SUB     $8, RSP, R29
00024 MOVD    R0, main.a(FP)
00028 FUNCDATA        $0, gclocals·wgcWObbY2HYnK2SU/U22lA==(SB)
00028 FUNCDATA        $1, gclocals·J5F+7Qw7O7ve2QcWC7DpeQ==(SB)
00028 FUNCDATA        $5, main.sum.arginfo1(SB)
00028 FUNCDATA        $6, main.sum.argliveinfo(SB)
00028 PCDATA  $3, $1
00028 MOVD    ZR, R2
00032 MOVD    ZR, R3
00036 JMP     52
00040 MOVD    (R0)(R2<<3), R4
00044 ADD     $1, R2, R2     # Here is our i <= 2 loop
00048 ADD     R3, R4, R3
00052 CMP     $2, R2
00056 BGT     72             # And here's where we jump out of the loop when finished
00060 CMP     R2, R1         # Here is the bound check
00064 BHI     40             # If the bound check passes, jump to the start of the loop
00068 JMP     88             # Out of bounds! jump to 88, we'll start to panic at 92
00072 MOVD    R3, R0
00076 LDP     -8(RSP), (R29, R30)
00080 ADD     $32, RSP
00084 RET     (R30)
00088 MOVD    R2, R0
00092 PCDATA  $1, $1
00092 CALL    runtime.panicIndex(SB)
00096 HINT    $0
00100 NOP
00100 PCDATA  $1, $-1
00100 PCDATA  $0, $-2
00100 MOVD    R0, 8(RSP)
00104 MOVD    R1, 16(RSP)
00108 MOVD    R2, 24(RSP)
00112 MOVD    R30, R3
00116 CALL    runtime.morestack_noctxt(SB)
00120 MOVD    8(RSP), R0
00124 MOVD    16(RSP), R1
00128 MOVD    24(RSP), R2
00132 PCDATA  $0, $-1
00132 JMP     0

As we can see here, the bound check is executed at each iteration - it is in fact a condition to continue looping.

Now let’s give a hint to the compiler: we’ll be forcing an early bounds check. This way, the compiler can optimize out the bounds check in the loop.

package main

import "fmt"

func main() {
    s := sum([]int{1,2,3})
    fmt.Printf("1+2+3=%d\n", s)
}

func sum(a []int) int {
    _ = a[2] // early bounds check - Found IsInBounds

    var s int
    for i := 0; i <= 2; i++ {
        s += a[i]
    }
    return s
}
❯ go build -gcflags="-d=ssa/check_bce/debug=1" ./cmd/b
# playground/bce/cmd/b
cmd/b/a.go:11:7: Found IsInBounds

This now compiles to:

00000 TEXT    main.sum(SB), ABIInternal, $32-24
00000 MOVD    16(g), R16
00004 PCDATA  $0, $-2
00004 CMP     R16, RSP
00008 BLS     96
00012 PCDATA  $0, $-1
00012 MOVD.W  R30, -32(RSP)
00016 MOVD    R29, -8(RSP)
00020 SUB     $8, RSP, R29
00024 MOVD    R0, main.a(FP)
00028 FUNCDATA        $0, gclocals·wgcWObbY2HYnK2SU/U22lA==(SB)
00028 FUNCDATA        $1, gclocals·J5F+7Qw7O7ve2QcWC7DpeQ==(SB)
00028 FUNCDATA        $5, main.sum.arginfo1(SB)
00028 FUNCDATA        $6, main.sum.argliveinfo(SB)
00028 PCDATA  $3, $1
00028 CMP     $2, R1         # Here is our early bound check
00032 BLS     84             # If it does not pass, then jump to 84
00036 MOVD    ZR, R1
00040 MOVD    ZR, R2
00044 JMP     60
00048 MOVD    (R0)(R1<<3), R3
00052 ADD     $1, R1, R1     # Here is the start of our i <= 2 loop
00056 ADD     R2, R3, R2
00060 CMP     $2, R1
00064 BLE     48             # Continue loop
00068 MOVD    R2, R0
00072 LDP     -8(RSP), (R29, R30)
00076 ADD     $32, RSP
00080 RET     (R30)
00084 MOVD    $2, R0
00088 PCDATA  $1, $1
00088 CALL    runtime.panicIndex(SB)
00092 HINT    $0
00096 NOP
00096 PCDATA  $1, $-1
00096 PCDATA  $0, $-2
00096 MOVD    R0, 8(RSP)
00100 MOVD    R1, 16(RSP)
00104 MOVD    R2, 24(RSP)
00108 MOVD    R30, R3
00112 CALL    runtime.morestack_noctxt(SB)
00116 MOVD    8(RSP), R0
00120 MOVD    16(RSP), R1
00124 MOVD    24(RSP), R2
00128 PCDATA  $0, $-1
00128 JMP     0

See? the early bounds check hinted the compiler that we just needed one bound check for all the the function body. Now, the loop is free of bounds checking, as we already checked for index 2, and as we stay lower or equal.

Now what about this?

package main

import "fmt"

func main() {
	s := sum([]int{1, 2, 3})
	fmt.Printf("1+2+3=%d\n", s)
}

func sum(a []int) int {
	_ = a[2] // Found IsInBounds

	var s int
	for i := 0; i < 3; i++ { // we're using i<3 instead of i<=2
		s += a[i] // Found IsInBounds
	}
	return s
}

It should be the same, right? The loop comparation and conditional jump may change, but the result should stay the same.

❯ go build -gcflags="-d=ssa/check_bce/debug=1" ./cmd/b
# playground/bce/cmd/b
cmd/b/a.go:11:7: Found IsInBounds
cmd/b/a.go:15:9: Found IsInBounds

It’s even worse! Let’s look at the assembly:

00000 TEXT    main.sum(SB), ABIInternal, $32-24
00000 MOVD    16(g), R16
00004 PCDATA  $0, $-2
00004 CMP     R16, RSP
00008 BLS     116
00012 PCDATA  $0, $-1
00012 MOVD.W  R30, -32(RSP)
00016 MOVD    R29, -8(RSP)
00020 SUB     $8, RSP, R29
00024 MOVD    R0, main.a(FP)
00028 FUNCDATA        $0, gclocals·wgcWObbY2HYnK2SU/U22lA==(SB)
00028 FUNCDATA        $1, gclocals·J5F+7Qw7O7ve2QcWC7DpeQ==(SB)
00028 FUNCDATA        $5, main.sum.arginfo1(SB)
00028 FUNCDATA        $6, main.sum.argliveinfo(SB)
00028 PCDATA  $3, $1
00028 CMP     $2, R1             # Here's our early bounds check
00032 BLS     104
00036 MOVD    ZR, R2
00040 MOVD    ZR, R3
00044 JMP     60
00048 MOVD    (R0)(R2<<3), R4
00052 ADD     $1, R2, R2         # Here's our loop
00056 ADD     R3, R4, R3
00060 CMP     $3, R2
00064 BGE     80                 # Here's were we jump out of the loop
00068 CMP     R2, R1             # Bound check 
00072 BHI     48                 # If it passes, continue the loop
00076 JMP     96                 # Out of bound, start to panic
00080 MOVD    R3, R0
00084 LDP     -8(RSP), (R29, R30)
00088 ADD     $32, RSP
00092 RET     (R30)
00096 MOVD    R2, R0
00100 PCDATA  $1, $1
00100 CALL    runtime.panicIndex(SB)
00104 MOVD    $2, R0
00108 CALL    runtime.panicIndex(SB)
00112 HINT    $0
00116 NOP
00116 PCDATA  $1, $-1
00116 PCDATA  $0, $-2
00116 MOVD    R0, 8(RSP)
00120 MOVD    R1, 16(RSP)
00124 MOVD    R2, 24(RSP)
00128 MOVD    R30, R3
00132 CALL    runtime.morestack_noctxt(SB)
00136 MOVD    8(RSP), R0
00140 MOVD    16(RSP), R1
00144 MOVD    24(RSP), R2
00148 PCDATA  $0, $-1
00148 JMP     0

The bounds check at each loop iteration reappeared. Also, we’re doing that early bounds check - so we’re actually doing even more of it! So, what happened here? The compiler has seen 3 in the conditional part of the loop, which is bigger than 2, our initial bounds check. To be better safe than sorry, it did not eliminate the bounds checks.

As Go developers, appreciating the balance between safety and performance is key. While it’s tempting to dive into micro-optimizations like manually tweaking bounds checks, it’s often more productive to rely on the compiler’s intelligence and focus on writing clear, maintainable code. In most cases, Go’s compiler will do an excellent job of optimizing your code without sacrificing safety.

Remember, the ultimate goal is to write efficient, robust software, and understanding concepts like bounds checking and their elimination is just one part of that journey. As you continue to develop in Go, keep these insights in mind to help you make informed decisions about your code’s safety and performance. Have fun!

All fields requesting personally identifiable information (PII) are optional. Any personal data you provide will be kept private and is used solely for moderation purposes. Your information is stored securely on a server in Switzerland and will not be used directly, shared, or disclosed to third parties. By submitting your personal data, you consent to its indefinite storage under these terms. Should you wish to have your data removed or have any privacy concerns, please contact me using the information provided in the 'About' section.