By morgan in gnovm — Sep 25, 2024

gno will have to make floating points deterministic

floating point arithmetic sometimes yields slightly different results on different hardware. this will be a problem we'll have to solve in gno.land to avoid chain forks.

the uninitiated to the conundrums of floating points will think that because floating points have a related standard, there is no way that they can possibly differ; and trying to talk about "floating point determinism" in the first place is an insult to such a wonderful and perfectly-created system, the best to represent the set of real numbers in the digital world.

however, one of the fundamental shifts in perspective that i think one must take in order to have a deeper understanding of the art of programming is that software is not the platform; hardware is.

i claim no credit for this idea, this is entirely taken from Mike Acton's talk on data oriented programming. linked is the specific part of the talk where he says this; but really, watch it, it's good.

i recommend reading two blog posts that explain the problems with floating point determinism; the first is a "linkdump" on Gaffer on Games, which points to many different resources; while the second one (linked by the first) is on Yossi Kreinin's blog, "Consistency: how to defeat the purpose of IEEE floating point". this last one in particular is very good; it points out to some strategies to circumvent them.

this blog post will try to delve a bit into specifically how this problem comes to light in gno, what will likely be our temporary solution for mainnet and what we can do in the future about it.

how gno handles floating points

right now, gno has a "restricted" subset of operations used for floating points: the common binary operations + - / * %, and conversions among floats and integers. floating-point constants are handled at the software level, using an arbitrary-precision decimal (software) library. constants are in general evaluated during preprocess, the "compilation" stage of the gnovm, and are a different story. you can inspect the code for op_binary and for values_conversions to see the specific operations that happen during execution.

in general, gno doesn't fuse add and multiply, and the subset of specific operations needed leaves little space for the go compiler to perform complex optimizations. so, when reading the yosefk's article, it might seem that the main pain would be x86-specific pain, like 80-bits of precision of calculation. however, it turns out that there are also many cases on "newer" architectures (like arm64) which may not have even been well-known at the time of this article, like this issue, where converting large integers into floats yields 0x7fffffff_ffffffff instead of 0x80000000_00000000 (on amd64).

ultimately, floating points are a very good piece of technology that every chipmaker is trying to optimise (for good reason) and where there are many edge cases where the answer "but it's a standard" is simply not satisfactory. it's very plausible that if we were to have some part of the validators run on amd64, and the other on arm64 or even i686, the chain would eventually fork. :'( that's bad!

so, we need a way to make these inconsistencies not come to light in our running chain. the problem is complex, though: so we'll have to cut corners in order to discover...

how we'll "solve" this for mainnet

our first mainnet launch will be heavily centralized, and validators will primarily be picked from the existing contributors to the gno project. this is a temporary phase; we intend to launch mainnet soon so we can reach more and more contributors. however, in the first phase gnot will be "locked": essentially, it cannot be sent to other users or used for anything other than gas fees, with the exception of a likely faucet that can be used to distribute funds to developers who want to build on gno.land.

for its first phase, mainnet will be a space to develop and help shape gno.land. it's a way to keep on developing and iterating without creating huge financial problems, and to easily coordinate with a small set of trusted builders of the project all upgrades while things are still in flux and we're waiting for the dust to settle.

what has this to do with floating points?

a similar situation has arisen recently, as we now automatically format all code that is added on-chain. in it, moul has said:

However, enforcing specific build (GOVERSION) and runtime (GOOS, GOARCH) constraints within the same BFT network is, in my opinion, a reasonable approach in the meantime.

these are temporary constraints. eventually, also for the formatter, we'll likely fork into our own formatter, forked from go/fmt, which we control and control upgrades on. and similarly, for floating points, we can eventually look to have these implemented in software.

how we can solve this, eventually

writing this post has been a demonstration of the usefulness of thinking through things to write them down and the work involved in researching them, to write them out.

and the reason why i say this... is this file in the go runtime package: softfloat64.go.

Could This 600-LOC File Actually Solve All Our Hurdles With Floating Points? (probably, yes)

i was ready to write how we'll likely need to make a floating point implementation and then try to battle test it to see how much it deviates from the hardware floating point results; but it seems like the go team had to try and do this, anyway. so we're likely to use some derivation of this file for our software implementation. more research needed, though.

hopefully it still was a fun ride into the intricacies of floats!

how gno handles floating points

how we'll "solve" this for mainnet

how we can solve this, eventually

Subscribe to diary of a gnome