making tests run faster
#3119 improves the state of GnoVM tests, by greatly improving their performance with some smart caching
over the past week, i've engaged on one of those seemingly small efforts which end up taking a huge amount of time to get right. what i was trying to achieve was to simply re-organise how we run the gnovm tests, as this is the current situation on master:
- there are two systems for running filetests; the one in
tests/file.go
and the one inpkg/gnolang/eval_test.go
- the former doesn't count in code coverage, but the latter does
- on the other hand, the latter lacks many features which are otherwise supported in filetests
- filetests are very slow. on the above video, you can see on the right what this look like before the changes i made, but you spot that many tests run remarkably quickly; possibly indicating some inefficiency going on.
- as an end result: running all the tests on the CI is very slow, and takes up to 20 minutes. (!!!)
- furthermore, the code in tests/imports and tests/file is just an endless amount of spaghetti code that is held together by ductape. it was due for an overhaul, just to keep our codebase somewhat sane.
i set out to re-organize all the code for running tests in a new package, gnovm/pkg/test
, and to have this be called within the gnovm for testing and code coverage. and the results, are good. in this post, we'll delve into a couple of optimizations that made this improvement possible.
do note that these improvements carry over into normal gno test
executions; here's an asciinema recording - roughly, all tests in examples/
now run in 50 seconds (with 30 being in one big outlier, p/demo/diff), against the previous ~6 minutes and 30 seconds.
using "store transactions" to speed up realm filetests
one of the slowest parts of the whole dance around filetests was for tests specifically testing behaviour happening in realms - ie., tests which start with a // PKGPATH: gno.land/r/...
. here's what code was running when we were going through filetests (bz
is the source of the filetest):
// save package using realm crawl procedure.
memPkg := &gnovm.MemPackage{
Name: string(pkgName),
Path: pkgPath,
Files: []*gnovm.MemFile{
{
Name: "main.gno", // dontcare
Body: string(bz),
},
},
}
// run decls and init functions.
m.RunMemPackage(memPkg, true)
// reconstruct machine and clear store cache.
// whether package is realm or not, since non-realm
// may call realm packages too.
store.ClearCache()
m.PreprocessAllFilesAndSaveBlockNodes()
pv2 := store.GetPackage(pkgPath, false)
m.SetActivePackage(pv2)
gno.EnableDebug()
if rops != "" {
// clear store.opslog from init function(s),
// and PreprocessAllFilesAndSaveBlockNodes().
store.SetLogStoreOps(true) // resets.
}
m.RunMain()
to summarize what's going on:
- we're constructing a package with a single file, and "running" it (
RunMemPackage
parses the MemPackage and runs its initialization functions) - then we're clearing the store cache and calling
PreprocessAllFilesAndSaveBlockNodes
(!!!!) - finally, we're re-getting the package again from the store and calling
main()
you might not necessarily know what the intermediary step translates to; but let me tell you: it's slow. PreprocessAllFilesAndSaveBlockNodes
, a method i actually call "pafas" as an acronym because it's way too long, is a crucial part of reconstructing the gnovm store when we re-start the chain, run in the keeper initialisation. it is only run there (aside from the filetest execution), and for good reason. what it does? it takes all the packages in the store
and runs Preprocess
on all of them, so that the gnovm store's internal cacheNodes
can be recovered.
essentially, it's re-compiling all of the packages in the store at that time. which, even for a normal test with a few imports, can take a whole lot of time, because some packages even in the standard library take a while to compile, take for instance unicode
.
the reason it does it, essentially, is to simulate a "real-life" situation where many of the objects within the realms are RefValues
, this post can help you understand what a RefValue is. but running pafas here is like tearing down and rebuilding your house to clean your bedroom.
it's a lot of unnecessary code, that we don't even do on-chain, where we also need to be able to execute code and rollback the store to a previous state when there's a problem. there were actually problems related to this, which I solved in #2319, but that's a story for another time. this is the updated code:
// save package using realm crawl procedure.
memPkg := &gnovm.MemPackage{
Name: string(pkgName),
Path: pkgPath,
Files: []*gnovm.MemFile{
{
Name: filename,
Body: string(content),
},
},
}
orig, tx := m.Store, m.Store.BeginTransaction(nil, nil)
m.Store = tx
// run decls and init functions.
m.RunMemPackage(memPkg, true)
// clear store cache and reconstruct machine from committed info
// (mimicking on-chain behaviour).
tx.Write()
m.Store = orig
pv2 := m.Store.GetPackage(pkgPath, false)
m.SetActivePackage(pv2)
gno.EnableDebug()
// clear store.opslog from init function(s).
m.Store.SetLogStoreOps(true) // resets.
m.RunStatement(gno.S(gno.Call(gno.X("main"))))
using BeginTransaction, we can create a new gno store without touching the objects of the upper store. this allows us to do something similar to what was done previously, but with significantly less effort.
however, there is a second trick we needed to get the full performance improvements:
caching the imports
the other important improvement concerns how imports are processed.
previously, all tests were executed with a fresh store. this is not a major problem when you run a small filetest doing a bit of arithmetic and printing it out; however, it becomes a problem for any filetest doing any non-trivial import.
it's a similar problem to the one above, where we preprocessed the code of each package on the execution of each individual filetest. however here, the problem is that for each individual filetest, we essentially re-load entire packages from the standard libraries, performing essentially the same computation each time.
so, the solution there is to do basically what we've always done on-chain since, like, forever: load the packages into the store, and re-use the store.
// Eagerly load imports.
// This is executed using opts.Store, rather than the transaction store;
// it allows us to only have to load the imports once (and re-use the cached
// versions). Running the tests in separate "transactions" means that they
// don't get the parent store dirty.
if err := LoadImports(opts.Store, filename, content); err != nil {
return runResult{Error: err.Error()}
}
// imports loaded - reset stdout.
opts.stdout.Reset()
this is executed shortly before the other code i showed you. the opts.Store
is a gno store which is common to all of the filetests; and LoadImports
simply performs GetPackage
on all of the imports:
func LoadImports(store gno.Store, filename string, content []byte) (err error) {
defer [...]
fl, err := parser.ParseFile(token.NewFileSet(), filename, content, parser.ImportsOnly)
if err != nil {
return fmt.Errorf("parse failure: %w", err)
}
for _, imp := range fl.Imports {
impPath, err := strconv.Unquote(imp.Path.Value)
if err != nil {
return fmt.Errorf("unexpected invalid import path: %v", impPath)
}
if gno.IsRealmPath(impPath) {
// Don't eagerly load realms.
// Realms persist state and can change the state of other realms in initialization.
continue
}
pkg := store.GetPackage(impPath, true)
if pkg == nil {
return fmt.Errorf("package not found: %v", impPath)
}
}
return nil
}
as you might notice, there is an important distinction: we don't eagerly load realms. this makes it so that if a realm changes another realm in its initialization, this is not persisted into the store; so that the tests truly run in isolation.
these improvements combined brought the improvement in execution i showed you above. i'm finishing up some other improvements and ironing out some bugs. but this should be exciting, and give us a much more tangible "benchmark" for the performance of the gnovm.
although, what i'm looking forward to the most, is for the vm tests to run in as little as 2 minutes (instead of the current 20).