Collisions between the personal and the commercial in open-source have become familiar, like when an irate independent developer pulled a widely-used package entirely in 2016 bringing down hundreds of projects. But we don't expect a commercial enterprise to suddenly discover a package they have long relied on forbids any use at all—and does so for a very good reason.
This was the realization for dapr, a “portable, serverless, event-driven runtime,” with 14k stars on GitHub, when an issue was opened last week about its dependency bouk/monkey. Monkey's license, in its entirety, reads:
I do not give anyone permissions to use this tool for any purpose. Don’t use it.
I’m not interested in changing this license. Please don’t ask.
It raises the question: do even corporate entities read the licenses of packages they use? While Dapr was called out, Arduino, SalesForce, and hundreds of other projects also reference Monkey.
Monkey is popular because it fills a real need. It was created by Bouke van der Bijl, a programmer based in Amsterdam, who was eager to add monkey patching to Go. The feature is useful for tests where, say, a repo that talks to a database is swapped out for a component returning test data. The lack of this feature in Go is a testament to the language's resistance to adding the trendy latest features that have bloated languages like Python and C#.
However, adding monkey patching to Go is not a reasonable goal while respecting anything close to "best practices." Like van der Bijl's previous project, Gonerics—which added generics to Go—Monkey asks why this much-desired feature is not already part of the language, and then adds it by any means necessary.
The appeal of van der Bijl's work is in how far he is willing to go to get the language to work the way he wants. For Monkey, this is achieved with assembly-level fuckery predicated on the insight that running Go programs can modify their own binaries.
Let's say our program is calling function x()
. Monkey replaces the first command of x()
with a JMP
to function y()
. Now the body of x()
is entirely replaced by y()
, which will run to completion and return to the caller. Ordinarily, Go's safety checks would prevent this, but the "unsafe" package allows us to bypass them. The OS has its own safety checks to prevent writing to the loaded binary, but these are evaded with syscall.Mprotect
. Of course, with these checks turned off, there is no testing for type safety, and the very real possibility of memory corruption.
This, from his blog post, is the heart of the program:
package main import ( "syscall" "unsafe" ) func a() int { return 1 } func b() int { return 2 } func getPage(p uintptr) []byte { return (*(*[0xFFFFFF]byte)(unsafe.Pointer(p & ^uintptr(syscall.Getpagesize()-1))))[:syscall.Getpagesize()] } func rawMemoryAccess(b uintptr) []byte { return (*(*[0xFF]byte)(unsafe.Pointer(b)))[:] } func assembleJump(f func() int) []byte { funcVal := *(*uintptr)(unsafe.Pointer(&f)) return []byte{ 0x48, 0xC7, 0xC2, byte(funcVal >> 0), byte(funcVal >> 8), byte(funcVal >> 16), byte(funcVal >> 24), // MOV rdx, funcVal 0xFF, 0x22, // JMP rdx } } func replace(orig, replacement func() int) { bytes := assembleJump(replacement) functionLocation := **(**uintptr)(unsafe.Pointer(&orig)) window := rawMemoryAccess(functionLocation) page := getPage(functionLocation) syscall.Mprotect(page, syscall.PROT_READ|syscall.PROT_WRITE|syscall.PROT_EXEC) copy(window, bytes) } func main() { replace(a, b) print(a()) } |
Comments on his project on reddit include "This is admirable work in the pursuit of evil" and "I'm pretty sure anyone using this in production code is going straight to hell," as shared in his Hacking Go Internals lecture at Coding Serbia 2015. And yet, perhaps Monkey's wide usage shows it is more stable than van der Bijl had expected.
I spoke with van der Bijl about his work.
» Monkey and Gonerics are both satirical projects that ask whether feature xxx can be made possible in Go—and then actually produce them, despite the risks and extreme work-arounds necessary. Could you tell me how you came up with these ideas?
Before I used Go I used more dynamic languages like Python and Ruby, where you can pass any value anywhere and change class definitions or functions at runtime, whenever you want. Go—and most other statically typed languages like C++, Rust—don't allow you to do this. One of the things that I believe strongly about computers and that drives the way I look at them is that they are completely human-made, they are a world unto themselves that we created. In the end, a CPU is just a pile of very smart sand. If you think about that and take it to its logical conclusion you will realize that a programming language might have an official set of features, but it also has another set of possible features that can be made to happen if you combine the features of the language in the right way.
Take Gonerics for example. It is widely known that Go does not have support for generics or function templating—though they're working on it. You cannot compile code at compile time. But I got thinking about the features that Go supports and realized that actually no, that is not correct. Go supports downloading packages from the internet from arbitrary sources and you can create a server that will generate code for you depending on the URL that is requested. So you actually are able to dynamically generate code.
The same thing with Monkey: you have a function and you want to overwrite it with another function. I believe I ended implementing it this way because of a chat I had with a colleague, who pointed me to the concept of hooking: https://en.wikipedia.org/wiki/Hooking which is quite common in the Windows world. With this in mind I realized that actually yes, you can overwrite a function with another in basically any language, because you can always modify the memory directly and redirect a function to another.
» As a long-time Go programmer, how do you feel about the balance of what is included in the Go language vs. other languages that are constantly adding new features (C#, etc)?
Go has struck quite an interesting balance by having powerful but simple concurrency features built into the syntax (channels, the 'go' statement) but also lacking type system-level features that many take for granted like function templating. I can appreciate that Go is a very stable language that has been considered in detail. A program I wrote seven years ago still runs unmodified in the latest version for example, while I can't trust some Rust or Ruby code I wrote back then to do the same. The simplicity of the language also makes it easy to jump in and start reading other people's code.
The lack of expressiveness is something I run into sometimes, doing anything with databases and an 'ORM' is often frustrating, because of this lack of 'power'. But it does make Go reliable and trustworthy in my mind.
» For dapr, it seems like Monkey was used to swap in test versions of functions. Is this something that is currently awkward to do in Go? Does Monkey make this easier (even if it comes with all kinds of risks)?
If you come from a dynamic language like Ruby or Python you might be inclined to just overwrite a function or object if you are testing the behavior of some code that relies on it. The right way to do this in Go however is to pass a value that implements an interface. Or to have a function variable. People want to overwrite the time. Now function to pin the time to a specific moment for example, instead you could have a now func() time.Time
variable that you set to time. Now, but then overwrite in your tests. This allows you to achieve the same thing without overwriting a function globally.
Monkey does make this easier of course, because it doesn't require you to re-architect your, or other people's code.
» Out of the 465 projects depending on it, have the uses of the library been similar dapr’s?
Yes, people mostly want to use it for writing tests.
» In your conference presentation, you mention that garbage collection would cause programs to crash, should it run at the wrong moment. Wouldn’t these projects using Monkey randomly have their tests fail when that happens?
IIRC, I said that based on something one of the Go maintainers commented on a Reddit thread, but I believe they since deleted it, so I'm not sure if it was accurate.
The trickiest thing with Monkey is that you have to disable inlining, that's something that took a while to figure out. And it means you have to set that flag whenever you build or test code that relies on it, which already says you're doing something wrong—one of the big reasons Go has great tooling is because there's so little to configure. So if there's this flag you have to set everywhere all of a sudden, then that's not good.
» What was it like to first see the HN thread about Monkey and watch the discussion unfolding? You've now archived the project, was that in response to the thread?
I was laughing out loud, people's reactions to it is exactly why I created the project in the first place. I never intended to use it or for it to be used, I only created it to see how people would react and for my own enjoyment. And it has certainly been a great success in that regard. It is almost a piece of modern art in that way—the project is completely irrelevant at this point, only the reaction to it is what matters.
I archived the project sometime last year—and still people keep using it! The license was added at one point because someone made an issue asking for one, so I added the license to make clear that it's not a project that should be seriously used. Then people kept asking me to change the license, so I added a line that makes clear that I am not interested in changing. Then people still kept asking, so I removed 'Issues' from the GitHub repository and archived it.
If anyone actually wants to do monkey patching in Go, they can use this project which is being kept up to date and has a permissive license.
But, just to spell it out in case it still isn't clear: you shouldn't!