Copy-and-patch is a technique for reducing the amount of effort it takes to write a JIT by leaning on an existing AOT compiler's code generator. Instead of generating machine code yourself, you can get LLVM (or another compiler) to generate a small snippet of code for each operation in your internal IR. Then codegen is simply a matter of copying the precompiled snippet and patching up the references.
The more resources are poured into a JIT, the less it is likely to use copy-and-patch. You get more control/flexibility doing codegen yourself.
My understanding is that e-graphs take care of selecting the best patch (by examining many options in parallel) but fundamentally it is still copy-and-patch.
Could you elaborate more on "fundamentally it is still copy-and-patch"? From what I can recall when I had first read about copy-and-patch a not-uncommon comparison was against Cranelift, which to me would imply that different approaches were taken. I don't recall any discussion about Cranelift's use of the technique, either, so your claim that it's at the heart of Cranelift is new information to me. Has Cranelift adopted copy-and-patch (maybe for a specific compilation stage?) in the meantime?
Interesting point about Cranelift! I've been following its development for a while, and it seems like there's always something new popping up. That connection with e-graphs adds a neat layer of complexity—it’s kinda wild to think about how optimization strategies can vary so much yet still be rooted in similar ideas.
I wonder if there's a place for copy-and-patch within Cranelift at some level, maybe for specific sequences or operations? I had a similar experience trying to streamline some code generation tasks and found that even small optimizations could lead to surprisingly big performance gains.
I think it's cool how different teams tackle the same challenges from different angles—like how CPython's JIT works, for instance. It really makes you appreciate the depth of creativity in the community. Do you think there are other JITs out there that are using these techniques in ways we haven’t seen yet? Or maybe there are trade-offs between speed and optimization that some projects have to weigh heavier than others?
https://cranelift.dev/