Author uses a lot of odd, confusing terminology and brings CPU baggage to the GPU creating the worst of both worlds. Shader hacks and CPU-bound partitioning and choosing the Greek letter alpha to be your accumulator in a graphics article? Oh my.
Yeah, I have high hopes for Vello to take off. I could throw away lots of hacks and caching and whatnot if I could do fast vector rendering reliable on the GPU.
I think Rive also does vector rendering on the GPU
Skia is definitely not a good example at all. Skia started as a CPU renderer, and added GPU rendering later, which heavily relies on caching. Vello, for example, takes a completely different approach compared to Skia.
NV path rendering is a joke. nVidia though that ALL graphics would be rendered on GPU within 2 years after making the presentation, and it took 2 decades and 2D CPU renderers still shine.
Right. The question is does Skia grows its broad and useful toolkit with an eye toward further GPU optimization? Or does Vello (broadened and perhaps burdened by Rust and the shader-obsessive crowd) grow a broad and useful API?
There's also the issue of just how many billions of line segments you really need to draw every 1/120th of a second at 8K resolution, but I'll leave those discussions to dark-gray Discord forums rendered by Skia in a browser.
> There's also the issue of just how many billions of line segments you really need to draw every 1/120th of a second at 8K resolution
IMO, one of biggest benefit of a high performance renderer would be power savings (very important for laptops and phones). If I can run the same work but use half the power, then by all means I'd be happy to deal with the complications that the GPU brings. AFAIK though, no one really cares about that and even efforts like Vello are just targeting fps gains, which do correlate with reduced power consumption but only indirectly.
Adding a power draw into the mix is pretty interesting. Just because a GPU can render something 2x faster in a particular test doesn't mean you have consumed 50% less power, especially when we talk about dedicated GPUs that can have power draw in hundreds of watts.
Historically 2D rendering on CPU was pretty much single-threaded. Skia is single-threaded, Cairo too, Qt mostly (they offload gradient rendering to threads, but it's painfully slow for small gradients, worse than single-threaded), AGG is single-threaded, etc...
In the end only Blend2D, Blaze, and now Vello can use multiple threads on CPU, so finally CPU vs GPU comparisons can be made more fairy - and power draw is definitely a nice property of a benchmark. BTW Blend2D was probably the first library to offer multi-threaded rendering on CPU (just an option to pass to the rendering context, same API).
As far as I know - nobody did a good benchmarking between CPU and GPU 2D renderers - it's very hard to do completely unbiased comparison, and you would be surprised how good the CPU is in this mix. Modern CPU cores consume maybe few watts and you can render to a 4K framebuffer with that single CPU core. Put rendering text to the mix and the numbers would start to be very interesting. Also GPU memory allocation should be included, because rendering fonts on GPU means to pre-process them as well, etc...
2D is just very hard, on both CPU and GPU you would be solving a little bit different problems, but doing it right is insane amount of work, research, and experimentation.
On my Apple M1 Pro, the Vello CPU renderer is competitive with the GPU renderers on simple scenes, but falls behind on more complex ones. And especially seems to struggle with large raster images. This is also without a glyph cache (so re-rasterizing every glyph every time, although there is a hinting cache) which isn't implemented yet. This is dependent on multi-threading being enabled and can consume largish portions of all-core CPU while it runs. Skia raster (CPU) gets similarish numbers, which is quite impressive if that is single-threaded.
I think Vello CPU would always struggle with raster images, because it does a bounds check for every pixel fetched from a source image. They have at least described this behavior somewhere in Vello PRs.
The obsession for memory safety just doesn't pay off in some cases - if you can batch 64 pixels at once with SIMD it just cannot be compared to a per-pixel processor that has a branch in a path.
It's an argument you can make in any performance effort. But I think the "let's save power using GPUs" ship sailed even before Microsoft started buying nuclear reactors to power them.
So what is the right way that Skia uses? Why is there still discussion on how to do vector graphics on the GPU right if Skia's approach is good enough?
The major unsolved problem is real-time high-quality text rendering on GPU. Skia just renders fonts on the CPU with all kinds of hacks ( https://skia.org/docs/dev/design/raster_tragedy/ ). It then renders them as textures.
Ideally, we want to have as much stuff rendered on the GPU as possible. Ideally with support for glyph layout. This is not at all trivial, especially for complex languages like Devanagari.
In the perfect world, we want to be able to create a 3D cube and just have the renderer put the text on one of its facets. And have it rendered perfectly as you rotate the cube.
I know lots of broke-ass people who manage to travel and have a cup of coffee while there. It's choices, not privilege. Author of the piece sure is insufferable, though.
Buffett's restraint was legendary and his transparency even more so.
Bill Gates also initially dismissed him, thinking he had nothing to learn.
General Electric also tried to "make a number go up" and effed up the insurance part despite having Buffett as a model and putting 10,000 people through their custom management training facility every year.
Almost. A modem sometimes had a phone jack as well as a coupler, for those cases when the handset was hardwired into the phone and the phone was hardwired into the wall.
We tapped where we could and we were happy. Bonus points if the rotary phone had a lock on it and you dialed out by pulsing the hangup switch.
Often, one could dial out by pulsing the on hook switch on any phone. Ask me how I know. That was such a fun discovery! I did it frequently from many different phones.
The Fourier Transform isn't even Fourier's deepest insight. Unless we're now ranking scientific discoveries based on whether or not they get a post every weekend on HN.
The FFT is nifty but that's FINO. The Google boys also had a few O(N^2) to O(N log N) moments. Those seemed to move the needle a bit as well.
But even if we restrict to "things that made Nano Banana Pro possible" Shannon and Turing leapfrog Fourier.
It's really getting in the way of all the daily AI opinion pieces I come here to read.
More seriously, there are tens of thousands of people who come to HN. If Fourier stuff gets upvoted, it's because people find it informative. I happen to know the theory, but I wouldn't gatekeep.
I remember it didn’t work out well for Randolph and Mortimer. Sam may pull it out, though, if he just sells the DRAM now while the market is still hot.
Instead shoehorning it into an arbitrary symbol salad by gimping its generality, I prefer the one which makes a statement: "What does it mean to apply inversion partially?"
"Transpose this MIDI file down a third" requires neither a specialized data format nor fancy prompt engineering. ChatGPT asked: "A) Major third up (+4 semitones) or B) Minor third up (+3 semitones)" then did it.
I still don't understand how this or the top level comment are related to the post.
I also don't get how you can claim we don't have to 'punch holes in cards to help the machine "think"', and also mention a MIDI file in your next comment. MIDI is much closer to punch cards than the proposed file format in the post.
NV_path_rendering solved this in 2011. https://developer.nvidia.com/nv-path-rendering
It never became a standard but was a compile-time option in Skia for a long time. Skia of course solved this the right way.
https://skia.org/
reply