> It's exceedingly rare to see any sort of global mutable state
I know a bit of Rust, so you don't need to explain in details. How to use a local cache or db connection pool in Rust (both of them, IMO, are the right use case of global mutable state)?
Why does that have to be global? You can still pass it around. If you don't want to clobber registers, you can still put it in a struct. I don't imagine you are trying to avoid the overhead of dereferencing a pointer.
I think a better example might be logging. How is this typically solved in Rust? Do you have to pass a Logger reference to every function that potentially wants to log something? (In C++ you would typically have functions/macros that operate on a global logger instance.)
In Rust you typically use the "log" crate, which also has a global logger instance [0]. There is also "tracing" which uses thread local storage.
As another comment said, global state is allowed. It just has to be proven thread-safe via Rust's Send and Sync traits, and 'static lifetime. I've used things like LazyLock and ArcSwap to achieve this in the past.
I wonder what the advantage of passing it around is when it makes the argument list longer. The only advantage that I can see is that it emphasizes that this function does something with cache.
There's at least 1 thing that Zig is better than Rust is that Zig compiler for Windows can be downloaded, unzipped then used without admin right. Rust needs msvc, which cannot be installed without admin right. It is said that Rust on Windows can use cygwin but I cannot make it work even with AI help.
cygwin is a POSIX-emulating library intended for porting POSIX-only programs to Windows.
That is: when compiling for cygwin, you'd use the cygwin POSIX APIs instead of the Windows APIs. So anything compiled with cygwin won't be a normal Windows program.
There's no reason to use cygwin with Rust, since Rust has native Windows support. The only reason to use x86_64-pc-cygwin is if you would need your program to use a C library that is not available for Windows, but is available for cygwin.
If you don't want to/can't use the MSVC linker, the usual alternative is Rust's `x86_64-pc-windows-gnu` toolchain.
Is it true that a message from a queue will disappear after it is consumed successfully? If yes, at this moment, how do you make kafka topics work as queues?
It "disappears" in the sense that the Consumer-Group that read/committed that message (event) will never see it again. It doesn't "disappear" in the sense that a new Consumer-Group can be started in a way that will get that message, or you can reset your Consumer-Group's offset to re-consume it.
Think about this for a second. Kafka offsets are a thing, consumer groups are a thing. It's trivial to ensure that only one message is delivered to only one consumer if that's what you want. Consumer groups track their offset and then commit the offset, the message stays in Kafka but it won't be read again.
This IMO is better behaviour than RabbitMQ since you can always re-read messages once they have been processed, whereas generally with MQ systems the message is then marked for deletion and asynchronously deleted.
> It's trivial to ensure that only one message is delivered to only one consumer if that's what you want. Consumer groups track their offset and then commit the offset, the message stays in Kafka but it won't be read again. This IMO is better behaviour than RabbitMQ
The trivial solution is to use Kafka. They're clearly saying that Kafka makes it trivial, not that it's trivial to solve from scratch.
What the parent poster described isn’t what makes Kafka’s “exactly once” semantics work. It’s the use of an idempotency token associated with each publication, which effectively turns “at-least-once” semantics into effectively “exactly once” via deduplication.
> better behaviour than RabbitMQ since you can always re-read messages once they have been processed
I can imagine, a 1 Billion dollar transaction accidentally gets processed by ten thousand client nodes due to a client app synchronization bug, company rethinks its dumb data dumper server strategy...news at 11.
No, it's not needed. We plan to remove Google Guava from the Fory Java dependency. Our philosophy is that the core should have as few dependencies as possible for maintainability and minimal footprint.
So, you want a place to store many files in a short period of time and when there's a new file, somebody must be notified?
Have you ever thought of using a postgresql db (also on aws) to store those files and use CDC to publish messages about those files to a kafka topic? In your original way, we need 3 aws services: s3, lambda and sqs. With this way, we need 2: postgresql and kafka. I'm not sure how well this method works though :-)
Like put the video blobs themselves in postgres data columns? Does putting very large (relative to what you normally put in postgres) files in pg work well? Genuine question, i do not know, I've been considering it too and hesitant about it.
Have you done this? I can google or AI for the max size that postgres will allow, sure. I have googled in the past for whether this actually works well, and have gotten answers leaning towards most advice against it in real world scenarios.
So if you have experience with this and it did work well, I'm curious to hear about it! That's why i asked about if it worked well, not about the maximum size postgres allowed in various data types.
If you have no experience with it, but are just posting advice based on what AI tells you about max sizes of data allowed by pg that I can get from the same source too, then okay, fair enough, and certainly no need to give me any more of that!
reply