Rerun: Visualize Multimodal Data over Time

jamessb · on Aug 28, 2024

The Software Engineering Daily podcast had an interview with Emil Ernerfeldt (co-founder and CTO of rerun) a couple of weeks ago: https://softwareengineeringdaily.com/2024/08/07/creating-gui...

dimatura · on Aug 28, 2024

Seems like it's aiming to be a more general alternative to tools like ROS rosbags, pretty widely used in robotics. Sounds like a good idea to me -- having a nice tool to create and visualize multimodal logs can be pretty useful outside of robotics.

Foxglove (https://foxglove.dev/) is an alternative specifically aimed at robotics. It was originally a fork of Cruise webviz (https://webviz.io/), iirc, which came out of the ROS ecosystem. The format for ROS logs (rosbags) has evolved a bit - from a custom format, to a format based on sqlite, to a new format that is intended to be more general and compatible with various serialization formats, MCAP (https://mcap.dev/).

amacneil · on Aug 28, 2024

Foxglove ceo here, thanks for the shout out!

Spot on with our history, most of the early Foxglove team came from Cruise where we built Webviz among other tools & infrastructure. Since then we have rewritten nearly 100% of the Webviz code.

MCAP is a multimodal logging file format we developed based on lessons learned from Cruise and other robotics startups, it has now taken on a life of its own and is the default logging format in ROS.

In terms of differences between Foxglove and Rerun, Rerun appears to focus more on local visualization whereas Foxglove offers more of a complete observability platform for local visualization plus cloud data workflows (upload logs from robot, organization & search, cloud visualization). I'm always happy to see more innovation in the robotics dev tools space though, the industry is growing quickly and pie will be very large so plenty of room for more alternatives.

nikonp · on Aug 28, 2024

Rerun ceo here, thanks for jumping into the discussion here Adrian!

First of, want to really highlight that MCAP is a great container format for robotics message recordings. We (like most of the robotics community) see it as the natural evolution of the rosbag format (co-created a while back by one of our team members).

To add onto some of the differences between Rerun and Foxglove, the Rerun open source project is focused on a viewer that runs completely client side. It's written end-to-end in Rust for maximum speed and portability. That means it can run fully natively for maximum speed (e.g. utilizing native threads and rendering APIs), and in the browser via wasm. You can even embed Rerun visualization in your own web apps and use it inline in python notebooks.

I would also highlight the SDK, which lets you log / send data easily from your own code without up front declaring a message schema. This makes it much easier to also use for quick debugging etc (although some of the largest companies in the world use it to build large sets of different internal visualization heavy tools as well).

On the data model side we've focused on developing what you might label a language or semantic data model. We believe this gives users significantly more power and control than "only" orienting around robotics messages. I think that's why you'll see Rerun being used in a lot more contexts than viewing Robotics messages (although that's a big use case as well). In order to really make use of that data model (with high performance) we've had to develop a new database query engine and a lot of other craziness.

We do have a cloud data platform in the works that leans heavily on our semantic model and query engine. It will have a different take to Foxglove's, with a much broader view of the full data lifecycle for embodied AI.

pzo · on Aug 28, 2024

Rerun is great. I wish they prioritize rerun_sdk build for iOS and/or Android - so that you can log remotely from mobile devices. Serializing and streaming images, depthmaps, sensors data in own code is a pain and rerun has done great work with that.

A little worrying for me that rerun seems getting more complicated and verbose and API changes frequently. The whole vizualization code can clutter algorithm/code that is begin debugged.

If rerun reading this maybe they can consider:

- integrating inkeep [0] "ask ai" (I found bun.sh use it and was useful for me)

- creating docset for Rerun for Dash.app [1]

[0] - https://inkeep.com/

[1] - https://kapeli.com/dash

nikonp · on Aug 28, 2024

Glad to hear you like Rerun and thanks for the suggestions!

> A little worrying for me that rerun seems getting more complicated and verbose and API changes frequently.

We'll unfortunately continue to change some APIs over the next few releases. We're working towards stabilizing the format soon and want to make sure it's in a form that will last.

Do you have any examples of changes you found to be more complicated? We've been trying to maintain the approach of providing a very simple high level API but then exposing more lower level APIs over time for more power and/or control. Would love to learn where we could be doing better there.

> The whole vizualization code can clutter algorithm/code that is begin debugged.

This is a bummer to hear for sure. One of the motivating experiences behind starting Rerun was how much system bloat home grown debugging and visualization infrastructure can bring. Any chance you could share examples that are less clean than ideal? (either here, discord, github or dm on any platform is fine)

pzo · on Aug 28, 2024

I think previous points logging was less verbose. Now I have:

rr.log(f"aruco/board_distance_to_plane_normalized_dots", rr.SeriesPoint(color=(0, 0, 255), marker_size=1.5)) rr.log(f"aruco/board_distance_to_plane_normalized_dots", rr.Scalar(peak.value))

I think before I could have just one line recording. Not a big deal if logging only one sensor but my code is logging a lot of sensors and variables and then overall there is a lot of rerun code around. Maybe some dedicated VSCode or Jetbrains plugin that would allow hide/unhide all rerun code would be also good workaround.

Regarding reason for integration with inkeep or docset is to make for AI better index up-to-date documentation so that it can provide less hallucinated response with older API

nikonp · on Aug 28, 2024

Thanks!

Just so you know you can log rr.SeriesPoint(color=(0, 0, 255), marker_size=1.5) in the same call as rr.Scalar(peak.value) if you want. You could also skip logging the rr.SeriesPoint altogether and include it in your blueprint (as a component override or default). That way you can more clearly separate styling from data. Either way I hear you on the added complexity. At the end of the day it came down to trading off simplicity vs expressiveness on this one.

Totally hear you on the "make it easier to have AI that doesn't hallucinate" thing. We should definitely do something in that area. Just haven't managed to get the cycles in to do so yet

pzo · on Aug 28, 2024

Thanks! I will have a look. Another idea - I wish there was some more high level visualization API similar like supervision [0] but instead for Rerun and instead of implemented in python to implemented with rust with autogenerated bindings (so that can be used on native/mobile as well).

So that you can easily log common models result like mediapipe pose/face/handlandmark, object detectors etc. with just few lines.

[0] - https://github.com/roboflow/supervision

sebastos · on Aug 28, 2024

I’m really interested in buying into Rerun over Foxglove; I like that it’s committed to stay open source and the lack of requirement for an account, but it also seems like the underlying data model is very strong. But MCAP + Foxglove’s killer app is h264 video stored inside the log! Rerun guys, if you’re listening, I would strongly recommend prioritizing the rework in your backlog needed to allow logs to hold encoded data of that sort. It’s a dealbreaker right now - once you get that working, the spice will flow.

Foxglove guys - I’m getting good value out of your product on a personal license. No disrespect! Love what you guys have done, and understand the close-sourcing the viewer wasn’t an easy decision.

nikonp · on Aug 28, 2024

Really glad to hear you like what we've been building so far! I personally appreciate the note on the strong data model in particular since that's taken up a lot of focus and effort.

In that case it will make you glad to hear that we are currently working on support for encoded video! The new (time) column oriented APIs that came in the latest 0.18 release was (among other things) a building block for video by allowing users to send data that extends over time to Rerun in a single call. You can expect something on the encoded video front within a release or two.

sebastos · on Aug 28, 2024

Exciting to hear, thanks for the information! You guys rock!

sgu999 · on Aug 28, 2024

Rerun is a fantastic tool. I've made my team adopt it to visualise all of our (Rust) inference pipeline, it's been instrumental to improve our system.

I'm very impressed that not only they develop rerun, but they are mostly building the entire UI from scratch through egui. Very serious syndrome of not invented here but the result is astonishing! Their blog also has some very interesting posts on the code.

nikonp · on Aug 28, 2024

Awesome to hear it's working out for you!

jenadine · on Aug 28, 2024

What existing UI could they have used?

nikonp · on Aug 28, 2024

Emil had already started egui long before we started Rerun. The immediate mode paradigm of egui also fits very well with how we wanted to archetype Rerun. In addition to that the GUI framework story in Rust is still quite immature which means you often need to be able to make changes to the actual GUI framework if you want to create a "cutting edge" UI heavy product in Rust. Not surprisingly you therefore see other Rust based apps build their own GUI frameworks for that same reason. There are some promising initiatives in the Rust world but we'd still make the same decision to build on egui today if we were to start from scratch.

vietjtnguyen · on Aug 28, 2024

Really excited to see Rerun get fleshed out as time goes on. I've used it at $WORK in spare time to visualize multi-agent telemetry and event information for remote operated robotic systems.

To me the most useful feature was actually just the timeline view. Being able to see all of the events and telemetry updates from all agents and correlate them to some line plots was already a huge help. I think Rerun should actually make the timeline view just another "view" [0] so that I can have additional timeline views with specific filters. Seeing the discrete pips in the timeline for different events and channel updates is really helpful.

Also we already have infrastructure to log and for live telemetry streaming so we don't integrate with the SDK directly in the deployed software and instead just have an adaptation layer to take logs or live data and push it through the Rerun SDK. I generally feel it's better to separate the data generation and logging from the data visualization, though I could understand directly integrating the SDK as a fast solution if the infrastructure is not already there. I hope the use case as the data offload/logging for a system doesn't become a focus sink for the Rerun team.

That said, I do appreciate the simplicity of the Rerun SDK API design. At first I wanted to try Foxglove but it seems pretty tied to the supported formats (which we weren't outputting for heritage reasons) and my eyes glazed over when I saw how much work (I perceived) it would take to integrate with their custom websocket interface. This was all spare time exploration so I basically dropped it once I saw how much code there was in the example [1].

[0]: https://rerun.io/docs/reference/types/views

[1]: https://github.com/foxglove/ws-protocol/tree/main/python/src...

nikonp · on Aug 28, 2024

Thanks for sharing and that makes me really happy to hear. Keeping Rerun simple to use and easy to get started with has been a goal of ours from the beginning.

> I think Rerun should actually make the timeline view just another "view"

This is something we've talked about on and off internally and mostly all agree it's the right thing to do at some point. I also think there is a lot of things we could do to increase the usefulness of the timeline view even further so glad you're getting alot out of it already.

> Also we already have infrastructure to log and for live telemetry streaming so we don't integrate with the SDK directly in the deployed software and instead just have an adaptation layer to take logs or live data and push it through the Rerun SDK.

This is actually quite a common case and we don't have the ambition to make the Rerun SDK be the perfect logging library for all usecases. We're trying to do more instead to make it easier write efficient adapters that run elsewhere in your infra for cases where using the logging SDK isn't the best fit. The recent `rr.send_columns` API is in that direction but there will be more in the future

samsartor · on Aug 28, 2024

I've been using rerun for the past year or so instead of tensorboard/wandb/etc for logging my ML training runs. I really like being able to just throw enormous amounts of arbitrary data back to my own laptop without worrying about cloud storage!

I will say the (not so recent) API change to "components" and "archetypes" hasn't clicked for me yet. Obviously I could sit down and figure it out. But it would be a lot nicer if the API could just ingest whatever damn types I throw at it, rather than raising an error 20-minutes in because I wasn't polite enough. The old API felt basically magic.

nikonp · on Aug 28, 2024

Rerun CEO here. Awesome to hear you're enjoying using Rerun for monitoring ML training runs!

> I will say the (not so recent) API change to "components" and "archetypes" hasn't clicked for me yet. Obviously I could sit down and figure it out.

Are you referring to the move from the `rr.log_image("path/to/my_image", ...)` style API to the `rr.log("path/to/my_image", rr.Image(...))` style API that came in 0.9 (roughly a year ago)? Our intention was that if you stick to the higher level Archetype APIs, the two should be equivalent in terms of "magically handle any data". The component level APIs are intended to give more control when you need it (for instance to improve performance), but aren't required.

> But it would be a lot nicer if the API could just ingest whatever damn types I throw at it, rather than raising an error 20-minutes in because I wasn't polite enough.

Our intension is for this to never happen either, but obviously "any data" is a very broad set of things so there may be some mistakes on our side. If you're not running the SDK in strict mode, any thrown exception from the Python SDK should be seen as a bug. Really sorry to hear that happened, I agree that sucks. Would love any details you can share on when any aspect of Rerun is annoying (either here, on discord, or github)!

samsartor · on Aug 28, 2024

Overall you guys do an amazing job! I'm just griping XD

And actually you are right, it didn't throw at all, just logged a nice warning! Despite me being annoyed that I had to go fix my mistake, the API has caused me 10x fewer issues than any other python lib I've touched

mpalmer · on Aug 28, 2024

Totally get why most examples are about video processing / motion tracking / point clouds; source-available 3D viz is underserved.

But I'm curious if there's any non-trivial examples of using Rerun to produce animated dataviz where the data is not already spatial, just arbitrary non-image-related data that is mapped into 2d or 3d coordinate spaces for various analysis. Like a stream of log data for a web service, etc.

nikonp · on Aug 28, 2024

That's totally doable but we haven't put together any examples of that kind other than perhaps this one https://rerun.io/examples/generative-vision/llm_embedding_ne..., which uses UMAP to project some text embeddings vectors to 3D

upbeat_general · on Aug 28, 2024

I’ve been using this for visualizing some computer vision data/model prediction and it has been great.

The remote client/server streaming has a bit to go, but just the option to use it in a browser is a game-changer. All the previous web-based tools that I’ve come across have been either very limited, or require significant manual setup.

nikonp · on Aug 28, 2024

Super glad to hear that!

What are some things you're missing on the client/server streaming? I'm aware that there is much to do but would love to hear your specific issues

upbeat_general · on Aug 28, 2024

Honestly most of the things I would like food are very oriented toward research/sharing use-cases, which I realize that is distinct from a lot of your users, but can mean:

- Working with large remote datasets/clusters - Needing to share visualizations with others [both interactive, and pre-recorded/controlled] - Being able to easy switch between N runs to look at data/results over time

Concretely:

- Currently it seems you need to forward multiple ports, and for each new session you need to re-connect. My memory is foggy but I found it was a lot of hassle to just connect a thin-desktop client to a remote backend.

- The support for baked-html is great, but it has limits (namely I was hitting some size limitations) and it would be cool to have a server that can read rr files from disk and stream them on-demand [e.g., point to a directory and let the user select]

- Generally speaking, I found the workflow difficult to integrate to the typical ML setup. Obviously not a priority for rerun but I think a little improvement [like the point above] could help a lot.

- A built-in screen recording feature would be great. I spent a lot of time screen recording and re-sharing clips.

Again, these are mainly nitpicks, love the tool overall and have recommended it to several people.

nikonp · on Aug 28, 2024

Thanks a lot I really appreciate you writing this out! I'd say most of your requests are things we'd like to address in the future. Built in screen recording is a common request and something I think makes sense for the open source project. A lot of the other requests require a central service of some kind to achieve high performance and/or a smooth user experience so those fall into the bucket of what will go into our commercial offering.

kowale · on Aug 28, 2024

Sorry, I wanted to love it, but it's way too slow compared to a simple Three.js app. Native version is a bit faster, but still laggy, though I only tried the build from Nixpkgs, might be the issue. The web builds I tried were all upstream though :(

nikonp · on Aug 28, 2024

Sorry to hear that. Mind sharing what it was you were trying to do? Most users find Rerun to be very fast when trying to do the things it was built for.

123gyat · on Aug 28, 2024

So basically like braindance scrolling from cyberpunk 2077 for multimodal data. Awesome

bewestphal · on Aug 28, 2024

Using this for computer vision and sensor debugging. It’s brilliant.

Having worked on developing such UI’s before in house it’s very nice to outsource and focus on building my app with Reruns insanely smooth rendering.

nikonp · on Aug 28, 2024

Awesome to hear you're enjoying it! Anything we could do to make you love it even more?

bewestphal · on Aug 28, 2024

For my team it would be a module for visualizing GPS sensor lat/long readings on maps.

nikonp · on Aug 28, 2024

Yeah that makes sense, we hear that a lot and it's on our roadmap. It won't come in the next release unfortunately but I'd love to have it land within the next couple releases after that

a-dub · on Aug 28, 2024

how is support for high cardinality (10s to 100s) high sample rate (1-10 khz) data these days? last time i looked, going over a few kilohertz with a handful of channels seemed to push it past usability.

nikonp · on Aug 28, 2024

When was the last time you tried it? This was the focus of the 0.18 release, which improved the situation quite significantly (see https://rerun.io/blog/column-chunks for details)

a-dub · on Aug 28, 2024

wow, this looks very promising! are the cardinality limitations also improved? (ie, if i throw 500 timeseries at it, does the ui handle things gracefully?)

also, how's api stability looking these days? it looks like none of my 0.16 test scripts want to run against 0.18?

nikonp · on Aug 28, 2024

The many entities performance is still not where we'd like it to be. We've made multiple improvements over the last releases so worth kicking the tires again but it isn't solved in a fundamental way yet. There is still some low hanging fruit available to speed things up for many common cases of high cardinality by expanding our APIs slightly I believe. Those can come sooner that a bigger architecture update so hopefully that will be enough for your use case

sebastos · on Aug 28, 2024

What I love about you guys is that when somebody asks for a feature, you never come back with “sure, maybe we’ll bolt that on”, you come back with “We agree! That’s why we completely rebuilt the thing from scratch to handle that legitimate and central requirement! And still we’re not satisfied!” It’s awesome!

john_minsk · on Aug 28, 2024

Super interesting! Thank you!