Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Grok-1.5 Vision Preview (x.ai)
86 points by runesoerensen on April 13, 2024 | hide | past | favorite | 22 comments


I think a lot of people are sleeping on Xai for two reasons - Twitter & Tesla FSD data.

I've seen numerous talks recently by Ai leaders discussing how the next level of LLMs have to "understand the physical world" (which tesla FSD is a leader in, see Elon's response to Sora - https://www.news18.com/tech/elon-musk-claims-tesla-has-been-...), along with Twitter being the "town square" of the internet giving them an immense advantage. I wouldn't be surprised if the Twitter API was completely cut off one day.


Imagine a baby who learned everything about language and human expression from Twitter. Of course they won't only use Twitter. But it's quite hard to see Twitter data as a significant competitive advantage unless you want to train the biggest jackass in the universe.


It’s more so on having live up-to-date data


having up-to-date data in a world where models are updated to data as of 6 months ago doesn't seem that relevant. especially when that up-to-date data is, who is Elon calling a pedo today.


You're thinking of LLMs today. When compute price drops another 99% and these models can re-train almost instantly, that "live data" will be invaluable. This is all coming from a talk I watched from Nvidia's CEO.


Interesting that they built their own benchmark and that it primarily features images from vehicles. Tesla overlap?

> ... we are introducing a new benchmark, RealWorldQA. This benchmark is designed to evaluate basic real-world spatial understanding capabilities of multimodal models. While many of the examples in the current benchmark are relatively easy for humans, they often pose a challenge for frontier models.

> The initial release of the RealWorldQA consists of over 700 images, with a question and easily verifiable answer for each image. The dataset consists of anonymized images taken from vehicles, in addition to other real-world images ... RealWorldQA is released under CC BY-ND 4.0.

Will be interesting to see the feedback once someone has a change to look into the dataset (https://data.x.ai/realworldqa.zip).

Side note — I'm very impressed with their "Explaining a meme" example.


OpenAI and others seem to be focused on a brute force approach towards achieving AGI. Others, such as Yann LeCun, have advocated for intelligence that can build understanding of the world similar to how humans learn. This seems like an approach in that direction?


This seems like an approach in that direction?

That makes it seem so? Aren’t they trying to catch up with OpenAI using exactly the same models and training methods that OpenAI uses?


This is what Elon wanted for OpenAI before he left the board according to the emails. He wanted them to become for-profit under Tesla, probably to assist with self-driving.


Looks like Optimus is an intended use case.


Optimus for indoor data collection and model deployment, Tesla for outdoor.


> Optimus for indoor data collection

I suspect they will work very hard to motivate a great number of people to accept a disclaimer for this. It will interesting to see how they apply the pressure.


Hold on, I think an Optimus is knocking on my door right now.


If those comparisons to other models are remotely accurate, that is a pretty impressive feat to catch up that quickly.


[flagged]


My reason to consider using this is that he’s the only significantly wealthy person fighting for equal AI access, when everyone else is building closed models and pushing for regulatory capture. I think he’s honest in recognizing the threat of having a few tech companies or few countries controlling our world through AI. Google’s Gemini was a preview of how the few big players might abuse their power and inject the bias of their employees into the AI influenced world this is becoming. I am not a tech oligarch but a random person, so choice and competition seem good to me?


Sure, I see your point but I would question his motivations for making that move. Doesn't seem very authentic to the open source ethos to open source your weights right after you sued someone for not doing exactly that. I think he's very good at framing his motivations as "for the common good." One should be highly speculative of anyone who hordes wealth and also tries to come off that way, SBF is a great example. One should also be highly speculative of "Open Source" projects that also do this, OpenAI is a great example of this along with thousands of other projects who use "open source" as a way to market their commercial products. This is America baby, there is no such thing as a free lunch and Musk is going to get that lunch money.


[flagged]


A cursory CTRL+F for "based" yields nothing, at least. A glimmer of hope.


Ask Google’s Gemini and Sergey “Bing” about AI-generated image damage control.


If that's a concern, then you can leave "fun mode" disabled.

I think many people are interested in a less aligned model.

I think people are also interested in a less silicon-valley aligned model, since it does get in the way. For example, Gemini gives a refusal for "please provide a list of common right leaning news orgs" and censors Fox News links, which is sometimes the only place you can find critical viewpoints of certain topics.


Is it a less aligned model or a... cringe aligned model? Or something? I can get uncensored models through ollama. They don't try and fail to be cool in the way x formerly known as Twitter grok does, at least as I've seen on x, formerly known as Twitter.


Don’t feed the trolls


To the contrary, I’m inclined to trust it more.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: