Hacker Newsnew | past | comments | ask | show | jobs | submit | pcblues's commentslogin

Tried to play a free-for-all card game with blank cards with friends in a bar thirty years ago. It was too far out for the group. Writing the rules for a game during your own turn is pretty great. But if there isn't an improv idea of "and then" among the group the game won't work. It's certainly not about winning :)

Not sure if this has been posted (I see stephenwoo has mentioned him further down), but it's a break-down of how sugary foods damage the body, particularly fructose.

It's 16 years old about 30 years of previous research.

https://www.youtube.com/watch?v=dBnniua6-oM


I haven't had a _terrible_ UI experience with Win 11 that Apple hasn't put me through already. But it took away my sideways toolbar. I don't click anything that loads edge, like "Show me more from the web" type links. So I don't see ads. I use firefox and thunderbird.

The telemetry all the way through the operating system sucks ethically. But I'm invested and familiar with Windows and Office. Not being able to make Copilot disappear is annoying.

However, all my games and software that work on Windows won't necessarily work on linux. I am not interested in making a political stand and putting up without abilities and features I currently have.

So, for my own use-case, Win 11 it is.

Clearly not an endorsement, just a data-point.


Yeah, I think it's just a matter of what you know. I recently got a Mac for work and the UI is horrid and I have no idea how people put up with it, it's like designed by people who never actually had to use it afterwards. But clearly it works for people so I strongly suspect it's just the personal bias here.


> However, all my games and software that work on Windows won't necessarily work on linux

Unless you specifically know what won't work: there's a solid chance that your games and software will work just fine on Linux.


Maybe I'm still in denial about the benefit of AI code design, but when I have an initial set of requirements for a program, the design begins. That is just a set of unanswered questions that I address with slowly growing code and documents. Then the final documents and code match the answers to all the questions that rose from the answers of previous questions. More importantly, I know how the code answers them and someone else can learn from the documentation. Since the invention of "velocity" I feel like much of the industry treats code and programmers like tissues. Wipe your nose and throw it away. Now we have AI-based automatic tissue dispensers and Weizenbaum's gripe about programmers creating work for themselves other than solving the requirements of the actual problems continues.


You gain experience getting interactions with other agencies optimised by dealing with them yourself. If the AI you rely on fails, you are dead in the water. And I'm speaking as a fairly resilient 50 year old with plenty of hands-on experience, but concerned for the next generation. I know generational concern has existed since the invention of writing, and the world hasn't fallen apart, so what do I know? :)


The Jupiter Ace was unreal, but only from a computer science perspective. You had to know a lot to know how to program Forth which was the fundamental language of that white but Spectrum-looking dish of a PC, in spite of a manual that read like HGTTG. Critically, it didn't reward you from the start of your programming journey like Logo or Basic did, and didn't have the games of the ZX Spectrum. I knew a person who tried to import and sell them in Australia. When I was young, he gave me one for free as the business had failed. RIP IM, and thanks for the unit!

https://80sheaven.com/jupiter-ace-computer/

Second Edition Manual: https://jupiter-ace.co.uk/downloads/JA-Manual-Second-Edition...


A CorelDraw version from the 1990s I used had an honest progress bar. Sometimes it went backwards, but by the time it got to the end, it was truly finished.


A question: Does anyone know how well AI does generating performative SQL in years-old production databases? In terms of speed of execution, locking, accuracy, etc.?

I see the promise for green-field projects.


It's very hit or miss. Claude does OK-ish, others less so. You have to explicitly state the DB and version, otherwise it will assume you have access to functions / features that may not exist. Even then, they'll often make subtle mistakes that you're not going to catch unless you already have good knowledge of your RDBMS. For example, at my work we're currently doing query review, and devs have created an AI recommendation script to aid in this. It recommended that we create a composite index on something like `(user_id, id)` for a query. We have MySQL. If you don't know (the AI didn't, clearly), MySQL implicitly has a copy of the PK in every secondary index, so while it would quite happily make that index for you, it would end up being `(user_id, id, id)` and would thus be 3x the size it needed to be.


Can someone please answer these questions because I still think AI stinks of a false promise of determinable accuracy:

Do you need an expert to verify if the answer from AI is correct? How is it time saved refining prompts instead of SQL? Is it typing time? How can you know the results are correct if you aren't able to do it yourself? Why should a junior (sorcerer's apprentice) be trusted in charge of using AI? No matter the domain, from art to code to business rules, you still need an expert to verify the results. Would they (and their company) be in a better place to design a solution to a problem themselves, knowing their own assumptions? Or just check of a list of happy-path results without a FULL knowledge of the underlying design? This is not just a change from hand-crafting to line-production, it's a change from deterministic problem-solving to near-enough is good enough, sold as the new truth in problem-solving. It smells wrong.


I can bring data here:

We recently did the first speed run where Louie.ai beat teams of professional cybersecurity analysts in an open competition, Splunk's annual Boss of the SOC. Think writing queries, wrangling Python, and scanning through 100+ log sources to answer frustratingly sloppy database questions:

- We get 100% correct for basic stuff in the first half that takes most people 5-15 minutes per question, and 50% correct in the second half that most people take 15-45+ minute per question, and most teams time out in that second half.

- ... Louie does a median 2-3min per question irrespective of the expected difficulty, so about 10X faster than a team of 5 (wall clock), and 30X less work (person hours). Louie isn't burnt out at the end ;-)

- This doesn't happen out-of-the-box with frontier models, including fancy reasoning ones. Likewise, letting the typical tool here burn tokens until it finds an answer would cost more than a new hire, which is why we measure as a speedrun vs deceptively uncapped auto-solve count.

- The frontier models DO have good intuition , understand many errors, and for popular languages, DO generate good text2query. We are generally happy with OpenAI for example, so it's more on how Louie and the operator uses it.

- We found we had to add in key context and strategies. You see a bit in Claude Code and Cursor, except those are quite generic, so would have failed as well. Intuitively in coding, you want to use types/lint/tests, and same but diff issues if you do database stuff. But there is a lot more, by domain, in my experience, and expecting tools to just work is unlikely, so having domain relevant patterns baked in and that you can extend is key, and so is learning loops.

A bit more on louie's speed run here: https://www.linkedin.com/posts/leo-meyerovich-09649219_genai...

This is our first attempt at the speed run. I expect Louie to improve: my answers represent the current floor, not the ceiling of where things are (dizzyingly) going. Happy to answer any other q's where data might help!


is a competition/speed run a realistic example?


Splunk Boss of the SOC is the realistic test, it is one of the best cyber ranges. Think effectively 30+ hours of tricky querying across 100+ real log source types (tables) with a variety of recorded cyber incidents - OS logs, AWS logs, alerting systems, etc. As I mentioned, the AI has to seriously look at the data too, typically several queries deep for the right answer, and a lot of rabbit holes before then - answers can't just skate by on schema. I recommend folks look at the questions and decide for themselves what this signifies. I personally gained a lot of respect for the team create the competition.

The speed run formulation for all those same questions helps measure real-world quality vs cost trade-offs. I don't find uncapped solve rates to be relevant to most scenarios. If we allowed infinite time, yes we would have scored even higher... But if our users also ran it that way, it would bankrupt them.

If anyone is in the industry, there are surprisingly few open tests here. That is another part of why we did BOTS. IMO sunlight here brings progress, and I would love to chat with others on doing more open benchmarks!


My 2 cents, building a tool in this space...

> Do you need an expert to verify if the answer from AI is correct?

If the underling data has a quality issue that is not obvious to a human, the AI will miss it too. Otherwise, the AI will correct it for you. But I would argue that it's highly probable that your expert would have missed it too... So, no, it's not a silver bullet yet, and the AI model often lacks enough context that humans have, and the capacity to take a step back.

> How is it time saved refining prompts instead of SQL?

I wouldn't call that "prompting". It's just a chat. I'm at least ~10x faster (for reasonable complex & interesting queries).


Same reason as why it's harder to solve a sudoku than it is to verify its correctness.


I should have made my post clearer :)

There isn't one perfect solution to SQL queries against complex systems.

A suduko has one solution.

A reasonably well-optimised SQL solution is what the good use of SQL tries to achieve. And it can be the difference between a total lock-up and a fast running of a script that keeps the rest of a complex system from falling over.


The number of solutions doesn't matter though. You can easily design a sudoku game that has multiple solutions, but it's still easier to verify a given solution than to solve it from scratch.

It's not even about whether or not the number of solutions is limited. A math problem can have unlimited amount of proofs (if we allow arbitrarily long proofs), but it's still easier to verify one than to come up with one.

Of course writing SQL isn't necessarily comparable to sudoku. But the difference, in the context of verifiability, is definitely not "SQL has no single solution."


If the current state of software is any indication, experts don't care much about optimization either.


I've recently started asking the free version of chat-gpt questions on how I might do various things, and it's working great for me - but also my questions come from a POV of having existing "domain knowledge".

So for example, I was mucking around with ffmpeg and mkv files, and instead of searching for the answer to my thought-bubble (which I doubt would have been "quick" or "productive" on google), I straight up asked it what I wanted to know;

  > are there any features for mkv files like what ffmpeg does when making mp4 files with the option `--movflags faststart`?
And it gave me a great answer!

  (...the answer happened to be based upon our prior conversation of av1 encoding, and so it told me about increasing the I-frame frequency).
Another example from today - I was trying to build mp4v2 but ran in to drama because I don't want to take the easy road and install all the programs needed to "build" (I've taken to doing my hobby-coding as if I'm on a corporate-PC without admin rights (windows)). I also don't know about "cmake" and stuff, but I went and downloaded the portable zip and moved the exe to my `%user-path%/tools/` folder, but it gave an error. I did a quick search, but the google results were grim, so I went to chat-gpt. I said;

  > I'm trying to build this project off github, but I don't have cmake installed because I can't, so I'm using a portable version. It's giving me this error though: [*error*]
And the aforementioned error was pretty generic, but chat-gpt still gave a fantastic response along the lines of;

  >  Ok, first off, you must not have all the files that cmake.exe needs in the same folder, so to fix do ..[stuff, including explicit powershell commands to set PATH variables, as I had told it I was using powershell before].
  >  And once cmake is fixed, you still need [this and that].
  >  For [this], and because you want portable, here's how to setup Ninja [...]
  >  For [that], and even though you said you dont want to install things, you might consider ..[MSVC instructions].
  >  If not, you can ..[mingw-w64 instructions].


[Going to give myself a self-reply here, but what-ev's. This is how I talk to chat-gpt, FYI]... So I happened to be shopping for a cheap used car recently, and we have these ~15 year old Ford SUV's in Aus that are comfortable, but heavy and thirsty. Also, they come in AWD and RWD versions. So I had a thought bubble about using an AWD "gearbox" in a RWD vehicle whilst connecting an electric motor to the AWD front "output", so that it could work as an assist. Here was my first question to chat-gpt about it;

  > I'm wondering if it would be beneficial to add an electric-assist motor to an existing petrol vehicle. There are some 2010 era SUV's that have relatively uneconomical petrol engines, which may be good candidates. That is because some of them are RWD, whilst some are AWD. The AWD gearbox and transfer case could be fitted to the RWD, leaving the transfers front "output" unconnected. Could an electric motor then be connected to this shaft, hence making it an input?
It gave a decent answer, but it was focused on the "front diff" and "front driveshaft" and stuff like that. It hadn't quite grasped what I was implying, although it knew what it was talking about! It brought up various things that I knew were relevant (the "domain knowledge" aspect), so I brought some of those things in my reply (like about the viscous coupling and torque split);

  > I mentioned the AWD gearbox+transfer into a RWD-only vehicle, thus keeping it RWD only. Thus both petrol+electric would be "driving" at the same time, but I imagine the electric would reduce the effort required from the petrol. The transfer case is a simple "differential" type, without any control or viscous couplings or anything - just simple gear ratio differences that normally torque-split 35% to the front and 65% to the rear. So I imagine the open-differential would handle the 2 different input speeds could "combine" to 1 output?
That was enough to "fix" its answer (see below). And IMO, it was a good answer!

I'm posting this because I read a thread on here yesterday/2-days-ago about people stuggling with their AI's context/conversation getting "poisoned" (their word). So whilst I don't use AI that much, I also haven't had issue with it, and maybe that's because of that way I converse with it?

---------

"Edit": Well, the conversation was too long for HN, so I put it here - https://gist.github.com/neRok00/53e97988e1a3e41f3a688a75fe3b...


Not sure if anyone here saw the movie Clueless, but a great quote was, "That guy is such a Monet. From a distance he looks great, but up close he's a real mess."


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: