More

KronisLV · 2025-12-26T13:12:03 1766754723

I just use a mix of Cerebras Code for lots of fast/simpler edits and refactoring and Codex or Claude Code for more complex debugging or planning and implementing new features, works pretty well. Then again, I move around so many tokens that doing everything with just one provider would need either their top of the line subscriptions or paying a lot per-token some months. And then there's the thing that a single model (even SOTA) can never solve all problems, sometimes I also need to pull out Gemini (3 is especially good) or others.

KronisLV · 2025-12-26T12:44:09 1766753049

Idk how I feel about this, my experience is the opposite of this.

Most Spring (+Boot) projects have been far more painful to work with - numerous abstractions and deep and complex relationships between them just to support every technology and integration under the sun. Hard to work with, hard to debug (especially with proxies and how the DI works, alongside the nonsensical rules about how @Transactional works, e.g. how it doesn't if you call a transactional method from within the same component/service), sometimes performs like shit or leaks memory and migrations between major versions are a pain. We just spent multiple months migrating a project from an old Spring version to a more recent version of Spring Boot. It's a pain.

Compare that to Dropwizard: it is fairly stable and most updates are maintenance related. It uses a bunch of idiomatic packages from the ecosystem: Jetty, Jersey, Jackson, Logback, Hibernate/JDBI3, supports Liquibase or Flyway, there's also validation and views or whatever you want, but none of it is pushed down your throat. The docs are practical and more manageable just because it does LESS overall, the code is simpler and there are far fewer surprises. And because the logic is mostly just Java and there's no castle-in-the-sky bullshit with annotations, if you want to swap out HK2 for Dagger (if you still want DI), you can - which I think is good because Dagger also does a lot at compile time and avoids the whole Spring runtime injection can of worms altogether.

The size of a framework doesn't instantly make it good or bad, but oftentimes the bigger ones will be more difficult to reason about and sometimes the amount of abstractions gets out of hand.

KronisLV · 2025-12-25T10:55:38 1766660138

> It's not like you can afford an average decent mid/upper range GPU these days thanks to the AI bros.

I mean, Nvidia was greedy even before then and AMD just did “Nvidia - 50 USD” or thereabout.

Intel Arc tried shaking up the entry level (retailers spit on that MSRP though) but sadly didn’t make that big of a splash despite the daily experience being okay (I have the B580). Who knows, maybe their B770 will provide an okay mid range experience that doesn’t feel like being robbed.

Over here, to get an Nvidia 5060 Ti 16 GB I'd have to pay over 500 EUR which is fucking bullshit, so I don’t.

fodkodrasz · 2025-12-25T12:14:19 1766664859

The Intel–Nvidia collaboration has just received the green light from the competition authority, with Nvidia purchasing a 4% stake.

Nvidia is expected to sell GPU intellectual property at a bargain to the entry-level segment, making it unprofitable for Intel to develop a competitive product range. This way, Intel would lack both the competence and the infrastructure internally to eventually break Nvidia’s market share in the higher segments.

kadoban · 2025-12-25T16:37:54 1766680674

> Intel Arc tried shaking up the entry level (retailers spit on that MSRP though) but sadly didn’t make that big of a splash

The Intel Arc B60 probably would have made a splash if they had actually produced any of the damn things. 24GB vram for low prices would have been huge for the AI crowd, and there was a lot of excitement and then Intel just didn't offer them for sale.

The company is too screwed up to take advantage of any opportunities.

consp · 2025-12-25T11:07:33 1766660853

Hmm, duopolies don't work you say? I doubt 3 will make any difference (see memory manufacturers). Then again looking at market share nvidia is a monopoly in practice.

The bad part is everyone wants to be on the AI money circle line train (see the various money flow images available) and thus everything caters for that. At this point i'd rather have nvidia and amd quit the gpu business and focus on "ai" only, that way a new competitor can enter the business and cater the the niche applications like consumer gpus.

KronisLV · 2025-12-24T17:08:53 1766596133

I found that Cache-Control with no-cache worked pretty well EXCEPT Apache2 would fail to return 304 when also compressing some of the resources: https://stackoverflow.com/questions/896974/apache-is-not-sen...

I think setting FileETag None solved it. With that setup, the browser won't use stale JS/CSS/whatever bundles, instead always validating them against the server, but when the browser already has the correct asset downloaded earlier, it will get a 304 and avoid downloading a lot of stuff. Pretty simple and works well for low traffic setups.

It was surprisingly easy to mess up, or having your translation bundles have cached out of date versions in the browser.

(nothing against other web servers, Apache2 was just a good fit for other reasons)

KronisLV · 2025-12-22T14:46:55 1766414815

RooCode and KiloCode also have an Orchestrator mode that can create sub-tasks and you can specify which model to use for what - and since they report their results back after finishing a task (implement X, fix Y), the context of the more expensive model doesn’t get as polluted. Probably one of the most user friendly ways to do that.

A simpler approach without subtasks would be to just use the smart model for Ask/Plan/whatever mode and the dumb but cheap one for the Code one, so the smart model can review the results as well and suggest improvements or fixes.

KronisLV · 2025-12-22T14:34:13 1766414053

Here’s my own stats, for comparison: https://news.ycombinator.com/item?id=46216192

Essentially migrating codebases, implementing features, as well as all of the referencing of existing code and writing tests and various automation scripts that are needed to ensure that the code changes are okay. Over 95% of those tokens are reads, since often there’s a need for a lot of consistency and iteration.

It works pretty well if you’re not limited by a tight budget.

KronisLV · 2025-12-22T14:06:09 1766412369

Idk why the gap is so big, surely a bunch of people would also pay 50$ a month across multiple vendors for medium amount of tokens.

cmrdporcupine · 2025-12-22T14:27:10 1766413630

Indeed I would consider switching to Codex completely if a) they had a $100 or $50 membership b) they really worked on improving the CLI tool a lot more. It's about 4-6 months behind Claude Code

KronisLV · 2025-12-22T11:04:39 1766401479

In regards to the review part:

What helps me is keeping around my TODO.txt month by month, as well as a lot of screenshots and images of the things I find relevant for sharing in stand ups and meetings and such (as well as presentations).

So if I need to review the past month/year (e.g. when I want to update CV/site or catch up with management), it’s just a matter of going through a bunch of text and images without a lot of unnecessary fluff, like digging through Jira. Maybe if I want to get the approximate time/effort spent on particular stuff, based on the amount of activity there.

Alongside that, it’s also nice to document stuff that was particularly good, or all the ways software broke in (and what broke how often), as well as stuff that pissed me off and made me want to quit (sometimes people/mindsets, sometimes tangible code or practices).

When the default is just going with the flow and not documenting anything and doing no self reflection, every improvement upon that helps.

A4ET8a8uTh0_v2 · 2025-12-22T11:22:15 1766402535

GPT was actually pretty good for this use case until 5.2 kneecapped its long term memory and now its more aggressive about pruning ( very annoying as wide recall now has to be explicitly invoked ).

xianwen · 2025-12-22T11:23:35 1766402615

Very interesting! Do you organize screenshots and images by day and by topic?

KronisLV · 2025-12-22T11:52:10 1766404330

Currently not really, at least not for the weekly status meetings.

Typically I'll have a folder with a bunch of numbered files in the order that I want to talk about them, since it's easier to just quickly share my screen and run through then when I want to let others know what I've done, for example along the lines of:

  01-migrate-gulp-grunt-to-vite.png
  02-vue-prebuild-script-check-unused-translations.png
  03-java-add-compile-memory-limit-ide.png
  04-server-update-python-for-ansible.png
  ...

If I need them for like a yearly performance review, then I'll probably do a pass where I group them into named folders and write a doc loosely following those topics, given that I might work on similar improvements and fixes across more than just 1 week. Pretty low friction daily and also when I need more structure.

KronisLV · 2025-12-22T10:39:26 1766399966

We should just have some standard for crawlable archived versions of pages with no back end or DB interaction behind them etc., for example if there's a reverse proxy, whatever it outputs is archived and it wouldn't actually pass on any call in the archive version. Same for translating the output of any dynamic JS into fully static HTML. Then add some proof-of-work that works without JS and is a web standard (e.g. server sends header, client sends correct response, gets access to archive) and mainstream the culture for low-cost hosting for such archives and you're done, also make sure that this sort of feature is enabled in the most basic configuration for all web servers and such, logged separately.

Obviously such a thing will never happen, because the web and culture went in a different direction. But if it were a mainstream thing, you'd get easy to consume archives (also for regular archival and data hoarding) and the "live" versions of sites wouldn't have their logs be bogged down by stupid spam.

Or if PoW was a proper web standard with no JS, then ppl who want to tell AI and other crawlers to fuck off, they could at least make it uneconomical to crawl their stuff en masse. In my view, proof of work that would work through headers in the current day world should be as ubiquitous as TLS.

KronisLV · 2025-12-22T10:29:32 1766399372

My experience: even for the run of the mill stuff, local models are often insufficient, and where they would be sufficient, there is a lack of viable software.

For example, simple tasks CAN be handled by Devstral 24B or Qwen3 30B A3B, but often they fail at tool use (especially quantized versions) and you often find yourself wanting something bigger, where the speed falls a bunch. Even something like zAI GLM 4.6 (through Cerebras, as an example of a bigger cloud model) is not good enough for doing certain kinds of refactoring or writing certain kinds of scripts.

So either you use local smaller models that are hit or miss, or you need a LOT of expensive hardware locally, or you just pay for Claude Code, or OpenAI Codex, or Google Gemini, or something like that. Even Cerebras Code that gives me a lot of tokens per day isn't enough for all tasks, so you most likely will need a mix - but running stuff locally can sometimes decrease the costs.

For autocomplete, the one thing where local models would be a nearly perfect fit, there just isn't good software: Continue.dev autocomplete sucks and is buggy (Ollama), there don't seem to be good enough VSC plugins to replace Copilot (e.g. with those smart edits, when you change one thing in a file but have similar changes needed like 10, 25 and 50 lines down) and many aren't even trying - KiloCode had some vendor locked garbage with no Ollama support, Cline and RooCode aren't even trying to support autocomplete.

And not every model out there (like Qwen3) supports FIM properly, so for a bit I had to use Qwen2.5 Coder, meh. Then when you have some plugins coming out, they're all pretty new and you also don't know what supply chain risks you're dealing with. It's the one use case where they could be good, but... they just aren't.

For all of the billions going into AI, someone should have paid a team of devs to create something that is both open (any provider) and doesn't fucking suck. Ollama is cool for the ease of use. Cline/RooCode/KiloCode are cool for chat and agentic development. OpenCode is a bit hit or miss in my experience (copied lines getting pasted individually), but I appreciate the thought. The rest is lacking.

evanreichard · 2025-12-23T04:30:14 1766464214

Have you tried llama.vscode [0]? I use the vim equivalent, llama.vim [1] with Qwen3 Coder 30B and personally feel that it's better than Copilot. I have hot keys that allow me to quickly switch between the two and find myself always going back to local.

[0] https://github.com/ggml-org/llama.vscode

[1] https://github.com/ggml-org/llama.vim