Hacker Newsnew | past | comments | ask | show | jobs | submit | itissid's commentslogin

One (narrow) circumstance to make the process of reviewing a large contribution — with significant aid from LLM — easier to review is to jump on a call with the reviewer, explain what the change is, and answer their questions on why is it necessary and what it brings to the table. This first pass is useful for a few reasons:

1. It shifts the cognitive load from the reviewer to the author because now the author has to do an elevator pitch and this can work sort of like a "rubber duck" where one would likely have to think about these questions up front.

2. In my experience this is a much faster to do this than a lonesome review with no live input from the author on the many choices they made.

First pass and have a reviewer give a go/no-go with optional comments on design/code quality etc.


> jump on a call with the reviewer

Have you ever done that with new contributors to open source projects? Typically things tend to be asynchronous but maybe it's a practice I've just not encountered in such context.


I've done that in contributions to unknown people's repo but not necessarily open source ones. I believe that this is quite under valued for the reasons I listed.

In addition, 1:1 contact can speed up things immensely in such situations because most activity on a change happens very soon after the first change is made and that initial voluminous back and forth can be faster than typing to have a back and forth on github for a PR.


From my own experience as one grows over their 30's, or probably much older, to get to what you mentioned "money, houses, kids, friends", these ads pretty much don't target u very effectively any ways because one's priorities are shifted and you care more about other things than what the attention economy is all about. IOW these ads all about the people who have attention to spare.

has anyone had good experience with humanlayer's system/process of management?

Just their thought management git system works pretty well for me TBH. https://www.humanlayer.dev/


The general process feels very much like having kids over for a birthday party. Except you have to get them all to play nice and you have no idea what this other kid was conditioned on by their parents. Generally it would all work fine, all the kids know how the party progresses and what their roles are — if any.

But imagine how hard it would be if these kids had short term memory only and they would not know what to focus on except what you tell them to. You literally have to tell them "Here is A-Z pay attention to 'X' only and go do your thing". Add in other managers for this party like a caterer, clowns, your spouse and they also have to tell them that and remember, communicate what other managers have done. No one has solved for this, really.

This is what it felt like in 2025 to code with LLMs on non trivial projects, with some what of an improvement as the year went by. But I am not sure much progress was made in fixing the process part of the problem.


There was a time when if you edited documentation in vscode and had copilot on it would complete internal user and project names when it encountered a path on some.random LLM project we were building. I could find people and their projects by just googling the username and contextual keywords.

We all had a lot of laughs with tab auto complete and wondered in anticipation what ridiculous stuff it threw up next.


One thing that is interesting to think about is given a skill which is just "pre-context", how can it be _evolved_ to create prompts given _my_ context? e.g. here is their web artifact skill builder from desktop app:

``` web-artifacts-builder

Suite of tools for creating elaborate, multi-component claude.ai HTML artifacts using modern frontend web technologies (React, Tailwind CSS, shadcn/ui). Use for complex artifacts requiring state management, routing, or shadcn/ui components - not for simple single-file HTML/JSX artifacts. ```

Say I want to build a landing page with some relatively static content — I don't know it yet but its just gonna be bootstrap CSS, no SPA/React(ish), it'll be fine with templated server side thing. But I don't know how to express this in words. Could the skill _evolve_ based on what my preferences are and what is possible for a relative novice to grok and construct?

This is a simple example, but it could extend to say using sqlite+litestream instead of postgres or using Gradient boosted trees instead of an expensive transformer based classifier.


Isn't atleast part of that GH issue something that this https://docs.boundaryml.com/guide/introduction/what-is-baml is also trying to solve? LLM inputs and outputs must be functions with defined functions. That was their starting point.

IIUC their most recent arc focuses on prompt optimization[0] where you can optimize — using DSPy and an optimization algo GEPA [1] — using relative weights on different things like errors, token usage, complexity.

[0] https://docs.boundaryml.com/guide/baml-advanced/prompt-optim... [1] https://github.com/gepa-ai/gepa?tab=readme-ov-file


I gave opus an "incorrect" research task (using this slash command[1]) in my REST server to research to use SQLite + Litestream VFS can be used to create read-replicas for REST service itself. This is obviously a dangerous use of VFS[2] and a system like sqlite in general(stale reads and isolation wise speaking). Ofc it happily went ahead and used Django's DB router feature to implement `allow_relation` to return true if `obj._state.db` was a `replica` or `default` master db.

Now claude had access to this[2] link and it got the daya in the research prompt using web-searcher. But that's not the point. Any Junior worth their salt — distributed systems 101 — would know _what_ was obvious, failure to pay attention to the _right_ thing. While there are ideas on prompt optimization out there [3][4], the issue is how many tokens can it burn to think about these things and come up with optimal prompt and corrections to it is a very hard problem to solve.

[1] https://github.com/humanlayer/humanlayer/blob/main/.claude/c... [2] https://litestream.io/guides/vfs/#when-to-use-the-vfs [3] https://docs.boundaryml.com/guide/baml-advanced/prompt-optim... [4]https://github.com/gepa-ai/gepa


I'm not sure a junior would immediately understand the risks of what you described. Even if they did well in dist sys 101 last year.


Really nice. We should have this as an add-on to https://app.codecrafters.io/courses/sqlite/overview It can probably teach one a lot about the value of good replication and data formats.

If you are not familiar with data systems, havea read DDIA(Designing Data Intensive Applications) Chapter 3. Especially the part on building a database from the ground up — It almost starts with sthing like "Whats the simplest key value store?": `echo`(O(1) write to end of file, super fast) and `grep`(O(n) read, slow) — and then build up all the way to LSMTrees and BTrees. It will all make a lot more sense why this preserves so many of those ideas.


Nicely done. I think from a product perspective it is interesting that:

- Humans really value authentic experiences. And more so IRL experiences. People's words about a restaurant matter more than the star rating to me.

- There is only one reason to go somewhere: 4.5 star reason. But there are 10 different reasons to not go: Too far, not my cuisine, too expensive for my taste. So the context is what really matters.

- Small is better. Product wise, scale always is a problem, because the needs of the product will end up discriminating against a large minority. You need it to be decentralized and organic, with communities that are quirky.

All this is, somehow, anethma to google maps or yelp's algorithm. But I don't understand why it is _so_ bad — just try searching for 'salad' — and be amazed how it will recommend a white table cloth restaurant in the same breath as chipotle.

There are many millions that want to use the product _more_ if it was personalized. Yet somehow its not.


> People's words about a restaurant matter more than the star rating to me.

I find that both offer an incredibly poor signal. I can usually get a much better idea of the quality of the place by looking at pictures of the food (especially the ones submitted by normal users right after their plate arrives at the table). It's more time consuming to scroll through pictures manually than to look at the stars, but I'm convinced it's a much better way to find quality.

Maybe that could be a good angle for this kind of tool. At least until this process becomes more popular and the restaurants try to game that too by using dishonest photography.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: