Hacker Newsnew | past | comments | ask | show | jobs | submit | anilgulecha's commentslogin

Has anyone implemented a system of Pi for a team? Basically consolidate all shared knowledge and skills, and work on things that the team together is working on through this?

Basically a pi with SSO frontend, and data separation.

If no one has - I have a good mind to go after this over a weekend.


There is a thing called Mercury that seems very promising. Check https://taoofmac.com/space/ai/agentic/pi for a list of pi-related things I'm tracking.

I have created a separate knowledge base in Markdown synced to git repo. Agents can read and write using MCP. Works fine!

And others pull regularly from the pool? how are knowledge and skills continuously updated? I was thinking these necessarily need to be server side (like the main project under discussion) for it to be non-clunky for many users, but potentially git could work?

Like, let's take a company example - gitlab. If an agent had the whole gitlab handbook, then it'll be very useful to just ask the agent what and how to do in a situation. The modern pi agents can help build such a handbook with data fed in all across the company.


But typescript is already trained in every model, and needs no additional work.

>If you ask me, no court should have ever rendered a judgement on whether AI output as a category is legal or copyrightable, because none of it is sourced. The judgement simply cannot be made, and AI output should be treated like a forgery unless and until proven otherwise.

Guilty until proven innocent will satisfy the author's LLM-specific point of contention, but it is hardly a good principle.


You are missing the point of the author. He literally said no court should have rendered a judgement, that's the exact opposite of guilty until proven innocent. Guilty means a court has made a judgement.

He is proposing to not make a judgement at all. If the AI company CLAIMS something they have to prove it. Like they do in science or something. Any claim is treated as such, a claim. The trick is to not even claim anything, let the users all on their own come to the conclusion that it's magic. And it's true that LLMs by design cannot cite sources. Thus they cannot by design tell you if they made something up with disregard to it making sense or working, if they just copy and pasted it, something that either works or is crap, or if they somehow created something new that is fantastic.

All we ever see are the success stories. The success after the n-th try and tweaking of the prompt and the process of handling your agents the right way. The hidden cost is out there, barely hidden.

This ambiguity is benefitting the AI companies and they are exploiting it to the maximum. Going even as far as illegally obtaining pirated intellectual property from an entity that is banned in many countries on one end of their utilization pipeline and selling it as the biggest thing ever at the other end. And yes, all the doomsday stories of AI taking over the world are part of the marketing hype.


sure, no "court" should render it, but then

>AI output should be treated like a forgery

Who's passing this judgement this? Author? Civil society?


A forgery isn't a subjective assessment. A forgery is intentionally made inaccurate claim of the origin of something. If the by-line is claiming it was made by someone who didn't make it, it's a forgery no matter how good of a copy it is judged to be.

This is precedent setting. In this case the rewrite was in same language, but if there's a python GPL project, and it's tests (spec) were used to rewrite specs in rust, and then an implementation in rust, can the second project be legally MIT, or any other?

If yes, this in a sense allows a path around GPL requirements. Linux's MIT version would be out in the next 1-2 years.


Its very important to understand the "how" it was done. The GPL hands the "compile" step, and the result is still GPL. The clean Room process uses 2 teams, separated by a specification. So you would have to

1. Generate specification on what the system does. 2. Pass to another "clean" system 3. Second clean system implements based just on the specification, without any information on the original.

That 3rd step is the hardest, especially for well known projects.


So what if a frontier model company trains two models, one including 50% of the world's open source project and the second model the other 50% (or ten models with 90-10)?

Then the model that is familiar with the code can write specs. The model that does not have knowledge of the project can implement them.

Would that be a proper clean room implementation?

Seems like a pretty evil, profitable product "rewrite any code base with an inconvenient license to your proprietary version, legally".


LLM training is unnecessary in what we're discussing. Merely LLM using: original code -> specs as facts -> specs to tests -> tests to new code.

It is hard to prove that the model doesn't recognize the tests and reproduces the memoized code. It's not a clean room.

1 is claude-code1, outputs tests as text.

2. Dumped into a file.

3. claude-code that converts this to tests in the target language, and implements the app that passes the tests.

3 is no longer hard - look at all the reimplementations from ccc, to rewrites popping up. They all have a well defined test suite as common theme. So much so that tldraw author raised a (joke) issue to remove tests from the project.


> but if there's a python GPL project, and it's tests (spec) were used to rewrite specs in rust, and then an implementation in rust, can the second project be legally MIT, or any other?

Isn't that what https://github.com/uutils/coreutils is? GNU coreutils spec and test suite, used to produce a rust MIT implementation. (Granted, by humans AFAIK)


Treating an AI-assisted rewrite as a legal bypass for the GPL is wishful thinking. A defensible path is a documented clean-room reimplementation where a team that never saw the GPL source writes independent specs and tests, and a separate team implements from those specs using black-box characterization and differential testing while you document the chain of custody.

AI muddies the water because large models trained on public repos can reproduce GPL snippets verbatim, so prompting with tests that mirror the original risks contamination and a court could find substantial similarity. To reduce risk use black-box fuzzing and property-based tools, have humans review and scrub model outputs, run similarity scans, and budget for legal review before calling anything MIT.


I'm somewhat confused on how it actually muddies the waters - any person could have read the source code before hand and then either lied about it or forgot.

Our knowledge of what the person or the model actually contains regarding the original source is entirely incomplete when the entire premise requires there be full knowledge that nothing remains.


No, GPL still holds even if you transform the source code from one language to another language.

That why I carved it out to just the specs. If they can be read as "facts", then the new code is not derived but arrived at with TTD.

The thesis I propose is that tests are more akin to facts, or can be stated as facts, and facts are not copyright-able. That's what makes this case interesting.


I assumed that "tests" refers to a program too, which in this example is likely GPL. Thus GPL would stick already on the AI-rewrite of GPL test code.

If "tests" should mean a proper specification let's say some IETF RFC of a protocol, then that would be different.


Yes, I had not specified in my original comment. But in the SOTA LLM world code/text boundary is so blurry, so as to be non-existent.

No package manager is. But of the ones that are installed by users, npm is probably the most popular.

What about pip? It's either installed or immediately available on many OSes

pip might be but it was historically super inconsistent (at least in my experience). Is it `pip install`? `python3 -m pip install`? maybe `pip3 install`? Yeah ubuntu did a lot of damage to pip here. npm always worked because you had to install it and it didnt have a transition phase from python2 being in the OS by default.

`pip install` either doesn’t work out of the box or has the chance to clobber system files though

system pip w/ sudo usually unleashes Zalgo, i’d rather curl | bash but npm is fine too. it’s just about meeting people where they’re at, and in the ai age many devs have npm

if you build for the web, no matter what your backend is (python, go, rust, java, c#), your frontend will almost certainly have some js, so likely you need npm.


This is about eight years old. The python situation has mostly gotten worse since https://xkcd.com/1987/

python packaging / envs is solved now by uv. its not promising or used by people in the know like the last 2 trendy python package managers. i was a big time python hater since it was a pita to support as a devtools guy but now its trivial. uv just works, it won.

I'm not a python dev, but I see a bit of its ecosystem. How does uv compare with conda or venv? I thought JS had the monopoly on competing package managers.

What? It’s much much better now, you can just use uv. Yeah, it’s yet another package manager, but it does it well.

Or go up a rung or two on the abstraction ladder, and use mise to manage all the things (node, npm, python, etc).

My 2c:

1) Reflect daily, and inspect your feelings. Most of the negative sentiments and positive sentiments of AI arise from how they impact your identity. ("I'm a great programmer", "I build complicated systems easily". Doing an RCA on your thoughts is like debugging.

2) List down things you can control, and you cannot control. "I cannot stop the launch of the new model." .. "I control my usage of these models" .. "My family needs me to do this, and I can" .. "I can do this in my team".

3) Fully accept both of above. It's a process.

4) Finally, you can then see what are the new identities and new things you can do in this disrupted new world, and you can begin to focus on those.

I think these also model the stages of dealing with trauma, because both require acceptance to truly figure out the next steps in a positive way.


It's not an intern because you can speak at a much higher level of abstraction that in the old world you could only speak with at an architect level.

In the new world this has become the potential expectation at an intern level.. which means forget leetcode - learn to deal with higher level architecture concepts and practice them.


Please explain this in more detail. I don't understand. The level of abstraction seems straightforward, even for an intern. Properly understanding this level of abstraction in context, without lengthy explanations, is a completely different matter. Beginners struggle with this, just as AI systems struggle with it (or rather, it's impossible). Of course, I'm talking about the conceptual, anthological context here, not the previous textual "LLM context".


Prior to LLM the concept of "Open Source" could co-exist with "Free Software" - one was a more pragmatic view of how to develop software, the other a political activist position of how code powering our world should be.

AI has laid bare the difference.

Open Source is significantly impacted. Business models based on it are affected. And those who were not taking the political position find that they may not prefer the state of the world.

Free software finds itself, at worst, a bit annoyed (need to figure out the slop problem), and at best, an ally in AI - the amount of free software being built right now for people to use is very high.


I’ve seen different opinions. Can LLM-generated software be licensed under the GPL?


Your question has nothing to do with the GPL. If your concern is that the code may count as derivative work of existing code then you also can't use that code in a proprietary way, under any license. But that probably only applies if the LLM regurgitated a substantial amount of copyrighted code into your codebase.


Fair; that was an example instance. People interested in “Free software” rather than “open source” seem to often favor the GPL, though other licensing options also count as “free software”.

But in any case, the question really refers to, can the LLM-generated software be copyrighted? If not, it can’t be put under any particular license.


Is your concern the potential for plagiarism or the lack of creative input from the human? If the latter, it would depend on how much intellectual input was needed from the human to steer the model, iterate on the solution etc.


If it can’t be copyrighted, then no. Licenses rely on the copyright holder’s right to grant the license. But that would also mean it’d be essentially public domain. I’m not sure there’s really settled legal opinion on this yet. Iirc it can’t be patented.


Can you link to them?

The way the world is currently working is code created by someone (using AI) is being dealt with as if it was authored by that someone. This is across companies and FOSS. I think it's going to settle with this pattern.


Very fresh take on APIs. With proliferation of compute types (container, lambda, cloudflare worker, VM, offline-first apps), there's something to be said for a common interface.

Kudos. will explore more.


Thanks for the kind words! Let me know your thoughts if you end up exploring.


Imo, it's allowed theoretically, as a hobby, but not really as a practice. This is what the blog is about.


There are literally still programmers who make their living writing assembly code by hand for embedded systems.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: