I have something against opt-out analytics over TCP/IP or UDP/IP period, because they aren't anonymized, they include an IP address by virtue of the protocol.
But I definitely only posted that original complaint of the email/username (not the person you responded to initially).
They're not anonymous, they're just pseudo-anonymous. It's incredibly easy to collect pieces of data A thru Z that, on their own, are anonymous but, all together, are not. It's also incredibly easy to collect data that you think is generic but is actually not.
Do you query the screen size? I have bad news for you. But, all of this is besides the point: when that data is exfiltrated to a third-party service, you have no idea how it's being used. You have a piece of paper, if you're lucky, telling you the privacy policy, which is usually "you have no privacy dumbass".
Even if data appears completely anonymous to humans, it can be ingested by machine learning algorithms that can spot patterns and de-anonymize the data.
I mean, we have companies who's entire business model is "how do we string together bits of data and tie it to real-world identity?": namely Google. Turns out it's remarkably easy when you have your hands in a lot of different pots. Collect a little anonymous data here, a little there, and boom: now you know that Billy Joe who lives on First Street loves to go to Walmart at 1 AM and buy Ben and Jerry's ice cream in a moment of weakness.
"not breaking things they like" is a very low bar for building a great product
To be honest building things this way seems like such a competitive disadvantage I don't see how it could ever work at scale. Certainly all the big players are using them. If we shake our heads at the little players doing the same, we're just going to widen the moat
Spying on your users does not give better feedback than simply asking your users (surveys, focus groups) and responding to the considered comments you receive. Spying and trying to infer intent is such a low bar to improve upon.
Companies blow money on bad ideas all the time. Middle managers love analytics because it lets them win internal arguments, not because it actually solves problems.
It is not an either or. Surveys are almost always ignored. Micro improvements cannot be done with just surveys and asking users. Often users do not know how to describe a problem. Product analytics, if anonymized with opt-out gives a pretty good picture of intent, especially in B2B software.
Any complex dataset has enough revealing information as to make deanonimization possible. To truly muddle the waters enough to make such attempts impossible would require injecting enough noise as to make the analytics useless to learn from.
Sure, but that is broader than product analytics and applies to all data collection. The word I should have used is "pseudonymize". The goal for capturing product analytics is not to deanonimize but understand usage trends/bottlenecks.
Pseudonymous is not what is wanted here though. For your spying on my usage to be acceptable, it would have to be truly anonymous. Pseudonymous means that instead of you putting "HN user adastra22" in your database for everything I do, you instead use "fffa366bc5d3." So any human being looking at the database record won't immediately see that it is me.
But in any sufficiently complex real-world database, it is a trivial step to map these pseudonymous tags to actual users, and thereby undo the obfuscation. It provides no actual privacy protection.
Isn't that an argument against any piece of ethics? Am I missing something or are you arguing that gaining an advantage by being a bad actor means you shouldn't be a good actor because then you'd be at a disadvantage?
I get that I am making a general statement from your original narrow scope so correct me if I'm wrong that you mean THIS bad thing is fine but other bad things are still bad.
I think also that this would be better as an mcp tool / resource. Let the model operate and query it as needed.