The complaint is mostly about JSON parsers, not the JSON spec itself, which as other comments have said works quite well in practice.
In .NET land the most popular JSON library is Newtonsoft which plays fast and loose with the JSON spec and makes a lot of default assumptions. A couple of years ago Microsoft rolled out their own first-party JSON library, which is a lot stricter and requires the developer to explicitly configure the parser instead of making assumptions.
But some developers still instinctively reach for Newtonsoft, even when working on brand new projects. In practice, this sort of ingrained habit is far greater of a problem than the JSON spec itself.
The author says that Protobuf makes all of JSON's problems go away. I don't know enough about Protobuf to confirm or deny, but I know that it's not tested 10 trillion times a day like JSON is.
Perhaps not, but it might be close: essentially every message sent between every component of every service on every machine in every datacenter Google operates gets encoded as a protobuf. Engineering at Google consists of populating protobufs to send to other services, which reply with protobufs containing the data you need to stuff into more protobufs, which you can then send on to further different services.
We can, in other words, safely conclude that protobuf scales.
ROT13 also scales. Anything that uses minimal compute can be embarrisingly parallelizable.
I think it is more interesting how protobuf is used. I am not experienced in it but roughly it seems like you design a schema for tight packing data. Therefore there is more work in accepting a protobuf and you cant just add arbitrary fields and let old versions of sdks ignore em. I think!
This is good and bad. It is the schema vs schemaless and compile types. vs runtime types.
There is also the efficiency of protobuf vs slightly more verbose JSON.
The complaint that many things are implementation-defined is a (valid) complaint about the spec, though.
JSON is essentially just a specification for some syntax. When you actually want a full and well-defined serialization/deserialization format, this can lead to fun surprises with JSON.
The argument that it is tested 10 trillion times a day is a fallacy (because it can't be that only the most popular option should be the best forever).
And furthermore, in this context "tested" is a bit of an overstatement, I would say "used" many gazillions of times a day.
It is used so much across the board for FE and BE and CI and config.
Despite the heavy usage I
it hasn't been canned and I have never heard anyone at work complain about JSON for any reason.
You could say it is network effects but if it were crap you could replace it much more easily than say moving from Python to Java or whatever. Especially for internal microservice stuff and perhaps front end.
JSON and the tooling is basically solid. It is a non-concern.
Using is testing. The whole point of automated UI tests is to mimic usage.
A random click on a website might not be a great test, but it is a test nonetheless. And if it fails enough times, someone somewhere will be held accountable.
We reach out to Newtonsoft, because .NET also has a Python 2/3 problem and not everyone can be on .NET vLatest, or has the budget to rewrite code to use other parsers.
I wonder if Pjmlp would become happier if he was to stop working with companies using ancient stacks for no reason whatsoever. Luckily, companies like that are becoming a minority. You do not have to concern yourself with .NET Framework if you do not want to under free market economy :)
Also, there was no Python 2/3 - it’s not comparable and there were no breaking changes in the language. You can’t be serious if you think these are equivalent.
Not everyone has the freedom to chose their employers, or consulting costumers.
All companies whose main business isn't selling software have reasons to use ancient stacks, that isn't the money maker, what is deployed in production works, and there isn't budget around for a full rewrite.
Also Microsoft could set the example, some of those products are theirs.
Yes I am serious, library code also counts.
If your identity wasn't so tied with .NET, maybe you would see the transition mess in a bit more impartial way, it is no accident that now they introduced AI based migration tool at Ignite, and are seeking companies willing to try it out.
For legacy code that's already using a lot of Newtonsoft, I understand. For new code, developers should default to the first-party option. System.Text.Json is supported by .NET versions as old as Framework 4.6.1.
If the problem is that all the parsers implement a different interpretation of the spec such that you get the kinds of incompatibilities the article lists, that all means that the spec is faulty.
The spec might not be perfect, but some of the most popular parsers play fast and loose with the rules. (Looking at you, Newton "I permit //comments in JSON" Soft)
Newtonsoft.JSON is no longer the standard choice. System.Text.Json was first introduced 5 years ago and is also provided as a package to support targets as old as .NET Framework 4.6.1.
I appreciate the blog post and I learned a bunch from it!
But the quote that comes to mind is: "People who say it cannot be done should not interrupt those who are doing it"
It's quite a stretch to say that JSON is broken and can't be fixed. JSON works exceptionally well in practice and any review of it that fails to acknowledge this is, IMO, very lacking in perspective.
If you look at the statistics, you’ll find that casual, condomless sex is very common. Despite the risks of STI transmission, it often pans out okay: most STIs are readily curable, while HIV has a low transmission rate amongst heterosexual couples (about 0.08% to 0.2% per sexual act) while PrEP is pretty common for homosexual men, and HSV has fairly low odds of transmission outside of an active outbreak (annual odds of transmission amongst couples: about 11–17% for couples with a male source partner and 3–4% with a female partner). Consequently, many people who have casual, condomless sex with many, many partners will end up just fine.
Is it reckless? Well, the connotation of “reckless” is fairly negative, and I don’t see why I should get to judge what two consenting adults decide is an acceptable risk vs reward for themselves (at least, so long as I’m not involved).
However, when we pivot back to the domain of software/engineering: when software design choices are made via bandwagon fallacy, claiming that “it usually works out okay”, I do find that to be reckless. You may be fine with such a carefree approach, but it isn’t really fair to other engineers, users, and stake holders.
It’s not that I believe JSON (or similarly pitfall-laden technology) should be strictly avoided, but rather that the risks and failure modes should be given serious consideration, rather than minimized or outright dismissed. In terms of the STI analogy, it’s perfectly reasonable for two individuals to be aware of the risks, and depending on their appetite for said risk, agree upon the inclusive/exclusive nature of their relationship, as well as whether they exchange test results; what would be ridiculous is to pretend that the risk of infection is zero.
There would be zero value in the article minimizing the design flaws (and resulting footguns) in JSON. The insistence that the article should do so is about as bizarre as someone responding to an article on safer sex by insisting that the article be amended with a note that “… but raw-dogging strangers isn’t too risky anyway, so, like, YOLO”.
I'm not sure if I should be impressed by your detailed knowledge of STI statistics or unnerved by your attempt to analogize unprotected sex to use of JSON. I'll split the difference and say, "what?"
Hey, if you like living your life uninformed of the risks of your actions (or maybe it’s that the risks are irrelevant to you, if you’re not getting any?), you do you.
I don’t know how I can help you understand the utility of analogies (you recognized the rhetoric as such, yet simultaneously seem stuck on the fact that analogies don’t establish a literal connection). However, I’m not sure if that’s actually the problem, or if you’re feigning confusion in attempt to offend me. Granted, it would seem an odd choice for you to claim incompetence as part of an attempted insult.
Regardless, I’d be happy to help you in any way I can.
They aren't claiming incompetence and you're being rude.
Ironically there is some projection here about playing dumb since you're pretending you don't understand why getting and spreading sexually transmitted disease is a poor analogy to "using json".
A good analogy needs more than just a slight parallel - it should have lots of significant parallels (and few significant misses) that help you think about the shape of the topic being discussed.
You can hate json but pretending it is similar to infectious disease in cause, usage, value, solution, or expression makes it really ineffective at bringing people to your side since it's really hard to see any useful similarities which makes it seem like the whole point is just fluffed up "json bad" and "people not me are dumb".
If you're actually happy to help in any way you can then stop being patronizing and either walk back your overly incendiary first shots or take a good faith attempt to clarify after someone pushes back on an intentionally extreme example.
> They aren't claiming incompetence and you're being rude.
No they weren't.
> Ironically there is some projection here about playing dumb since you're pretending you don't understand why getting and spreading sexually transmitted disease is a poor analogy to "using json".
I don't understand why it would be a poor analogy, and I'm not pretending. (And I hope I'm not too dumb for real.) Please explain.
> You can hate json but pretending it is similar to infectious disease in cause, usage, value, solution, or expression makes it really ineffective at bringing people to your side since it's really hard to see any useful similarities
The obvious similarity I saw was "this has risks. Not huge risks, the sky isn't falling, but definitely not zero either, so it's useful to be informed about and explicitly acknowledge and weigh the risks." Am I dumb for seeing this similarity? Or for not finding it rude?
For someone who wrote such an overwrought analogy, you should know that when you make an analogy, you can't analogize something that bakes in a conclusion on the thing up for debate.
The problem with your analogy is that you chose something with serious undeniable consequences: STIs. The analogy doesn't work because the person above doesn't grant the severity of consequences of using JSON.
So if you like health analogies, replace STIs with the common cold and explain how scary it is even though we've had the common cold hundreds of times and are well aware of its severity.
I feel like many of the points are complaining about the parsing side of JSON. Not the format itself.
You can argue that a format is useless when "everyone" parses it "wrong" but no specification on this earth is free of that.
Using a lot of json in our API space and it working fine (so far) leads me to think that OP complains about something that does not fit their use-case.
Firing people for choosing something that does not fit "your" use-case seems like a wild take.
It is indeed a play on the saying “Nobody ever got fired for buying IBM”. What wasn’t explained, though, is the meaning behind the expression. It’s an expression used to mock use of the bandwagon fallacy and a general lack of critical thought in the decision process.
If you’re responsible for choosing a vendor, the easy out is to pick one of the largest vendors available: if it’s what everyone else is doing, then you can waive away any personal accountability for any resulting failures by claiming that “it’s what everyone else is doing, so it must have been the right choice, and therefore the failure that occurred must have been inherent to the problem domain, rather than something that could have been avoided if I personally chose better”.
The phrase now gets used to mock such bandwagon behaviors: a CTO completely unfamiliar with Kubernetes, yet choosing it “because everyone else is using it”; an engineer picking a serialization format based on popularity while never having read the spec; etc.
The article isn’t suggesting that anyone actually be fired. It is, however, critical of people choosing JSON due to familiarity/popularity, without any critical thought involved in choice (and bandwagon fallacy does not count as critical thought).
Sure. But then they added "maybe they should be". Yeah, it's a clickbait title not backed up by the content, but I think it's fair to criticize the words the author chose to write.
Go can actually decode numbers into an “arbitrary precision” raw token (json.Number, which is a wrapper around []byte). You can turn this on at the decoder level or for a particular field. I often do this for more fine grained control of number decoding without needing to drop to the tokenizer.
I agree with all of the points, except the last. Protobuf is a nice transfer format but it's horrible once a human wants to inspect or copy/paste a payload.
Exactly. And this is an extremely important detail. It makes your development process go from using glass tubes where you see everything as it is sent/received to black boxes of binary data.
JSON is self describing, you can mostly just call the API, read the JSON response and figure it out from there.
Ignoring that detail is ignoring the elephant in the room.
This article reeks of being written by someone with maybe a year or two of experience that's read someone else's ill-informed rants about JSON and is rehashing it.
IMO, the only valid complaint is that the JSON spec can't handle Inf or NaN, but I'd argue that if you're finding yourself dealing with Inf or NaN, then you likely have a bug elsewhere, and JSON is merely exposing it.
If you're storing credit card numbers, license keys, barcode IDs, etc. as integers, you're already doing it wrong, and what serialization format you're using is completely irrelevant.
Totally concur with your last paragraph. But, considering that the author is apparently one of the people behind the competing format mentioned, protobuf, your first paragraph feels way off.
Jason can't be streamed: wtf? If you have a data connection that can be broken, every data format is incomplete until the last ack
There is absolutely nothing stopping you from parsing json as it comes in. An open array that hasn't reached the closing arrays character yet? How is that different than a stream of data elements?
With the Jackson API, I was easily able to parse incoming text into Json tokens and extract data as it came in. I did this for certain corner cases where Json documents possibly were sent in that were greater than could be practically put into memory.
Sure, you might want to have a rollback function in case there is a break in the data transfer, but how'd that be different with a different data format streamed?
The guy is complaining a lot about number formats. If you care about it that much just put your numbers and strings, and make the parsing format part of the standard /api interface.
This guy simply appears to be in love with binary formats and strong typing. If that floats his boat by all means do it.
The fact that so much of the web relies on Json, and it pretty much seems to work. The guy hopefully indicated some corner cases on its use. But come on it's not fatally flawed.
Much as JSON is without a doubt far from perfect, I think most of the article is rather meh.
Protobuf won't stop people from trying to put a license keys, card numbers, or other IDs in a 64-bit float or int. All it'll do is make someone be able to smugly point to a certain person and say "you're stupid, you did this incorrectly as per the spec". The extra assurance could even make things worse, e.g. using 64-bit ints for credit card numbers and having something break that interprets it as a signed int (or perhaps become unusable if 20-digit numbers happen), or have code that converts it to a float anyway, devolving back to 53-bit precision. The "fix" for both formats for normal people is still exactly the same - "please please please don't put things that aren't strictly counts of things or imprecision-tolerant real numbers in number types".
Unpaired surrogates is just a question of which component fails. If any! An arbitrary user input containing an unpaired surrogate would likely be better off quietly transformed (or preserved) than become a DoS vector.
Lack of NaNs, infinities, and compact ways to encode arbitrary bytes are the things I personally find the most unfortunate. The special constants are especially weird as null/false/true exist, but byte strings do have the problem that not all languages have a reasonable distinct thing for such to map to.
A funky side-note is that JSON being so tolerant/imprecise (and as such most suppliers not pushing it to edge-cases) means that crappy parsers/producers aren't typically particularly problematic, though this perhaps isn't really a thing to judge data formats by.
I've worked with json schema and have mixed opinions- it was extremely cumbersome to do some things that should be simple, and very simple to do things that should be extremely cumbersome.
I agree that XSD was one of the only redeeming qualities of XML- the schema schema itself was fairly well thought-through.
Personally I prefer the IDL-style languages- that would include CORBA (just the schema and encoding), protobufs, and the ur-IDL that Sun RPC was based on, XDR. Protobufs have gone through some weird evolution, though- IIRC proto3 went tyhrough a full 180 on required/optional fields.
But the libraries aren’t broken, they follow the spec. The spec is vague enough that compliant implementations can be mutually incompatible, that’s the main criticism as far as I can see.
In .NET land the most popular JSON library is Newtonsoft which plays fast and loose with the JSON spec and makes a lot of default assumptions. A couple of years ago Microsoft rolled out their own first-party JSON library, which is a lot stricter and requires the developer to explicitly configure the parser instead of making assumptions.
https://learn.microsoft.com/en-us/dotnet/standard/serializat...
But some developers still instinctively reach for Newtonsoft, even when working on brand new projects. In practice, this sort of ingrained habit is far greater of a problem than the JSON spec itself.
The author says that Protobuf makes all of JSON's problems go away. I don't know enough about Protobuf to confirm or deny, but I know that it's not tested 10 trillion times a day like JSON is.