If you JSON-decode material containing a value that contains a backslashed "n" to indicate a newline, at what point should you replace it with a true newline?
Here’s an artificial example:
let dict = ["key": "value\nvalue"]
let json = try! JSONEncoder().encode(dict)
let result = try! JSONDecoder().decode([String:String].self, from: json)
print(result["key"])
That outputs "value\nvalue"
, but for purposes of display in my app, I then call replacingOccurrences
to change "\n"
(meaning backslashed n" into "n"
(meaning newline) so that an actual newline appears in my interface.
Now, I’m a little surprised that JSONDecoder isn’t already doing this for me. Just as it has a configurable policy for decoding a date string value into a date, I would expect it at the least to have a configurable policy for decoding a string value into a string. But it doesn’t.
My question is: what do people do about this sort of situation, as a general rule? Dealing with it on a case by case basis, as I’m doing in my app, feels wrong; in reality, these JSON data are coming from the server and I want all JSON HTTP response bodies to be treated in this way.
2
Answers
It looks like the server is sending 5c5c6e (i.e. backslash-backslash-n or
\n
). That’s valid JSON, but it doesn’t mean "newline." It means "backslash-n" (followed by
n
). If the server means to send newline, that’s mis-encoded. It needs to be 5c6e, "backslash-n." Sure, you can fix it on the client side, but there’s no "normal" way to do that because it’s just wrong.The right way to fix that is to fix it on the server side. You can double-unescape the strings on the client-side, but that’s ambiguous and I don’t recommend it unless there’s no better way. Repeatedly unescaping tends to mess things up when actual backslashes show up in the string.
"Literal newlines" are not allowed in JSON strings in that the byte 0a is not allowed between quotation marks. Putting that into JSONLint should fail. But 5c6e (backslash-n or
n
) is, and is the correct way to do it.This is (likely) a backend problem. But
JSONEncoder
andJSONDecoder
do properly escape and unescape characters.In terms of how to proceed, tell your boss that that the raw JSON payload should be
5c6e
, not5c5c6e
. Anything else is incorrect (or, at best, bad design).Beyond that, we can’t be more specific. That having been said, there are one of two likely sources of this server bug:
the actual server database/model contains
5c6e
rather than0a
and the JSON encoding is (as it should) converting that initial5c
to5c5c
, thus resulting in5c5c6e
in the final raw JSON payload; orthe database contains
0a
, and the backend devs are unaware that standard JSON encoding routines will properly escape/convert this to5c6e
for them and are manually (whether intentionally or not) converting it themselves to5c6e
, and again the server’s JSON encoder is (correctly) converting it to5c5c6e
.The first scenario is likely what is going on, but we don’t have enough information to diagnose it further. You will likely need to have some back end dev look at hex representations of what is actually in the database to figure out where the problem rests.
For what it’s worth, if the first scenario applies, to may need to go back further in the process to figure how how the
5c6e
got into the database/model in the first place (if that is indeed the case).We should recognize that it might not be a server bug at all, but rather some client app over-escaped the original input. E.g., maybe it had a
0a
newline character, added escapes itself, and provided5c6e
to its JSON encoder, resulting In sending the server5c5c6e
in the raw JSON, and the server dutifully unescaped it and stored5c6e
in the database/model.Bottom line, you have to determine what the server really has in its model/database, and figure how whether it was garbage-in-garbage-out or whether the database is correct and there is a bug in the server’s JSON generation.