Page 1 of 1

visual bug when previewing "æ"

PostPosted: Fri Nov 25, 2022 9:57 am
by Haganham
Previewing this https://www.nationstates.net/page=dispatch/id=1801077
in an rmb post causes a bug where "æ" displays as "æ"

PostPosted: Fri Nov 25, 2022 10:04 am
by Barbaria
/notamod I've tested this and can confirm this happens.

PostPosted: Fri Nov 25, 2022 2:23 pm
by Wormfodder Delivery
Seems to be browser based, as I just checked the RMB post and it displayed normally there, use Opera currently.
Edit: Nevermind, it also is that way on Opera when previewed.

PostPosted: Fri Nov 25, 2022 3:34 pm
by Trotterdam
EDIT: I completely missed the point, and so everything in this post is irrelevant. Ignore me.

I assume you're refering to this post?

So while poking around, I noticed something interesting. The Antiquity theme includes this line in the header:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
This is, according to the HTML 4 standard, the correct way to indicate character encoding. Meanwhile, the Century and Rift themes include this line instead:
<meta charset="iso-8859-1">
This is a new notation defined in the HTML 5 standard that was not present in HTML 4, but most browsers should support HTML 5 nowadays. However, the HTML 5 standard also specifies:
The charset attribute specifies the character encoding used by the document. This is a character encoding declaration. If the attribute is present, its value must be an ASCII case-insensitive match for the string "utf-8".
The Encoding standard requires use of the UTF-8 character encoding and requires use of the "utf-8" encoding label to identify it. Those requirements necessitate that the document's character encoding declaration, if it exists, specifies an encoding label using an ASCII case-insensitive match for "utf-8". Regardless of whether a character encoding declaration is present or not, the actual character encoding used to encode the document must be UTF-8.
Which just leaves me wondering what the heck? Why bothering adding a new way of specifying the character encoding that allows only one possible value? Was there some intermediate form of the standard in which other encodings were permitted? Regardless, websites which use non-UTF-8 character encodings remain common on the internet, so practical browsers (including mine - the post displays correctly for me) will make some effort to implement them regardless of what the HTML standard says.

In any case, examination of the page in a hex editor shows that the "æ" is, in fact, encoded in its ISO-8859-1 representation. This makes it even weirder that it displays as "æ" for you, since that is what it would display if a UTF-8-encoded character in the page were misinterpreted by your browser as ISO-8859-1, which clearly cannot be the case (I checked the page, and nowhere in it does the UTF-8 encoding of "æ" appear).

My best guess, in light of this evidence, is that you have some sort of browser script that reads the page, modifies it, and inserts the modified text back into the page, but mangles the character encoding during the "insert the modified text back into the page" stage. Are you using any NationStates browser plugins? If so, try turning them off and seeing which one causes the problem.

Or did I get the wrong link? If so, please link to the actual page that has the problem.

PostPosted: Fri Nov 25, 2022 5:32 pm
by Racoda
Trotterdam wrote:I assume you're refering to this post?

[...]

Or did I get the wrong link? If so, please link to the actual page that has the problem.

Haganham wrote:Previewing this https://www.nationstates.net/page=dispatch/id=1801077
in an rmb post causes a bug where "æ" displays as "æ"

I think the previewing part is relevant here. Indeed, the post itself looks fine, but when writing an RMB post with a link to the dispatch and clicking on preview, it bugs out.

PostPosted: Fri Nov 25, 2022 8:12 pm
by Phydios
Trotterdam wrote:My best guess, in light of this evidence, is that you have some sort of browser script that reads the page, modifies it, and inserts the modified text back into the page, but mangles the character encoding during the "insert the modified text back into the page" stage. Are you using any NationStates browser plugins? If so, try turning them off and seeing which one causes the problem.

I doubt this is a browser issue. I replicated it (as described by Racoda) on iPhone Safari just fine, and I certainly don't have any scripts or plugins running.

PostPosted: Sat Nov 26, 2022 1:32 am
by Trotterdam
Racoda wrote:I think the previewing part is relevant here. Indeed, the post itself looks fine, but when writing an RMB post with a link to the dispatch and clicking on preview, it bugs out.
Oh, right. Missed that part. Ignore me, then.

In that case, it's probably an issue with JavaScript/AJAX using a different encoding than the main site. But using JavaScript for such a thing is poor design anyway.

EDIT: Wait, does it happen only in the factbook preview, and not when previewing a post which has "æ" typed in it normally? In that case, it must be something serverside with the factbook code and RMB code not communicating properly.

PostPosted: Sat Nov 26, 2022 5:24 am
by Roavin
Bug has been entered into our backlog; thanks for the report!