Page 1 of 1
[Q] Sending Unicode characters such as ❖ over dispatch API
Posted:
Sun Apr 19, 2020 2:09 pm
by Bowzin
I'm working on an automated dispatch posting tool, however I am having an issue where any unicode characters I am sending to the API are coming through as boxes and other weird characters. I am not encoding them in any form, in fact when I send them via cURL to my own web pages, the ❖ comes through. Is there some server side encoding that we aren't able to get around?
Posted:
Sun Apr 19, 2020 4:15 pm
by Frisbeeteria
There are a large number of UniCode character sets: UTF-8 and many others. Since this site started in 2002 and has been added to on an irregular basis ever since; we do not have a single UTF standard across the entire site. It would take a massive rebuild of the older portions to make it uniformly compliant, and [violet] decided there were better uses for her time.
In any given segment or element of the site, you will simply have to see what works and what doesn't. Sorry.
Posted:
Sun Apr 19, 2020 5:05 pm
by Bowzin
Frisbeeteria wrote:There are a large number of UniCode character sets: UTF-8 and many others. Since this site started in 2002 and has been added to on an irregular basis ever since; we do not have a single UTF standard across the entire site. It would take a massive rebuild of the older portions to make it uniformly compliant, and [violet] decided there were better uses for her time.
In any given segment or element of the site, you will simply have to see what works and what doesn't. Sorry.
hmm...its just weird considering ❖ works everywhere I've tried, including dispatches, but I can't send it over the API, oh well thanks for the response.
Posted:
Sun Apr 19, 2020 6:17 pm
by [violet]
This is now fixed. I hadn't added UTF8 support to the API because until recently you couldn't upload any content to it. But now you can post Dispatches, so it's needed.
There still may be a few oddities because, as Fris says, our character encoding is a bit of a mess. But the API should now be consistent with the rest of the site.
With this change, I have also bumped the API version number to 11. If you require the old method (i.e. no UTF8 support), you should request version 10 or earlier via the API's "v" parameter:
https://www.nationstates.net/pages/api.html#versions
Posted:
Sun Apr 19, 2020 6:19 pm
by Bowzin
[violet] wrote:This is now fixed. I hadn't added UTF8 support to the API because until recently you couldn't upload any content to it. But now you can post Dispatches, so it's needed.
There still may be a few oddities because, as Fris says, our character encoding is a bit of a mess. But the API should now be consistent with the rest of the site.
With this change, I have also bumped the API version number to 11. If you require the old method (i.e. no UTF8 support), you should request version 10 or earlier via the API's "v" parameter:
https://www.nationstates.net/pages/api.html#versions
Thanks <3
Posted:
Sun Apr 19, 2020 7:35 pm
by Bowzin
So I am still having issues, except now its just posting ?'s
There is definitely a chance this is on my end right now, but thought I'd put it out there while I troubleshoot just to see.
EDIT: Pretty sure I am sending the UTF-8 characters properly, still trying the diamond thing, getting ?'s. I guess its progress from the boxes but still something up
Posted:
Sun Apr 19, 2020 10:44 pm
by [violet]
Post an example of what's not working, if you can.
Posted:
Sun Apr 19, 2020 11:55 pm
by Bowzin
Posted:
Wed Apr 22, 2020 10:13 pm
by Bowzin
Any updates or anything else you need me to do?
Posted:
Wed Apr 22, 2020 10:20 pm
by [violet]
At the moment I don't know what you're attempting to post. Can you please do this:
1. Create a dispatch with your desired text via the regular website. So this presumably looks right. (If it doesn't, it isn't an API issue.)
2. Create a duplicate dispatch with the exact same text via the API. This presumably looks wrong.
Posted:
Thu Apr 23, 2020 12:36 am
by Racoda
(Not OP)
I did a few tests from the command line/curl.
- Code: Select all
curl -H "X-Pin: ####" -A "CLI test" "https://www.nationstates.net/cgi-bin/api.cgi" --data "nation=rsca&c=dispatch&dispatch=add&title=U2756%20UrlEncoded&category=1&subcategory=105&mode=execute&token=0123456abcdef" --data-urlencode "text=Test: ❖"
Publishing a dispatch with ❖ results in the character becoming a question mark:
Test: ?- Code: Select all
curl -H "X-Pin: ####" -A "CLI test" "https://www.nationstates.net/cgi-bin/api.cgi" --data "nation=rsca&c=dispatch&dispatch=add&title=U2756%20escaped%20UrlEncoded&category=1&subcategory=105&mode=execute&token=0123456abcdef" --data-urlencode "text=Test: ❖"
However, escaping ❖ to be
❖ does work (bug? feature?): the result is
Test: ❖
Posted:
Thu Apr 23, 2020 1:20 am
by Bowzin
Racoda wrote:(Not OP)
I did a few tests from the command line/curl.
- Code: Select all
curl -H "X-Pin: ####" -A "CLI test" "https://www.nationstates.net/cgi-bin/api.cgi" --data "nation=rsca&c=dispatch&dispatch=add&title=U2756%20UrlEncoded&category=1&subcategory=105&mode=execute&token=0123456abcdef" --data-urlencode "text=Test: ❖"
Publishing a dispatch with ❖ results in the character becoming a question mark:
Test: ?- Code: Select all
curl -H "X-Pin: ####" -A "CLI test" "https://www.nationstates.net/cgi-bin/api.cgi" --data "nation=rsca&c=dispatch&dispatch=add&title=U2756%20escaped%20UrlEncoded&category=1&subcategory=105&mode=execute&token=0123456abcdef" --data-urlencode "text=Test: ❖"
However, escaping ❖ to be
❖ does work (bug? feature?): the result is
Test: ❖
hmmm...I'll give that a shot
The original dispatch that triggered this:
https://www.nationstates.net/page=dispatch/id=1345798Here's the API posted version:
https://www.nationstates.net/page=dispatch/id=1347428
Posted:
Sun Apr 26, 2020 11:56 pm
by Bowzin
Any updates on this? Let me know if you need anything else. Escaping the characters to their numeric value isn't easy to do when they're in a huge block of text with PHP.