Page 5 of 97

PostPosted: Mon Jan 09, 2012 8:19 pm
by [violet]
New!

Data from nation "Analysis" pages is now available from the API. See the documentation for examples. You can get any nation's score on any of the scales by using the shard "censusscore-N", where N is the numeric ID of the census.

Also added a few world shards, like "censusmedian".

PostPosted: Mon Jan 09, 2012 8:24 pm
by [violet]
Now let's talk about Daily XML Dumps. I've done some profiling, and these are a large part of why our daily updates are taking so long to complete. Now we have API shards, I'm wondering how far we can strip back the Dumps.

Could regular users of these Daily Dumps please let me know which data they actually use from the XML, and which they don't?

Now we have shards, can the Dumps stop including (for example) Causes of Death?

PostPosted: Tue Jan 10, 2012 4:40 am
by New South Hell
[violet] wrote:New!

Data from nation "Analysis" pages is now available from the API. See the documentation for examples. You can get any nation's score on any of the scales by using the shard "censusscore-N", where N is the numeric ID of the census.

Also added a few world shards, like "censusmedian".


The documentation is not clear on some things. My best guess is that in "censusscore-N", N is to be replaced with an integer, and that the integer for the current census may be obtained from the world censusid. I am guessing that the N's are in order by their appearance in the Analysis tab, starting at 0 for Civil RIghts (which might be a surprise to the naive). If so, this is a potential problem if there are ever more added, because if you keep them alphabetized, then adding a new one will move the numbers of ones later in the alphabet, and gum up any analysis based on knowing the previous numbers. The most practical way around this, I think, is to publish a table of the numbers, and agree not to change existing ones. (The numbers could be API-level-linked, I guess. Ugh!)

I request that the documentation be updated to contain the correct answers to the questions I've speculated upon. I request either an N to analysis table or a description of the alternate technique that will be used should new analyses be added. I request that there be at least one example within the doc for making a call to obtain analysis data.

And one question: is there a limit on the number of censusscore-N's that can be specified in a single call? If not, is there a reasonable upper bound to avoid straining the server and/or a revised maximum rate of usage if one wants a lot of them for a bunch of nations?

Thank you for implementing this! :bow:

- nsh -

PostPosted: Tue Jan 10, 2012 6:19 am
by New South Hell
[violet] wrote:Now let's talk about Daily XML Dumps. I've done some profiling, and these are a large part of why our daily updates are taking so long to complete. Now we have API shards, I'm wondering how far we can strip back the Dumps.

Could regular users of these Daily Dumps please let me know which data they actually use from the XML, and which they don't?

Now we have shards, can the Dumps stop including (for example) Causes of Death?


I make occasional use of the daily dump. Items of interest to me in the nation dump are as follows (pasted directly from code):

'<NAME>','<CIVILRIGHTS>','<ECONOMY>','<POLITICALFREEDOM>',
'<REGION>','<POPULATION>','<TAX>','<MAJORINDUSTRY>',
'<COMMERCE>','<CATEGORY>','<ANIMAL>','<CURRENCY>',
'<GOVTPRIORITY>','<DEFENCE>','<ADMINISTRATION>',
'<CAPITAL>','<LEADER>','<RELIGION>','<TYPE>',
'<NATION>', '<FREEDOMSCORES>'

I have in the past used the cause of death information, but I have no expectation of ever doing so again.

Of the items not in this list, I can foresee wanting to make use of other <GOVT> subentries and <PUBLICSECTOR>.

I'm assuming that the region dump is not a performance issue, but if you want a list of what I use there, let me know.

(Of course, the daily dump wasn't created for folks like me - I presume that the calculators have a bigger list.)

- nsh -

PostPosted: Tue Jan 10, 2012 6:52 am
by The Most Glorious Hack
New South Hell wrote:My best guess is that in "censusscore-N", N is to be replaced with an integer, and that the integer for the current census may be obtained from the world censusid. I am guessing that the N's are in order by their appearance in the Analysis tab, starting at 0 for Civil RIghts (which might be a surprise to the naive).

How many people using shards could be considered naive?

PostPosted: Tue Jan 10, 2012 7:09 am
by Ballotonia
What I use daily:

regions.xml.gz: <NAME>, <NUMNATIONS>, <DELEGATE>, <FLAG>
nations.xml.gz: <NAME>, <UNSTATUS>, <FLAG>

How about constructing the nation.xml.gz file after the update has completed? Just provide a way we can obtain the date of the file before pulling it in its entirety, and I don't think it really matters when exactly it gets constructed or when it becomes available.

Does the update keep the XML files open and writes to them throughout the update or does it continually open/close them and appends?

Ballotonia

PostPosted: Tue Jan 10, 2012 9:39 am
by Snefaldia
This is probably a silly question but I'm not a techy type- has anyone used this information to create apps like NSeconomy or NSdossier? Or other sorts of calculators and services? I'm really wondering what the potential would be for regular players to make use of that data, since probably many like me have no idea how to look at it properly or what we might use it for. :D

PostPosted: Tue Jan 10, 2012 11:36 am
by Unibot II
Snefaldia wrote:This is probably a silly question but I'm not a techy type- has anyone used this information to create apps like NSeconomy or NSdossier? Or other sorts of calculators and services? I'm really wondering what the potential would be for regular players to make use of that data, since probably many like me have no idea how to look at it properly or what we might use it for. :D


NSEconomy, NSDossier and NStracker all use this information in some form.

PostPosted: Tue Jan 10, 2012 12:04 pm
by Snefaldia
Yes, but they're a tad old hat. I meant anything new, now that there is unprecedented access to the API.

PostPosted: Tue Jan 10, 2012 5:33 pm
by Solm
Snefaldia wrote:Yes, but they're a tad old hat. I meant anything new, now that there is unprecedented access to the API.


I know that NStracker is continually updated with the new API data, and I believe NSDossier is as well. Just because they were created a long time ago doesn't mean they aren't updating themselves with the new information.

PostPosted: Tue Jan 10, 2012 5:53 pm
by Snefaldia
Okay, maybe I phrased this wrong: I mean to ask, are there any new calculators being used? There are a lot of people in this thread looking at the data, and I wondered if any were using it to provide a service to NSers, outside of the existing and commonly-used calculators.

PostPosted: Tue Jan 10, 2012 6:01 pm
by NewTexas
[violet] wrote:Now let's talk about Daily XML Dumps. I've done some profiling, and these are a large part of why our daily updates are taking so long to complete. Now we have API shards, I'm wondering how far we can strip back the Dumps.

Could regular users of these Daily Dumps please let me know which data they actually use from the XML, and which they don't?

Now we have shards, can the Dumps stop including (for example) Causes of Death?


:o

Between the eight or nine pages in our NSDossier Suite (aka NSSuite), we use every single element of both Nations & Regions with, ironically enough, the exception of Causes of Death. Initially you had the Death by [national animal] using the actual national animal and it was unusable without parsing that into a more database-friendly format. We were not that motivated and subsequently never made anything that uses that set of data.

So, we would really like to keep the Nation & Region Daily dumps as is with the exception of Causes of Death - it can go.

:ugeek:

PS: Presumably you are looking at ways to speed up processing. We know of an alternative that would help you and radically change a lot of dynamics of the invader/defender game. Another online game we play - The Kingdom of Loathing - utilizes a daily forced log off of everyone for approximately 15-45 minutes where they back up things, make code changes and process daily activities. It is not that painful and we would love to see what it would do to the invader/defender game.

Or, you could implement Ballotonia's suggestion. Fork off the API Process and have two "updates" of sorts - one active and does whatever the "main" update does and a second one that just builds the feeds that can start after the main one is done.

Just a thought. :idea:

PostPosted: Tue Jan 10, 2012 9:59 pm
by Assorro
Indeed, Causes of Death was found to be unusable for us as well.

The data dumps have become an essential tool and NSTracker utilizes all other elements found within both files. Presuming that you perform both tasks at the same time it does appear reasonable to suggest scheduling them separately. At least as a test.

This does present us with some interesting possibilities Violet and considerable food for thought.

PostPosted: Tue Jan 10, 2012 10:36 pm
by [violet]
Thanks for the responses.

The #1 source of load for the daily update is loading all 90,000 active nations. The #2 source is generating the daily dump. So while we could speed up the daily update by breaking out #2, that would require doing #1 twice: once for the daily update and once for the data dump. That's no good.

Causes of Death is a fairly high load generator, so it's good to know it's largely unneeded in the dumps. I'll look at removing it from the next API version.

NewTexas wrote:Between the eight or nine pages in our NSDossier Suite (aka NSSuite), we use every single element of both Nations & Regions with, ironically enough, the exception of Causes of Death.

Just to make sure: you require every field but Causes of Death to be in the daily dump, is that right? I.e. it would be problematic for you if any of the other fields were available only via shards, not the dump.

PostPosted: Tue Jan 10, 2012 10:44 pm
by [violet]
New South Hell wrote:The documentation is not clear on some things. My best guess is that in "censusscore-N", N is to be replaced with an integer

Yep. If you click the "censusscore-N" tag in the API documentation, it takes you to an example.

I am guessing that the N's are in order by their appearance in the Analysis tab

This is incorrect, and any new World Census scales won't change the IDs of existing ones. You can see which census IDs match which scales by browsing a random nation (like so) and looking at the URL. In this example, "Environmental Beauty" is revealed to be ID #63.

And one question: is there a limit on the number of censusscore-N's that can be specified in a single call? If not, is there a reasonable upper bound to avoid straining the server and/or a revised maximum rate of usage if one wants a lot of them for a bunch of nations?

There's no limit, and from our perspective there doesn't need to be one. But you may encounter difficulty if you use an extremely long URL because not all clients/proxies/servers support those.

PostPosted: Tue Jan 10, 2012 11:03 pm
by Assorro
Every other element is used to compile various statistics from our own resources on regions and nations alike following our own updates. If not available, this information would have to be directly lifted from the shards which could potentially mean an increase in server requests and IP bans which we've striven to reduce. The dump has become our closest friend in a way. He can lose the tie (Causes of Death) but we do like him just the way he is. :)

PostPosted: Tue Jan 10, 2012 11:32 pm
by [violet]
Assorro wrote:an increase in server requests and IP bans which we've striven to reduce.

While I've got you here: Absolution scripts are still smashing us, and not complying with API rules (e.g. no UserAgent). It would be SUPER if you could fix some of the obvious errors, e.g. cases where you request the exact same page multiple times per second.

PostPosted: Wed Jan 11, 2012 5:11 am
by New South Hell
[violet] wrote:
I am guessing that the N's are in order by their appearance in the Analysis tab

This is incorrect, and any new World Census scales won't change the IDs of existing ones. You can see which census IDs match which scales by browsing a random nation (like so) and looking at the URL. In this example, "Environmental Beauty" is revealed to be ID #63.


I humbly suggest that the information above be placed in the API documentation. Now that I've been clued in, I don't require it, but I remember my frustration when I first started scripting with being unable to find key information about it which was only in long-dead Forum threads. I suppose that skilled search would have revealed those threads to me but, at the time, I didn't know exactly what I was looking for.

Much thanks.

PostPosted: Wed Jan 11, 2012 5:52 am
by NewTexas
[violet] wrote:Thanks for the responses.

8<-- snip


Just to make sure: you require every field but Causes of Death to be in the daily dump, is that right? I.e. it would be problematic for you if any of the other fields were available only via shards, not the dump.


Yes, we would like to keep every element except <DEATHS>.

Like Assoro, we are concerned about:

Assorro wrote:an increase in server requests and IP bans which we've striven to reduce.


:ugeek:

PostPosted: Wed Jan 11, 2012 9:11 pm
by [violet]
API Version bump!

API has gone from version 2 to 3. Differences:
  • Causes of Death is no longer included in the Standard Nation API
  • RMB messages are no longer optionally available via the Standard Region API

These items are still available via shards or by using API versions to specify v=2.

PostPosted: Mon Jan 16, 2012 8:30 pm
by Coffee and Crack
Would it be possible to get a shard that contains the regional happenings of a region? Currently I'm pinging each Feeder once every two minutes, I've been maintaining a database of all the nations that have joined NationStates and what region they have gone to since 2008-12-04 till now and would like to update it to use the API's so I could get more accurate with the information.

Thanks.

PostPosted: Mon Jan 16, 2012 9:00 pm
by Frisbeeteria
Coffee and Crack wrote:I've been maintaining a database of all the nations that have joined NationStates and what region they have gone to

The "newnations" shard will give you the nation names with a whole lot less access, and other queries should get your their regions easily enough. The last 100 nations is usually about a two hour supply, though it can be up to six hours overnight and in slack times.

Newbies don't necessarily stay in their first regions all that long, and puppets get moved enough to be confusing. I'm not seeing why getting the initial region data would be valuable. Not saying [violet] won't give it - just wondering.

PostPosted: Mon Jan 16, 2012 9:06 pm
by Coffee and Crack
It gives good metrics on the success of the recruiting efforts of various regions without too much effort. It's easy for me to pull up data to see what regions have been doing well in recruitment and what have died down.

My latest interest in the data is that I'm working on a stock market for my region that bases stock prices on various regions around the globe and I would like to give my "investors" as much information as possible. If a region is trending up in recruitment, it might be a good time to buy. If they haven't been getting many new nations it might be wise to sell.

PostPosted: Mon Jan 16, 2012 10:25 pm
by [violet]
That sounds nifty. What format would make sense? Because the API wouldn't normally contain hyperlinks. E.g. it would turn an entry like:

"8 seconds ago: [flag] Testlandia departed this region for NationStates."


into something like:

<LOG time="123456789">testlandia departed this region for nationstates.</LOG>


... which seems like it might be difficult to parse.

PostPosted: Mon Jan 16, 2012 11:08 pm
by Coffee and Crack
Even if it's in its raw form that is now that would be a great help, similar to how messages in the current API are currently handled. The only thing I'd lose as opposed to it's current format is the nations prefix(which can provide some interesting data, but I'd be willing to count as an acceptable loss and isn't in scope for my current endeavor.)


Or maybe...

<ENTRY>
<TIMESTAMP>123456789</TIMESTAMP>
<NATION>Testlandia</NATION>
<ACTION>departed</ACTION>
<DATA>NationStates</DATA>
</ENTRY>
<ENTRY>
<TIMESTAMP>123456789</TIMESTAMP>
<NATION>Testlandia2</NATION>
<ACTION>cease to exist</ACTION>
<DATA></DATA>
</ENTRY>
<ENTRY>
<TIMESTAMP>123456789</TIMESTAMP>
<NATION>Foundlandia2</NATION>
<ACTION>WFEUpdate</ACTION>
<DATA></DATA>
</ENTRY>