[violet] wrote:Now let's talk about Daily XML Dumps. I've done some profiling, and these are a large part of why our daily updates are taking so long to complete. Now we have API shards, I'm wondering how far we can strip back the Dumps.
Could regular users of these Daily Dumps please let me know which data they actually use from the XML, and which they don't?
Now we have shards, can the Dumps stop including (for example) Causes of Death?
I make occasional use of the daily dump. Items of interest to me in the nation dump are as follows (pasted directly from code):
'<NAME>','<CIVILRIGHTS>','<ECONOMY>','<POLITICALFREEDOM>',
'<REGION>','<POPULATION>','<TAX>','<MAJORINDUSTRY>',
'<COMMERCE>','<CATEGORY>','<ANIMAL>','<CURRENCY>',
'<GOVTPRIORITY>','<DEFENCE>','<ADMINISTRATION>',
'<CAPITAL>','<LEADER>','<RELIGION>','<TYPE>',
'<NATION>', '<FREEDOMSCORES>'
I have in the past used the cause of death information, but I have no expectation of ever doing so again.
Of the items not in this list, I can foresee wanting to make use of other <GOVT> subentries and <PUBLICSECTOR>.
I'm assuming that the region dump is not a performance issue, but if you want a list of what I use there, let me know.
(Of course, the daily dump wasn't created for folks like me - I presume that the calculators have a bigger list.)
- nsh -