Reading Excel XML in Python

2007-08-19

Python is one of the new weapons of choice at one of my current contracts. I'm a big believer in removing as many chances for error in a project as possible. That means for example where as some teams will have their designers edit an Excel spreadsheet .xls file and then require them to also export to a .csv file for the game, I'd prefer if they just save to .xml at all times.

They lose no features because Excel will save all the data, formatting, formulas etc to XML and Windows will even directly open the .xml files to Excel when double clicked so as far as the user is concerned nothing changes. In fact once they create the file Excel will load and save to .XML so there is no need to remember to save as .XML, it just happens. Automated things like that mean no errors.

When using CSV files there is always the issue that the designer may or may not have remembered to export so they change a value, save the file and forgetting to export they spend a bunch of time trying to figure out why their changes are not showing up in the game. Or, worse, when you save AS CSV in Excel, excel then defaults to saving that file in CSV mode. So, if the designer makes any changes after that and forgets to resave the file specifically in .xls mode then all their changes will be lost. Sure they can reload the .csv file but any formulas, formatting, colors, comments etc will all be lost.

If instead they use XML files then they don't have to remember the step of saving to CVS. They don't have to worry about saving back to XLS as well and they never have to worry about losing data since that issue will never come up.

I know some teams that write Excel macros to do this. That's a huge waste of time. Those macros generally have to be propogated to all their Excel files. Those macros often have to be tweaked for different versions of Excel. It becomes even more of a pain at localization time because the developers don't usually control the machines of the people editing the files in other countries and macros etc are often disabled. I've seen no end of problems dealing with Excel files that had macros in them so no, macros are not the solution.

It's much simpler just to parse the XML. As you can see below it's very little code. You can easily add it to your tool pipeline or, looking at the example for reference, just make your game load and parse the XML file directly, at least during development.

Anyway, here is code to read an excel XML file (right click and pick Save Target As). I posted some perl to do the similar things before if perl is more your thing.

I tried Pythonifying the code so it's pretty simple to use. The following are a few simple examples

xl = excelreader.ParseExcelXMLFile("someExcelFile.xml")

print "Cell B4 in first sheet = ", xl[0][4][2]
print "Cell G5 in first sheet = ", xl[0]['G5']
print "names of sheets        = ", xl.keys()
print "Cell A7 in Sheet1      = ", xl['Sheet1'][7][1] # assumes there is a sheet called "Sheet1"
print "all cells in all sheets"
for worksheet in xl:
    for row in worksheet.keys():
       for column in row.keys():
          print row, column, worksheet[row][column]

The library also pulls out style info which is useful if you want to know the colors (say for generating corresponding materials in Maya) or the font names and sizes (say for auto−generated localized textures)

print xl[0].GetCellStyle(7,2)  # style info for cell B7 in first sheet
print xl[0].GetCellStyle('D3') # style info for cell D4 in first sheet
Comments
An Open Letter to EA - Open Source EASTL!
Professional Game Development Guidelines