Python is one of the new weapons of choice at one of my current contracts.
I’m a big believer in removing as many chances for error in a project as
possible. That means for example where as some teams will have their designers
edit an Excel spreadsheet .xls file and then require them to also export to a
.csv file for the game, I’d prefer if they just save to .xml at all times.
They lose no features because Excel will save all the data, formatting, formulas
etc to XML and Windows will even directly open the .xml files to Excel when
double clicked so as far as the user is concerned nothing changes. In fact once
they create the file Excel will load and save to .XML so there is no need to
remember to save as .XML, it just happens. Automated things like that mean no
errors.
When using CSV files there is always the issue that the designer may or may not
have remembered to export so they change a value, save the file and forgetting
to export they spend a bunch of time trying to figure out why their changes are
not showing up in the game. Or, worse, when you save AS CSV in Excel, excel then
defaults to saving that file in CSV mode. So, if the designer makes any changes
after that and forgets to resave the file specifically in .xls mode then all
their changes will be lost. Sure they can reload the .csv file but any formulas,
formatting, colors, comments etc will all be lost.
If instead they use XML files then they don’t have to remember the step of
saving to CVS. They don’t have to worry about saving back to XLS as well and
they never have to worry about losing data since that issue will never come up.
I know some teams that write Excel macros to do this. That’s a huge waste of
time. Those macros generally have to be propogated to all their Excel files.
Those macros often have to be tweaked for different versions of Excel. It
becomes even more of a pain at localization time because the developers don’t
usually control the machines of the people editing the files in other countries
and macros etc are often disabled. I’ve seen no end of problems dealing with
Excel files that had macros in them so no, macros are not the solution.
It’s much simpler just to parse the XML. As you can see below it’s very little
code. You can easily add it to your tool pipeline or, looking at the example for
reference, just make your game load and parse the XML file directly, at least
during development.
Anyway, here is code to read an excel XML file (right click and pick Save Target As).
perl is more your thing.
I tried Pythonifying the code so it’s pretty simple to use. The following are a few simple examples
import excelreader
xl = excelreader.ParseExcelXMLFile("someExcelFile.xml")
print "Cell B4 in first sheet = ", xl[0][4][2]
print "Cell G5 in first sheet = ", xl[0]['G5']
print "names of sheets = ", xl.keys()
print "Cell A7 in Sheet1 = ", xl['Sheet1'][7][1] # assumes there is a sheet called "Sheet1"
print "all cells in all sheets"
for worksheet in xl:
for row in worksheet.keys():
for column in row.keys():
print row, column, worksheet[row][column]
The library also pulls out style info which is useful if you want to know the colors (say for generating corresponding materials in Maya) or the font names and sizes (say for auto-generated localized textures)
print xl[0].GetCellStyle(7,2) # style info for cell B7 in first sheet
print xl[0].GetCellStyle('D3') # style info for cell D4 in first sheet
This is extremely helpful – thanks! couple things:
- got a license for this? (or public domain?)
- i had to make a minor change, between lines 418 and 419:
self.mergeAcross = int(self.mergeAcross)
For whatever reason, this var was a unicode string instead of an int. Probably related to different versions of excel’s xml output, or platform (I’m on OS X 10.5, python 2.5, XML came from windows excel 2003)… whatever, minor.
I’d email you directly regarding a license, but couldn’t find contact info on your many sites. Feel free to email me.
Again, THANK YOU! You saved me hours of boilerplate code!
thanks for the fix
the license is New BSD. It’s in the comments at the top of the file.
Thanks for code but there some troubles with this script:
characters function of XML Handler returns only last part of cell if it has called for 2 or more times for one particular cell (e.g. you have “<” combination in XML cell (or ‘<‘ symbol in Excel row before import to XML)).
I’m not so good at Python but i suppose than following would help:
1.
Instead lines 416-417 (In class ExcelToSparseArrayHandler, func characters):
“self.currentRow[self.currentColumnNum]["content"]=content
self.inCell = False”
Insert : ”
try:
self.currentRow[self.currentColumnNum]["content"]+=content
except:
self.currentRow[self.currentColumnNum]["content"]=content
”
2.
Add: ”
def end_Cell(self):
self.inCell = False
“