I've written a script that changes values for a certain date in an excel sheet. When I create a new excel file using copy
, it copies everything over correctly except for the year portion of the date. For example, it will go from 4/5/2012 to 4/5/2008. It appears that all dates go back four years. Here's the code:
def exceledit():
#open excel sheet
import xlrd, xlwt, xlutils
import datetime
from xlutils.copy import copy
print 'Opening excel sheet...'
book = xlrd.open_workbook('test.xls', on_demand=True, formatting_info=True)
print 'Creating and editing new excel sheet...'
wbook = copy(book)
print 'Done creating new excel sheet'
sh = book.sheet_by_index(0)
#iterate through dates in excel sheet
for colnum in range(sh.ncols):
date = sh.cell_value(3, colnum+4)
#if xlrd finds a date
if date:
#grab date data
year, month, day, hour, minute, second = xlrd.xldate_as_tuple(date\
, book.datemode)
#if dates are within the month currently being edited
if month == 04:
#format excel date information to work with parkh dict
format = str(month) + "/" + str(day) + "/" + str(year)
print 'Editing ' + format
#clear cells to eliminate old information
wbook.get_sheet(0).write(6, colnum+6, "")
wbook.get_sheet(0).write(5, colnum+6, "")
wbook.get_sheet(0).write(7, colnum+6, "")
#iterate through hour segments for that day
for x in parkh[format]:
#if regular hours, insert in "HOURS" row
if x[0] == 'Park Hours':
wbook.get_sheet(0).write(6, colnum+6, x[1])
#if extra magic hours, insert in respective row
if x[0] == 'Extra Magic Hours':
#insert in morning row
if int(x[1][0:1]) in range(2,9):
wbook.get_sheet(0).write(5, colnum+6, x[1])
#insert in evening row
else:
wbook.get_sheet(0).write(7, colnum+6, x[1])
if month == 05:
break
print 'Done editing. Now saving...'
wbook.save('new.xls')
print 'new.xls saved'
Any idea why it might be changing the year? I haven't seen anyone else have the same issue elsewhere.
You evidently have an input Excel file that is using the 1904 date system. Your immediate problem is that xlutils
doesn't support copying these files correctly. Fortunately the fix is a one-liner and you can even do it yourself in your script after you have done the copy:
wbook = copy(book)
wbook.dates_1904 = book.datemode
This works for copying dates because xlwt
supports writing the record that specifies what datemode is in use.
WARNING Any new date values that you write into the file by using Worksheet.write() will NOT be written correctly, as xlwt
unfortunately ignores the dates_1904
setting when converting datetime.date
and datetime.datetime
objects to the Excel magic floats.
I have written and tested a fix which replaces the whole body of the Row.__excel_date_dt
method. It will be committed to the new xlwt
repository on github
soonish. In the meantime here is the code if you are in dire need:
def __excel_date_dt(self, date):
adj = False
if isinstance(date, dt.date):
if self.__parent_wb.dates_1904:
epoch_tuple = (1904, 1, 1)
else:
epoch_tuple = (1899, 12, 31)
adj = True
if isinstance(date, dt.datetime):
epoch = dt.datetime(*epoch_tuple)
else:
epoch = dt.date(*epoch_tuple)
else: # it's a datetime.time instance
epoch = dt.datetime(1900, 1, 1)
date = dt.datetime.combine(epoch, date)
delta = date - epoch
xldate = delta.days + delta.seconds / 86400.0
# Add a day for Excel's missing leap day in 1900
if adj and xldate > 59:
xldate += 1
return xldate
WARNING Trying to convert your file to the 1900 system by opening it in Excel, unticking the 1904 config item and saving the file DOESN'T WORK -- dates will be 4 years out of whack.
What appears to work properly is this:
open the file in Excel, save it as XML Spreadsheet 2003 (*.xml)
...
this format records dates in text format e.g.
1999-12-31T23:59:59.999
open the XML file in a text editor, find the line that reads
<Date1904/>
... yes, the XML is human-readable straight out of the
box ... and delete it, then save the xml file
open the changed XML file in Excel, all your data and formatting
should be preserved, the only difference should be that the pesky
1904 box is not ticked anymore. You can then save it as an XLS file
I have experienced this with Excel workbooks, not even using Python. When the same file gets passed around between Windows and Mac, weird things can happen with the dates (though this is not typical). Excel for Windows defaults to the so-called "1900" date system, while Excel for Mac defaults to the "1904" date system. (These are the starting years for the respective systems.)
Though the xlrd documentation and source code strongly recommend you stick with the datemode
value that is encoded in the workbook (as you have done), I think it's worth a try explicitly giving the "other" mode to xldate_as_tuple
to see if it fixes your problem:
year, month, day, hour, minute, second = xlrd.xldate_as_tuple(date,
1 - book.datemode)
datemode
will be 0
for 1900 mode, 1
for 1904 mode; 1 - datemode
flips this.
Incidentally, you can break lines without the backslash as long as you are breaking after a comma in a parenthesized expression.