Friday, September 2, 2011

Reconciling File Times Between Unix and Windows

I did some enhancements for AnyBackup not too long ago that required comparison of hash keys generated using (in part) files' last modified time. I discovered an oddity that, despite years of being on the platform, I'd never known about Windows. File meta data has a resolution of 2 seconds. Don't believe me? Take a closer look. What this means is that the modified time (in seconds since the epoch) can never be odd, it's impossible.

It also means that when you copy a file from Linux (which tracks meta times accurately to 1/100 of a second) to Windows, the time is rounded up or down accordingly. The oddness that ensures is that when you look at the Windows file copy it'll (sometimes) show a one second difference as compared to the Linux copy. (It all depends on rounding.)

My gut reaction was to just divide the times by 100 and remove the two least significant digits from play, but that lowers precision and doesn't quite guarantee that you'll avoid the problem entirely. (Imagine your Unix modified time is 1699999999, in Windows this will become 1700000000 -- oh the imprecision!)

When you get the modified time of a Linux file (say through a Samba share) it'll invariably have two digits to the right of the decimal place. (At least when doing so via something like Python, not from a Windows property box.) If you convert it to an inegert to remove these the number will be rounded up or down accordingly. Instead I decided to do something like the following:

  1. Round down (regardless of the two digits to the right of the decimal)
  2. Convert to integer
  3. Check if the number is even (modulus 2)
  4. If it is even, add 1
So going back to our initial example, say your Linux file comes back with a modified time of 1699999999.42:
  1. 1699999999.00 (Round down)
  2. 1699999999 (Convert to int)
  3. Not even (1699999999 % 2 = 1)
  4. 1700000000 (Add one)
  5. Voila, it matches the new Windows copy
(Yes, the conversion to an integer isn't really necessary, but we're dealing with whole numbers already anyway, so why not?)

The above steps ensure that you'll end up with a Windows compatible view of the modified time. So what does this look like in Python code? See below:

 mtime = int(math.floor(os.path.getmtime(fileLocation)))  
 if mtime%2:  
   mtime += 1  

AnyBackup 0.9.3 Released

A hasty follow up to 0.9.2, 0.9.3 comes with some critical bug fixes.

Change list:

  • Issue 49 - Added additional test case for testing the skip list
  • Issue 62 - remote indexing ignoring skip list
  • Issue 63 - Improve remote index property interaction
  • Issue 64 - setName is accessed directly during indexing
  • Issue 65 - Modified rounding time differences
  • Issue 66 - UTF-16 encoded file names
  • Issue 67 - Refreshing multiple drives including remote drives only indexes remote drives if remote indexing is confirmed
  • Fix to reconcile linux's < 1 second file time resolution and windows's 2 second time resolution ( i.e. modified times in windows can only move in deltas of 2 seconds )