Tuesday, February 22, 2011

AnyBackup Beta 2 -- Now with 100% more Python!

Following up on Beta 1 I've now finished Beta 2 of AnyBackup! What's different? Well, I ported the entire project to Python instead of Perl. Not because I think that there's any great functional benefit to one over the other, but rather because it's something I needed to pick up for work anyway. The main benefit to this release is that pretty much all the standing issues have been resolved. See below. I'll be looking into setting up a google code page in the near future for AnyBackup since it seems like a much cleaner approach than just updating my blog + mediafire account all the time.

Note: I work on this program in my spare time primarily to solve my own backup needs, I release it for others to use since I figure others may have similar backup needs that AnyBackup can fulfill. That said, this is beta software (and even if it weren't) I can make no guarantees that it will always 100% back up all your data and that no data loss will occur. There can be bugs that I don't know about / haven't hit. So basically, buyer beware, use at your own risk, I can't be held responsible for any issues that arise. If you do hit issues, don't hesitate to report them!

Changes:
  • Now written in Python instead of Perl (see explanation above)
  • File comparisons now made against file sizes (in KB) and directory paths in addition to file names
    • See issues below for more details
  • File object revised to store directory root and directory path separately
    • This allowed me to get rid of some ugly regular expression hacks going on in the backup and indexing methods
  • When backing up files or restoring files AnyBackup now creates a pending size change (the total of file size deletes and additions) and makes sure that any additional adds will fit, this keeps things from running into an issue where a drive will run out of space and copies fail
  • Icon added to show whether a drive is currently connected or not (+ and green for connected, x and red for disconnected)
  • Drive background color in the drive list now changes colors based on free space (>15% green, >10% yellow, <10% red)
  • Drive free/total space added to drive list (in GB)
Known Issues:

  • If a backup volume is full but there is new content to backup that prefers this full volume (due to the above backup logic based on parent folders), it will still choose this full volume as the destination and then throw an error that there isn't enough space on the destination volume.
    • Follow up -- my solution for now is preferring a drive based on the above logic, but it creates a pending write total and checks to see if all the files we want to add will fit, if not it will instead grab the most free drive and back the files up there, in the next release I'll add a property file which will allow you to turn this sticky files feature on or off. (For now it's on -- in the next release you can turn it off and it will always place files on the most free drive at the time of choosing)
  • For some reason when running a new backup, AnyBackup will often leave a few old files (that is, files that are no longer found on your content drives -- meaning they've been deleted / renamed / written to) -- this normally has no negative repercussions, but I'll figure out what's causing this eventually. 
  • Some people are having issues launching the exe packaged version of AnyBackup, I'm not sure what's causing the issue, I've been using Cava Packager to create the exe's + installers and it runs perfectly on my test machines. I'll put up an archive of the raw perl later for those that have issues -- the downside to using the raw perl is that you'll need to have a valid perl install with all the packages I use.
    • Since this is now in python being packaged with py2exe, I'm assuming this will no longer be an issue.
  • File comparisons are made only against file names, so if you change/write to a file or have duplicate file names in multiple directories this can lead to inconsistencies such as only backing up one file instead of multiple or not backing up the updated version of a file
    • I augmented this naive comparison, it now checks the file sizes and directory names in addition to file names. The only possible complication I see at this point is that the file size comparison is done in KB and not bytes, I'm not sure if this will cause issues (I'm thinking not) Feel free to let me know if this impacts small files with minute changes, though! (This is the only situation I really see it being an issue.) If I get reports I'll look into changing this.
  • Minor issue with the most free drive logic, it needs to be updated to incorporate the new pendingwrite total otherwise it may not be grabbing the drive with the most free space, but rather than drive with the most free space before any operations have taken place (i.e. if we have 10gb pending to a drive which has 8gb more free than any other volume, it should no longer be considered the most free drive, but right now, it will be) 
  • The CLI interface for the app, backupWrapper.py isn't guaranteed to work yet and all features are definitely not built out (primarily restore definitely does not work, I backup will not remove old files from backup volumes). What does work I haven't tested, so buyer beware when using it in this release. In the next release I will finish building out the features for it and also package it as an exe alongside the GUI interface.
 I'm only going to add one new screen shot to show off the (very small) UI changes to the main window. Despite the complete port of the app to Python, it's still using wxWidgets which is pretty much identical across languages.


Download:
(As before, this code is released under the GPL license!)
Update: Google Code project created! You can find it here.

    Thursday, February 17, 2011

    Dynamically Convert A Raid5 Array to Raid6

    Transferred from my old blog:
    If you have the newest mdadm tool, version 3.1.1 and above, it is now capable of changing the raid level of an array. If you cannot find a copy of the latest version for your distro you can compile from source via mdadm’s git repo, git://neil.brown.name/mdadm
    The below assumes that you have 1 spare drive ready to add to your array (/dev/sdb) and that /dev/md0 is the raid5 array that you would like to move to raid6 and that you have 4 raid devices in /dev/md0 to start with.
    Use the below commands:
    mdadm --add /dev/md0 /dev/sdb
    mdadm --grow --level=6 --raid-devices=5
    Once this completes, you should have a fully functioning raid6 array. Enjoy your dual parity.
    Further, you can also change the chunk size dynamically while you’re at it, the default chunk size of mdadm (which I believe they plan to up in future versions) is a paltry 64k, you’d be much better off with something in the 256-512k range. To change the chunk size of an array, use the following:
    mdadm --grow /dev/md0 --chunk=512
    I’ve seen several references now using –chunks-size, so it’s possible in future versions this may be the correct flag instead of –chunk, just something to be aware of. Also, upping your chunk size to 512 may not be possible depending what the total size of your array is. It’s possible that mdadm will spit out an error stating that the total array size is not divisible by 512, in which case you’ll have to settle for something smaller. (i.e. try 256 or 128).

    Tuesday, February 15, 2011

    wxPerl ListCtrl

    When writing AnyBackup I got up close and personal with the beast that is wxPerl and it's almost ridiculous reliance on its parent wxWidgets documentation. One of the most frustrating steps I encountered was trying to wrap my head around just how ListCtrl works. At first I started with the much simpler (and user friendly) ListBox, which does exactly what you think it does. It lets you display a list of strings and you can attach data objects to those strings which are accessible via click events. Very handy when you want to attach an object ref to each string. ListCtrl... well it isn't so nice.

    It's not extremely apparent upon first glance and if you don't read the documentation carefully you may, as I did, stumble on blindly expecting to be able to hack in ListCtrl in ListBox's place. But this is simply not the case. To start with, you can't attach an object reference to an item, rather, you can only assign a Long id. Now this makes it possible to store data, but you've now added overhead, and additional data objects to keep track of in your code, and if you're using multiple ListCtrl objects as I did, forget about it!

    Abstraction is the key here, so I created a wrapper for ListCtrl which makes it act much like ListBox, but with all the pretty options that come along with it, such as icons, colors, etc.

    The best way to learn is by example, I feel, so here you'll find the wrapper class that I wrote for AnyBackup

    Now this wrapper has some parts that are quite specific to my application, so don't expect to be able to just drop it in sans modification to your program, but it can give you a good idea of what you need to do to make ListCtrl be what you want!

    Let's walk through the main functions and explain their... well, function.
    •  GetSelectedData - this function does exactly what you think it'd do, it looks for the currently selected row and then gets the Long id for this row, and finally it grabs the previously set object reference from the storage hash and voila, transparently you've gotten an object ref for your selected row without ever dealing with the long id! 
      • Note: This example wrapper specifies a ListCtrl which only accepts single selections, you could change this option, but then this function would have to be modified to return an array of object references.
    • populateDrives - this function is a great example of what you'd need to modify. It takes an array of object references and grabs their names for display strings in the ListCtrl and then stashes the object refs themselves in the storage hash, mapping them with the long id.
    • DeleteAllItems -- this acts like and behaves just like ListCtrl's DeleteAllItems. It delets all the items for your contained ListCtrl object and it clears out the stoarge hash and support variables in your easyListCtrl object
    • InsertItem - this is where a lot of the magic happens, it takes a string (item) and an object ref (data) and maps the data to an id, and then adds the string to the ListCtrl object
    • GetData - this allows you to pass an arbitrary item (string) and get it's stored data back, should that item exist, anyway.
    • adjust - this function exists purely for adjusting the width of a ListCtrl object after deleting or adding items. The problem is that ListCtrl doesn't take care of this for you! If you just add items and leave it to its own devices, you'll end up with a ListCtrl which has strangely narrow columns which cut off your strings,which is most likely not what you were going for, this function will automatically resize the column to the length of the longest row, so no data will be cut off and the ListCtrl object will be wrapped, if needed, in a scrollbox.
    • getListCtrl - this allows you to grab the wrapped ListCtrl object, important when it comes time to adding the object to a frame and/or sizer

    Mount a CIFS share via fstab

    Have a CIFS share you want to mount on every boot? Tired of manually mounting? Don't like the hack-around of a startup script doing this for you? It's pretty easy to add this to your fstab file.
    //<ip.address>/<share_name>       /mount/point       cifs       credentials=/path/to/.credentials_file,_netdev,iocharset=utf8,uid=<numeric user id>,gid=<numeric group id>,file_mode=0750,dir_mode=0750,rw     0 0
    So let's go over some explanations for the above fstab line, at least for the parts that aren't self explanatory.
    • _netdev -- this option tells fstab that we don't want to try mounting this share until the network manager is running, probably a good idea since a cifs share is, naturally, mounted over the network!
    • iocharset -- the above might differ depending on the language your files are named with, play with this if you use a different character set (i.e. Cyrillic)
    • uid and gid -- setting these for the share determines who appears as the owner when you browse to the mount point, if you don't set these options root will appear to own the mount point and all the files in it, which is most likely not what you want! To find your user and group id's you can type "id <username>" in the terminal and it will print out the numeric id for your username and all the groups your id is member of
    • file_mode and dir_mode -- these determine the permissions of the mount point locally and they work just like any other user permissions on a unix system, the owner permissions will be relative to the uid you set and the group permissions will be relative to the gid you set. This means if you use the above example and have 5 for the group permissions and then, as a user other than uid and who isn't a part of gid, you try to write to the share, you'll get permission denied.
      • Note: even if you set permissions here to give you write access but the share source doesn't give you write permissions, you still won't be able to write to the share!
    • credentials -- this is a nice way to avoid posting your username and password directly in your fstab! For example, instead of using a credentials files you could put username=SomeUser,password=SomePassword right into the options and it would work, but that would expose your password to anyone who had viewing rights to the fstab file, which may not matter in the scheme of things for a private server, but it's still poor practice.
      • The contents of a credentials file is very simple, consider the below
        • username=SomeUser
        • password=SomePassword
      • Note: CIFS can be very sensitive to extra white spaces, so make sure each line ends with the end of your username / password and not a white space, also make sure that your file ends on the password line and not a blank line, this can be a good place to check if your mount is failing and you're sure that your login information is correct. Also, do not use quotation marks, this will cause login failures.

    Upgrading a Fedora Install

    After making a right mess of things installing some new packages, I decided to do an offline upgrade of my Fedora 11 server to Fedora 14, and I hit a few snags along the way that I'll detail.

    First off, as you may or may not know, you can only upgrade to two versions ahead. So to get from 11 to 14, I had to go first from 11 to 13 and then from 13 to 14. This isn't a big deal, but it will add some time to your upgrade process! Just make sure you have both dvd's handy.

    An interesting issue I ran into is that despite upgrading, I still had many, many fc11 packages still installed, and what's more, they were preventing me from upgrading or installing new packages via yum for fc14! There are some commands that can make this easy to clean up, though. Note: If you're upgrading one release ahead, i.e. fc13 -> fc14 it can be completely normal to see fc13 packages since some packages don't get updated right away.

    That said, there is a handy little command that lets you find orphaned packages installed on your system, that is, packages which are no longer in any repository your system looks at. (Note: This also includes any programs which you got from outside your repositories -- Handbrake for example, or Subsonic.) This command is:
    • package-cleanup --orphans
    If there are only a few items that are put out from this you can do a clean up manually, I chose to remove packages with the following command:
    • rpm -e --nodeps <package name>
    -e tells rpm to remove the package and --nodeps means to remove only the package, not other packages that depends on it.

    I had over 100 fc11 packages leftover, so I used this little bash loop to get rid of them:
    • for i in `sudo package-cleanup --orphans | grep fc11`; do sudo rpm -e --nodeps $i; done
    The grep fc11 above will strip out the beginning lines (loading plugins, extra repo's, etc) which are not valid package names, as well strip out packages which are release independent (again things like Handbrake, Subsonic, etc).

    Even after doing the above when installing packages via yum I was getting error messages (though my installs would still go through) about certain packages missing dependencies. The only solution I could find for this was to patiently look up the libraries (most of the errors were about .so files) on google and find out which packages they belonged to and then run a yum install for said package. It took about 5-10 packages before the errors finally cleared up.

    Something else worthy of noting is that I hit some major weirdness / disruptiveness in my perl install. I had to upgrade all my perl packages after the upgrade manually via yum. The main issue, however, is that namespace, namespace::clean, and namespace::autoclean were not installed and this led to all sorts of issues attempting to run scripts which depended explicitly on namespace and also attempting to install and compile packages via CPAN.

    So be prepared to deal with a little work when attempting to do a Fedora upgrade (especially from three releases back!) but on the whole I'd say it was less effort to get all this done than to do a new install from scratch as it saved me from having to worry about saving off all my custom software, scripts, configs, and then restoring these afterward.

    Monday, February 14, 2011

    Minesweeper Applet

    From my old blog:

    Almost four years ago now I had to create a version of Minesweeper using Swing for a software programming course in college. For fun later I converted it to an applet and hosted it up on my website. What you see below is the final product. The obvious drawbacks being it isn't at all configurable (either the size or the difficulty -- number of mines) but it was enough to get an A, so it was enough for me. :)

    Greyhole Improvements

    Since I made the move from Raid6 to Greyhole I decided to get involved in the project's development. Up until this point it's all been developed through the tireless efforts of one man, Guillaume Boudreau. I already made a commit which made it into the latest release of the project, 0.9.0, which allows you to find orphaned files on storage volumes (the main benefit being you can have prepopulated volumes you add to the pool without the need to have bulk quantities of files make the long journey from remote share, to landing zone, and finally to one or more storage volumes).

    Next up, now that I know my way around the system quite a bit better, I did a bit of an overhaul to the fsck framework of Greyhole. It now supports being passed storage volume paths as well as share paths. So you can pinpoint target exactly what you want to fsck. This is all part of Issue 50 on the google code page. I hope to keep cranking out improvements to the project, it's the first time I've officially committed code!

    Greyhole

    I recently made the jump from raid6 (raid5 with two parity drives instead of one) to Greyhole (a JBOD pooling application which uses samba as an access point). Why leave the mature bosom of parity striped raid? Well, for my situation, it couldn't have made more sense.

    What does Greyhole buy you?
    • Instantly add new volumes to your storage pool
      • When I wanted to add a drive to my array, it could take 24-40 hours! And that was just the reshape, I then had to expand the file system, etc. It was a big time sync and I'd find myself putting it off as long as possible just because I didn't want to deal with it.
    • Recycle Bin
      • If you've ever accidentally (or retrospectively regretfully) deleted some data, and lamented the lack of a recycle bin in linux or on mounted network shares, you'll probably appreciate the extra layer of security that Greyhole provides. 
      • Whenever a delete command is issued via the samba interface, Greyhole takes this file and moves it into its "Attic" where it remains until you either empty the "Attic" or delete the file again from the Recycle Bin share.
    • Independently formatted drives
      • If one drive dies, only the data that was on that drive is inaccessible
      • If you remove a drive from your greyhole pool, it is a completely normal, accessible drive which can be mounted and read from as easily as any other drive
    • You can "check out" drives from the storage pool
      • You can notify the Greyhole daemon that a drive is going to be missing, and it will wait on recreating file copies (if you're using Greyhole's file persistence)
    • Selective data redundancy
      • You can setup, by share, how many copies of your files you want persisted.
        • So if you say you want two copies of your personal documents, it will make sure every one of your documents has two different copies on two different volumes, so if one drive dies, another will have your data, and should a drive die, it will automatically persist another copy of that file to replace the missing one
    • Single storage point
      • Just like raid5/6 and LVM your storage point appears to be one big, convenient location, but with a couple perks
        • If you're like me and you have your backups separate from Greyhole, you don't need to persist multiple copies of files, and since it's not raid, you don't lose any space to parity. This means you get 100% of your drive space setup in a single storage point!
        • LVM can do the above, that is, offer you a single storage point, but it combines your drive into a virtual volume and spreads a single file system across them, and if one drive dies in the LVM volume, your file system is lost, or in a bad way, at the least.
    What doesn't it do?
    • No performance gains.
      • Unlike raid5, you'll get no read boosts since each file exists wholly on an individual volume (and isn't striped across multiple disks) you'll be getting all the bandwidth one drive can offer, not three or more.
      • Unlike raid0, you won't be writing to multiple / reading from multiple drives either, so you're going to again be left with the read/write bandwidth of a single drive
    • Recreating symlinks / missing file copies from a multicopy missing volume can take time 
      • Unlike raid1, you don't have an immediate, up to date backup waiting to be swapped into place, if you're using multiple file copies and a volume dies, it might be a little while before the symlinks to those files and the extra copies of those files are restored and accessible, so it isn't good for a situation which calls for data to always be available regardless of the circumstances
    • Greyhole does not work with the native system fs
      • To capture file operations, Greyhole is completely reliant on Samba, if you access your share outside of Samba, Greyhole has no way of knowing what file operations have taken place
      • The work around to the above is to mount your Samba share locally on the linux machine, but it is definitely a limitation
    In the end Greyhole is a clever way to simulate a single storage pool by grabbing file operations through Samba and persisting those operations to file copies on a number of a pooled storage volumes. It creates the illusion of this central share point by symlinking to files across the various pool volumes. Despite it being a relatively new program (it's been out for download for around a year now) it's a fairly stable product with little chance of data loss.

    Update 7/20/2011: I've been using Greyhole now for close to half a year and I'm more than satisfied with it. Why I actually went to Greyhole was simply for the flexibility. No more degraded arrays, no more 24+ hour reshaping / recovering.  I've lost no data, I've had no issues with data disappearing due to my sata controller resetting ports as it likes to do, and I've added drives to my pool with a quick and casual ease which raid 5 could only dream of. I still manage my backups with my own AnyBackup program separate from Greyhole's built-in file redundancy but that is not due to a lack of functionality on Greyhole's part, just to a different use case on mine.

    Destroying an mdadm Array

    Or how to banish the ghost of raid5's past.
    I recently made the jump from raid6 to greyhole. I'll cover why in another post, but the primary point here is that I took my array offline using the following command:
    • mdadm -S /dev/md0 (where md0 can be whatever your raid device's name is)
    And then I formatted each of the raid drives to be an independently formatted disk with an ext4 partition. I then added them as part of my greyhole pool and life was grand. That is, until I rebooted, and low and behold, I found my lurid past haunting me. md0 had risen from the dead and was claiming most of the drives back as spares! Not only that but it attempted to make them raid members again and hid the previously created partitions! (By this I mean I only saw /dev/sdb without the accompanying partition /dev/sdb1.) My data is 100% backed up, so I was not afraid of data loss, but it was rather annoying to see, potentially, 75% of my newly transferred data swallowed by a jilted raid array. Fear not! By opening the disk in fdisk and writing the partition table, I was able to retrieve my data partitions intact.

    Literally all I did above to recover my raid-altered drives was
    • fdisk /dev/sdX
    • w
    I then checked to see if the raid superblocks were still present on my previous raid volumes even though they'd been formatted, and to my surprise they were! You can view the superblock information of any volume by running the following command:
    • mdadm -E /dev/sdX
    If there is no super block you'll just get a line spit out that says "mdadm: No md superblock detected on /dev/sdX" otherwise you'll get a multiline printout of a drive's superblock information.

    You can vanquish these peskily pervasive blocks by issuing the command:
    • mdadm --zero-superblock /dev/sdX

    One other thing to check is that /etc/mdadm.conf is empty, as information about your raid device can reside here as well. After taking care of the superblocks and your mdadm.conf file you can be relatively sure that your array has been laid to rest for good.

    CIFS share mount command hanging

    I ran into an issue today when attempting to mount a samba share via cifs on a linux client where the mount command was taking two or three minutes to complete. Here's what the fix was in my situation, apparently cifs will first attempt to make a connection via port 445, in my iptables config I have that port dropped, eventually cifs falls back to port 139, however, if you specify port=139 in the mount options, it will go to the correct port the first time around and complete immediately.

    Sunday, February 13, 2011

    Restore drives that have been erroneously marked as failed in a Raid 5/6 array

    Transferred from my old blog.
    Scenario:

    You have a raid5/6 array (/dev/md0) in which one or more drives have been marked as failed. For instance, I had a motherboard problem in my server recently which would cause my esata controller to spontaneously reset ports and knock 3-4 drives off my array at a time, putting the array in a failed state. All is lost, yes? No! Since in my scenario the array immediately fails upon having more than two drives disappear (this is a raid6 array), no data has changed on the actual file system, you can use the following command to force reassemble the drive. If possible, mdadm will up the event count on the “failed” drive(s) and clear the faulty flag for it. NOTE: Be careful with this, if a given drive was knocked out of your array before a modification to your array (i.e. writing to the array or an array reshape), this can cause massive, non-recoverable data corruption. Only do this if you are SURE that your array contents has not been changed/written to between the time these one or more drives were removed from your array and now.

    If you have filled out mdadm.conf with your array and corresponding drives:

    mdadm -Af /dev/md0

    If you do not have mdadm.conf filled out and rely on mdadm auto assembling your array upon start up, use the following:

    mdadm -Af /dev/md0

    Where devices is replaced by the drives that make up your array, example:

    mdadm -Af /dev/md0 /dev/sd[b-d] /dev/sd[h-o]

    Silicon Image 3132 mdadm Array Problems

    Transferred from my old blog.
    I ran into an issue this weekend where my raid 6 array absolutely refused to rebuild (I was adding in a new spare since another drive had legitimately died), it kept marking multiple drives as failed and setting the array as degraded. Looking at /var/logs/messages I saw it was peppered with sections of errors like the below:

    Nov 12 20:48:41 articles kernel: ata9.03: status: { DRDY DF }
    Nov 12 20:48:41 articles kernel: ata9.03: cmd 60/40:d0:10:9c:22/00:00:02:00:00/40 tag 26 ncq 32768 in
    Nov 12 20:48:41 articles kernel: res 60/40:d0:10:9c:22/00:00:02:00:00/40 Emask 0×9 (media error)
    Nov 12 20:48:41 articles kernel: ata9.03: status: { DRDY DF }
    Nov 12 20:48:41 articles kernel: ata9.03: error: { UNC }
    Nov 12 20:48:41 articles kernel: ata9.03: cmd 60/c0:e8:50:93:22/00:00:02:00:00/40 tag 29 ncq 98304 in
    Nov 12 20:48:41 articles kernel: res 55/35:00:00:00:00/00:00:00:d0:55/00 Emask 0×81 (invalid argument)
    Nov 12 20:48:41 articles kernel: ata9.03: status: { DRDY ERR }
    Nov 12 20:48:41 articles kernel: ata9.03: error: { IDNF ABRT }
    Nov 12 20:48:41 articles kernel: ata9.04: exception Emask 0×100 SAct 0×0 SErr 0×0 action 0×6 frozen
    Nov 12 20:48:41 articles kernel: ata9.05: exception Emask 0×100 SAct 0×0 SErr 0×0 action 0×6 frozen
    Nov 12 20:48:41 articles kernel: ata9.15: hard resetting link
    Nov 12 20:48:41 articles kernel: ata9: controller in dubious state, performing PORT_RST

    This seems to be a common problem with port multipliers on Linux, or at the very least a common problem with the Silicon Image controllers on Linux — and there seem to be precious few outside of Silicon Image. (Side note — I’m running the latest firmware available for the controller card) Even still under heavy read/write conditions I see these error messages in my system logs, but they no longer result in drives being marked as failed and getting kicked out of the array.

    Here’s what I did to get mdadm to stop marking drives as failed after the Silicon Image controller performed port resets. There is a feature for most modern SATA drives called NCQ (Native Command Queuing). This feature takes 32 read-write commands in queue and then optimizes their order based on physical location of the read head in order to minimize the movement needed to complete all requests and thus (ideally) speed up performance and extend drive life time. This only leads to increased performance under certain (usually server oriented) load conditions. I set the NCQ depth to 1 for all the drives in my array, which effectively disables NCQ — i.e. there is only one item in queue at any given time, so there is no optimization taking place. Again this did not get rid of the port resets, but it did stop the drives from getting marked as failed by mdadm.

    Here’s my theory why. NCQ queues 32 read/write commands. That means whenever the sil24 driver decides the controller is in a bad state (which it will do when it has three or more devices with outstanding commands, I believe — I read mention of this in a post by one of the devs), it will perform a port reset which will results in the drives powering off and on again, when this happens they drop their queue (this is speculation — but I would be very surprised if the queue survived a power cycle), which means up to 32 pending read writes go bye bye. In that case data isn’t written to or read from the drive as requested and could end up causing mdadm to assume the drive is faulty. This is only a theory, but regardless of its correctness this action did fix my array problems.

    To set the NCQ depth to 1:
    echo 1 > /sys/block/sdX/device/queue_depth

    To turn NCQ back on:
    echo 31 > /sys/block/sdX/device/queue_depth

    Where X above is the device letter. (Note you need root privileges to run the above commands, so become root or use sudo.)

    My server is using:

    * Fedora Core 11 — latest available kernel (Aging yes, but not yet worth the effort to upgrade)
    * Silicon Image 3132 based PCIe eSATA controller
    * Sans Digital 8 TR8M bay eSATA tower (using a port multiplier to convert 8 SATA ports to 2 eSATA ports

    AnyBackup Beta 1

    I've been working on this side project for a little while now, it's called AnyBackup (as the title may have indicated). It's written in perl with wxWidgets as the GUI toolkit. Despite being written in perl, it is by no means platform independent, it's very much windows only for the time being. I built this program because there just doesn't seem to be a good solution out there for backing up multiple volumes to multiple volumes, especially to dynamic drives -- that is, drives that may or may not be connected and may or may not retain the same drive letter, a very real concern when your backup drives are being hooked up via USB! AnyBackup identifies your drives via a combination of the volume name (a volume must be named before being added, AnyBackup will refuse to add unnamed volumes) and the volume serial number. The bottom line, it's fairly stable and mostly works, I have not had it eat any data, and I've use it to keep my content drives in sync with my backup drives, I've even used it to restore some missing data. There are a few areas which need more intelligent handling as you can still get into a few situations with full drives that would require manual intervention to resolve.

     Features
    • Backup any number / sized volumes to any number / sized volumes
    • Restore missing files or backup new files / delete old files
    • Search through files on any of your content and backup drives
    • An easy to navigate gui interface for navigating the indexed content of your drives
    • Backup volumes don't all have to be connected at once
      • The idea here is that if you're connecting your backup drives via usb, you can connect them one at a time, AnyBackup will prompt you for the specific volume it needs (this is ideal if you're using a USB/eSATA dock to connect your backup drives)
    • Allows you to specify a list of file extensions you wish to backup (so if you only wanted to backup mp3's, for instance, you could go to  Edit -> Edit Lists (or Ctrl+E) and select valid extensions from the drop down, and add "mp3" to the list. To index all files you could add ".*" to the list of valid extensions
    • Allows you to specify regular expressions for files to avoid indexing
    • Will semi-intelligently backup content and attempt to cluster content together
      • By this I mean if you had a folder for your favorite band, say, The Beatles, and you have sub folders for all the albums (i.e. The White Album, Abbey Road, etc) which then contain audio files, when backing up each audio file it will look to see if a backup volume already contains its parent folder (i.e. The White Album) or its parent's parent (The Beatles) and if a backup volume already contains either, it will choose that backup volume as its destination.
    Known Issues
    • If a backup volume is full but there is new content to backup that prefers this full volume (due to the above backup logic based on parent folders), it will still choose this full volume as the destination and then throw an error that there isn't enough space on the destination volume.
      • The workaround is obvious, move files off the full volume to a less full volume and then refresh both the backup volumes so AnyBackup will be aware of the changes.
      • I'm leaning towards a solution that involves taking the parent folder and moving it to a different volume that has sufficient space and then copying the new content there too (obviously after performing a check that the new destination drive has enough space for the parent folder size + the new content size)
    • For some reason when running a new backup, AnyBackup will often leave a few old files (that is, files that are no longer found on your content drives -- meaning they've been deleted / renamed / written to) -- this normally has no negative repercussions, but I'll figure out what's causing this eventually. 
    • If you have multiple network drives mounted to the same share but at different levels (i.e. X: is mounted to \\server\share\dir1 and Z: is mounted to \\server\share\dir2) AnyBackup has no way of differentiating between the two as they'll have the same volume name and volume serial number, so it will refuse to add X when Z is already added in the above example. There are no plans to change this, my suggestion is to create separate shares so that, at the very least, the volume name will change between the two.
    • Some people are having issues launching the exe packaged version of AnyBackup, I'm not sure what's causing the issue, I've been using Cava Packager to create the exe's + installers and it runs perfectly on my test machines. I'll put up an archive of the raw perl later for those that have issues -- the downside to using the raw perl is that you'll need to have a valid perl install with all the packages I use.
    • AnyBackup will create folders on backup drives even if there are no valid files inside it -- so basically any folders present on the content drives will make it to one of the backup volumes, even if they have no valid file types, I actually prefer this, but I may add an option to toggle the behavior later.
    Enough talk, pictures now!









    Without further adieu, I give you the download information.

    Download:
    AnyBackup Beta 1 -- Win32 Packaged EXE 
    AnyBackup Beta 1 -- Perl Source 
    (Please note, when using the Perl Source above, you'll need to install several additional perl packages: Win32::API, Win32::DriveInfo, Cava::Packager -- this needs to be done via the Cava Packager program, and finally wxPerl -- this can take a while to install!)

    Also note,  I have not had a chance to add the GPL license to the above source, but please note that ALL code posted here is released under the GPL license! The GPL license can be viewed  here for those that are interested!

    Followers