Thursday, April 14, 2011

Restoring Deleted Files in Greyhole And Terminology Explained

Greyhole has a lot of interesting terms that might not offer an immediate explanation as to what they actually represent. I also see a lot of people asking how they can restore deleted files in Greyhole. Well, let's get to it!


Update 7/20/2011: I submitted a change to my forked Greyhole github which gboudreau merged into the main Greyhole git repo. This change simplifies all the terminology, so I've updated the below guide to show the new terms along side their old world counterparts. These new terms will be live in 1.0.0! Everything that looks like (This) is referring to Greyhole 1.0.0+.

First let's get a list of terms together.

  • Tombstone (Metadata File)
  • Attic (Trash)
  • Graveyard (Metadata Store)
None of these make much sense right away (well, they do if you understand the thought process behind them, but that can take time!) So let's go through and analyze each item. I'll put them through the layman's translator for you!

Tombstone (Metadata File)
Tombstone (Metadata File) -- "A file containing meta data about a file in your Greyhole pool."

  • Tombstones (Metadata files) are automatically created and stored for every file that is written to your Greyhole pool
  • Tombstones (Metadata files) are stored in a collection called a graveyard (metadata store) (we'll get to this later)
  • Every drive in your pool has it's own collection of Tombstones (Metadata Files)
  • Tombstones (Metadata Files) mirror the structure of your share
    • You have a file in your share stored at /path/to/sharename/folder/file
    • Let's say Greyhole moves this file to drive sdb1 which is mounted at /mnt/hdd1
    • The Tombstone  (Metadata File) for file will be created at /mnt/hdd1/gh/.gh_graveyard/sharename/folder/file (mnt/hdd1/gh/.gh_metastore/sharename/folder/file)
  • If you have a share set to save multiple copies of a file, there will be a Tombstone  (Metadata File) created on each drive that contains a copy
  • If you have only one copy of files per share you will actually have two Tombstones  (Metadata Files).
    • One will be one the drive that contains the file
    • The other will be in in a backup graveyard (metadata store) -- this is so you know what files have gone missing if a drive dies!
Graveyard (Metadata Store)
Graveyard  (Metadata Store) -- "A storage pool drive's collection of tombstones (metadata files)"

  • Every drive in your Greyhole pool has a Graveyard  (Metadata Store).
  • The Graveyard's (Metadata Store's) location is /path/to/pool/drive/gh/.gh_graveyard (/path/to/pool/drive/gh/.gh_metastore)
  • The directory structure inside .gh_graveyard (.gh_metastore) mirrors that of your share, the only difference being that the files it contains are not your files, but rather meta data (Tombstones)  (Metadata Files) about them, you'll notice that they are small and contain just a little bit of text (see the above definition for more about Tombstones  (Metadata Files))
    • There may also be a .gh_graveyard_backup (.gh_metastore_backup) folder on pool drives which contain Tombstones (Metadata Files) for files on other shares when the files copies for a share is only one
Attic (Trash)
Attic (Trash) -- "Greyhole's recycling bin"

  • Whenever Greyhole get's into a situation where it would delete a file, Greyhole moves the file into the Attic (Trash) instead.
    • If you do a delete, Greyhole moves the file to the Attic (Trash).
      • Note: If you have a program that creates temporary files when opening a file (like word or vim, etc) and then deletes those temporary files you'll end up with files in your Attic (Trash) that you don't necessarily recognize. (See below for how to access files in your Attic (Trash).)
    • If you have >1 copies of files per share and you write to a file the out of date copies (those that weren't modified) are sent to the attic (trash).
  • Each drive has it's own Attic (Trash) folder.
    • The Attic (Trash)  folder is at /path/to/pool/drive/gh/.gh_attic/ (/path/to/pool/drive/gh/.gh_trash)
      • The folder structure for an Attic (Trash), like a Graveyard (Metadata Store), mirrors that of your share, but, unlike a Graveyard (Metadata Store), the files inside an Attic (Trash) are real files.
  • To get to the files in the Attic (Trash) you can either browse to the path above for each of your Greyhole drives or you can setup a Greyhole Recycle Bin Share
    • You can create a special share name with one of the following names in Samba: 'Greyhole Attic', 'Greyhole Trash', 'Greyhole Recycle Bin'
    • Create the above share like you would any other Greyhole share (that is, use the vfs object and dfree properties)
    • When Greyhole sees this in your Samba config it will create symlinks to all files deleted after the share is created -- older files in the Attic (Trash)  must be accessed via the paths above -- in the Attics (Trashes) in the share path you specify.
      • This won't take effect until after the Greyhole service has been restarted, so remember to do this after making changes to your Samba or Greyhole configs!
    • From this share you can copy your deleted files back to the pool or delete them.
      • Files deleted from the Attic / Recycle Bin share are deleted permanently.
  • Having deleted files move to the Attic (Trash) is the default behavior. If you do not want this to happen you can change the delete_moves_to_attic (delete_moves_to_trash) property in greyhole.conf (either globally or per share)
    • If you set this property to "no" Greyhole will permanently delete all files, they will not be moved to the Attic (Trash) ever.



4 comments:

  1. Hey! Great article.
    But I'm pretty sure this is wrong: "When Greyhole sees this in your Samba config it will create (during your next fsck [...]"
    fsck will not scan the attics to create symlinks in a new Attic share.
    When you create an Attic share, only files deleted after will appear in it.
    Again, good one.
    Cheers.

    ReplyDelete
  2. Updated, I probably should have checked the code before making that assumption. :)

    ReplyDelete
  3. Does greyhole run as root?

    I notice examples typically have a gh directory under the mounted drive directory. Is this necessary for greyhole to work?

    ReplyDelete
  4. Yes greyhole runs as root. That directory structure is required for greyhhole to work. The idea is that you can use the mount for things other than greyhole if you desire (I do), greyhole only cares about files in the gh folder.

    ReplyDelete

Followers