Linux and DMAPI - Data Migration Application Interface
Written by Doug Bonnell Posted on: 02.03.2007 at 06:22am Section: Tutorials There's a rather arcane extension to the Linux kernel that, in conjunction with the XFS filesystem, allows for 'hierarchical data storage' or 'data migration'. This means you can have a mounted filesystem where the files appear to be there, but the ones that haven't been accessed for a while are actually stored offline, say on other RAID systems, on network shares, tape or DVD libraries, etc.
The DMAPI extension allows the kernel to communicate a number of filesystem events to a program or programs in user space. DMAPI also supplies a number of 'behind the scenes' file manipulation functions to the user space programs. Let's do a walk through to see how this all works:
1. Start the 'event monitor' daemon. It sets up a communications session with the kernel's DMAPI layer and waits for filesystem events. 2. Mount an XFS filesystem with one of the options being 'dmapi'. 3. Copy a 4GB video file to the XFS filesystem. The monitor daemon will see a 'create event' and a number of 'write events'. 4. Now use another program to 'punch' the video file. This program uses one of the 'background' functions to truncate the file. All of the original file attributes such as size, creation timestamp, etc. remains intact on the XFS filesystem.
Each file has special DMAPI attributes called 'extents' that indicate the actual datasize of the file and the remaining 'virtual' filesize. From the DMAPI perspective, each file is a virtual file with a maximum virtual size of 64 bits (2^64 bytes).
5. After punching, if you do a 'du -h' on your XFS filesystem, you are no longer consuming 4GB for the file. The minimum 'punched' filesize depends on the block size of the filesystem, typically it's 4KB for XFS. 6. If you perform a read operation on the punched file that's within it's punched size (say 4KB), then there is no need to recover the file. If you read beyond that punched size, then the daemon invokes a program to restore the file from the place it was archived. Your read will block until the file is restored.
Buy using a set of user space programs, it's fairly straightfoward to determine the age of files, migrate them to offline storage to conserve the online disk space and restore the files if access is required at some future point. All of this is transparent to any applications that access the files under the filesystem with DMAPI support. The archives and migrated files can be tracked using MySQL or some other database. |