Wednesday, March 5, 2014

Daily Blog #255: RHEL Forensics Part 4: More on mlocate.db

Hello Reader,
    Today we have something special, a guest post from Hal Pomeranz. One of the best parts of sharing research and information is when others come back and extend their work for the benefit of us all. Today Hal Pomeranz has kindly not only shared back his work in extending the idea of recovering mlocate database's from unallocated space, he's written a tool to do so! I'll be testing Hal's tool on my test image and post those results tomorrow. In the mean time enjoy this very well written post!

If you want to contact Hal go to http://deer-run.com/~hal if you want train with Hal he is an excellent instructor with SANS http://www.sans.org/instructors/hal-pomeranz. Hal is an awesome forensicator and community resource who also is willing to 1099 to those of you like myself that run labs that are looking to extend our capabilities.

You can download the new tool has has made here: https://mega.co.nz/#!TsoTlSjR!BMz7BQqOhCeLGumWu51kaw4v_VFxd6UT3lyqu-ljUdc

With all that said, here is today's guest post from Hal. 


Hal Pomeranz, Deer Run Associates

One of the nice things about our little DFIR community is how researchers build off of each other’s work.  I put together a tool for parsing mlocate.db files for a case I was working on.  David Cowen and I had a conversation about it on his Forensic Lunch. David had some questions about whether we could find previous copies of the mlocate.db in unallocated blocks, and wrote several blog posts on the subject here on HECF blog.  David’s work prompted me to do a little work of my own, and he was kind enough to let me share my findings on his blog.

Hunting mlocate.db Files

In Part 3 of the series, David suggested using block group information to find mlocate.db data in disk blocks.  My thought was, since the mlocate.db files have such a clear start of file signature, we could use the sigfind tool from the Sleuthkit to find mlocate.db files more quickly.
# sigfind -b 4096 006D6C6F /dev/mapper/RD-var
Block size: 4096  Offset: 0  Signature: 6D6C6F
Block: 100736 (-)
Block: 141568 (+40832)
Block: 183808 (+42240)
Block: 232192 (+48384)
Block: 269312 (+37120)
Here I’m running sigfind against my own /var partition.  006D6C6F” is “mlo” in hex, the first four bytes of a mlocate.db file (sigfind only allows a max of 4-byte signatures).  I’m telling sigfind to look for this signature at the start of each 4K block (“-b 4096”).  As you can see, sigfind actually located five different candidate blocks. 
What was interesting to me was the blocks are in multiple different block groups in the file system.  As David suggested in Part 3, the EXT file system normally tries to place files in the same block group as their parent directory.  But when the block group fills up, the file can be placed elsewhere on disk.
I wanted to make sure that these were all legitimate hits and not false-positives.  So I used a couple of other Sleuthkit tools and a little Command-Line Kung Fu:
# for b in 100736 141568 183808 232192 269312; do
    echo ===== $b;
    blkstat /dev/mapper/RD-var $b | grep Allocated;
    blkcat -h /dev/mapper/RD-var $b | head -1;
done
===== 100736
Not Allocated
0    006d6c6f 63617465 00000127 00010000 .mlo cate ...' ....
===== 141568
Not Allocated
0    006d6c6f 63617465 00000127 00010000 .mlo cate ...' ....
===== 183808
Not Allocated
0    006d6c6f 63617465 00000127 00010000 .mlo cate ...' ....
===== 232192
Not Allocated
0    006d6c6f 63617465 00000127 00010000 .mlo cate ...' ....
===== 269312
Allocated
0    006d6c6f 63617465 00000127 00010000 .mlo cate ...' ....
The blkcat output is showing us that these all look like mlocate.db files.  Since block 269312 is “Allocated”, that must be the current mlocate.db file, while the others are previous copies we may be able to recover.

Options for Recovering Deleted Files

Let’s review our options for recovering deleted data in older Linux EXT file systems:
·         For EXT2, use ifind from the Sleuthkit to find the inode that points to the unallocated block that the mlocate.db signature sits in.  Then use icat to recover the file by inode number.

·         For EXT3, the block pointer information in the inode gets zeroed out.  My frib tool uses metadata in the indirect blocks of the file to recover the data (actually, we could use frib in EXT2 as well).
Unfortunately, I’m dealing with an EXT4 file system here, and things are much harder.  Like EXT3, much of the EXT4 inode gets zeroed out when the file is unlinked.  But EXT4 uses extents for addressing blocks, so we don’t have the indirect block metadata to leverage with a tool like frib. You’re left with trying to “carve” the blocks out of unallocated.
However, our carving is going to run into a snag pretty quickly.  Take a look at the istat output from the current mlocate.db file:
# ls -i /var/lib/mlocate/mlocate.db
566 /var/lib/mlocate/mlocate.db
# istat /dev/mapper/RD-var 566
inode: 566
Allocated
[…]

Direct Blocks:
269312 269313 269314 269315 269316 269317 269318 269319
[…]
270328 270329 270330 270331 270332 270333 270334 270335
291840 291841 291842 291843 291844 291845 291846 291847
[…]
The first 1024 blocks of the file are contiguous from block 269312 through 270335. But then there’s a clear gap and we’re starting a new extent with block 291840.  If we had to carve this file, we’d have a significant problem because the file is fragmented.  And unfortunately, all of the mlocate.db files I’ve examined in my testing contained multiple extents.
We could certainly get useful information from the start of the file:
# blkcat /dev/mapper/RD-var 232192 1024 >var-232192-1024
# mlocate-time var-232192-1024
/etc 2014-03-01 09:47:12
/etc/.java 2013-05-09 16:15:05
/etc/.java/.systemPrefs    2013-05-09 16:15:05
/etc/ConsoleKit 2013-05-09 16:15:05
/etc/ConsoleKit/run-seat.d 2013-05-09 16:15:05
/etc/ConsoleKit/run-session.d   2013-05-09 16:15:05
/etc/ConsoleKit/seats.d    2014-01-31 11:11:35
[…]
/home/hal/SANS/framework/data/msfweb/vendor/rails/actionpack/test/fixtures/addresses/.svn/prop-base 2013-05-09 18:52:29
/home/hal/SANS/framework/data/msfweb/vendor/rails/actionpack/test/fixtures/addresses/.svn/props     2013-05-09 18:52:29
/home/hal/SANS/framework/data/msfweb/vendor/rails/actionpack/test/fixtures/addresses/.svn/text-base 2013-05-09 18:52:30
/home/hal/SANS/framework/data/msfweb/vendor/rails/0.556:avahi-daemon
0.564:bluetooth
0.584:ufw
0.586:smbd
[…]
I use blkcat here to dump out 1024 blocks from one of the unallocated mlocate.db signatures, and then hit it with my mlocate-time tool.  Things go great for quite a while, but then we clearly run off the end of the extent and into some unrelated data.  I was able to pull back almost 6,000 individual file entries from this chunk of data, but the current mlocate.db file on my system has over 35,000 entries.

Looking for Another Signature

The fragmentation issue got me wondering if there was some signature I could use to find the other fragments of the file elsewhere on disk.  Here’s a relevant quote from the mlocate.db(5) manual page (emphasis mine):
The  rest  of  the  file until EOF describes directories and their contents.  Each directory starts with a header: 8 bytes for directory time (seconds) in  big  endian, 4 bytes for directory time (nanoseconds) in big endian (0 if unknown, less than 1,000,000,000),  4  bytes  padding, and  a  NUL-terminated  path name of the the directory.  Directory contents, a sequence of file entries sorted by name, follow.
Examining several mlocate.db files, the 4 bytes of padding are nulls, and the directory pathname begins with a slash (“/”).  So “000000002f” is a 5-byte signature we could use to look for directory entries in mlocate.db file fragments.
sigfind doesn’t help us here, because it wants to look for a signature at the start of a block or at a specific block offset.  Since I needed to look for the signature anywhere in a block, I threw together a quick and dirty Perl script for finding our signature in a disk image.  I haven’t done a significant amount of testing, but early indications are:
·         False positives can be a problem—a series of nulls followed by a slash is unfortunately common in data in a typical Linux file system.  In order to combat this, I’ve added a threshold value that requires a block to have a minimum of 6 instances of our signature before being reported as a possible mlocate.db chunk (the threshold value is configurable on the command-line).

·         False-negatives are also an issue.  If a directory contains a large number of files (think /usr/lib), then the directory contents may span multiple blocks.  That means one or two blocks with no instances of our “start of directory entry” signature, even though those blocks actually are part of a mlocate.db fragment.
That being said, the script does do a reasonable job of finding groups of blocks that are part of fragmented mlocate.db files.  With a little manual analyst intervention, it appears that it would be possible to reconstitute a deleted mlocate.db from an EXT4 file system, assuming none of the original blocks had been overwritten.
Frankly, our signature could be a little better too.  It’s not just “four nulls followed by a slash”, it’s “four nulls followed by a Linux path specification”.  Using a regular expression for this would likely reduce the false-positives problem plaguing the current script.

Wrapping Up (For Now)

I’ve had fun getting a deeper understanding of mlocate.db files and some of the challenges in trying to recover this artifact from unallocated.  But I still see some open questions. Can we improve the fidelity of our file signature to eliminate false-positives?  And given that there are chunks of at least five different mlocate.db files scattered around this file system, would we be able to put the correct chunks back together to recover the original file(s)?  Perhaps David or somebody else in the community would like to tackle these issues.