Wednesday, February 25, 2015

Automating DFIR - How to series on programming libtsk with python Part 8

Hello Reader,
            Welcome to part 8 of the Automating DFIR series, if this is the post your starting with... Stop! You need to read all the prior parts or you will be really, really lost. There is a lot going on here and the better you understand it the easier it will all make sense to you, allowing you to done your wizard robe and hat of DFIR Wizardry! Catch up on the part you left on below:

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3  - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator
Part 6 - Accessing an E01 image and extracting files
Part 7 - Taking in command line options with argparse to specify an image

Now that we have that out of the way for the new people, let's get back into the good stuff. One of the most common things we need to do, other than pulling a file out of an image, is to hash a file or files contained within an image. Let's start simply by just adding the hashing library that Python provides and generating some hashes for our $MFTs. Then we'll work on making our script a bit more useful with some more command line options and much better intelligence.

So to start with, as you may be expecting by now, we need to import a library. In Python 2.7 the library to import for hashing, that comes with Python, is called hashlib. So the first thing we need to do is import is as follows:

import hashlib

You can read the documentation on hashlib here: https://docs.python.org/2/library/hashlib.html#module-hashlib

Next we need to create a hashlib object to generate, store and print our hashes. Hashlib supports many different types of hashing and as of this post it supports md5, sha1, sha224, sha256, sha384, and sha512. When we create our hashlib option we do so while also choosing which of the hashing algorithms our object will use. For each hash algorithm you want to use you'll need a separate object. Let's start with MD5 by making the object and storing it in a variable called md5hash:

md5hash = hashlib.md5()

If we wanted to make a sha1 hashlib object we would just change the function at the end we are calling to sha1() and store it in a new variable, as follows:

sha1hash = hashlib.sha1()

If want to then generate a hash we would call the update function built into our hashlib object. When you call the update function you need to give it something to hash. You can give it a string or a variable that contains data to be hashed. In our program I am giving the object the contents of the $MFT we read from the image whose data is being stored in the variable filedata. So the whole call looks like this:

md5hash.update(filedata)

Similarly the call to generate the sha1 hash looks like this:

sha1hash.update(filedata)

We've now generated md5 and sha1 hashes for the $MFT we've extracted from the image, now we need to get the hashes printed so our user can see them.To do this we use the hexdigist() method built into our hashlib object. The hexdigest function takes no arguments, it just prints in hex whatever hash value we last set with update. In this version of DFIR Wizard! we are just going to print out the hash value to our command prompt window. It looks like this:

print "MD5 Hash",md5hash.hexdigest()
print "SHA1 Hash",sha1hash.hexdigest()

Taken all together the program looks like this
#!/usr/bin/python
# Sample program or step 6 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
import datetime
import pyewf
import argparse
import hashlib     
class ewf_Img_Info(pytsk3.Img_Info):
  def __init__(self, ewf_handle):
    self._ewf_handle = ewf_handle
    super(ewf_Img_Info, self).__init__(
        url="", type=pytsk3.TSK_IMG_TYPE_EXTERNAL)
  def close(self):
    self._ewf_handle.close()
  def read(self, offset, size):
    self._ewf_handle.seek(offset)
    return self._ewf_handle.read(size)
  def get_size(self):
    return self._ewf_handle.get_media_size()
argparser = argparse.ArgumentParser(description='Extract the $MFT from all of the NTFS partitions of an E01')
argparser.add_argument(
        '-i', '--image',
        dest='imagefile',
        action="store",
        type=str,
        default=None,
        required=True,
        help='E01 to extract from'
    )
args = argparser.parse_args()
filenames = pyewf.glob(args.imagefile)
ewf_handle = pyewf.handle()
ewf_handle.open(filenames)
imagehandle = ewf_Img_Info(ewf_handle)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
  print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
  if 'NTFS' in partition.desc:
    filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512))
    fileobject = filesystemObject.open("/$MFT")
    print "File Inode:",fileobject.info.meta.addr
    print "File Name:",fileobject.info.name.name
    print "File Creation Time:",datetime.datetime.fromtimestamp(fileobject.info.meta.crtime).strftime('%Y-%m-%d %H:%M:%S')
    outFileName = str(partition.addr)+fileobject.info.name.name
    print outFileName
    outfile = open(outFileName, 'w')
    filedata = fileobject.read_random(0,fileobject.info.meta.size)
    md5hash = hashlib.md5()
    md5hash.update(filedata)    print "MD5 Hash",md5hash.hexdigest()    sha1hash = hashlib.sha1()    sha1hash.update(filedata)    print "SHA1 Hash",sha1hash.hexdigest()    outfile.write(filedata)
    outfile.close

Pretty easy right? When I run it I get the following:
C:\Users\dave\Desktop>python dfirwizard-v7.py -i SSFCC-Level5.E01
0 Primary Table (#0) 0s(0) 1
1 Unallocated 0s(0) 8064
2 NTFS (0x07) 8064s(4128768) 61759616
File Inode: 0
File Name: $MFT
File Creation Time: 2014-09-12 12:20:52
2$MFT
MD5 Hash d91df0fb48c36f77a7a9c65870761beb
SHA1 Hash 26213c341902111a68464020965e5fb50108730c

Now the neat thing about the update method build into hashlib is that it will just keep adding to the hash the data you pass in. So if you wanted to buffer your reads, which we will do in a future part in the series, to prevent too much memory usage you can just pass the chunks to update as you read them and still get a hash for the complete file. 

You can get the source of this version of DFIR Wizard on the series github here: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v7.py

In the next part we are going to go into how to recurse through an entire image and hash all the files within it. Building up and bigger as our DFIR Wizard program continues to grow.


Tuesday, February 24, 2015

Automating DFIR - How to series on programming libtsk with python Part 7

Hello Reader,
           We've reached the seventh part of our series and we are starting to ramp up on our dfir wizardry! If you are just starting here, don't. Please start at Part 1 unless you already have a lot of experience both with Python AND with pytsk/pyewf.

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3  - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator
Part 6 - Accessing an E01 image and extracting files

When last I left you we extracted files from an E01 image. This is great but our script is still hard coding in the name of the image to access, so it's time to to start adding command line arguments and parsing to start making our DFIR Wizard program more flexible. Python makes this easy with a standard library named 'argparse' which not only will allow us to handle advanced command line arguments but it will also make usage instructions for us. If you want to read the documentation on argparse go here: https://docs.python.org/2.7/library/argparse.html

We start by importing the argparse library, importing libraries is something you should be very familiar with by now.

import argparse

Next we need to create an argparse object by calling the ArgumentParser function, providing a description of what our program does and then storing the result in a variable named argparser. The description of our program we place into the description parameter we pass in will be used in the auto-generation of help and usage instructions. The more you put, the more the user knows. In the end the argparse object creation will look like this:

argparser = argparse.ArgumentParser(description='Extract the $MFT from all of the NTFS partitions of an E01')

For every argument we want to pass in from the command line and handle we need to use the add_argument function that we invoke from our argparser object. There are a lot of options you can choose from when you use add_argument and we are going to take advantage of a couple of them. See below for how this will look and then we will go into what the parameters mean:

argparser.add_argument( '-i', '--image', dest='imagefile', action="store", type=str, default=None, required=True, help='E01 to extract from' )
 The first thing you specify if what, if any, option you want to identify this argument as. For our script we are defining this argument to be set if the user passes in -i or the long form --image. Next we need to tell add_argument what variable to store the parsed argument into, we are going to put it into a variable named imagefile. Next we need to tell the parser what to do with data passed to this argument, in this case we want to store it. Next we need tell the parser what type of variable we want to store this as, in this case we want to store it as a string. Next we can set a default value if we would like, but we don't have a default value for our forensic image file name. The next argument is important as we specify if an argument is required for the program to continue executing. If you don't provide a required argument then the program will print the usage instructions for the program, generated by argparse, and then quit. The last option we are specifying here is the help option which will print whatever description we put here in the usage information.
Now that we have defined our argument we have to tell argparse to parse the arguments that have been passed in by calling the parse_args function from our argparser object and then assign the object it returns of parsed arguments to a variable. In our program we have called that variable args and we will use it to access any arguments parsed.
args = argparser.parse_args()
One more change and we are done, we need to change our code to remove the harcoded image name and replace it with args.imagefile which contains the name of the image the user specified with -i or --image.
filenames = pyewf.glob(args.imagefile)
That's it! We can specify at execution what E01 image file we want to extract the $MFT from. The final program looks like this: 
#!/usr/bin/python # Sample program or step 6 in becoming a DFIR Wizard! # No license as this code is simple and free! import sys import pytsk3 import datetime import pyewf import argparse class ewf_Img_Info(pytsk3.Img_Info): def __init__(self, ewf_handle): self._ewf_handle = ewf_handle super(ewf_Img_Info, self).__init__( url="", type=pytsk3.TSK_IMG_TYPE_EXTERNAL) def close(self): self._ewf_handle.close() def read(self, offset, size): self._ewf_handle.seek(offset) return self._ewf_handle.read(size) def get_size(self): return self._ewf_handle.get_media_size() argparser = argparse.ArgumentParser(description='Extract the $MFT from all of the NTFS partitions of an E01') argparser.add_argument( '-i', '--image', dest='imagefile', action="store", type=str, default=None, required=True, help='E01 to extract from' ) args = argparser.parse_args() filenames = pyewf.glob(args.imagefile) ewf_handle = pyewf.handle() ewf_handle.open(filenames) imagehandle = ewf_Img_Info(ewf_handle) partitionTable = pytsk3.Volume_Info(imagehandle) for partition in partitionTable: print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len if 'NTFS' in partition.desc: filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512)) fileobject = filesystemObject.open("/$MFT") print "File Inode:",fileobject.info.meta.addr print "File Name:",fileobject.info.name.name print "File Creation Time:",datetime.datetime.fromtimestamp(fileobject.info.meta.crtime).strftime('%Y-%m-%d %H:%M:%S') outFileName = str(partition.addr)+fileobject.info.name.name print outFileName outfile = open(outFileName, 'w') filedata = fileobject.read_random(0,fileobject.info.meta.size) outfile.write(filedata) outfile.close
The final program can be downloaded from out series Github here: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v6.py

 In future posts we'll change our program so that the user can specify a live disk, raw image or other image types but one step at a time. In part 8 we will start hashing files, which is something all of us have to do at one time or another. 

Automating DFIR - How to series on programming libtsk with python Part 6

Hello Reader,
         I really hope you've read all the prior posts in this series because it just keeps building from here! Here are the previous parts if you need to refer to them, each contains the knowledge needed to understand what we talk about in this post.

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3  - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator

What you'll need for this part:

1. Download the sample E01 located here: https://mega.co.nz/#!uhgQzZpL!F9aoPo6pZ_m9cKpYoK5ND_NY26GjCg7YKS60StVrl98

What you should already have installed at this point (from part1):

1. Python 2.7 32bit
2. pytsk
3. pyewf

Accessing an E01 image and extracting a file

The time has arrived and now that we've explained a lot of python constructs and knowledge for creating these DFIR Automation scripts we are prepared to talk about libewf and it's python binding pyewf.

Now libtsk and it's python wrapper pytsk is no slouch, it can open the following image formats:

  • Single Raw Images
  • Split Raw Images
  • Single VMDK's but not their snapshots. Full VMDK support is available from pyvmdk
  • Single VHD, full VHD support is available from pyvhd
  • Live disks

With pyewf library we can access the following image formats:

  • Single E01 or Expert Witness Format images
  • Split E01 image
  • Compressed Raw Images (aka smart format .S01)
  • Single non encrypted Ex01 v1 images
  • Split non encrypted Ex01 v1 images
  • non encrypted Lx01 v1 images
  • L01 images
There is another library for AFF image access but that is a topic for another post.

So the first thing we need to do is import the pyewf library using the import command you should be very familiar with by now

import pyewf

The next thing we need to do is gather up all the possible parts of the image we want to load. Most examiners create multi-part images in the process of their forensic imaging. The multi-part imaging preference first came around because we wanted to archive images to media such as CDs or DVDs but there is another nice thing about multi-part images. You can hash all the parts of a multi part image to verify the image consistency on a per part basis rather than having to hash the contents of the image itself. This is especially useful when you are copying the image to a new drive and you want to know which image segment didn't copy over correctly, then comparing the segement hash will let you replace a single bad segment rather than copying over the whole image again.

pyewf gives us a handy method to gather up all the sequential parts of a multi-part image with a single function named glob. The glob function will take the file name given and then following the rules for how multi-part image extensions are sequentially named (E01-EZZ) it will load the full list into an array that is returned. 

filenames = pyewf.glob("SSFCC-Level5.E01")



Here you can see we are storing the result of glob for our example E01 in a variable named filenames. Our example image is a single image segment but our code will work for both single and split image files. The next thing we need to is open up a handle to our image. We do this first by creating a handle object using handle() and storing it in a variable. In the code below we are calling handle and storing the result in the variable ewf_handle.


ewf_handle = pyewf.handle()

Next we need to use this new object to open up our image. We use the open function contained within the handle object to do so. In the code below we are calling the open method stored within ewf_handle on the filenames variable we made. 

ewf_handle.open(filenames)

Now when we used pytsk we next needed to create an Img_Info object, and we still do. However since pytsk does not support pyewf we are going to use pyewf to do this for us. We are calling the function ewf_Img_info here, passing in our ewf_handle object and storing the result in our imagehandle variable as seen below:

imagehandle = ewf_Img_Info(ewf_handle)

Now ewf_Img_Info is not something provided by the pyewf library. Instead it's a new class we are creating in our program that is based on pytsk's Img_Info object but extends it to handle the pyewf supported formats. So to do this we need to create a class named ewf_Img_Info and declare that it inherits the base classes of pytsk's Img_Info. 

class ewf_Img_Info(pytsk3.Img_Info):

The class specified in () after ewf_Img_Info is the class we are inheriting. Next we need to create a constructor for our class so that we invoke ewf_Img_Info it can prepare a libtsk compatible object that we can use going forward. In looks something like this:

def __init__(self, ewf_handle):
    self._ewf_handle = ewf_handle
    super(ewf_Img_Info, self).__init__(
        url="", type=pytsk3.TSK_IMG_TYPE_EXTERNAL)

Note the class we are building here is taken from the example provided at: https://github.com/libyal/libewf/wiki/Development 


So the first thing we are doing is creating our constructor function named __init__ which takes as a parameter itself and the ewf_handle object. The next thing we are doing is taking a reference to ewf_handle and storing it within the class as self._ewf_handle. Lastly we are using a neat function called Super to call the constructor of the class we are inheriting, which looks like super(paramaters to pass in).__init__ which is the constructor of the parent class, which in this case the parent class is pytsk3.Img_Info. Super can call any class in the parent which becomes important when we get to method overriding below. Once we call the parent constructor we need to pass it two variables, the url and the type. The url is set to "" and in this case the type TSK_IMG_TYPE_EXTERNAL. For a full list of image types you can pass into Img_Info go here and look at the enumerations section: http://www.sleuthkit.org/sleuthkit/docs/api-docs/tsk__img_8h.html

Great now our constructor has built an Img_Info object that is based on the pytsk Img_Info class so the object will be compatible with all the other pytsk functions we've called before. Now that we have a Img_Info object you might think we are done, but we need to do one more thing. We need to override the functions provided by Img_Info for closing the handle to the image,  reading data from an image and getting the size of the media contained within the image. If we used the base pytsk Img_Info functions they would fail as they do not understand the image formats that pyewf handles for us. So to override just those functions we just need to declare them within our new ewf_Img_Info class as follows:

  def close(self):
    self._ewf_handle.close()

The close function defined above will cal the ewf_handle object's close method instead of Img_Info's close method. 

  def read(self, offset, size):
    self._ewf_handle.seek(offset)
    return self._ewf_handle.read(size)

The read function defined above takes the offset of where to start reading and the total amount to read like the standard Img_Info method would, but it does the reading using the ewf_handle object's version of seek and read. It then returns the data read by ewf_handle's read function to whomever called it. 

  def get_size(self):
    return self._ewf_handle.get_media_size()

The get_size function is using get_media_size from the ewf_handle object to return the total size of the media rather than the standard Img_Info get_size method. 

There we go, with those functions now defined we've created a pytsk compatible object that we can now pass into the rest of our code from part 3 as if we were dealing with a native pytsk support image format. The complete code follows: 

#!/usr/bin/python
# Sample program or step 5 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
import datetime
import pyewf
     
class ewf_Img_Info(pytsk3.Img_Info):
  def __init__(self, ewf_handle):
    self._ewf_handle = ewf_handle
    super(ewf_Img_Info, self).__init__(
        url="", type=pytsk3.TSK_IMG_TYPE_EXTERNAL)
  def close(self):
    self._ewf_handle.close()
  def read(self, offset, size):
    self._ewf_handle.seek(offset)
    return self._ewf_handle.read(size)
  def get_size(self):
    return self._ewf_handle.get_media_size()

filenames = pyewf.glob("SSFCC-Level5.E01")
ewf_handle = pyewf.handle()
ewf_handle.open(filenames)
imagehandle = ewf_Img_Info(ewf_handle)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
  print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
  if 'NTFS' in partition.desc:
    filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512))
    fileobject = filesystemObject.open("/$MFT")
    print "File Inode:",fileobject.info.meta.addr
    print "File Name:",fileobject.info.name.name
    print "File Creation Time:",datetime.datetime.fromtimestamp(fileobject.info.meta.crtime).strftime('%Y-%m-%d %H:%M:%S')
    outFileName = str(partition.addr)+fileobject.info.name.name
    print outFileName
    outfile = open(outFileName, 'w')
    filedata = fileobject.read_random(0,fileobject.info.meta.size)
    outfile.write(filedata)
    outfile.close
You can grab this code from our series Github here: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v5.py

You may have noticed I dropped the code we added in part 5 for automatic elevation. That is because we do not need to run as administrator to access an image file, and if you don't need administrative privileges your code shouldn't run with them. In part 7 of our series we will cover taking in command line parameters so you don't have to hard code in your image file names after which we will move on to hashing and recursing through file systems. 

Automating DFIR - How to series on programming libtsk with python Part 5

Hello Reader,
              This is part 5 of a planned 24 part series. If you haven't read the prior parts I would highly recommend you do to understand how we got to this point!

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3  - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable

In this post, before continuing on to accessing an E01 image which is a bit more complicated, let's make our lives a little bit easier. It's always a pain when you forget to open an administrative command prompt to run your script and in future posts when we get to GUIs its easy to forget to right click and run as administrator/sudo your script. So instead let's have our code do it for us. Now I can't take credit for this code like most good programmers I turn to Google for answers which most frequently will lead you to stackoverflow.com for answers. On stackoverflow I found a series of threads which offered solutions to the problem of elevating a python script and in testing I found the following thread to offer the best solution: http://stackoverflow.com/questions/19672352/how-to-run-python-script-with-elevated-privilage-on-windows

So let's look at what changes we need to make our DFIR Wizard program to do this. By this I mean check to see if our script is running as administrator/root and if it's not then to try to do so (if the account has permissions to do so).

#!/usr/bin/python
# Sample program or step 3 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
import datetime
import admin
if not admin.isUserAdmin():
   admin.runAsAdmin()
   sys.exit()
imagefile = "\\\\.\\PhysicalDrive0"
imagehandle = pytsk3.Img_Info(imagefile)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
  print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
  if 'NTFS' in partition.desc:
    filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512))
    fileobject = filesystemObject.open("/$MFT")
    print "File Inode:",fileobject.info.meta.addr
    print "File Name:",fileobject.info.name.name
    print "File Creation Time:",datetime.datetime.fromtimestamp(fileobject.info.meta.crtime).strftime('%Y-%m-%d %H:%M:%S')
    outFileName = str(partition.addr)+fileobject.info.name.name
    print outFileName
    outfile = open(outFileName, 'w')
    filedata = fileobject.read_random(0,fileobject.info.meta.size)
    outfile.write(filedata)
    outfile.close

I have bolded the parts of the code that I have changed. You can see the first thing we are doing is importing in a new class. In this case that class is called 'admin'. Now before you try to install admin with pip you should know that admin is actually an new python script we are going to create. Rather than embed the admin functions, which are quite lengthy, in our main script we are going to import the functions it provides and use them.

After importing the admin class we are doing a test using the 'if' conditional statement. We are testing the result of calling the admin class function 'isUserAdmin' for the negative. In other words our 'if' statement here returns 'true' that we are running as administrator then the script will not execute the next two lines of code and just continue on executing. However if the 'isUserAdmin' function comes back that the process is not currently running as the administrator then the 'not' applies before the function will make it return true and thus the two lines of code indented after the 'if' statement will execute.

So let's talk about those two files after the 'if' statement. The first line after the 'if' statement is calling the 'runAsAdmin' function provided from the admin class we imported. This function will start a new process of our python program as administrator for us and when it executes again as administrator our program will skip over this if statement and run the rest of our DFIR Wizard program. The next line tells our program that is not running as administrator to stop running as the rest of the script requires us to run as administrator to execute correctly. The sys library provides this 'exit' function that tells our program to quit and it will only quit after spawning off the new administrative version of our python script. That's all the modifications we are going to make to our main script, dfirwizard-v4.py.

Now, let's take a look at the new python script we are making called admin.py.

#!/usr/bin/env python
# -*- coding: utf-8; mode: python; py-indent-offset: 4; indent-tabs-mode: nil -*-
# vim: fileencoding=utf-8 tabstop=4 expandtab shiftwidth=4
# (C) COPYRIGHT © Preston Landers 2010
# Released under the same license as Python 2.6.5

import sys, os, traceback, types
def isUserAdmin():
    if os.name == 'nt':
        import ctypes
        # WARNING: requires Windows XP SP2 or higher!
        try:
            return ctypes.windll.shell32.IsUserAnAdmin()
        except:
            traceback.print_exc()
            print "Admin check failed, assuming not an admin."
            return False
    elif os.name == 'posix':
        # Check for root on Posix
        return os.getuid() == 0
    else:
        raise RuntimeError, "Unsupported operating system for this module: %s" % (os.name,)
def runAsAdmin(cmdLine=None, wait=True):
    if os.name != 'nt':
        raise RuntimeError, "This function is only implemented on Windows."
    import win32api, win32con, win32event, win32process
    from win32com.shell.shell import ShellExecuteEx
    from win32com.shell import shellcon
    python_exe = sys.executable
    if cmdLine is None:
        cmdLine = [python_exe] + sys.argv
    elif type(cmdLine) not in (types.TupleType,types.ListType):
        raise ValueError, "cmdLine is not a sequence."
    cmd = '"%s"' % (cmdLine[0],)
    # XXX TODO: isn't there a function or something we can call to massage command line params?
    params = " ".join(['"%s"' % (x,) for x in cmdLine[1:]])
    cmdDir = ''
    showCmd = win32con.SW_SHOWNORMAL
    #showCmd = win32con.SW_HIDE
    lpVerb = 'runas'  # causes UAC elevation prompt.
    # print "Running", cmd, params
    # ShellExecute() doesn't seem to allow us to fetch the PID or handle
    # of the process, so we can't get anything useful from it. Therefore
    # the more complex ShellExecuteEx() must be used.
    # procHandle = win32api.ShellExecute(0, lpVerb, cmd, params, cmdDir, showCmd)
    procInfo = ShellExecuteEx(nShow=showCmd,
                              fMask=shellcon.SEE_MASK_NOCLOSEPROCESS,
                              lpVerb=lpVerb,
                              lpFile=cmd,
                              lpParameters=params)
    if wait:
        procHandle = procInfo['hProcess']  
        obj = win32event.WaitForSingleObject(procHandle, win32event.INFINITE)
        rc = win32process.GetExitCodeProcess(procHandle)
        #print "Process handle %s returned code %s" % (procHandle, rc)
    else:
        rc = None
    return rc

if __name__ == "__main__":
    sys.exit(test())
Now there is a lot of code here to get through to understand whats going on here. Let's break it down in chunks:

import sys, os, traceback, types

Here we importing 4 libraries needed for the admin class to work; sys, os, traceback and types. 
Next we are gong to define the first function the admin class provides, the 'isUserAdmin' function.

def isUserAdmin():
    if os.name == 'nt':
        import ctypes
        # WARNING: requires Windows XP SP2 or higher!
        try:
            return ctypes.windll.shell32.IsUserAnAdmin()
        except:
            traceback.print_exc()
            print "Admin check failed, assuming not an admin."
            return False
    elif os.name == 'posix':
        # Check for root on Posix
        return os.getuid() == 0
    else:
        raise RuntimeError, "Unsupported operating system for this module: %s" % (os.name,)
The first thing we are doing is determining what operating system our program is running by checking the contents of the variable 'name' from the os library we imported in the first line. We are testing to see if the operating system name is defined as 'nt' which means windows. To see the full list of operating system name returned go here: https://docs.python.org/2/library/os.html#os.name

If we are running under windows we are going to import another library for called 'ctypes' via the import ctypes line. The ctypes library is going to give us access to several windows internal functions, but it can do way more than that. These functions that we will call are made available from the windows api also called the win32 api its running to let programmers request information about the windows session they are currently running under. To learn more about the ctypes library go here: https://docs.python.org/2/library/ctypes.html?highlight=ctypes#module-ctypes

Next we see a new python conditional called 'try' and 'except'. This will allow us to to attempt to execute a function and in the event that it returns a failure we can define how to handle the error. Our try line is calling the following function ctypes.windll.shell32.IsUserAnAdmin(), for more information about this win32 api function go here: https://msdn.microsoft.com/en-us/library/windows/desktop/bb776463%28v=vs.85%29.aspx. Look at what we are doing with the ctypes library to call this function. We are using ctypes to call the win32 api through windll which is then calling shell32 to call the IsUserAnAdmin function. Using this syntax we can call any other shell32 function of which their are many! For a much longer read on the shell32 library go here: https://msdn.microsoft.com/en-us/library/windows/desktop/bb773177(v=vs.85).aspx

Now if we are running as administrator our code will return true. If we are not then the except clause will be called and we will print to the console that we are not running as administrator and then return false. The next elif or 'else if' will check to see if we are running on a 'posix' operating system. Posix should return for most unix operating systems such as BSD, OSX and Linux. If we are running under a Posix operating system we will check to see if we are running as root or uid 0 and then return truse or false. Otherwise we have an 'else' operator at the end stating we don't know how to handle any non windows non posix operating system.

We are now done with this function! Let's move on to the next function 

def runAsAdmin(cmdLine=None, wait=True):
    if os.name != 'nt':
        raise RuntimeError, "This function is only implemented on Windows."
    import win32api, win32con, win32event, win32process
    from win32com.shell.shell import ShellExecuteEx
    from win32com.shell import shellcon
    python_exe = sys.executable
    if cmdLine is None:
        cmdLine = [python_exe] + sys.argv
    elif type(cmdLine) not in (types.TupleType,types.ListType):
        raise ValueError, "cmdLine is not a sequence."
    cmd = '"%s"' % (cmdLine[0],)
    # XXX TODO: isn't there a function or something we can call to massage command line params?
    params = " ".join(['"%s"' % (x,) for x in cmdLine[1:]])
    cmdDir = ''
    showCmd = win32con.SW_SHOWNORMAL
    #showCmd = win32con.SW_HIDE
    lpVerb = 'runas'  # causes UAC elevation prompt.
    # print "Running", cmd, params
    # ShellExecute() doesn't seem to allow us to fetch the PID or handle
    # of the process, so we can't get anything useful from it. Therefore
    # the more complex ShellExecuteEx() must be used.
    # procHandle = win32api.ShellExecute(0, lpVerb, cmd, params, cmdDir, showCmd)
    procInfo = ShellExecuteEx(nShow=showCmd,
                              fMask=shellcon.SEE_MASK_NOCLOSEPROCESS,
                              lpVerb=lpVerb,
                              lpFile=cmd,
                              lpParameters=params)
    if wait:
        procHandle = procInfo['hProcess']  
        obj = win32event.WaitForSingleObject(procHandle, win32event.INFINITE)
        rc = win32process.GetExitCodeProcess(procHandle)
        #print "Process handle %s returned code %s" % (procHandle, rc)
    else:
        rc = None
    return rc
This is not a short function so let's break it down in chunks again. Our function is being defined here with two variables; cmdLine and wait. These two variables are being given default values in case no value was passed in. In other words if we called this function with a specific cmdLine variable then the function would accept it, otherwise if no such variable is passed in (as we are doing in our program) then it will use the default value of None.  Next we make sure that we are running under Windows as this function won't work on another operating system as its currently written. We do this with a check to os.name again with the condition != or not equals. If our operating system is not windows (returned as nt here) then the function will raise an error stating it won't work!

Next we need import more libraries! We are importing win32api, win32con, win32event, win32process libraries in order to start this up. From these libraries we are importing two functions into our local namespace. Importing into our local namespace means we don't have to call it with the full library path, instead we can call it by function name alone. We bringing in ShellExecuteEx and shellcon into our local namespace.

Now its time to figure out what python instance we are going to run as administrator with the following line, python_exe = sys.executable. We are assigning the name of the currently running python interpreter we are using from sys.executable to the variable python_exe.

Let's look at the next chunk of code:

if cmdLine is None:
        cmdLine = [python_exe] + sys.argv
    elif type(cmdLine) not in (types.TupleType,types.ListType):
        raise ValueError, "cmdLine is not a sequence."
    cmd = '"%s"' % (cmdLine[0],)

If we didn't pass in a value to cmdLine and its default value of 'None' is applied than we will update the cmdLine variable to be the name of our executable that we capture prior and the command line arguments passed into our currently running script via the sys provided variable argv.

If our cmdLine variable contains a value we passed in but it is not a list (a series of values in an array) then we throw an error. Lastly if neither applies, meaning a value was supplied and it is a list type variable then we assign it to our cmd to execute.

This is followed by:
params = " ".join(['"%s"' % (x,) for x in cmdLine[1:]])
    cmdDir = ''

Where we are joining all the command line arguments into a variable called params and defining the directory we want our program to run in to be the directory our program is currently running from.

Next we set the option of whether to show the console window of the running application. If this was a GUI program we would want to hide this, but since this is currently a command line program we want to show it. We are using the win32con library constants SW_SHOWNORMAL and SW_HIDE to set that value which will be pass into the administrative process we create.
showCmd = win32con.SW_SHOWNORMAL
    #showCmd = win32con.SW_HIDE

Next we need to make sure we set the right flag for an elevated process if we are in a UAC aware environment:  lpVerb = 'runas'  # causes UAC elevation prompt. by setting the lpVerb variable equal to runas.

Now we are ready to create our administrative process using the ShellExecuteEx command and passing in all the variables we set in the lines prior. We store the resulting process id in a variable named procInfo. We are passing one additional constant value here from shellcon, SEE_MASK_NOCLOSEPROCESS to make sure our parent process does not exit here.

  procInfo = ShellExecuteEx(nShow=showCmd,
                              fMask=shellcon.SEE_MASK_NOCLOSEPROCESS,
                              lpVerb=lpVerb,
                              lpFile=cmd,
                              lpParameters=params)

For more information on the ShellExecuteEx function go here: http://www.pinvoke.net/default.aspx/shell32/ShellExecuteEx.html

Lastly we need to check if we passed in a wait flag (true or false)
 if wait:
        procHandle = procInfo['hProcess']  
        obj = win32event.WaitForSingleObject(procHandle, win32event.INFINITE)
        rc = win32process.GetExitCodeProcess(procHandle)
        #print "Process handle %s returned code %s" % (procHandle, rc)
    else:
        rc = None
    return rc

If wait is true, which is by default, then we will wait for our administrative execution to finish and then we set the variable rc to exit code of the administrative process. If we are not waiting to get the return state then we set rc to the value None. Lastly we return the value of the variable rc to the program that called this function to begin with.

Lastly we have a default constructor

if __name__ == "__main__":
    sys.exit(test())

That was a lot to get through! But now that we have it we can reuse it every time we want to make sure our executable runs as administrator rather than having to restart it manually ourselves. If you notice in my code I put an exit after the administrative process returns, this is because if we let our script continue to run in the original non administrative process it will throw an error and likely just confuse the user.

If you want to grab these two files do it from the series github:
dfirwizard-v4.py: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v4.py
admin.py: https://github.com/dlcowen/dfirwizard/blob/master/admin.py

In the next part we will access an E01 image!

Monday, February 23, 2015

Automating DFIR - How to series on programming libtsk with python Part 4

Hello Reader,

Make sure you have read the prior post in this series before continuing:
Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3  - Extracting a file from a live system

             In the previous post we modified our DFIR Wizard program to run against a live system. Now this is great, but wait... what do we do if the live system we want to run it against does not have python installed?!

While we could install python and the pytsk library on every system we want to access, that's not the best idea for a couple reasons:
1. Larger impact to forensic evidence
2. Production systems tend to be quite restrictive on new program installs
3. Unintended side effects on shared libraries
4. Internal politics


So what if we could take our python script and turn it into a standalone executable? Well you can! I am going to cover the most widely used and known program to do this, py2exe.

Things you'll need to follow along with this post:


1. Py2Exe, download it from http://sourceforge.net/projects/py2exe/files/latest/download?source=files. If you are doing this on OSX or Linux you most likely will already have python installed if not look into py2app for OSX and pyInstaller for cross platform support.
2. To have your program run on systems without the required msvcr90.dll you'll also need to grab the redistributable package Microsoft provides, you can download it here http://www.microsoft.com/downloads/en/details.aspx?FamilyID=9b2da534-3e03-4391-8a4d-074b9f2bc1bf&displaylang=en

Turning your python program into a windows executable


So let's turn dfirwizard-v3.py into dfirwizard-v3.exe!

The first thing you'll need to do is install py2exe, when you are done continue on to the next step.
The second thing you'll need to do is create a new python script in the same directory where dfirwizard-v3.py is called setup.py.

The contents of setup.py will be as follows:
from distutils.core import setup
import py2exe setup(console=['dfirwizard-v3.py'])
We are dong three things in this script. The first thing are doing is importing a single function from the distutils.core package into our local namespace. This means when we call setup we don't have to say distutils.core.setup, instead we can just write setup. Distutils or distribution utilities is a library made for creating packages of python code for installing, packaging or distributing to other python users. 

The second thing we are doing is importing the py2exe library into our program. The last thing we are doing is calling the setup function and passing in the argument console which contains the name of the python script we want to turn into a windows console executable.There are other options available to make windows gui's and windows services but we can talk about that in another post.

To actually get py2exe going we now need to run the setup.py program using the following command:
python setup.py py2exe

It will create a directory called dist under the directory you ran the script in and contained in that directory will be 9 files that contain all the python libraries we need to run DFIR Wizard on a system without python installed. You'll notice that in the dist directory there is a file called dfirwizard-v3.exe if everything worked right your directory should look like the following:


Now if you noticed above under 'Things you'll need' the second point talks about a package from Microsoft that contains the file msvcr90.dll. This C++ dll is needed for our py2exe created executable to run on a system where python is not installed. Many systems have this library installed, my system has 29 occurrences of it, but just in case you should make sure to include it in your dist directory before you start deploying this package out to other systems. You can fix this by copying into your dist directory under the directory 'Microsoft.VC90.CRT'. For more about troubleshooting py2exe and dealing with systems who don't have msvcr90.dll read this: http://py2exe.org/index.cgi/Tutorial

Once you have the dll in place you can zip up the whole dist directory and push out your executable and libraries wherever it needs to be run. Just remember that at this stage our windows executable has to be run from an administrative command prompt or executed remotely with a run as administrator option. You also might want to consider powershell remoting as it won't expose your credentials to the remote system but that is a post for another day.

Making your program a single file for deployment


As an alternative to py2exe you could also try pyInstaller which allows you to bundle all of the files above into one executable. It has a lot of other features as well but that's the one that may interest you the most when pushing this out for remote execution.

First install pyInstaller with pip:
pip install pyinstaller
Next install pywin32 which is needed for the make one file option
http://sourceforge.net/projects/pywin32/files/pywin32/Build%20214/pywin32-214.win32-py2.7.exe/download


Next run pyinstaller as follows:
\python27\scripts\pyinstaller -F dfirwizard-v3.py

Where -F is telling pyinstaller to generate just one executable that all the rest of the libraries will be extracted from at run time. Your executable dfirwizard-v3.py will be located in the dist directory under the directory where dfirwizard-v3.py is located. When run as administrator your single executable will now unpack itself and run on any system you want. An added value here is that you don't have to worry about including the C++ dlls!

Now this will execute slower at first than the py2exe version above but it will only require one file to be pushed/executed. 

In the next part let's talk about how to get our program to elevate its own privileges if they are available to the logged in user and then move onto accessing E01 images. 

To grab the setup.py used in this post get it from the series Github here: https://github.com/dlcowen/dfirwizard/blob/master/setup.py


Saturday, February 21, 2015

Automating DFIR - How to series on programming libtsk with python Part 3

Hello Reader,
      Before you read any farther make sure you have read part 1 and 2 as we are not going back over what we've already done.

Part 1
Part 2

In the last post in this series we extracted a file from a forensic image, which is something you can start using to extract all sorts of data without having to load another tool. We are going to take a sideways step away from forensic images for one post and show how versatile this library is.

So here is a question to make you think, what is the difference between a raw forensic image (some call this a dd image) of a hard drive and a live hard drive itself?

The answer, the live hard drive is changing but otherwise all of the structures are the same.

This means the same library (pytsk) we are using to access a forensic image can be used to access the hard drive of a live running system! If you are investigating a system, for whatever reason, and you want to:

A. Get access to locked files
B. Access files the operating system won't show you
C. Extract data without changing the metadata
D. Carve a live system
E. Write your own imaging program
F. Anything else you can dream up!

Then this is the way to do it! Many commercial vendors have offered solutions for doing this as 'triage' products and for the most part we can replicate all of their collection efforts using free and open source libraries.

Do I have your interest? Then let's go!

Let's start with our first program from part 1 and change one thing which is marked in bold below.

#!/usr/bin/python
# Sample program or step 1 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
imagefile = "\\\\.\\PhysicalDrive0"
imagehandle = pytsk3.Img_Info(imagefile)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
Note: For this to run you must execute it as administrator/root on your system as you are accessing a raw physical disk. To do this in windows right click cmd.exe and say run as administrator. To do this in Linux or OSX make sure to run your script with sudo. 

If you were to run this on your own system right now it would print the partition table of the first drive in sequence. This code is interesting as the path I've given to a physical disk is in the windows style for accessing raw devices, for Linux or OSX make sure to do a fdisk -l to find the path to the physical disks on your system.

Now let's look at that code again and let this sink in. We didn't change any code to make the same program work on a live system then we did on a forensic image. The only thing we changed was the name of the file it was going to open and access like a forensic image and all the rest of our code worked! 

That means that we can write one program that can operate on both live systems and forensic images which is pretty awesome if you stop and think about it. 

Now let's move forward from our sidestep and extend out our previous file extraction example to iterate through all the partitions on a live system and extract the $MFT from it. For this example we are looking for $MFT files so we only want to extract data from NTFS partitions on our live systems. To do this we need to add a line code right after we print our partition table to check to see if the partition description displayed to us contains the word NTFS. We can do this with an 'if' statement and the operator 'in'. 

if 'NTFS' in partition.desc:
So we are testing if our variable 'partition.desc' contains the word 'NTFS'. If it does than the next part of the code will execute, if it does not then it will execute any other conditionals relating to the if (else if and else) and then move on. Since for this example we are only testing for NTFS our code will just loop to the next partition if it is not NTFS. 

The next thing we need to do is to take out the hardcoded offset to the beginning of our partition that is in the FS_Info method we called in the previous example from part 2. We need to replace the hard coded offset we used before with the offset of whatever partition we just checked for containing NTFS. So instead of

filesystemObject = pytsk3.FS_Info(imagehandle, offset=65536)

we are going to replace 65536 with the same variable we are printing out in the partition table, partition.start, which gives us the number of sectors into the disk we are looking at where this partition begins. Then we need to take that from sectors to absolute offset by multiplying the number of sectors by the sector size. I am going to assume here that you have a sector size of 512, if you don't change 512 to whatever your sector size is, so I am going to multiply the number contained in partition.start by 512 to get the absolute offset where this partition begins. Lastly I need to make sure that the order of operations happens correctly so I will wrap this in a () to note that the commands within it should be evaluated before proceeding. The end result looks like this:

filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512))
Great! Now we are iterating through and accessing all the NTFS partitions on our live system! But wait, if we are doing this for multiple partitions we will be overwriting the contents of each prior partition with the next file when we write out the data. The next change we have to make is changing the name of our output file to identify the partition number the file came from and the name of file we are extracting. To do this I am first making a new variable called outFileName which will contain the filename of the file we are writing out to. In order to combine the partition number and the filename together I am going to use the '+' operator which will combine two strings together and return the combination. There is only one small thing left to fix, partition.addr which stores our partition number is not stored as a string it is stored as an integer (a number). So in order to append it to a string using the  '+' operator I need to tell python to treat it or 'cast it' as a string using the str() function. The end result looks like this: 

outFileName = str(partition.addr)+fileobject.info.name.name
I now change our open command that opens our output file for writing to use this variable instead of the hard coded file name we had before:

outfile = open(outFileName, 'w')

and lastly since we are going to be reusing this file handle it would be wise to close it 

outfile.close
and that's it! 

We now have a program that will iterate though all the partitions on the 1st disk in your live system and extract the $MFTs from every NTFS partition to uniquely named files! The finished code looks like this:

#!/usr/bin/python
# Sample program or step 3 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
import datetime
imagefile = "\\\\.\\PhysicalDrive0"
imagehandle = pytsk3.Img_Info(imagefile)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
  print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
  if 'NTFS' in partition.desc:
    filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512))
    fileobject = filesystemObject.open("/$MFT")
    print "File Inode:",fileobject.info.meta.addr
    print "File Name:",fileobject.info.name.name
    print "File Creation Time:",datetime.datetime.fromtimestamp(fileobject.info.meta.crtime).strftime('%Y-%m-%d %H:%M:%S')
    outFileName = str(partition.addr)+fileobject.info.name.name
    print outFileName
    outfile = open(outFileName, 'w')
    filedata = fileobject.read_random(0,fileobject.info.meta.size)
    outfile.write(filedata)
    outfile.close

You can download this post's code at the series Github here: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v3.py

If you wanted to do this against a forensic image just change the path stored in the variable imagefile back to the forensic image you want it to extract from and all the rest of the code remains the same. In the next part in this series we will make it so you can specify where to extract from and how to turn your python script into a standalone executable so you don't have to have python on the system you want to run your program on. 

Thursday, February 19, 2015

Automating DFIR - How to series on programming libtsk with python Part 2

Hello Reader,
    In our last post, part 1 click here to read it, we printed out a partition table from a forensic image. Now you might have noticed that the image we are using for these first posts is a VHD and not an E01. VHD and other raw image formats are supported directly by pytsk so for our first couple of posts it will be easier to work with them, once we get beyond the basics I'll then bring pyewf and its corresponding libewf lbraries into our code examples and we will begin using all different types of image formats in our DFIR Wizardry even getting into shadow copy access and more! So stick with me if you want to go step by step.  If you don't want to go step by step and you have the programming experence to leap ahead then I would suggest jumping straight to the sample code that comes with the following projects:

pytsk sample code: https://github.com/py4n6/pytsk/tree/master/samples
libewf sample code: https://github.com/libyal/libewf/wiki/Development
dfvfs sample code:  https://github.com/log2timeline/dfvfs/tree/master/examples

Now with that out of the way let's move on to the next step of our DFIR Wizard program. In the first post, we can call that DFIRW v1, we accessed an image and printed out the partition table. Now let's extend out example and extract a file from the image.

First let's remember where we left off, here is the code we last worked with:
#!/usr/bin/python
# Sample program or step 1 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
imagefile = "Stage2.vhd"
imagehandle = pytsk3.Img_Info(imagefile)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
  print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len

Now to access a file stored in a forensic image we already have the main thing we need, a python libtsk object that provides functions to access the volume stored within the forensic image. Next we need an object that will give us access to the file system on a volume we choose. In the case of this example image there is only one valid file system and that is the second partition on the disk which is NTFS.

If you remember from the previous post our NTFS partition info looked like this:

2 NTFS (0x07) 128s(65536) 1042432

This becomes important in this next step because we need to tell libtsk where our file system is that we want to open. To do that we need  to call a new function called FS_Info which takes two important pieces of information to work, the name of the variable that is storing our image object we made already and the offset to where our file system begins on the partition we want to examine. If you remember from the last post we said that the value 65536 was the absolute offset to the beginning of the NTFS partition so we already have the information we need! Let's add that on to our program.

#!/usr/bin/python
# Sample program or step 2 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
imagefile = "Stage2.vhd"
imagehandle = pytsk3.Img_Info(imagefile)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
  print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
filesystemObject = pytsk3.FS_Info(imagehandle, offset=65536)

You can see we've added one new line to the bottom of our program, we've made a new variable called filesystemObject which because we passed in the offset to our NTFS partition now gives us access to the underlying filesystem contained within t! That's great you say, but how do we actually access a file? Well in later examples I'll show how to recurse through a file system to search, find, hash and all sorts of other good things but to begin with let's just grab something. One of the files you can always expect on a NTFS drive is the master file table which goes by the name of $MFT. So let's grab that!

#!/usr/bin/python
# Sample program or step 2 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
imagefile = "Stage2.vhd"
imagehandle = pytsk3.Img_Info(imagefile)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
  print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
filesystemObject = pytsk3.FS_Info(imagehandle, offset=65536)
fileobject = filesystemObject.open("/$MFT")

Great now we have a new variable called fileobject which contains our access to all the things libtsk can tell us about the file $MFT located at the root of file system. If you want to play with this program later you can change /$MFT to the full path of any other file you want. For now though let's focus on the $MFT which is a useful file in its own right and many times we want to extract it and parse it with external parses to get at some of the more obscure metadata. 

Let's start by gathering some information about the $MFT file like:
  • What is its inode number
    • libtsk has a metadata structure we can use to provide this. It's stored in the info.meta.addr value which in our code we would fully reference it as fileobject.info.meta.addr
  • What is the file name, in case in the future we are accessing files by inode
    • libtsk has a separate structure just for file names than it does metadata. While NTFS combines the storage of these two structures into one location (the MFT) many other file systems don't instead they store the file name in the directory that links to the file. The filename is stored in info.name.name and we would fully reference it as fileobject.info.name.name
  • What is the creation time
    • Creation time and all the other time stamps of a file are always important to us. The value that libtsk returns to us is the time in epoch (Stored UTC). The creation timestamp is stored in info.meta.crtime and we would fully reference it is fileobject.info.meta.crtime
For a full list of what metadata properties you can access go here: http://www.sleuthkit.org/sleuthkit/docs/api-docs/structTSK__FS__META.html

So let's add in some code to print out all this useful information about $MFT in our image:
#!/usr/bin/python
# Sample program or step 2 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
imagefile = "Stage2.vhd"
imagehandle = pytsk3.Img_Info(imagefile)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
  print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
filesystemObject = pytsk3.FS_Info(imagehandle, offset=65536)
fileobject = filesystemObject.open("/$MFT")
print "File Inode:",fileobject.info.meta.addr
print "File Name:",fileobject.info.name.name
print "File Creation Time:", fileobject.info.meta.crtime

 Awesome! Now we can access a file stored in an image and print out information about it! However, you'll notice if you run this example that the creation time printed isn't what you expect. The timestamp value being returned here is in epoch form, meaning the number of seconds that have passed since midnight 1/1/1970 UTC. So we need some help from the standard python libraries in getting this epoch timestamp into a human readable timestamp. The python standard library datetime will do just that for us! Inside of the datetime library is a function called 'fromtimestamp' which when combined with 'strftime' will allow us to convert our epoch value into a human readable timestamp of our liking! That's right you can make the timestamp show up in any format you want to match american, european and other database specific timestamp formats.

To add in the datetime library we need to add a new import statement near the beginning of our program, import datetime, I put it in bold in the program below. Then we need to use the two functions we talked about below which we reference with the library name first (datetime) and then the function name in the library (datetime.datetime) followed by the function we want to call (fromtimestamp) and then how we want the timestamp printed (strftime). All combined you get the program as you see below:

#!/usr/bin/python
# Sample program or step 2 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
import datetime
imagefile = "Stage2.vhd"
imagehandle = pytsk3.Img_Info(imagefile)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
  print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
filesystemObject = pytsk3.FS_Info(imagehandle, offset=65536)
fileobject = filesystemObject.open("/$MFT")
print "File Inode:",fileobject.info.meta.addr
print "File Name:",fileobject.info.name.name
print "File Creation Time:",datetime.datetime.fromtimestamp(fileobject.info.meta.crtime).strftime('%Y-%m-%d %H:%M:%S')

So to break this down further, instead of just printing the epoch value we are now printing the human readable value of the creation timestamp. We are doing this conversion with the datetime library by calling it as datetime.datetime.fromtimestamp. We are passing fromtimestamp the full reference to our selected file's creation timestamp (fileobject.info.meta.crtime) and then we are appending onto this a string formatting command. strftime or string format time is allowing us to control how the timestamp will be printed. Here we are passing %Y for the full four digit year, %m for the two digit month, %d for the two digit date and then the 24 hour time version of the time with %H for hour %M for minute and %S for seconds.  You can change the ordering anyway you want to make the timestamp format fit your needs.

For the full list of timestamp formatting codes go here: http://strftime.org/

Ok now for the next part which many of you have been waiting for, how do I get a file out of this image! Would you believe me if I said all we need is three more lines of code? Well you should!

#!/usr/bin/python
# Sample program or step 2 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
import datetime
imagefile = "Stage2.vhd"
imagehandle = pytsk3.Img_Info(imagefile)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
  print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
filesystemObject = pytsk3.FS_Info(imagehandle, offset=65536)
fileobject = filesystemObject.open("/$MFT")
print "File Inode:",fileobject.info.meta.addr
print "File Name:",fileobject.info.name.name
print "File Creation Time:",datetime.datetime.fromtimestamp(fileobject.info.meta.crtime).strftime('%Y-%m-%d %H:%M:%S')
outfile = open('DFIRWizard-output', 'w')
filedata = fileobject.read_random(0,fileobject.info.meta.size)
outfile.write(filedata)

We are opening a file for writing, I called the file we are writing to DFIRWizard-output, using the open function and letting python know we want to write to this file using the 'w' flag. We are storing the file handle for writing to this file in the variable outfile. The final line then looks like outfile = open('DFIRWizard-output', 'w')

To read the contents of the file we use function read_random which takes two parameters; the offset from the start of the file where we want to start reading and how many bytes of data we want to read. We are then reading in the contents of the $MFT from the beginning (0) to the end (fileobject.info.meta.size is the size of the file in bytes) and storing the data read into a variable called filedata. Now when you are working with large files or lots of files this isn't the best way to read the data. You'll likely want to buffer it and do reads and writes in a loop, but to keep it simple we are making it one line. The final line then looks like filedata = fileobject.read_random(0,fileobject.info.meta.size)

Last we are writing the data we just read into filedata into an output file 'DFIRWizard-output' using the write method that is available to all file objects. We do this by calling the file object (outfile) with the method write and passing the write method the variable we want to write to the file (filedata). So when is all said and done it looks like outfile.write(filedata)

That's it! Our second version of DFIR Wizard is done! You can try this yourself or download my version from the series Github at:https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v2.py

In the third part of this series we will show how to do the same thing against a live system!