Python and Reverse Engineering.
Before I get into this post, I should give you a little background into what I do day-to-day. In a typical week I will do a large range of work mainly it revolves around reverse engineering, exploit development, vulnerability analysis, penetration testing, etc. The nature of my (and many other researchers in my shoes) work can create a very diversified work load each having different requirements and environments. With that in mind, for me, python is my language of choice. I have yet to hit a limitation with python that I haven’t been able to figure out a solution for.
This past weekend I was talking to an acquaintance of mine regarding reverse engineering, exploit and tool development, and similar subjects. It was an interesting talk until I told him that 99% of the time I am using python for everything with the other 1% being ASM (shellcode). That statement alone flipped a nice conversation into me getting told that I was wrong, that I must be an idiot because it was not possible to use a language such as python for what we were talking about. He then followed that up by basically saying python was ‘stupid’ and a waste of time. There are a few things that piss me off and 2 of them are people telling me I am an idiot and people bashing python. After I finished explaining to him how wrong he was, I got the idea to write this post and hopefully enlighten someone to the joy python can be when reverse engineering.
First and foremost I need to point out two reasons why python and Reverse Engineering work so well together. Pedram Amini (aka Don Amini) and Ero Carrera are godsends to pyfreaks everywhere. Their code alone makes up most of the tools that I use, not to mention all the code I have written that is based on their code. All of these are must haves:
“PaiMei is written entirely in python and exposes at the highest level a debugger, a graph based binary abstraction and a set of utilities for accomplishing various repetitive tasks. The framework can essentially be thought of as a reverse engineer’s swiss army knife and has already been proven effective for a wide range of both static and dynamic tasks such as: fuzzer assistance, code coverage tracking, data flow tracking and more.”
Basically PaiMei is THE reverse engineering framework and something you should not be with out. As an added bonus it includes PyDbg which is a python win32 debugging interface, again its a must have. Matasano also has a nice write up about it. Pedram also has a collection of scripts that are worth checking out.
“Sulley is a fuzzer development and fuzz testing framework consisting of multiple extensible components. Sulley (IMHO) exceeds the capabilities of most previously published fuzzing technologies, commercial and public domain. The goal of the framework is to simplify not only data representation but to simplify data transmission and target monitoring as well. Sulley is affectionately named after the creature from Monsters Inc., because, well, he is fuzzy.”
While not strictly a reverse engineering tool, Sulley is a very powerful fuzzer and a must have.
Ero has written a slew of great tools and it would take way to long to go into detail about each but these are the ones I use frequently.
pydasm – Python interface to libdasm.
pefile – Python module to read and modify PE files.
ida2sql – IDA plugin that exports disassembly information from IDA into SQL.
pydot – Python interface to Graphviz’s Dot language.
pyEmu is a python based x86 32-bit emulator. Alone its very a powerful tool to have, but when combined with the above tools, some very impressive code can be made extremely fast. You should also check out his repository over on OpenRCE.org.
“IDAPython is an IDA Pro plugin that integrates the python programming language, allowing scripts to run in IDA Pro. These programs have access to IDA Plugin API, IDC and all modules available for python. The power of IDA Pro and python provides a platform for easy prototyping of reverse engineering and other research tools.”
“Immunity Debugger is a powerful new way to write exploits, analyze malware, and reverse engineer binary files. It builds on a solid user interface with function graphing, the industry’s first heap analysis tool built specifically for heap creation, and a large and well supported python API for easy extensibility.”
Although ImmDbg is not built on pure python, I think it deserves to be mentioned. Think of it as the bastard child of OllyDbg and python.
All of these are great pieces of code but what really makes them outstanding is when they are used together. Keep in mind this is by no way a complete list, just the ones that I use the most and think show how the flexibility of python can make such a complex task such as reverse engineering manageable. If you think I missed something, let me know and I will include it (maybe).