Introduction to iOS Binary Patching (Part 1)

Part of my job as a forensic scientist is to hack applications. When working some high profile cases, it’s not always that simple to extract data right off of the file system; this is especially true if the data is encrypted or obfuscated in some way. In such cases, it’s sometimes easier to clone the file system of a device and perform what some would call “forensic hacking”; there are often many flaws within an application that can be exploited to convince the application to unroll its own data. We also perform a number of red-team pen-tests for financial/banking, government, and other customers working with sensitive data, where we (under contract) attack the application (and sometimes the servers) in an attempt to test the system’s overall security. More often than not, we find serious vulnerabilities in the applications we test. In the time I’ve spent doing this, I’ve seen a number of applications whose encryption implementations have been riddled with holes, allowing me to attack the implementation rather than the encryption itself (which is much harder).

There are a number of different ways to manipulate an iOS application. I wrote about some of them in my last book, Hacking and Securing iOS Applications . The most popular (and expedient) method involves using tools such as Cycript or a debugger to manipulate the Objective-C runtime, which I demonstrated in my talk at Black Hat 2012 (slides). This is very easy to do, as the entire runtime funnels through only a handful of runtime C functions. It’s quite simple to hijack an application’s program flow, create your own objects, or invoke methods within an application. Often times, tinkering with the runtime is more than enough to get what you want out of an application. The worst example of security I demonstrated in my book was one application that simply decrypted and loaded all of its data with a single call to an application’s login function, [ OneSafeAppDelegate userIsLogged: ]. Manipulating the runtime will only get you so far, though. Tools like Cycript only work well at a method level. If you’re trying to override some logic inside of a method, you’ll need to resort to a debugger. Debugging an application gives you more control, but is also an interactive process; you’ll need to repeat your process every time you want to manipulate the application (or write some fancy scripts to do it). Developers are also getting a little trickier today in implementing jailbreak detection and counter-debugging techniques, meaning you’ll have to fight through some additional layers just to get into the application.

This is where binary patching comes in handy. One of the benefits to binary patching is that the changes to the application logic can be made permanent within the binary. By changing the program code itself, you’re effectively rewriting the application. It also lets you get down to a machine instruction level and manipulate registers, arguments, comparison operations, and other granular logic. Binary patching has been used historically to break applications’ anti-piracy mechanisms, but is also quite useful in the fields of forensic research as well as penetration testing. If I can find a way to patch an application to give me access to certain evidence that it wouldn’t before, then I can copy that binary back to the original device (if necessary) to extract a copy of the evidence for a case, or provide the investigator with a device that has a permanently modified version of the application they can use for a specific purpose. For our pen-testing clients, I can provide a copy of their own modified binary, accompanied by a report demonstrating how their application was compromised, and how they can strengthen the security for what will hopefully be a more solid production release.

Patching an iOS binary typically requires that it be signed again, as this breaks codesigning, a security mechanism added by Apple. Most people who attack applications, however, copy the application over to a device with a jailbreak installed already (which typically disables this signing), or will jailbreak the device it’s on. Some jailbreaks don’t completely disable the signing, requiring the use of a tool like ldid, or codesign_allocate to sign it. It’s also possible to take a binary written by one developer, patch it, then sign it again using a different signature. This is one of the more underhanded techniques a malicious hacker might use to get the application running on a non-jailbroken device – or one of the more forensically useful techniques :)

There are a number of tools you can use for patching an iOS binary. You can do it for free, or you can spend upwards of $5,000 for good tools. This article will demonstrate the basics of how the process works, and also reference a couple of tools that will make your life a lot easier. Much of what’s in this article can also be used to patch other executables (such as x86 desktop OSX apps), as they use the same Mach-O file format (the instructions will be different, however).

Lets take the following simple program, which calls another function and then evaluates the result. This kind of function could be used anywhere in an application. It could be for the purpose of testing if a device has a jailbreak, ensuring an account has sufficient funds to perform a transfer, or (hopefully not) authenticating a user’s password. This is a pretty vanilla example:

#include <stdlib.h>
int test_condition() {
 return 0xff;
}
int main() {
 int result = test_condition();
 if (result) {
 exit(0);
 }
}

The code is very simple. It calls the test_condition() function, which returns 0xff (255). It then evaluates the result, and exits if it is non-zero. This is the typical kind of response you can expect from an application with some form of jailbreak detection; such a function may test for the existence of a specific file or perform another test to determine if the device is rooted.

To compile this for iOS, you can use llvm-gcc on the command-line. Just make sure you’ve installed Xcode and the command-line tools.

$ export SDK_PATH=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS6.1.sdk
$ export ARM_PATH=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/
$ ${ARM_PATH}/arm-apple-darwin10-llvm-gcc-4.2 -o test1 test1.c -isysroot ${SDK_PATH}

Now lets take a look at the relevant portions of machine code. To do this, use the otool command.

$ otool -tV test1
_main:
  00002fa8 e92d4080 push {r7, lr}
  00002fac e1a0700d mov r7, sp
  00002fb0 e24dd00c sub sp, sp, #12
 -00002fb4 e59f0034 ldr r0, [pc, #52]
| 00002fb8 e58d0000 str r0, [sp]
| 00002fbc ebfffff0 bl _test_condition <- test_condition()
| 00002fc0 e58d0004 str r0, [sp, #4]
| 00002fc4 e59d0004 ldr r0, [sp, #4] <- result (0xff)
| 00002fc8 e59d1000 ldr r1, [sp] <- comparison (0)
| 00002fcc e1500001 cmp r0, r1 <- subtract the two
| 00002fd0 1a000000 bne 0x2fd8 <- go if nonzero
| 00002fd4 ea000001 b 0x2fe0 <- otherwise, go here
| 00002fd8 e3a00000 mov r0, #0 <- load exit value
| 00002fdc eb000004 bl 0x2ff4 @ symbol stub for: _exit
| 00002fe0 e5170004 ldr r0, [r7, #-4]
| 00002fe4 e1a0d007 mov sp, r7
| 00002fe8 e8bd4080 pop {r7, lr}
| 00002fec e12fff1e bx lr
 -00002ff0 00000000 andeq r0, r0, r0

The output is pretty straight forward assembly, because we haven’t compiled with any optimizations, or with thumb. Before the test_condition function is called, a value of zero is stored on the stack. This is the value we’re going to eventually compare the result of the test function to. Starting at 0x2fbc, the main function branches to the test_condition function. This would look like a “call” on an x86 machine. Upon returning, a comparison is performed at 0x2fcc between the return value of test_condition (stored in r0), and the comparison value of zero (loaded into r1). In plain terms, you can think of the instruction as “compare 0xff and zero”, and the next instruction as “what to do if…” In reality, the comparison instruction subtracts the two register values. The bne instruction (branch if not equal) immediately follows the comparison, and causes a branch (jump) to 0x2fd8 if the result of the cmp operation is non-zero (e.g. return code != 0). At location 0x2fd8, we see r0 loaded with an exit code, and exit invoked. Right after the bne instruction is another branch (0x2fe0), which is where the program would have jumped to had the test_condition function returned zero.

Now lets take a look at our test function.

_test_condition:
 00002f84 e24dd008 sub sp, sp, #8
 - 00002f88 e59f0014 ldr r0, [pc, #20] <- load 0xff into r0
| 00002f8c e58d0000 str r0, [sp]
| 00002f90 e59d0000 ldr r0, [sp]
| 00002f94 e58d0004 str r0, [sp, #4]
| 00002f98 e59d0004 ldr r0, [sp, #4]
| 00002f9c e28dd008 add sp, sp, #8
| 00002fa0 e12fff1e bx lr <- return
 - 00002fa4 000000ff .long 0x000000ff <- our data (0xff)

This too is pretty straight forward, with the exception of the compiler being stupid. The instruction at 0x2f88 loads the value at the memory address [program counter + 20] into r0. This value is our data, stored directly in the binary as 0xff. A few instructions later, at 0x2fa0, it branches to lr (link register), which typically contains the return address of the caller. The extra crud added in by the compiler is virtually useless. It first allocates 8 bytes on the stack, reads and writes to it, then discards the stack before it returns.

Something worth noting: ldr statements from both main and test_condition functions refer to the program counter (pc), and add an offset. At the time these are executed, the program counter is equal to the start address of the function + 8 bytes (two instructions w worth). On binaries compiled with –mthumb, the program counter is equal to the start address of the function + 4 bytes (also two instructions worth). In our example of test_condition, pc + #20 = 0x2f88 + 8 + 20 = 0x2fa4 (we didn’t compile with thumb).

This is all fine and good, and it’s relatively easy to gain an understanding of how we can attack this code. The following possibilities immediately come to mind.

  • Because our test value is hard-coded, we could change 0xff in the binary to 0x00.
  • We could modify the test_condition function to return 0
  • We could flip the logic around in the main function
  • We could nop the first branch (to exit), letting the program branch where we want it

There are several other ways to attack this code as well, but these four are probably the four your’e going to encounter being feasible in most attacks against program logic like this. We will explore each.

Translating File Offsets

Before we can patch the binary in any way, we need to know where to patch in the binary. While the otool command gives us offsets, these offsets don’t reflect the actual file position on disk. They actually represent offsets within memory when the binary is loaded. In order to translate this to an actual file offset, we need to map the file offset to the memory offset.

To create this mapping, the otool command again comes in handy. The mach-o binary includes a set of load commands. These load commands tell the dynamic linker where in virtual memory to map each of the segments within the binary file. The code for your binary can be found in the __text section of the binary. To list out the different load commands for each section, use otool with the -l flag.

$ otool -l test1
Section
 sectname __text
 segname __TEXT
 addr 0x00002f0c
 size 0x000000e8
 offset 7948
 align 2^2 (4)
 reloff 0
 nreloc 0
 flags 0x80000400
 reserved1 0
 reserved2 0
...

The offset within the file is displayed as 7948. The start address for the __text section in the file is 0x2f0c. Now lets go back to our disassembly and look at where the value 0xff was stored in memory:

00002fa4 000000ff .long 0x000000ff <- our data (0xff)

If you ran this program with a debugger attached, you’d see our value right where you’d expect it:

# gdb -q ./test1
 Reading symbols for shared libraries . done
 (gdb) b main
 Breakpoint 1 at 0x2fb4
 (gdb) r
 Starting program: /private/var/root/test1
 Reading symbols for shared libraries + done
 Reading symbols for shared libraries ............................ done
Breakpoint 1, 0x00002fb4 in main ()
 (gdb) x/w 0x2fa4
 0x2fa4 <test_condition+32>: 0x000000ff
 (gdb)

And you could even change it to your liking:

(gdb) set *(char *) 0x2fa4 = 0
(gdb) x/w 0x2fa4
 0x2fa4 <test_condition+32>: 0x00000000

To calculate the offset of address 0x2fa4 within the file, however, you’ll need to do a tiny bit of math. Subtract the start address for the segment from the start address for our data, and then add the file offset to get the file position. So:

File Offset = (0x2fa40x2f0c) + 7948
= 152 + 7948
= 8100 (0x1fa4)

$ hexdump -s 0x1fa4 -n 4 test1
0001fa4 ff 00 00 00

And there we go. Since your iOS devices are little endian, values are stored in reverse.

Edit this in a hex editor at file offset 0x1fa4, then save.

$ hexdump -s 0x1fa4 -n 4 test1
 0001fa4 00 00 00 00

Now, when the application runs, it will load this value instead of the 0xff that was there before. The compare operation will result in a zero value, instead of non-zero like before. When this happens, the ‘bne’ instruction will fail, so the program will then execute the second branch in our function. Congratulations! You’ve just patched a logic check out of your binary!

Manipulating Return Values

Your’e now familiar with how to make actual patches on disk, and can calculate the file offset of data based on its address within the segment. Most applications don’t perform tests on hard-coded data, however, and so knowing how to change a static value isn’t going to get you very far.

The second technique we discussed for attacking this logic is to modify the test_condition function to return zero. Regardless of how it gets its values, we want the function to always return zero so that the application won’t suddenly exit. Lets have a look at the test_condition function again:

_test_condition:
 00002f84 e24dd008 sub sp, sp, #8
 - 00002f88 e59f0014 ldr r0, [pc, #20] <- load 0xff into r0
| 00002f8c e58d0000 str r0, [sp]
| 00002f90 e59d0000 ldr r0, [sp]
| 00002f94 e58d0004 str r0, [sp, #4]
| 00002f98 e59d0004 ldr r0, [sp, #4]
| 00002f9c e28dd008 add sp, sp, #8
| 00002fa0 e12fff1e bx lr <- return
 - 00002fa4 000000ff .long 0x000000ff

It’s standard calling convention on an ARM processor to store the function return value in r0, and this is what the function does at 0x2f88. As we discussed, the rest of the stack play is merely stupidity on the compiler’s part. At 0x2f84, 8 bytes are allocated on the stack, which are later reclaimed at 0x2f9c when the stack is discarded. We need to make sure we don’t break this either: we need to make sure we either discard the stack, or never allocate on it in the first place by overwriting the sub instruction. What we would like our hacked test_condition function to do is to load the r0 register with zero, and then return by branching to lr:

mov r0, 0
bx lr

So we know the instructions, but you can’t just write that into the executable file. The next step is to determine the hexadecimal encoding for these instructions. To do this, I like to use llvm-mc. This comes from the llvm compiler, which can be freely downloaded from the llvm project’s website at llvm.org. Using the llvm-mc command, you can easily get the encodings for any instructions you care to use.

$ echo "mov r0, 0; bx lr" | llvm-mc -assemble -triple=armv7 -show-encoding
 .text
 mov r0, #0 @ encoding: [0x00,0x00,0xa0,0xe3]
 bx lr @ encoding: [0x1e,0xff,0x2f,0xe1]

You can also display the encodings for programs compiled with thumb:

$ echo "mov r0, 0; bx lr" | llvm-mc -assemble -triple=thumbv7 -show-encoding
 .text
 mov.w r0, #0 @ encoding: [0x4f,0xf0,0x00,0x00]
 bx lr @ encoding: [0x70,0x47]

Lets overwrite the test_condition function with our new instructions. You are already familiar with how to map to a file offset. The test_condition function begins at 0x2f84. Using the same calculation as before:

File Offset = (0x2f840x2f0c) + 7948
= 120 + 7948
= 8068 (0x1f84)

$ hexdump -s 0x1f84 -n 8 test1
 0001f84 08 d0 4d e2 14 00 9f e5

You can see the eight bytes at 0x1f84 perfectly match the instructions in the disassembly at that location, however have been reversed for little endian.

00002f84 e24dd008 sub sp, sp, #8
00002f88 e59f0014 ldr r0, [pc, #20]

Using a hex editor, carefully overwrite these bytes with the instructions they’re being replaced with. The rest of the function doesn’t really matter, because we plan on returning before any of it ever gets executed. What matters is that there’s room to fit in our own replacement code.

NOTE: The encoding provided by llvm-mc has already accounted for endian, so you won’t need to flip the bytes around. Convenient!

Your binary should now look like this:

$ hexdump -s 0x1f84 -n 8 test1
 0001f84 00 00 a0 e3 1e ff 2f e1

If you disassemble the program again using otool, the first two instructions of your test_condition function should now match our replacement instructions.

_test_condition:
 00002f84 e3a00000 mov r0, #0
 00002f88 e12fff1e bx lr

Again, we don’t care about the rest as it never gets executed. Now when the program runs, the test function will immediately load zero into theregister and return. The comparison test will then equal zero, which will cause the bne instruction (leading to a premature exit) to be ignored, and the program will continue along the path as if the test has passed!

Logic Inversion

Manipulating return values is especially useful when there are a number of different calls to the same test, and you want the value to always return the same. In some cases, however, you may find that you only want to patch out the result of some calls to the function, and allow other calls to the function to receive the intended return code. In cases like this, it can make more sense to change the logic of the caller. For example, our original program’s source code stated:

if (result) {
 exit(0);
 }

This is demonstrated by the ‘cmp’ and ‘bne’ instructions, which compare the return value to zero, and then branch if the results are not equal:

if (result) {
 exit(0);
 }
00002fcc e1500001 cmp r0, r1 <- subtract the two
00002fd0 1a000000 bne 0x2fd8 <- go if nonzero

The bne instruction has an opposite counterpart: beq. The beq instruction (branch if equal to zero) causes the program to branch when the comparison results in zero, which translates to:

if (! result) {
 exit(0);
 }

In other words, if we know the test will fail, we can bypass the test by changing the comparison that happens after the test. This can be done using the same hex editing techniques as you’ve already learned, except changing the bne instruction.

NOTE: On a side note, the way bne and beq work is that they check the z flag (zero flag), which gets set or unset during a cmp operation.

Determine the correct instruction is a little tricky here. That 0x2fd8 you see is actually an offset calculated when the program was disassembled. The instruction is actually ‘bne 0‘, because the program counter is already two instructions ahead of you. You can confirm this using llvm-mc:

echo 'bne 0' | llvm-mc -assemble -triple=armv7 -show-encoding
 .text
 bne #0 @ encoding: [0x00,0x00,0x00,0x1a]

To flip the logic around on this, change bne to beq:

$ echo 'beq 0' | llvm-mc -assemble -triple=armv7 -show-encoding
 .text
 beq #0 @ encoding: [0x00,0x00,0x00,0x0a]

Now calculate the correct address to overwrite the 0x1a with 0x0a. Disassemble again to confirm the logic has changed:

00002fcc e1500001 cmp r0, r1
00002fd0 0a000000 beq 0x2fd8 <- Ha ha! h4x0r3d j00!

At this point, your code will now compare the result with zero, and branch to a premature exit if the test FAILS, rather than passes.

Deleting instructions with nop

The nop instruction is short for “no operation”. It’s the equivalent of replacing an instruction with whitespace. On ARM, the nop instructino takes four bytes:

nop @ encoding: [0x00,0xf0,0x20,0xe3]

On thumb, it takes two:

nop @ encoding: [0x00,0xbf]

One last example of how to attack this binary is to patch the ‘bne‘ instruction so that it is replaced with a nop. This will prevent any and all chance that the program will jump to its premature exit. Because another branch exists immediately after the bne, the program will continue execution as if the test had passed. To do this, simply take the bytes containing the bne instruction:

00002fcc e1500001 cmp r0, r1 <- subtract the two
00002fd0 1a000000 bne 0x2fd8 <- go if nonzero
00002fd4 ea000001 b 0x2fe0 <- otherwise, go here

And replace them with a nop:

00002fcc e1500001 cmp r0, r1
00002fd0 e320f000 nop <- Ha ha!
00002fd4 ea000001 b 0x2fe0
$ hexdump -C -s 0x1fd0 -n 4 test1
 00001fd0 00 f0 20 e3

You’ve now completely obliterated the instruction!

Exercises:

A few exercises for the reader:

1. Instead of nop’ing the bne instruction, try overwriting it with the branch instruction that follows it.

2. Patch this binary so that 0xff is returned, but it is compared with the value 0xff, so that the test passes

3. Try a different type of attack: nop the exit statement. Fortunately, the rest of the program code immediately follows the exit – but what would happen if it didn’t?

A few other things

1. This primer focused on a binary with only one architecture. For binaries with multiple architectures, you’ll need to disassemble and patch all relevant architectures within the binary. Use the otool -f command to display the universal binary header. This will give you the file offsets for each of the architectures present in the binary.

$ otool -f my_universal_binary
 Fat headers
 fat_magic 0xcafebabe
 nfat_arch 2
 architecture 0
 cputype 16777223
 cpusubtype 3
 capabilities 0x80
 offset 4096 <- Add this offset
 size 767344
 align 2^12 (4096)
 architecture 1
 cputype 7
 cpusubtype 3
 capabilities 0x0
 offset 774144 <- And later on this one
 size 606112
 align 2^12 (4096)

As you edit each architecture, be sure to add the file offset of the architecture in with your existing file offset calculations. Remember to always verify the hex matches that of the disassembly before you attempt to modify.

2. For $60, there is a great tool named Hopper, which is a disassembler, pseudocode generator (decompiler), and binary patcher all in one. You can use Hopper’s Modify -> Assemble Instruction feature to simply replace any existing instructions with new ones, and then use the File -> Product New Executable feature to write a new binary out. This will save you hours of manual labor… but it’s important to learn how to do it by hand, which is why this primer was written.

3. If you get into serious penetration testing, consider a licensed copy of IDA (The Interactive Disassembler). The starter edition is a mere $500 and it is well worth the money. Free demo copies are available on their website at https://www.hex-rays.com/products/ida/support/download_demo.shtml