Twitter: @s4tan
Sojobo GitHub project: https://github.com/enkomio/Sojobo
Sojobo is a new binary analysis framework written in .NET and based on B2R2. I created this project for learning purpose and to make my work easier during malware analysis.B2R2
A couple of months ago a new binary analysis framework named B2R2 was released ([01, 02]), which also won the "BAR 2019 Best Paper Award" ([03]). It immediately attracted my attention since it is fully developed in F# in .NET Core and doesn't need any external libraries. This was a big plus for me since I love F# and I always had issues with the most common binary analysis frameworks (like the needs of a specific library version or the python binding is not working with the latest version or they are supposed to run only on Linux).B2R2 is a framework with an academic origin (this is a very rare case, since academic are reluctant to release working source code) and the developer is very responsive (and kind) on GitHub. It supports various CPU architectures and implements a new IR (LowUIR) which is very simple to understand. All sound very promising :)
Unfortunately, as the B2R2 main developer wrote ([04]), it is a frontend framework and at the moment no implementation is provided as backend. Also, they are considering running a business on the implementation of a backend framework and at the moment they are unsure when they will release it.
In the meantime that such code will be released I decided to write a backend on my own :)
Using Sojobo
Sojobo allows to emulate PE binary (32 bit) and to interact with the emulation. It implements a Sandbox class that can be used to emulate a given binary. In the following paragraph we will see how to write a simple generic unpacker.Implementing a generic unpacker
As first example I tried to write a tool that dumps a dynamically allocated memory region which is then executed. My purpose was to write a generic unpacker (as a POC of course) by following the principles described in the paper "Automatic Static Unpacking of Malware Binaries" ([05]). This kind of tools are pretty common among malware analysts, recently a new one was released([06]).You can find the source code of this sample in the GitHub repository, I'll paste it here for convenience:
#include <stdint.h>
#include <Windows.h>
void copy_code(void *buffer)
{
__asm
{
jmp start
code:
push ebp
mov ebp, esp
xor eax, eax
mov edx, 1
mov ecx, DWORD PTR [ebp+8]
l:
xadd eax, edx
loop l
mov esp, ebp
pop ebp
ret
start:
mov esi, code;
mov edi, buffer;
mov ecx, start;
sub ecx, code;
rep movsb
}
}
int main()
{
uint32_t ret_val = 0;
void *fibonacci = VirtualAlloc(NULL, 0x1000, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
copy_code(fibonacci);
ret_val = ((uint32_t (*)(uint32_t))fibonacci)(6);
VirtualFree(fibonacci, 0x1000, MEM_RELEASE);
return 0;
}
As you can see the code allocates a new memory region, invokes a function to copy some code and executes it. I tried to mimic a malware that unpacks the real payload in memory and then executes it. My goal is to dump such code.
To do that I'll follow a simple principle (described in the referred paper): if a memory region that was previously written to is executed, then I'll dump it to disk. By using Sojobo I subscribed to an event handler that is invoked each time that a memory is accessed. I can now step trough the process and monitor if a region that was previously written is now executed.
One of the first issue was to emulate invocation of external function (like VirtualAlloc). With Sojobo you can easily emulate such call by following a given coding convention (I'm a fan of convention over configuration paradigm [07]) but don't worry, Sojobo already implements emulation for some functions and I plan to support many more functions.
Saying that, the solution to our problem is the following one (the code is also in GitHub):
namespace ES.EndToEndTests
open System
open System.IO
open System.Collections.Generic
open B2R2
open ES.Sojobo.Model
open ES.Sojobo
module DumpDynamicMemory =
let private _memoryRegions = new List<MemoryRegion>()
let mutable private _memoryDumped = false
let private memoryAccessedHandler(operation: MemoryAccessOperation) =
match operation with
| Read address -> ()
| Write(address, value) -> ()
| Allocate memRegion -> _memoryRegions.Add(memRegion)
| Free memRegion -> ()
let private writeDisassembly(activeProcess: IProcessContainer) =
let text = Utility.formatCurrentInstruction(activeProcess)
Console.WriteLine(text)
let private identifyUnpackedCode(activeProcess: IProcessContainer) =
if not _memoryDumped then
let pc = activeProcess.GetProgramCounter().Value |> BitVector.toUInt32
_memoryRegions
|> Seq.tryFind(fun memRegion ->
pc >= uint32 memRegion.BaseAddress &&
pc < uint32 memRegion.BaseAddress + uint32 memRegion.Content.Length
)
|> Option.iter(fun memRegion ->
// a previously allocated region now is being executed, maybe unpacked code!
let filename = String.Format("mem_{0}.bin", memRegion.BaseAddress)
File.WriteAllBytes(filename, memRegion.Content)
Console.WriteLine("[+] Dynamic code dumped to: {0}!", filename)
_memoryDumped <- true
)
let private step(activeProcess: IProcessContainer) =
writeDisassembly(activeProcess)
identifyUnpackedCode(activeProcess)
let private getTestFile() =
["Release"; "Debug"]
|> Seq.map(fun dir -> Path.Combine("..", "..", "..", dir, "RunShellcodeWithVirtualAlloc.exe"))
|> Seq.tryFind(File.Exists)
let ``dump dynamically executed memory``() =
let sandbox = new Win32Sandbox()
let exe =
match getTestFile() with
| Some exe -> exe
| None ->
Console.WriteLine("RunShellcodeWithVirtualAlloc.exe not found, please compile it first!")
Environment.Exit(1)
String.Empty
sandbox.Load(exe)
// setup handlers
let proc = sandbox.GetRunningProcess()
proc.Memory.MemoryAccess.Add(memoryAccessedHandler)
proc.Step.Add(step)
// print imported function
proc.GetImportedFunctions()
|> Seq.iter(fun symbol ->
Console.WriteLine(
"Import: [0x{0}] {1} ({2}) from {3}",
symbol.Address.ToString("X"),
symbol.Name,
symbol.Kind,
symbol.LibraryName
)
)
// run the sample
sandbox.Run()
The code is quite simple, each time that a memory region is allocated I add it to a list. For each executed instruction I monitor if EIP is in the range of one of the previously allocated memory and if so I dump the region content to disk. If we execute the code a new file is written to disk which contains the following disassembled code:
L_00000000: push ebp
L_00000001: mov ebp, esp
L_00000003: xor eax, eax
L_00000005: mov edx, 0x1
L_0000000A: mov ecx, [ebp+0x8]
L_0000000D: xadd eax, edx
L_00000010: loop 0xd
L_00000012: pop ebp
L_00000013: ret
A real world sample: emulates KPOT v2.0 and dumps the deobfuscated strings
Let's try to use Sojobo with a real world case. Recently, Proofpoint published a new article about a new KPOT version ([08]). We will consider the sample with SHA256: 67f8302a2fd28d15f62d6d20d748bfe350334e5353cbdef112bd1f8231b5599d.In the GitHub repository I included the KPOT sample too, I took precaution to be sure that it is not executed by mistake (it is XORed, base64 encoded and with a corrupt PE header).
Our goal is to dump the strings once that they are decrypted. The function in charge for the decryption is at address 0x0040C8F5 and once that it returns in EAX is stored the length of the string and the EDI register points to the decrypted buffer. We can then read the memory content and print it.
Sojobo tries to emulate the most common functions and in particular it emulates GetLastError by returning 0 (success). If we take a look at the KPOT code we spot the following one:
.text:004103BB call ds:LoadUserProfileW
.text:004103C1 test eax, eax
.text:004103C3 jnz short loc_4103D0
.text:004103C5 call ds:GetLastError
.text:004103CB cmp eax, 57h ; 'W'
.text:004103CE jz short loc_4103D5
.text:004103D0 jmp near ptr loc_4103D0+1 ; Jump to garbage
Basically, if the GetLastError code is different than 0x57 the process crash (jump to garbage data). So we have to override the GetLastError default function definition in order to force to return 0x57. This is done by creating a class with name Kernel32 and a function with name GetLastError that accepts as first parameter a ISandbox object. Take a look at this file for the implementation details. Then, we add our assembly to the Sandbox in order to consider our function implementation, finally as done before we setup a process step handler, which contains the following code:
private static void ProcessStep(Object sender, IProcessContainer process)
{
var ip = process.GetProgramCounter().ToInt32();
if (ip == _retAddresDecryptString)
{
// read registers value
var decryptedBufferAddress = process.GetRegister("EDI").ToUInt64();
var bufferLength = process.GetRegister("EAX").ToInt32();
// read decrypted string
var decryptedBuffer = process.Memory.ReadMemory(decryptedBufferAddress, bufferLength);
var decryptedString = Encoding.UTF8.GetString(decryptedBuffer);
Console.WriteLine("[+] {0}", decryptedString);
}
}
By reversing the sample we know that the decrypt function end at address 0x0040C928, so when this point is reached we can dump the decrypted string by reading the EAX and EDI register values and also by reading the process memory. Find below an example of execution:
-=[ Start Emulation ]=-
[+] wininet.dll
[+] winhttp.dll
[+] ws2_32.dll
[+] user32.dll
[+] shell32.dll
[+] advapi32.dll
[+] dnsapi.dll
[+] netapi32.dll
[+] gdi32.dll
[+] gdiplus.dll
[+] oleaut32.dll
[+] ole32.dll
[+] shlwapi.dll
[+] userenv.dll
[+] urlmon.dll
[+] crypt32.dll
[+] mpr.dll
-=[ Emulation Completed ]=-
Of course that list is by no means exhaustive. We will see in the next paragraphs why of this.
It is really so simple and smooth?
I would love to say yes, but there are still some limitations (that I already planned to solve). The output above is taken by emulating the KPOT function that is in charge for loading the real used DLLs. Before that code we have the following one:
.text:00406966 64 A1 30 00 00 00 mov eax, large fs:30h ; read PEB
.text:0040696C 8B 40 18 mov eax, [eax+18h] ; read Heap
.text:0040696F C3 retn
Basically, it reads the Heap base address from PEB. A solution to this would be to place some fake values but it is not a good solution in the long term (KPOT resolves function addresses by walking the EAT). So I defined a PEB and TEB structures and written them to the process memory (I also correctly initialized the FS register). I have also implemented a serialization algorithm that will allows us to "read" object type from memory (instead that just a bunch of raw bytes). This will be very handy if we want to customize some complex structure (like PEB in this case). In the next paragraph we will take advantage of this feature.
The second problem is that KPOT tries to resolve function addresses by walking the Ldr field. It also use the Ldr field to find the base address of Kernel32, this is done by the following code:
.text:00406936 get_Kernel32_base_via_Ldr proc near
.text:00406936 64 A1 30 00 00 00 mov eax, large fs:30h ; read PEB
.text:0040693C 8B 40 0C mov eax, [eax+0Ch] ; read Ldr
.text:0040693F 8B 40 0C mov eax, [eax+0Ch] ; read InLoadOrderModuleList
.text:00406942 8B 00 mov eax, [eax] ; read first entry (ntdll)
.text:00406944 8B 00 mov eax, [eax] ; read second entry (kernel32)
.text:00406946 8B 40 18 mov eax, [eax+18h] ; read DllBase
.text:00406949 C3 retn
.text:00406949 get_Kernel32_base_via_Ldr endp
Even in this case you can just fake this value and write back the LDR_DATA_TABLE_ENTRY structure to memory but very soon you will discover that this strategy with fail (in fact, in our test the emulation raise an exception).
Dumping all strings from KPOT v2.0 (for real)
In the previous paragraph was introduced a feature that allows us to read objects from the process memory. In this paragraph we will see how to dump all encrypted strings in a very easy way. As said by Proofpoint all strings are encrypted with a very simple algorithm and stored in a struct that has the following layout:
public class EncryptedString
{
public UInt16 EncryptionKey;
public UInt16 StringLength;
public UInt32 Buffer;
public String Decrypt(IProcessContainer process)
{
var buffer = process.Memory.ReadMemory(this.Buffer, this.StringLength);
var stringContent = new StringBuilder();
foreach(var b in buffer)
{
stringContent.Append((Char)(b ^ this.EncryptionKey));
}
return stringContent.ToString();
}
}
It would be very useful if we can read from the memory process an EncryptedString object instead that a raw byte array (as done by the Proofpoint python script). With Sojobo you can do it and the code to print all the decrypted strings is as simple as this one:
private static void DecryptStrings(IProcessContainer process)
{
Console.WriteLine("-=[ Start Dump All Strings ]=-");
// encrypted strings
var encryptedStringsStartAddress = 0x00401288UL;
var encryptedStringsEndAddress = 0x00401838UL;
var currentOffset = encryptedStringsStartAddress;
while (currentOffset < encryptedStringsEndAddress)
{
var encryptedString = process.Memory.ReadMemory<EncryptedString>(currentOffset);
var decryptedString = encryptedString.Decrypt(process);
Console.WriteLine("[+] {0}", decryptedString);
// go to the next string
currentOffset += 8UL;
}
Console.WriteLine("-=[ Dump All Strings Completed ]=-");
}
In the GitHub repository you can find the full source code (to dump all strings pass --strings as first argument). The result it is the same as the one provided by Proofpoint (but with a cleaner code :P).
Conclusion and future development
Sojobo is still in its infancy but it can already be used for some initial analysis. In its future releases I'm going to add more emulated functions and the possibility to map other files in the process address space. By mapping external files (like Kernel32 or Ntdll) we can overcome problems related to an indirect referencing (like in the case above) while still maintaining control on how to emulate the function.References
[01] B2R2: Building an Efficient Front-End for Binary Analysis - https://www.reddit.com/r/ReverseEngineering/comments/aultc1/b2r2_building_an_efficient_frontend_for_binary/[02] B2R2: Building an Efficient Front-End for Binary Analysis (PDF) - https://ruoyuwang.me/bar2019/pdfs/bar2019-final51.pdf [03] NDSS Workshop on Binary Analysis Research (BAR) 2019 - https://ruoyuwang.me/bar2019/
[04] Symbolic Execution component #question - https://github.com/B2R2-org/B2R2/issues/9
[05] Automatic Static Unpacking of Malware Binaries - https://www.researchgate.net/publication/221200507_Automatic_Static_Unpacking_of_Malware_Binaries
[06] MwEmu: Malware analysis emulator written in Python 3 (based on Unicorn) - ALPHA version - https://www.reddit.com/r/Malware/comments/bkb0p9/mwemu_malware_analysis_emulator_written_in_python/
[07] Convention over configuration - https://en.wikipedia.org/wiki/Convention_over_configuration
[08] New KPOT v2.0 stealer brings zero persistence and in-memory features to silently steal credentials - https://www.proofpoint.com/us/threat-insight/post/new-kpot-v20-stealer-brings-zero-persistence-and-memory-features-silently-steal