sabato 15 maggio 2021

Alan - A post exploitation framework


Twitter: @s4tan
Download: GitHub (use this repo to report issues)
Documentation: https://github.com/enkomio/AlanFramework/tree/main/doc

I decided to dedicate a bit of my free time to develop a new project: Alan, a post exploitation framework. Doing red-team activities is not my main job, but I like this field and, as a malware analyst, I analyze a lot of programs that have a very similar intent.

The Alan concept is simple: the operator creates an agent file that is executed on the compromised host, and receives commands from a server under to control of the operator. The goal of the project is to provide a framework that has as primary target red-team activities. I implemented it by using C/Assembly for the agent and F# as backend (with .NET core, this ensure the excution on various OS).

Alan is implemented by considering weaknesses and missing features that I found in some of the currently available red-teaming tools. For example, a lot of tool claims that the traffic with the server is encrypted but they embed the key inside the request, or, in other cases, the key can be retrieved if the binary is available for reversing (too often I found a key generation algorithm based on a seed that can be easily computed).

A post-exploitation tool

Alan supports a good amount of features allowing the operator to further comprimise the target after the initial exploitation. Alan agent can be deployed in various formats, such as: Executable, DLL, Powershell and Shellcode. Below you can find a video that shows how to create an agent and interact with it by launching a command-shell on the remote host.



The agent can be easily customized and flexibility is a key feature for Alan. The agent profile can be updated at runtime, this means that you can change server address or even the communication protocol! The video below shows how to change the agent profile at runtime, by specifying a different server port and moving from HTTP to HTTPS.



Security Operation

Beign caugth by a blue team is something that should be avoided if you don't want to lose access to your target. Unfortunately, network traffic is something that cannot be hidden. Alan encrypts the network traffic in a strong way, but even if encrypted the requests might look suspicious. To avoid to raise any alerts, the operator can increase the delay between two requests or customize the requests and server reponses to look as a normal HTTP traffic from a know application. The video below shows the following features:
  • Create a powershell agent
  • Migrate to notepad.exe process. When the migration is completed in the Fiddler window is possible to see that the process sending requests became notepad.exe
  • A command-shell is executed and the original agent powershell file is deleted. By deleting the file there is no trace of the agent on disk and the execution is performed only in-memory. Then, the operator downloads some files to his system
  • The HTTP network traffic is inspected. The agent network traffic looks like normal traffic to an nginx server with default installation




In the next release I'll implement additional features and strenght a bit the code to try to avoid easily detection by AVs ;)

venerdì 12 giugno 2020

Deobfuscating C++ ADVobfuscator with Sojobo and the B2R2 binary analysis framework


Twitter: @s4tan
GitHub code: https://github.com/enkomio/Sojobo/tree/master/Src/Tools/ADVDeobfuscator

At Black Hat Europe 2014 - Amsterdam was presented a new obfuscation tool named ADVobfuscator. It is based on C++11 metaprogramming. The paper describes in depth how the strings and function calls are obfuscated.

ADVobfuscator demonstates how to use C++11/14 language to generate, at compile time, obfuscated code without using any external tool or a custom compiler.
Compile time obfuscators (like this one) are quite annoying to analyze since it is not easy to write a generic deobfuscator that it is based on code patterns recognition. In fact, the resulting binary code depends on the compiler, the used flags and so on. This will result in a series of corner cases that must be correctly handled to correctly deobfuscated the code. The worst part is that the handling of these corner cases might not be reused for a different sample that was compiled with a different compiler or with different flags. My idea to solve this problem, it is to write a deobfuscator that is based on flags extracted through the execution of generic heuristics. In this way, I can abstract the analysis from the code details.

Another interesting aspect of ADVObfuscator, it is that it was recently used to protect a malware sample that was analyzed in this very interesting blog post. In particular, in section "3. Latest variant of Team9 loader", it is possible to see a reference to the strings deobfuscation process.

In this blog post, I'll focus on the strings obfuscation part, by writing an utility that is able to decode the obfuscated strings. The deobfuscation utility uses the B2BR binary analysis framework to statically analyze the binary, and Sojobo to emulate the code.

The sample that I'll consider has SHA256 hash value: aaa9268b4a80f75eeb58b61cbd745523b1823d5adf54c615ad9ddf6b8fa0e806. It was used in a demo during my talk at HackInBo Safe Edition and can be downloaded from my GitHub repository.

Identify obfuscated strings

This is probably the most annoying part. We can't rely on specific code patterns, since according to the used compiler, the code might change. My idea was to abstract this concept and tries to identify interesting points, by using a series of heuristics. ADVObfuscator uses various methodologies to obfuscate the strings, some of them are reported below:

1400012AA    movdqa  xmm0, cs:xmmword_140023520          ; load obfuscated buffer
1400012B2    movdqu  [rbp+57h+var_90], xmm0
1400012B7    mov     rcx, r14
1400012BA
1400012BA loc_1400012BA:; CODE XREF: sub_1400011F4+D4↓j
1400012BA    mov     al, byte ptr [rbp+57h+var_90]
1400012BD    xor     byte ptr [rbp+rcx+57h+var_90+1], al ; deobfuscation
1400012C1    add     rcx, r15                            ; Increase counter
1400012C4    cmp     rcx, 0Eh                            ; check size
1400012C8    jb      short loc_1400012BA
1400012CA    mov     byte ptr [rbp+57h+var_90+0Fh], r14b ; set null byte
Unfortunately, not all deobfuscation tasks are implemented as in-line code, in some cases a function is invoked, as reported below.

140006161    movdqa  xmm0, cs:xmmword_140023800          ; load obfuscated buffer
140006169    lea     rcx, [rbp+var_30]                   ; pointer to the obfuscated buffer
14000616D    xor     eax, eax
14000616F    mov     [rbp+var_20], 627A6844h
140006176    movdqu  [rbp+var_30], xmm0
14000617B    mov     byte ptr [rbp+var_1C], al
14000617E    call    sub_140003684                       ; call deobfuscation function
140006183    mov     r9d, r15d
..............
140003684 sub_140003684   proc near  ; CODE XREF: sub_140005EB8+2C6↓p
140003684    lea     rax, [rcx+1]                        ; skip first byte, which is used as key
140003688    mov     r9d, 13h                            ; set string size
14000368E    mov     r8, rax                             ; pointer to buffer to decode
140003691
140003691 loc_140003691:; CODE XREF: sub_140003684+19↓j
140003691    mov     dl, [rcx]                           ; read XOR key
140003693    xor     [r8], dl                            ; deobfuscation
140003696    inc     r8                                  ; increment buffer pointer
140003699    sub     r9, 1                               ; decrement counter and check for termination
14000369D    jnz     short loc_140003691
14000369F    mov     [rcx+14h], r9b
1400036A3    retn
1400036A3 sub_140003684   endp
In the later case, it is possible to see that the string size is hardcoded inside the function body and not passed as input parameter. This means that we have a lot of functions like that, that differ only for some minor changes (like the string size).

Heuristics definition

As said, my main idea is to analyze all the functions that the B2R2 framework is able to identify and extract the flags that are based on the heuristics that I created. You can find all the defined heuristics in the associated source code. An excerpt from that list is presented below:
The heurstics above are used for the following tasks:
  • Identify all functions that deobfuscate a string. This task is useful to cover the case of the deobfuscation process defered to another function.
  • Identify the start of the code in charge for the deobfuscation.
  • Identify the address of the buffer that will be deobfuscated. This is done by identifying the deobfuscation operation.
  • Identify the end of the code in charge for the deobfuscation.

Emulation

At this point I have the following information: the functions that run a deobfuscation task and the related chunk of code in charge for this task. The final step is to emulate this code and read the deobfuscated string from memory. Before to run the emulation it is necessary to execute one final step. The heuristics might miss some important information, like the register that is used to increment the counter (in one of the example above we can see that r15 is used to increment the counter).

To cover this problem I used two strategies. In the first strategy, I do a backtrace analysis starting from the identified start address, and verify if the instruction is safe to be emulated. If so, I'll change the start address. The second strategy analyzes the instructions that should be emulated and if it notices that exists an operation that add two registers, I set the value of the source register, inside the emulator, to the value 1.
These strategies seem to be good enough to catch possible registers initialization code.

We can now run the emulator and read the decrypted string from memory.

Result

Below is reported a short video of the execution of the deobfuscation tool on the considered sample. With these information it shouldn't be too difficult to patch the original file and to NOP the deobfuscation operations. I tested it on various samples and it seems to work properly. If you found any errors just send me a message on twitter.


martedì 17 dicembre 2019

Writing a packer in Sacara

Twitter: @s4tan

GitHub project: https://github.com/enkomio/sacara

Release: https://github.com/enkomio/sacara/releases/latest

Sacara packer script: https://gist.github.com/enkomio/35b14084c1422db6740b5ed98cdb2db7

The Sacara project

It is a while that I don't write a blog post and recently I had the opportunity to work again on the Sacara project. I really like this project since it allows me to:
  • write code in x86 Assembly since all the VM code is implemented in x86 Assembly
  • to better understand how to implement a simple programming language (you can find the Sacara grammar file here)
  • how to write an assembler.
My previous post on Sacara is more than 1 year old so it is a good time to see if the project has any issues that can be resolved. After trying to write a new script, I immediately realize which was the biggest defect, writing a program was quite a pain due to the awkward syntax. It is time to improve this aspect with some syntax sugar :).

I released a new version 2.4 which adds some directives and other features that allows to easier the task of writing of a new script, leaving the VM core almost untouched.

Writing a simple packer

As done in my latest post about Sacara, even in this one I'll try to see how the AV industry will behave by analyzing a malicious program. I'll write a simple program that executes a Sacara script which purpose is to decode and run a malicious content.

This is the typical behavior of a packer, in my case I'll just decode and run the embedded content as if it was a shellcode. In general a packer will correctly maps the PE in memory and executes it by locating the Original Entry Point (OEP) but I'll leave this aspect out.

This time I'll create a C project and linking the Sacara static library, in this way we will have just one binary (and not a bunch of files as in my .NET test project created in my previous post).

For my test I wanted to use a real malware so I looked for some good stuff in VT. In the end I decided to use an unpacked Cobalt Strike payload whit SHA-1: 83a490496a7ea9562d6e3eb9a12a224fe50875e7. This is a perfect fit for my case since all Cobalt Strike modules are packed in a way that can be executed as a shellcode starting for the DOS header (the same happens with the Metasploit meterpreter_loader).

The overall design is quite simple, I'll embed the encoded malicious content as a PE resource and will use a Sacara script to decode and execute it. The Sacara script will be embedded as a PE resource too. The task done by the C code is deadly simple, just read the needed resources, allocates a memory region and run the script by providing the input buffer.

Implementation

I'll focus on the Sacara script, you can read the source code of the C code from the repository. The tasks done by the script are:
  1. Decode the password used to encrypt the malicious code
  2. RC4 decrypt the resource content
  3. Run the decrypted code
In order to easier the development and debugging of the script, I'll create various standalone scripts for each task and I'll merge them at the end. All the scripts in this post can be found in the test directory.

Decode the password used to encrypt the malicious code

As first step we want to retrieve the password used to encrypt the content. We don't want that a simple string search will reveals its value, so we will obfuscate it with a simple XOR operation by using a 1-byte key with value 0xA1.

Below you will find the relevant code with a simple test case, I think its comments are self explanatory.

// this routine will be used to stored the 
// script global data, all labels are global
proc global_data
password: 
 // encoded password
 byte 0xe0, 0xe1, 0xe2, 0xe3
endp

proc decode_password(pwd, pwd_len) 
 .mov index, 0
 
decode_pwd_loop:
 // read the byte to decode
 .mov pwd_offset, (pwd + index)
 .read.b pwd_offset 
 pop xored_char
 
 // decode the byte with hardcoded key
 .xor xored_char, 0xA1
 
 // write back the result
 pop xored_char
 .write.b pwd_offset, xored_char
 
 // check if completed
 .cmp index, pwd_len
 .inc index
 push decode_pwd_loop
 jumpifl 
 
 ret
endp

proc main
 // result must be the first variable if I want
 // to retrieve the result with SacaraRun, so set it to 0
 .mov result, 0
 
 // invoke the routine to decode the password
 .decode_password(password, 4)
 
 // read the decoded password as a double word at the specified offset
 .read.dw password
 pop result
 halt
endp
To test if it works, we will try to deobfuscate the buffer 0x41, 0x40, 0x43, 0x42. We have to first obfuscate it, so we will compute the XOR operation between the two integers 0x42434041 (little endian) and 0xA1A1A1A1, which result in 0xE3, 0xE2, 0xE1, 0xE0 (this is the same buffer that you will find at the top of the script).

Firs let's assemble it:
c:\SacaraAsm.exe test_decode_buffer.sacara

          -=[ Sacara SIL Assembler ]=-
Copyright (c) 2018-2019 Antonio Parata - @s4tan

[INFO] 2019-12-08 11:52:07 - VM code written to file: test_decode_buffer.sac
We can now test the script by running it, passing the input value 0x42434041 and expecting as result 0x42434041 (or 1111703617 in decimal notation) which is the little-endian hexadecimal notation of our buffer.
c:\SacaraRun.exe -p test_decode_buffer.sac 0x42434041
Execute file: c:\test_decode_bufferdecode_password.sac
Code execution result: 1111703617


RC4 decrypt the resource content

Our second and most complex step is the decryption of the buffer. In my previous post I used a simple XOR algorithm, for this post I decided to implement the RC4 cryptographic algorithm. If you are used to reverse malware you have probably encountered the usage of RC4 to encrypt configuration or code.

As in my previous script I'll use a lot of comments to make the code easy to understand (I also avoided some trivial optimization to avoid over complication). Since this code is quite long you can find the source code of this step in a test script, in this post I'll only show the KSA and PRGA phases.
proc ksa(password, password_length)
 .mov i, 0
 .mov j, 0

ksa_loop:
 // read the i-th byte from S array
 .read.b (S + i)
 pop S_i
 
 // read the i-th byte from password
 .read.b (password + (i % password_length))
 pop pwd_i
 
 // compute loop expression
 .mov j, ((j + S_i + pwd_i) % 256) 
 .swap(i, j)
 
 // check if I have to iterate
 .inc i 
 .cmp i, 256
 push ksa_loop
 jumpifl 
 ret
endp

proc prga(buffer, buffer_length)
 .mov i, 0
 .mov j, 0
 .mov n, 0
 
prga_loop:
 // update index i
 .mov i, ((i + 1) % 256) 
 
 // update index j
 .read.b (S + i) 
 pop S_i
 .mov j, ((j + S_i) % 256)
 
 // swap
 .swap(i, j)
 
 // read indexes
 .read.b (S + i) 
 pop S_i
 
 .read.b (S + j) 
 pop S_j
 
 // compute random
 .read.b (S + ((S_i + S_j) % 256))
 pop rnd
 
 // read n-th buffer value
 .read.b (buffer + n)
 pop buffer_n 
 
 // XOR with buffer and write back the result 
 .xor buffer_n, rnd
 pop encrypted_char
 .write.b (buffer + n), encrypted_char 

 // check if I have to iterate
 .inc n
 .cmp n, buffer_length
 push prga_loop
 jumpifl
 
 ret
endp

The assembling step is the same as before.

Run the decrypted resource

At this step we have the code in a decrypted form, we just have to run it. This can be done by the infrastructure code (the C code) or from Sacara. Since Sacara allows to invoke native code via the ncall instruction we will use this approach.

Build the code

Now we have all the pieces for our packer, once compiled the infrastructure code we can add the needed resources, that are:

  • DATA: contains the buffer that will be decrypted and executed.
  • SECRET: is the RC4 password that will beu sed to decrypt the buffer. It is XOR encoded with the 0x41 value.
  • SACARA: the Sacara code that will decrypt the buffer and execute it.


To add the resources you can use a simple utility that I wrote or any other PE explorer utility. Finally, the full source code of the Sacara packer can be found here.

Evaluation

As said my test will run a Cobalt Strike Payload encrypted with RC4 and the password: sacara_packer_password. I uploaded the file to VT and waited for the analysis, the result can be found here. We went from 52/70 to 24/69.

Not bad but to be honest I was expecting a better result. If you run the code in a real environment you will notice that the decryption of the code is quite CPU intensive and needs several seconds before to be invoked, this means that the identification of malicious content via emulation is improbable due to performance reason.

I decided to do a second test and upload the same program but without resources, this will make the program 100% safe. The result was very interesting, 12/69 AV flagged my sample as malicious. This means that they flagged the Sacara code as malicious and not real payload.

Conclusion and Future Work

This new version of Sacara improved the language in order to easier the development of a script. The next step is to provide an easy access to the Windows API in order to create more meaningful programs. Also, I want to improve the VM code in order to make it more resistant to reverse engineering.

Finally, I want to stress out that a single test case is not a valid reason to decide if an AV is good or not, so please take my result as a first step in the complex process of AV evaluation.

giovedì 6 giugno 2019

hm0x14 CTF: reversing a (not so simple) crackme

Twitter: @s4tan

Writeup GitHub project: https://github.com/enkomio/Misc/tree/master/Hm0x14Writeup

I'm not used to participate in CTF competition but in this case I personally know the author of this challenge and I consider her to be very smart, so I decided to give it a try. As I hope to show in this writeup, the challenge is very interesting and not the typical reverse engineering challenge.

Introduction

The challenge file is:

hm0x14.exe
SHA-256: 7cad36c64df33e30673d98e24be4d60c38ba433aa72f8d2bec14f69db4dbf173

It is a C++ application. As first step I run the application to see what it looks like. I have to admit that the author put a lot of effort in making the challenge appealing from a UI point of view. Below an image of the run of the challenge:



Analysis

Before I continue, I have to say that this challenge remained unsolved since its creation. For this reason the author decided to have a talk on how to solve it. This write up is not based on her presentation.

When you open the file in IDA you can immediately see that the main function is quite big. Taking a look at the decompiled source code we can see that the program initializes a DES provider and then read the resource Segreto from 4DES which content is displayed in the image below:



Proceeding with the debugging, it is clear that most of the code is in charge for the UI animation. After stepping a bit with the debugger, it will block on the function that reads the input password. After this function, it is easy to trace the program and see which is the function that accepts as first parameter the input password (which is, in this case, the string "1234567890").


00402E97 | 8D8D 50FEFFFF | lea ecx,dword ptr ss:[ebp-1B0] |
00402E9D | 50            | push eax                       | eax:&L"1234567890"
00402E9E | E8 08F2FFFF   | call hm0x14.4020AB             |
00402EA3 | 59            | pop ecx                        | ecx:L"xe"
00402EA4 | 33C0          | xor eax,eax                    | eax:&L"1234567890"



By decompiling the code we can see that the main goal of this function is to invoke another function that I called hash_chars and then generates 4 symmetric keys. Since in the video there is a 4DES banner I suspect that this function creates 4 keys that will be used in this new crypto algorithm :)


  memset(&v25, 0, 0x20u);
  if ( v7 )
    v8 = password;
  else
    v8 = *password;
  hash_chars(v8, &v25, v6);            ; generate 4DES key buffer (8 byte = 64 bit, a typical DES key length)
  if ( *(password + 20) < 8u )
    v9 = password;
  else
    v9 = *password;
  hash_chars(v9 + 2 * v6, &v26, v6);   ; generate 4DES key buffer (8 byte = 64 bit, a typical DES key length)
  if ( *(password + 20) < 8u )
    v10 = password;
  else
    v10 = *password;
  hash_chars(v10 + 4 * v6, &v27, v6);  ; generate 4DES key buffer (8 byte = 64 bit, a typical DES key length)
  if ( *(password + 20) >= 8u )
    v3 = *password;
  hash_chars(v3 + 6 * v6, &v28, hKey); ; generate 4DES key buffer (8 byte = 64 bit, a typical DES key length)
  v11 = v19;
  v12 = bcrypt_generate_symmetric_key(v19, &hKey, &v25);
  v13 = bcrypt_generate_symmetric_key(v11, &v22, &v26);
  v14 = bcrypt_generate_symmetric_key(v11, &v23, &v27);
  v15 = bcrypt_generate_symmetric_key(v19, &v21, &v27);      ; <----- ?!? (1)



At this point we are not in good luck, since breaking this algorithm seems to be not so easy (just consider that 3DES is still considered a strong algorithm). Let's take a look at the function in charge for creating the key from our password, in the image below I have highlighted the main points:

The loop is executed a number of times that depends on the password length. The meaning of the various circles is:

* blue circle: read the ith character of the input password

* orange circle: this is the main code. It just multiplies the current key value for 0x1F and save only the low DWORD result value (remember this fact).

* green circle: the value of the blue circle is added to the result

* read circle: before to return the result in the ESI register, the value is shifted left by 5 (other point to remember)

The code to generate the key from a password can be represented by the following F# code:


let hashChunk(password: String, offset: Int32, rounds: Int32) =
 let mutable result = 1UL
 for i=0 to rounds-1 do
  result <- (result * 0x1FUL) + uint64 password.[i + offset]
 (result <<< 5) &&& 0x00000000FFFFFFFFUL

let generateKey(password: String) =
 let keys = [
  let size = password.Length >>> 2
  for i=0 to 3 do
   let value = hashChunk(password, i * size, size)
   yield BitConverter.GetBytes(value)
 ]     

 (keys.[1], keys.[0]) // I'll exaplin later why I only return these two keys



Finally, the decryption of the resource content is done by executing the following code:



Which can be summarized as:

P = D(E(D(E(C, k1), k2), k3), k4)



If during the decryption the application identify an error, the image of the skull is displayed (if you are wondering which skull, watch the first video till the end ^^).

Implementation Errors

Let's take a break to do a recap of the info that we have. Despite the fact that each key is 8 bytes long, only the first 4 bytes are used, so here we have the first error. However, breaking such a keyspace is still not feasible with my laptop.

One of the most important aspect that will help us is pointed out in the decompiled code above with reference (1). I'll rewrite the code below for easy reference:


v12 = bcrypt_generate_symmetric_key(v19, &hKey, &v25);
v13 = bcrypt_generate_symmetric_key(v11, &v22, &v26);
v14 = bcrypt_generate_symmetric_key(v11, &v23, &v27);
v15 = bcrypt_generate_symmetric_key(v19, &v21, &v27);



Do you see it? The last two operations reference the same exact value! By debugging the application we can notice this fact since the first two operations use the same key, invalidating the result. So the effective decryption process is:

P = D(E(D(E(C, k1), k1), k2), k3) = D(E(C, k2), k3)



So we downgraded the algorithm to a 2DES and if you have ever followed a cryptographic course, you know that there is a reason if we jumped from DES to 3DES by skipping 2DES.

Meet In the Middle

The reason why 2DES is considered not secure is for this specific attack. By quoting wikipedia:

When trying to improve the security of a block cipher, a tempting idea is to encrypt the data several times using multiple keys. One might think this doubles or even n-tuples the security of the multiple-encryption scheme, depending on the number of times the data is encrypted, because an exhaustive search on all possible combination of keys (simple brute-force) would take 2^(n-k) attempts if the data is encrypted with k-bit keys n times.

The MITM is a generic attack which weakens the security benefits of using multiple encryptions by storing intermediate values from the encryptions or decryptions and using those to improve the time required to brute force the decryption keys. This makes a Meet-in-the-Middle attack (MITM) a generic space–time tradeoff cryptographic attack.

The MITM attack attempts to find the keys by using both the range (ciphertext) and domain (plaintext) of the composition of several functions (or block ciphers) such that the forward mapping through the first functions is the same as the backward mapping (inverse image) through the last functions, quite literally meeting in the middle of the composed function. For example, although Double DES encrypts the data with two different 56-bit keys, Double DES can be broken with 2^57 encryption and decryption operations.


Since we know that our key is 32 bit long we can break this encryption with 2^37 operations. Nice... in theory. I don't know about you, but my laptop is still not so powerful to break such a keyspace. There must be some other way to downsize the key.

Indeed, there is! If you take another look at the function that generates the key from a password you will notice that the final result is left shifted 5 times, this means that the least 5 important bits are always zero! With this information we can downgrade the key from 32 bit to 27 bit!

At this point I started to implement the algorithm, but 27 bit are still too much for my laptop. I have to confess that I was stuck at this point. Talking with the author, she told me that there is still a way to downgrade the key size from 27 to 24 bits.

I struggled a bit on this part, until I realized, the parity bit! It is pretty know that DES uses a key of 64 bits but the effective size is 56 bits. This is due to the fact that the last bit is used as parity bit and it is not consider in the encryption process. Since we have 3 full bytes (the last one is shifted by 5 so doesn't count), by removing 1 bit from each byte we reach the final size of 24 bits.

Finding the plaintext

At this point we have in place the theory and the feasibility of the attack but we miss one last piece, the plaintext to encrypt. Unfortunately the program doesn't seem to give any hints on the format of the plaintext, so I decided to take another look at the program. As every experienced reverse engineering in the world would do, I run the most sophisticated analysis, I run strings on the binary. I discovered some interesting strings like:

"Scrivi il messaggio e premi INVIO, control Z, INVIO... "
"Inserisci la password SEGRETA e premi INVIO!! "
"Stai per creare un messaggio segreto                    con"


those strings were effectively referenced in the binary. By taking a look at the referencing code I discovered that if the binary doesn't found the encrypted resource, it enters in another state and allows to create a secret message. So I removed the resource and started the program again. The image below show how to create a secret message.



Since I created a new protected message I'm finally able to see which is the screen displayed to the user when a message is correctly decrypted. From this screen I can see that the first 8 bytes are always the same and their value is: "Oggetto:". Finally we have all the missing pieces of our puzzle.

Break the rule!

We reached the end of the writeup, let's do a quick recap of the attack:

P = D(E(C, k2), k3) => E(P, k3) = E(C, k4) =>
E("Oggetto:", k3) = E("\xA8\xEC\xE8\x6E\x9D\xB5\xE1\xB7", k4)



I have to compute the two parts for each k3 and k4 until I found two keys that generates the same value.

So, the implementation of the Meet In The Middle attack is composed of two steps, in the first part I encrypt the plaintex with all keys from the 24 bit keyspace and save the result and the key. Then I proceed to encrypt the ciphertext with each possible key and try to find a match with the first step. If a match is found I broke the encryption.

On my laptop it took a while to complete. In the following result you can see the execution of the first step of the attack:


-=[ Start encrypt plaintext: 6/6/2019 2:54:27 PM ]=-
Start iteration 0 of 7 at 6/6/2019 2:54:27 PM
Start iteration 1 of 7 at 6/6/2019 3:07:01 PM
Start iteration 2 of 7 at 6/6/2019 3:18:45 PM
Start iteration 3 of 7 at 6/6/2019 3:32:16 PM
Start iteration 4 of 7 at 6/6/2019 3:45:15 PM
Start iteration 5 of 7 at 6/6/2019 3:58:26 PM
Start iteration 6 of 7 at 6/6/2019 4:10:37 PM
Start iteration 7 of 7 at 6/6/2019 4:22:27 PM
-=[ End encrypt plaintext: 6/6/2019 4:34:14 PM ]=-


And here is the second part of the attack where you can see that the password was successfully bruteforced:


-=[ Populate storage from pre-built table: 6/6/2019 5:56:37 PM ]=-
-=[ Start identify key: 6/6/2019 5:58:03 PM ]=-
Start iteration 0 of 7 at 6/6/2019 5:58:03 PM
Start iteration 1 of 7 at 6/6/2019 6:01:58 PM
Encrypt Password found: 20-8A-34-40-00-00-00-00
Decrypt Password found: 80-A0-DE-1C-00-00-00-00
-=[ End identify key: 6/6/2019 6:08:43 PM ]=-
-=[ Secret message: 6/6/2019 6:08:47 PM ]=-
Oggetto: Finché la barca va.

Quell'augel d'ebano, allora, così tronfio e pettoruto
tentò fino ad un sorriso il mio spirito abbattuto:
«Sebben spiumato e torvo, - dissi, - un vile non sei tu
certo, o vecchio spettral corvo della tenebra di Pluto?
Quale nome a te gli araldi dànno a corte di Re Pluto?»
Disse il corvo allor: «HM{2005d05af414ac92a3ffc5beecbd94f4}!».

PS: questo «4DES» non mi sembra molto sicuro. ho dei seri dubbi sull'algo-
ritmo di hashing della password, e comunque quando si implementa un algorit-
mo non standard si rischia sempre di fare degli erroi grossolani. anche ba-
nali errori di copia-incolla possono essere fatali per la sicurezza. E poi
il logo è così lento a disegnarsi! Troviamo un'alternativa?



Once that I retrieve the keys I'm able to decrypt the text and visualize the FLAG (which is HM{2005d05af414ac92a3ffc5beecbd94f4}), as can be seen from the following video:



Conclusion

This challenge was very entertaining, not because of the reversing part (that was pretty easy to be honest) but because was built with the idea to show how difficult is to implement a new cryptographic algorithm by demonstrating how a real world attack works.

Side Note

The challenge is full of funny comments having joke of hackers. The author told me that she created this challenge by considering a middle-aged developer.

The final message is an excerpt from the poem "The Raven" where the quote "Nevermore" was replaced with the MD5 of "Barbra Streisand" (the actual CTF flag value). This is a tribute to a meme that was popular at that time (actually I didn't realize this thing, it was the author that told me to search for that MD5).

Source Code

The full source code is on my Github account, I report here just the most important parts to break the encryption (for an updated version please visit the Github website).

namespace Hm0x14Writeup

open System
open System.Text
open System.IO
open System.Collections.Generic
open System.Reflection

module Program =    

 let mangleKey(k0: Int32, k1: Int32, k2: Int32, k3: Int32) = [|
  byte k0 <<< 5
  byte k1 <<< 1
  byte k2 <<< 1
  byte k3 <<< 1
  0uy
  0uy
  0uy
  0uy
 |]

 let getKeys() = seq {
  for i0=0 to 0x7 do
   Console.WriteLine("Start iteration {0} of 7 at {1} ", i0, DateTime.Now)
   for i1=0 to 0x7F do
    for i2=0 to 0x7F do
     for i3=0 to 0x7F do
      yield (mangleKey(i0, i1, i2, i3))
 }

 let buildEncryptedTextTable(plainText: Byte array, storage: Dictionary<String, Byte array>) =
  Console.WriteLine("-=[ Start encrypt plaintext: {0} ]=-", DateTime.Now)

  Utility.getKeys()
  |> Seq.iter(fun key ->
   try
    let encryptedBuffer = Encryption.encrypt(plainText, key)
    storage.[BitConverter.ToString(encryptedBuffer)] <- key
   with _ -> ()
  )

  Console.WriteLine()
  Console.WriteLine("-=[ End encrypt plaintext: {0} ]=-", DateTime.Now)
  Console.WriteLine()

 let populateStorage(storage: Dictionary<String, Byte array>) =
  if not <| Storage.storageExists() then
   let plainText = Encoding.UTF8.GetBytes("Oggetto:")
   buildEncryptedTextTable(plainText, storage)
   Storage.saveEncryptedText(storage) 
  Storage.loadEncryptedText(storage)

 let findKey(encryptedText: Byte array, storage: Dictionary<String, Byte array>) =
  Console.WriteLine("-=[ Start identify key: {0} ]=-", DateTime.Now)
  let mutable (encKey, decKey) = (Array.empty<Byte>, Array.empty<Byte>)

  Utility.getKeys()
  |> Seq.iter(fun key ->
   try
    let encryptedBuffer = Encryption.encrypt(encryptedText, key) |> BitConverter.ToString
    if storage.ContainsKey(encryptedBuffer) then
     encKey <- storage.[encryptedBuffer]
     decKey <- key
     Console.WriteLine("Encrypt Password found: " + BitConverter.ToString(storage.[encryptedBuffer]))
     Console.WriteLine("Decrypt Password found: " + BitConverter.ToString(key))                
     Console.ReadLine() |> ignore
   with _ -> ()
  )

  Console.WriteLine()
  Console.WriteLine("-=[ End identify key: {0} ]=-", DateTime.Now)
  Console.WriteLine()

  (encKey, decKey)

 let getCipherText() =
  let curDir = Path.GetDirectoryName(Assembly.GetEntryAssembly().Location)
  File.ReadAllBytes(Path.Combine(curDir, "4DES_SEGRETO"))

 [<EntryPoint>]
 let main argv = 
  let storage = new Dictionary<String, Byte array>()
  populateStorage(storage)

  // decrypt the cipher text
  let ciphertext = getCipherText()     
  let (encKey, decKey) = findKey(ciphertext.[0..7], storage)

  // print the cipherText
  let secretMessage = Encryption.twoDesDecrypt(encKey, decKey, ciphertext)
  Console.WriteLine(secretMessage)
  0

domenica 19 maggio 2019

Sojobo - Yet another binary analysis framework

Twitter: @s4tan

Sojobo GitHub project: https://github.com/enkomio/Sojobo

Sojobo is a new binary analysis framework written in .NET and based on B2R2. I created this project for learning purpose and to make my work easier during malware analysis.

B2R2

A couple of months ago a new binary analysis framework named B2R2 was released ([01, 02]), which also won the "BAR 2019 Best Paper Award" ([03]). It immediately attracted my attention since it is fully developed in F# in .NET Core and doesn't need any external libraries. This was a big plus for me since I love F# and I always had issues with the most common binary analysis frameworks (like the needs of a specific library version or the python binding is not working with the latest version or they are supposed to run only on Linux).

B2R2 is a framework with an academic origin (this is a very rare case, since academic are reluctant to release working source code) and the developer is very responsive (and kind) on GitHub. It supports various CPU architectures and implements a new IR (LowUIR) which is very simple to understand. All sound very promising :)

Unfortunately, as the B2R2 main developer wrote ([04]), it is a frontend framework and at the moment no implementation is provided as backend. Also, they are considering running a business on the implementation of a backend framework and at the moment they are unsure when they will release it.

In the meantime that such code will be released I decided to write a backend on my own :)

Using Sojobo

Sojobo allows to emulate PE binary (32 bit) and to interact with the emulation. It implements a Sandbox class that can be used to emulate a given binary. In the following paragraph we will see how to write a simple generic unpacker.

Implementing a generic unpacker

As first example I tried to write a tool that dumps a dynamically allocated memory region which is then executed. My purpose was to write a generic unpacker (as a POC of course) by following the principles described in the paper "Automatic Static Unpacking of Malware Binaries" ([05]). This kind of tools are pretty common among malware analysts, recently a new one was released([06]).

You can find the source code of this sample in the GitHub repository, I'll paste it here for convenience:


#include <stdint.h>
#include <Windows.h>

void copy_code(void *buffer)
{
 __asm 
 {
  jmp start
 code:
  push ebp
  mov ebp, esp
  xor eax, eax
  mov edx, 1
  mov ecx, DWORD PTR [ebp+8]
 l: 
  xadd eax, edx
  loop l
  mov esp, ebp
  pop ebp
  ret
 start:
  mov esi, code;
  mov edi, buffer;
  mov ecx, start;
  sub ecx, code;
  rep movsb
 }
}


int main()
{
 uint32_t ret_val = 0;
 void *fibonacci = VirtualAlloc(NULL, 0x1000, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
 copy_code(fibonacci);
 ret_val = ((uint32_t (*)(uint32_t))fibonacci)(6);
 VirtualFree(fibonacci, 0x1000, MEM_RELEASE);
 return 0;
}


As you can see the code allocates a new memory region, invokes a function to copy some code and executes it. I tried to mimic a malware that unpacks the real payload in memory and then executes it. My goal is to dump such code.

To do that I'll follow a simple principle (described in the referred paper): if a memory region that was previously written to is executed, then I'll dump it to disk. By using Sojobo I subscribed to an event handler that is invoked each time that a memory is accessed. I can now step trough the process and monitor if a region that was previously written is now executed.

One of the first issue was to emulate invocation of external function (like VirtualAlloc). With Sojobo you can easily emulate such call by following a given coding convention (I'm a fan of convention over configuration paradigm [07]) but don't worry, Sojobo already implements emulation for some functions and I plan to support many more functions.

Saying that, the solution to our problem is the following one (the code is also in GitHub):

namespace ES.EndToEndTests

open System
open System.IO
open System.Collections.Generic
open B2R2
open ES.Sojobo.Model
open ES.Sojobo

module DumpDynamicMemory =
    let private _memoryRegions = new List<MemoryRegion>()
    let mutable private _memoryDumped = false

    let private memoryAccessedHandler(operation: MemoryAccessOperation) =
        match operation with
        | Read address -> ()
        | Write(address, value) -> ()
        | Allocate memRegion -> _memoryRegions.Add(memRegion)
        | Free memRegion -> ()

    let private writeDisassembly(activeProcess: IProcessContainer) =
        let text = Utility.formatCurrentInstruction(activeProcess)
        Console.WriteLine(text)

    let private identifyUnpackedCode(activeProcess: IProcessContainer) =
        if not _memoryDumped then
            let pc = activeProcess.GetProgramCounter().Value |> BitVector.toUInt32
            _memoryRegions
            |> Seq.tryFind(fun memRegion -> 
                pc >= uint32 memRegion.BaseAddress &&
                pc < uint32 memRegion.BaseAddress + uint32 memRegion.Content.Length
            )
            |> Option.iter(fun memRegion ->
                // a previously allocated region now is being executed, maybe unpacked code!            
                let filename = String.Format("mem_{0}.bin", memRegion.BaseAddress)
                File.WriteAllBytes(filename, memRegion.Content)
                Console.WriteLine("[+] Dynamic code dumped to: {0}!", filename)
                _memoryDumped <- true
            )

    let private step(activeProcess: IProcessContainer) =
        writeDisassembly(activeProcess)
        identifyUnpackedCode(activeProcess)

    let private getTestFile() =
        ["Release"; "Debug"]
        |> Seq.map(fun dir -> Path.Combine("..", "..", "..", dir, "RunShellcodeWithVirtualAlloc.exe"))
        |> Seq.tryFind(File.Exists)

    let ``dump dynamically executed memory``() =
        let sandbox = new Win32Sandbox() 
        let exe = 
            match getTestFile() with
            | Some exe -> exe
            | None ->
                Console.WriteLine("RunShellcodeWithVirtualAlloc.exe not found, please compile it first!")
                Environment.Exit(1)
                String.Empty

        sandbox.Load(exe)

        // setup handlers
        let proc = sandbox.GetRunningProcess()
        proc.Memory.MemoryAccess.Add(memoryAccessedHandler)
        proc.Step.Add(step)
        
        // print imported function
        proc.GetImportedFunctions()
        |> Seq.iter(fun symbol ->
            Console.WriteLine(
                "Import: [0x{0}] {1} ({2}) from {3}", 
                symbol.Address.ToString("X"), 
                symbol.Name, 
                symbol.Kind, 
                symbol.LibraryName
            )            
        )
        
        // run the sample
        sandbox.Run()


The code is quite simple, each time that a memory region is allocated I add it to a list. For each executed instruction I monitor if EIP is in the range of one of the previously allocated memory and if so I dump the region content to disk. If we execute the code a new file is written to disk which contains the following disassembled code:


L_00000000:   push ebp
L_00000001:   mov ebp, esp
L_00000003:   xor eax, eax
L_00000005:   mov edx, 0x1
L_0000000A:   mov ecx, [ebp+0x8]
L_0000000D:   xadd eax, edx
L_00000010:   loop 0xd
L_00000012:   pop ebp
L_00000013:   ret 


A real world sample: emulates KPOT v2.0 and dumps the deobfuscated strings

Let's try to use Sojobo with a real world case. Recently, Proofpoint published a new article about a new KPOT version ([08]). We will consider the sample with SHA256: 67f8302a2fd28d15f62d6d20d748bfe350334e5353cbdef112bd1f8231b5599d.

In the GitHub repository I included the KPOT sample too, I took precaution to be sure that it is not executed by mistake (it is XORed, base64 encoded and with a corrupt PE header).

Our goal is to dump the strings once that they are decrypted. The function in charge for the decryption is at address 0x0040C8F5 and once that it returns in EAX is stored the length of the string and the EDI register points to the decrypted buffer. We can then read the memory content and print it.

Sojobo tries to emulate the most common functions and in particular it emulates GetLastError by returning 0 (success). If we take a look at the KPOT code we spot the following one:


.text:004103BB                 call    ds:LoadUserProfileW
.text:004103C1                 test    eax, eax
.text:004103C3                 jnz     short loc_4103D0
.text:004103C5                 call    ds:GetLastError
.text:004103CB                 cmp     eax, 57h ; 'W'
.text:004103CE                 jz      short loc_4103D5
.text:004103D0                 jmp     near ptr loc_4103D0+1 ; Jump to garbage


Basically, if the GetLastError code is different than 0x57 the process crash (jump to garbage data). So we have to override the GetLastError default function definition in order to force to return 0x57. This is done by creating a class with name Kernel32 and a function with name GetLastError that accepts as first parameter a ISandbox object. Take a look at this file for the implementation details. Then, we add our assembly to the Sandbox in order to consider our function implementation, finally as done before we setup a process step handler, which contains the following code:


private static void ProcessStep(Object sender, IProcessContainer process)
{
 var ip = process.GetProgramCounter().ToInt32();
 if (ip == _retAddresDecryptString)
 {
  // read registers value
  var decryptedBufferAddress = process.GetRegister("EDI").ToUInt64();
  var bufferLength = process.GetRegister("EAX").ToInt32();
  
  // read decrypted string
  var decryptedBuffer = process.Memory.ReadMemory(decryptedBufferAddress, bufferLength);
  var decryptedString = Encoding.UTF8.GetString(decryptedBuffer);
  Console.WriteLine("[+] {0}", decryptedString);
 }
}


By reversing the sample we know that the decrypt function end at address 0x0040C928, so when this point is reached we can dump the decrypted string by reading the EAX and EDI register values and also by reading the process memory. Find below an example of execution:


-=[ Start Emulation ]=-
[+] wininet.dll
[+] winhttp.dll
[+] ws2_32.dll
[+] user32.dll
[+] shell32.dll
[+] advapi32.dll
[+] dnsapi.dll
[+] netapi32.dll
[+] gdi32.dll
[+] gdiplus.dll
[+] oleaut32.dll
[+] ole32.dll
[+] shlwapi.dll
[+] userenv.dll
[+] urlmon.dll
[+] crypt32.dll
[+] mpr.dll
-=[ Emulation Completed ]=-


Of course that list is by no means exhaustive. We will see in the next paragraphs why of this.

It is really so simple and smooth?

I would love to say yes, but there are still some limitations (that I already planned to solve). The output above is taken by emulating the KPOT function that is in charge for loading the real used DLLs. Before that code we have the following one:


.text:00406966 64 A1 30 00 00 00             mov     eax, large fs:30h ; read PEB
.text:0040696C 8B 40 18                      mov     eax, [eax+18h]    ; read Heap
.text:0040696F C3                            retn


Basically, it reads the Heap base address from PEB. A solution to this would be to place some fake values but it is not a good solution in the long term (KPOT resolves function addresses by walking the EAT). So I defined a PEB and TEB structures and written them to the process memory (I also correctly initialized the FS register). I have also implemented a serialization algorithm that will allows us to "read" object type from memory (instead that just a bunch of raw bytes). This will be very handy if we want to customize some complex structure (like PEB in this case). In the next paragraph we will take advantage of this feature.

The second problem is that KPOT tries to resolve function addresses by walking the Ldr field. It also use the Ldr field to find the base address of Kernel32, this is done by the following code:


.text:00406936                               get_Kernel32_base_via_Ldr proc near
.text:00406936 64 A1 30 00 00 00             mov     eax, large fs:30h ; read PEB
.text:0040693C 8B 40 0C                      mov     eax, [eax+0Ch]    ; read Ldr
.text:0040693F 8B 40 0C                      mov     eax, [eax+0Ch]    ; read InLoadOrderModuleList
.text:00406942 8B 00                         mov     eax, [eax]        ; read first entry (ntdll)
.text:00406944 8B 00                         mov     eax, [eax]        ; read second entry (kernel32)
.text:00406946 8B 40 18                      mov     eax, [eax+18h]    ; read DllBase
.text:00406949 C3                            retn 
.text:00406949                               get_Kernel32_base_via_Ldr endp


Even in this case you can just fake this value and write back the LDR_DATA_TABLE_ENTRY structure to memory but very soon you will discover that this strategy with fail (in fact, in our test the emulation raise an exception).

Dumping all strings from KPOT v2.0 (for real)

In the previous paragraph was introduced a feature that allows us to read objects from the process memory. In this paragraph we will see how to dump all encrypted strings in a very easy way. As said by Proofpoint all strings are encrypted with a very simple algorithm and stored in a struct that has the following layout:


public class EncryptedString
{
 public UInt16 EncryptionKey;
 public UInt16 StringLength;
 public UInt32 Buffer;

 public String Decrypt(IProcessContainer process)
 {
  var buffer = process.Memory.ReadMemory(this.Buffer, this.StringLength);
  var stringContent = new StringBuilder();
  foreach(var b in buffer)
  {
   stringContent.Append((Char)(b ^ this.EncryptionKey));
  }

  return stringContent.ToString();
 }
}


It would be very useful if we can read from the memory process an EncryptedString object instead that a raw byte array (as done by the Proofpoint python script). With Sojobo you can do it and the code to print all the decrypted strings is as simple as this one:


private static void DecryptStrings(IProcessContainer process)
{
 Console.WriteLine("-=[ Start Dump All Strings ]=-");
 
 // encrypted strings
 var encryptedStringsStartAddress = 0x00401288UL;
 var encryptedStringsEndAddress = 0x00401838UL;

 var currentOffset = encryptedStringsStartAddress;
 while (currentOffset < encryptedStringsEndAddress)
 {
  var encryptedString = process.Memory.ReadMemory<EncryptedString>(currentOffset);
  var decryptedString = encryptedString.Decrypt(process);
  Console.WriteLine("[+] {0}", decryptedString);

  // go to the next string
  currentOffset += 8UL; 
 }

 Console.WriteLine("-=[ Dump All Strings Completed ]=-");
}


In the GitHub repository you can find the full source code (to dump all strings pass --strings as first argument). The result it is the same as the one provided by Proofpoint (but with a cleaner code :P).

Conclusion and future development

Sojobo is still in its infancy but it can already be used for some initial analysis. In its future releases I'm going to add more emulated functions and the possibility to map other files in the process address space. By mapping external files (like Kernel32 or Ntdll) we can overcome problems related to an indirect referencing (like in the case above) while still maintaining control on how to emulate the function.

References

[01] B2R2: Building an Efficient Front-End for Binary Analysis - https://www.reddit.com/r/ReverseEngineering/comments/aultc1/b2r2_building_an_efficient_frontend_for_binary/
[02] B2R2: Building an Efficient Front-End for Binary Analysis (PDF) - https://ruoyuwang.me/bar2019/pdfs/bar2019-final51.pdf
[03] NDSS Workshop on Binary Analysis Research (BAR) 2019 - https://ruoyuwang.me/bar2019/
[04] Symbolic Execution component #question - https://github.com/B2R2-org/B2R2/issues/9
[05] Automatic Static Unpacking of Malware Binaries - https://www.researchgate.net/publication/221200507_Automatic_Static_Unpacking_of_Malware_Binaries
[06] MwEmu: Malware analysis emulator written in Python 3 (based on Unicorn) - ALPHA version - https://www.reddit.com/r/Malware/comments/bkb0p9/mwemu_malware_analysis_emulator_written_in_python/
[07] Convention over configuration - https://en.wikipedia.org/wiki/Convention_over_configuration
[08] New KPOT v2.0 stealer brings zero persistence and in-memory features to silently steal credentials - https://www.proofpoint.com/us/threat-insight/post/new-kpot-v20-stealer-brings-zero-persistence-and-memory-features-silently-steal

domenica 11 novembre 2018

Sacara VM Vs Antivirus Industry

Twitter: @s4tan

Sacara VM GitHub project: https://github.com/enkomio/sacara

In this blog post I want to describe a bit my latest side project and provides some data about how effective are protections based on software virtualization.

State of the art

If you ever read an academic paper, you have noticed that is imperative to describe which is the current state of the art of the topic discussed. I found this section very helpful so I decided to report here the articles that I have read and, according to my opinion, their technical level. Of course this is not a complete list and is very probable that I have missed some good resources.

Level beginner

As often happens there are a lot of good resource to start with, this is also true for the VM protection concept. At this level I think that the only needed skill is to be able to read Assembly and being able to use a debugger. If you are looking for some code to read I suggest you to take a look at Pasticciotto ([01]). It has also a nice writeup about how the VM works and which are the implemented opcodes. Another very interesting challenge is the one created by MalwareTechBlog, where you have to reverse a binary in order to obtain the flag. You can find a good write-up at [02].

Level intermediate

Let's raise the difficulty bar and see some projects that were created with the real purpose to protect the code. The required skill is to be able to create some simple scripts in order to easier your task, but nothing too advanced.

By considering projects created only for fun, the two most renowned ones are the hyperunpackme2 by thehyper ([03]) and the ReWolf x86 Virtualizer ([04]).
Maximus wrote a good (and lengthy) write-up about the first challenge at [05]. Even Rolf Rolles wrote a post where he created an IDA Processor module to analyze the code ([06]). Before you ask me, I don't consider writing a full IDA Processor as having basic IDA scripting skills :)

Level Advanced

To tackle advanced reverse engineering problems is not enough to have a very good understanding of theoretical concepts, but it is also necessary to be proficient with the available tools.

At this level the amount of work that must be done in order to understand what a program is doing cannot be solved by just looking at the assembly code (at least without an enormous amount of pain). There are three cases that in particular I consider pretty difficult to analyze.

The first one is a crackme challenge implemented by Solar Designer in 1996 (yes, you read it correctly, more than 22 years ago) [07]. In his project the author implemented what is know as a "one instruction set computer (OISC)", in particular he based all his work on the NOR instruction.

The second one is the challenge number 12 of the 2018 Flare-On challenge (Suspicious Floppy Disk: Nick Harbour), in this case the author went one step further and implemented two nested OISC, where the first one is a SUbtract and Branch if Less than or EQual aka "subleq" and the second one is a Reverse Subtract and Skip if Borrow aka "RSSB".
You can read a solution for this challenge at [08,09].

The last example, directly from the academia, is the tigress challenge [10], which is a challenge based on the obfuscation of the various hash functions, by using state-of-the-art protection (VM, Jitting ,etc...). A solution to part of the challenge was provided by Jonathan Salwan in [11].

As you can see by reading the solution of those challenges, the authors have used some advanced techniques that imply the creation of a custom CPU processor, or emulation via symbolic execution. Without a proficient knowledge of tools, solving that kind of challenges would result in a very complicated (almost impossible) task.

Introducing Sacara VM

Sacara is another project that implements a custom low level language that can be used to obfuscate part of code. It is not a tool that translate a PE binary in an obfuscated one, you have to write your own program :)

It tries to protect the code by using some features that increase the difficulty in the reverse engineering process (like Opcode encryption based on the location, multiple opcodes representation, usage of NOR instruction to implements various arithmetic functions, anti-debugging, and so on).

I created the project since I wanted to experiment a bit in this area, in the GitHub repository you can find the assembler (written in F#) and the VM to execute the code (written in x86 assembly). I'm not going to describe in details how it works, it is open source, read the code if you are curious :) Instead, I want to show you how effective can be this kind of protection in order to hide the real meaning of a program when the binary is analyzed by an Antivirus.

Before to proceed I want to make clear that this post is not another rant post on how the AV industry sucks. Too often people forget how difficult is to implement such kind of programs. If you really want to write a rant post on it, please be sure to present also an effective solution to the identified problems.

Protecting a .NET binary

For my test I created a sample application that read a blob from the resource and load it via the Assembly.Load method. You can find the source code of this program in the GitHub project, under the Example\LoadEncryptedAssembly directory.

The program allows to specify a .NET binary and a password in order to create a copy of itself with the specified file "encrypted" and embedded in its resources. The encryption is very simple, here is the code:
public static void ManagedEncrypt(Byte[] buffer, String password)
{
 var key = Encoding.Default.GetBytes(password);
 for (var i = 0; i < buffer.Length; i++)
 {
  buffer[i] = (byte)(buffer[i] ^ key[i % key.Length]);
 }
}
Once done that, you can invoke the new created program, which just loads the resource, decrypt it and run it.

The important point is that I used the Sacara VM in order to do the decryption of the data. To do this I created a simple script that you can find here, find below the source:

In order to have a realistic test I chose a malware from VirusTotal with a very high detection rate. After searching for the Assembly keyword I found this file: 3dd7ae0bca5e8e817581646c0e77885ffd3a60333a5bd24df9ccbe90b9938293, which has a detection rate of 65/68, as you can see in the following image:



Then, I ran the following command:
 LoadEncryptedAssembly.exe -b 3dd7ae0bca5e8e817581646c0e77885ffd3a60333a5bd24df9ccbe90b9938293 -p sacara
 -=[ Dynamically load encrypted Assembly SacaraVm sample ]=-
 For more information pass -h as argument
 New file 'LoadEncryptedAssembly.build.exe' generated. Run it to execute the program.
As I said before the command takes the file, encrypts it by using as password sacara and embeds it in the resource. It generates a new file named LoadEncryptedAssembly.build.exe, if you run it you will see that after a while the original malware binary is executed.

The question is, how effective is this kind of protection? I have uploaded the new file to VT: 2e46664c52373b9ec14c64496cf1d18661e745fb83f1cdaaf73970d4fca59bbe in order to analyze it and as you can see from the following image the detection rate dropped drastically to 3/64:



Conclusion

As you have noticed by using an obfuscation based on a software VM allowed to hide a malware that had a detection rate of 65/68 to a detection rate of 3/64.

The reason for this may be various, I suspect that the transaction from the managed world to the unmanaged world (in order to execute the decryption routine) may cause some problems. But this is something that most .NET malware already know, so I guess it shouldn't influence too much the result.

The second possibility is that the software emulation of the encryption code has caused trouble to the detection engines. Of course, all of them are pure speculations :)

References

[01] pasticciotto - https://github.com/peperunas/pasticciotto
[02] Reverse Engineering simple VM crackme - https://secrary.com/CrackMe/VM_1_MalwareTech/
[03] hyperunpackme2 by thehyper - https://crackmes.one/crackme/5ab77f5633c5d40ad448c280
[04] ReWolf x86 Virtualizer - https://github.com/rwfpl/rewolf-x86-virtualizer
[05] Reversing a Simple Virtual Machine - http://index-of.co.uk/Reversing-Exploiting/Reversing a Simple Virtual Machine.pdf
[06] Defeating HyperUnpackMe2 With an IDA Processor Module - http://www.msreverseengineering.com/blog/2014/8/5/defeating-hyperunpackme2-with-an-ida-processor-module
[07] Hackme - ftp://ftp.df.ru/pub/solar/dos/hackme.com
[08] Suspicious Floppy Disk - https://www.fireeye.com/content/dam/fireeye-www/blog/pdfs/FlareOn5_Challenge12_Solution.pdf]
[09] Flare-On 2018 - Challenge 12 - Subleq'n'RSSB - https://emanuelecozzi.net/posts/ctf/flareon-2018-challenge-12-subleq-rssb-writeup/
[10] Reverse Engineering Challenges! - http://tigress.cs.arizona.edu/challenges.html
[11] Tigress_protection - https://github.com/JonathanSalwan/Tigress_protection

lunedì 26 febbraio 2018

Analyzing the nasty .NET protection of the Ploutus.D malware.

Twitter: @s4tan

EDIT: The source code is now online: https://github.com/enkomio/Conferences/tree/master/HackInBo2018

Recently the ATM malware Ploutus.D reappeared in the news as being used to attack US ATM ([1]). In this post I'll show a possible analysis approach aimed at understanding its main protection. The protection is composed of different layers of protection, I'll focus on the one that, in my hopinion, is the most annoying, leaving the others out. If you want a clear picture of all the implied protections, I strongly recommend you to take a look at the de4dot Reactor deobfuscator code.

Introduction

Reversing .NET malware, in most cases, is not that difficult. This is mostly due to the awesome tool dnSpy ([2]), which allows debugging of the decompiled version of the Assembly. Most of the .NET malware use some kind of loader which decrypts a blob of data and then loads the result through a call to the Assembly.Load method ([3]).

From time to time some more advanced protection are involved, like the one analysed by Talos in [4]. What the article doesn't say is that in this specific case the malware uses a multi files assembly ([5]).

This implies that instead of using the Assembly.Load method, it uses the way less known Assembly.LoadModule method ([6]). This protection method is a bit more difficult to implement but I have to say that is way more effective as obfuscation. The malware also encrypt the method bodies and decrypt them only when necessary. This protection is easily overcome by calling the "Reload All Method Bodies" command in dnSpy at the right moment (as also showed in the Talos article).

Ploutus.D is also protected with an obfuscator which encrypts the method bodies and decrypts them only when necessary. The protector used is .NET Reactor ([7]) as also pointed out in a presentation by Karspersky ([8]). This particular protection is called NecroBit Protection, and from the product website we can read that:

NecroBit is a powerful protection technology which stops decompilation. NecroBit replaces the CIL code within methods with encrypted code. This way it is not possible to decompile/reverse engineer your method source code.


The difference with the previous case is that if we try to use the "Reload All Method Bodies" feature in dnSpy, it will fail (this is not technically correct since there is nothing to reload as we will see).

Reversing Ploutus.D obfuscation

To write this blog post I have reversed the sample with MD5 ae3adcc482edc3e0579e152038c3844e. When I start to analyse a .NET malware, as first task I ran my tool Shed ([9]) in order to have a broad overview of what the malware does and to try to extract dynamically loaded Assemblies. In this case I was able to extract some useful strings (like the configured backend usbtest[.]ddns[.]net) but not the Assembly with the method bodies decrypted (however this is not an error and as we will see it is the correct behaviour).

The next step is to debug the program with dnSpy. If you run it the following Form will be displayed:

I started to dig a bit on the classes that extend the Form class in order to identify which commands are supported. Unfortunately most of the methods of these classes are empty, as can be seen from the following screenshot:


It is interesting to note that all the static constructors are not empty. All of them are pretty simple (in some cases they have just one instruction), what it is interesting is that all of them call the same method: P9ZBIKXMsRMxLdTfcG.Nf9E3QXmJD();, which is marked as internal unsafe static void Nf9E3QXmJD().

By analysing it, the thing start to get interesting since this method is pretty huge, especially since it implements a very annoying control flow obfuscation. It is interesting to notice that if we set a breakpoint on this method and re-start the debugging session, it is amongst the first methods invoked by the program. Scrolling through the code we can find the following interesting statement:

if (P9ZBIKXMsRMxLdTfcG.Ax6OYTY7tiMf4Yu1B4(P9ZBIKXMsRMxLdTfcG.XnSi7dQe0TUTJbDcxg(P9ZBIKXMsRMxLdTfcG.CQNheW6eOQNeBsXbJC(processModule)), "clrjit.dll"))


This piece of code is particularly interesting, since it tries to identify the clrjit.dll module. Once found, it identifies the CLR version, which in my case is 4.0.30319.0. Then, it extracts the resource m7fEJg2w6sBe9LM3D3.i4tjc9Xt0Vhu5G72Uh.

After a while the getJit string appears in the execution. This function is exported by clrjit.dll and it is a very important method since it allows to get a pointer to the compileMethod method. To know more about it you could refer to my Phrack article about .NET program instrumentation ([10]). We can also identify a call to the VirtualProtect method.

With these information we can start to make some assumption, like that the malware hook the compileMethod method in order to force the compilation of the real MSIL bytecode. Let's verify our assumption, in order to do so we need to change tool, in particular we will use WinDbg with the SOS extension (if you want to know more about debugging .NET applications with WinDbg take a look at my presentaion [11]).

In order to inspect the program at the right moment, we will set an exception when the clrjit.dll library is loaded. This is easily done with the command:

sxe ld clrjit.dll
once that this exception is raised let's inspect the clrjit module as showed in the following image:



The getJit method is an exported by clrjit dll and returns the address of the VTable of an ICorJitCompiler object, where the first item is a pointer to the compileMethod method, as can be seen from the source code ([12]). But, since we don't trust the source code, let's debug the getJit method till the ret instruction and inspect the return value stored in eax:


as can be seen from the image above, the address of the compileMethod is at 0x70f049b0. Now let's the program run until the main windows is displayed and then break the process in the debugger. Let's display again the content of the VTable (which was 0x70f71420).


As can be seen from the image above the value of the first entry of the VTable changed to from 0x70f049b0 to 002a0000. So our assumption about the hooking of the compileMethod was right :)

Now we want to identify which method hooked the compileMethod method. To do this we will load the SOS extension (with the command .loadby SOS clrjit), set a breakpoint at the compileMethod method and when the brakpoint hits, type !CLRStack command to see which method was set as replacement. In order to trigger the compileMethod breakpoint I clicked on a random button in the interface.


from the image above we can spot that the interested method is qtlEIBBYuV. Find below the decompiled code of the metohd (I have renamed the argument names and added some comments):

What is interesting from the code above is that:
  • it reads the address of the COREINFO_METHOD_INFO structure at (1)
  • writes back the real MSIL bytecode at (2)
  • updates the fields ILCode and ILCodeSize at (3) and (4)
  • finally call the original compileMethod at (5)
In this way, it is sure that the correct MSIL code is compiled and executed (for more info on this structure please refer to [10,12]).

Finally, we have a pretty good understanding of how the real code is protected, now we can try to implement a simple program which dumps the real MSIL bytecode and rebuilds the assembly. The de4dot tool, instead, uses a different approach, which is based on emulating the decryption code of the method body and then rebuild the assembly.

Let's the code speak

A possible approach to dump the real MSIL bytecode is:
  • Hook the compileMethod before the malware
  • Force all static constructors to be invoked and force compilation of all methods via RuntimeHelpers.PrepareMethod. This will ensure that we are able to grab all the ILCode of the various methods.
  • When the hook is invoked store the values of the fields ILCode and ILCodeSize. We have to record also which method is currently compiled, this is done with the code getMethodInfoFromModule from [10].
  • Rebuild the assembly by using Mono.Cecil or dnlib (my choice)
However, for this specific case, I'll use a slightly different approach, which is not as generic as the previous one but it is simpler and more interesting imho :)

As we have seen from the code above, the P9ZBIKXMsRMxLdTfcG.k6dbsY0qhy is a dictionary of objects which contains the real MSIL bytecode as value and as key the address of the MSIL buffer. What we can do is to read the value of this object via reflection and rebuild the original binary. All this without implying the hooking of any methods :)

I have implemented a simple program that extracts those values via reflection, calculates the address of each method and rebuild the assembly. If you want to take a look it, here is the code.

After dumped the real MSIL, we can see that now the methods are not empty anymore:


Conclusion

The purpose of this post was to show how to analyse, in an effective way, a strongly obfuscate malware with the help of different tools and the knowledge of the internal working of the .NET framework.

As an alternative, if you want to obtain a de-obfuscated sample I encourage you to use the de4dot tool (and to read the code since this project is a gold mine of information related to the .NET internals).

At the time of this writing the sample is not correctly deobfuscated by de4dot due to an error in the string decryption step. To obtain a deobfuscated sample with the real method body, just comment out the string decryption step in ObfuscatedFile.cs.

Too often developers underestimate the power of reflection and as a result it is not uncommon to bypass protection (included license verification code) only by using reflection and nothing more :)

References

[1] First ‘Jackpotting’ Attacks Hit U.S. ATMs - https://goo.gl/6WY14V
[2] dnSpy - https://github.com/0xd4d/dnSpy
[3] Assembly.Load Method (Byte[]) - https://goo.gl/owZtC1
[4] Recam Redux - DeConfusing ConfuserEx - https://goo.gl/oKgj1k
[5] How to: Build a Multifile Assembly - https://goo.gl/mVdHuU
[6] Assembly.LoadModule Method (String, Byte[]) - https://goo.gl/D6N797
[7] .NET REACTOR - http://www.eziriz.com/dotnet_reactor.htm
[8] Threat hunting .NET malware with YARA.pdf - https://goo.gl/RxEw1G
[9] Shed, .NET runtime inspector - https://github.com/enkomio/shed
[10] http://www.phrack.org/papers/dotnet_instrumentation.html
[11] .NET for hackers - https://www.slideshare.net/s4tan/net-for-hackers
[12] getJit() - https://github.com/dotnet/coreclr/blob/master/src/inc/corjit.h#L241