mercoledì 16 settembre 2015

CryptoPHP Vs Tempesta

I have recently published a new static analysis tool (Tempesta), useful to analyze PHP source code in order to identify a possible malicious behavior. The project is still in its early stage so I carefully monitor the files that are submitted and what could be the cause of possible issues. One of these files is the infamous CryptoPHP backdoor (Fox-IT report).

The submitted sample initially hasn't produced any results, mainly due to the fact that the sample is not a valid PHP code (ok, ok, there was also a bug in Tempesta that prevented the analysis :P).

The sample is composed of only three classes definitions without any objects instantiation code. I decided to take a closer look at this file in order to see if Tempesta was able to analyze it.

The first step was to fix the syntax by closing all the unbalanced parenthesis. Then, I have added three lines of code that just instantiate each class in a correct way.

The first submission didn't returne any meaningful result. By inspecting the code more carefully it was easy to find a call to the function curl_setopt, with high chance this means that inside the file there should also be a list of contacted domains. This was confirmed by the following piece of code that builds the array of the domains to contact:

foreach ($this->uQfIZmMpqyjCaRQMgMoc as $ZRhtpGOgTZTZRdeSYVBw) { $ANVoslonRNQSwwQloQTx[] = base64_decode(str_rot13(strrev($ZRhtpGOgTZTZRdeSYVBw))); }
However, by continuing to inspect the code, the following snipped is executed:

foreach ($eaRKAIVvmthhlFyDIslv as $emDnXOMHIUCHXVocAIgZ) { $ANVoslonRNQSwwQloQTx[] = $JKBGKZwspUYvdkeoetY[$emDnXOMHIUCHXVocAIgZ]; } return $ANVoslonRNQSwwQloQTx;


where $JKBGKZwspUYvdkeoetY is the array populated in the previous snipped and containing the decoded domains, $eaRKAIVvmthhlFyDIslv is an opaque value and $ANVoslonRNQSwwQloQTx is the final array containing the domains that will receive the stolen data. In this specific case, without knowing exactly the value of the $eaRKAIVvmthhlFyDIslv variable, we are not able to populate correctly the array variable $ANVoslonRNQSwwQloQTx, with the result that we can't identify the list of contacted domains. This is a tipical case of the limitation of the static analysis tools (or at least of the current used static analysis approach).

However, not everything is lost. By inserting some more fine grained code inspection during the simulation we were able to identify the list of domains.

You can find the analysis of the sample at: http://enkomio.com/tempesta/#/scan/bbe1de38-a798-4f0a-99c3-1e0f1dee07b3


Update Here is another analysis of a CryptoPHP backdoor: http://enkomio.com/tempesta/#/scan/18a7b126-31a8-4048-b032-80fe50b0ae72