DISMANTLING BLACKENERGY, PART2 – “THE MARK”

WRITTEN BY

Aleksey Yasinskiy

[post-views]

March 01, 2016 · 7 min read

DISMANTLING BLACKENERGY, PART2 – “THE MARK”

Table of contents:

I will not make a speech on what a BlackEnergy framework is since a lot was written about it already and without me, however I want to refer to information from this particular review:

… The cybercriminal group behind BlackEnergy, the malware family that has been around since 2007 and has made a comeback in 2014, was also active in the year 2015. ESET has recently discovered that the BlackEnergy Trojan was recently used as a backdoor to deliver a destructive KillDisk component aimed at destruction of files on the hard disks in attacks against Ukrainian news media companies and against the electrical power industry…

Everyone, meet the Flashplayer!

During the investigation of the attack on our infrastructure we have detected various malware samples and among them Flashplayerapp.exe (https://www.virustotal.com/ru/file/c787166ad731131c811d1a63080ac871ec11f10bcd77b9a1e665f1c9bbaa9a54/analysis/)

Briefly, flashplayerapp extracts main_light.dll into its own memory space and transfers control to it. This library is a light version of BlackEnergy and is used for C&C communication to download the main payload with functionality depending on the goals pursued by the attacker. It creates a file: C:\Users\user\Appdata\Adobe\cache.dat

While encrypted this file contains various information including the build version (such as 2015lstb), address of command and control center from which it can retrieve various payloads (such as hxxps://188.40.8.72/l7vogLG/BVZ99/rt170v/solocVI/eegL7p.php) etc. As I was examining this sample, I switched focus from C&C as one particular part of the code caught my attention, as it was an obvious sign of using a technique called code permutation. Let me translate a quote from this source http://hacks.clan.su/publ/11-1-0-481:

…Permutation is the transformation of already complete code. The most advancements in this technique to the date were made by Z0MBiE. However, even the older engine versions credited someone called Lord_ASD. The next advancements in this technique after him were made by Vecna and SBVC. So what is a permutation? The core algorithm for permutation engines is the disassembly of the code followed by its mutation and reassembly. Let us have a look at the most simple algorithm used in permutation engine:
All code instructions are disassembled by length, conditional and unconditional jumps are marked.
Instructions are substituted with the synonymous ones and in-between garbage instructions are inserted. All jumps are re-calculated.
This kind of polymorphism is most promising due to following reasons:
Flexibility (not ease) of engine coding
High level of mutations
Requirement for emulation of all instructions
Despite this type of polymorphism being known or quite a while, it did not become mainstream due to difficulty of implementation…

So here is how the code fragment looked like (the instructions added by the permutation engine are highlighted in red and they do not affect functionality at all, they just make code more difficult to understand):

By removing the unnecessary strings, we get a mechanism that processes all function names in kernel32.dll

by turning them their (names) into some form of hash that it later compares with some value:

Let us recreate this mechanism using C as example. As a result, we receive a utility that can produce for us a “dictionary” that contains names of all functions in kernel32.dll library and hash-values of these names. Let us try searching the hash values that we obtained during the code reverse analysis in our new “dictionary”. First, we seek the value from the EAX register (0x5147F60F), and then compare it to the reference value:

Our theory is working: our sample has processed the name of the first function, created the hash and is now comparing it with reference value that is contained in buffer (0xC8AC8026). We search for this value and behold… we get the LoadLibraryA function!

To skip another wall of text example I will just mention that our “sample” uses the same approach to search for another function called GetProcAdress.

By using these two functions our “sample” can load other libraries and extract the addresses for the functions it requires.

The tracks left by the Engine

The technique of making static code analysis difficult via function call based on some hash value rather than its name is not new. However it lead me to the following thought: despite of how the malicious code is processed, we know the algorithm of functions’ name hashing and thus we can confidently look for these two functions (LoadLibraryA and GetProcAdress) in the code. Thus, I started searching them internetz for the mentions of this hash (0xC8AC8026) to see if someone has used the same engine from our example before and to my surprise, I did find something:

This hash is mentioned for the first time during 2006 (the article can only be found at the archive.org): https://web.archive.org/web/20060614030412/http://osix.net/modules/article/?id=789

Later, in 2008, a potential author of the algorithm makes an appearance on the forum:
https://exelab.ru/f/index.php?action=vthread&forum=6&topic=11845

Afterwards, in 2009 an analysis article about shellcode MS08-067 is published at blogs.technet.com: http://blogs.technet.com/b/srd/archive/2009/06/05/shellcode-analysis-via-msec-debugger-extensions.aspx

Then in 2013, the “XAKEP” magazine (one of biggest Russian media on IT and Information Security) publishes article https://xakep.ru/2011/06/23/55780/ where they mention the exact same algorithm which creates the same hash (!):

Hold on just second, the situation we got here is that the same algorithm that generates the same hash-values is in use for more than 10 years and yet nobody has noticed this? Should this be otherwise, then for the flashplayerapp.exe (https://www.virustotal.com/ru/file/c787166ad731131c811d1a63080ac871ec11f10bcd77b9a1e665f1c9bbaa9a54/analysis/) being the same as it appears now, it would be impossible to stay unnoticed by anti-virus solutions while we already live in a year of 2016:

There is a clear “mark” that is left by this concrete permutation engine that was used in our analyzed sample and it looks like a valid signature to me. Moreover, it is rather easy to track directly in the clear text:

Well, we got our assumptions and the theory is straightened out so let us proceed to practice and try transforming our conclusions into a Yara signature / rule:

rule API_Hash

{meta: description = “Hash of LoadLibrary that is {26 80 ac c8} and GetProcAddres {ee ea c0 1f}”

strings:

$a = {26 80 ac c8}

$b = {ee ea c0 1f}

condition: $a and $b

}

The only thing left to do now is to test it against multiple various samples (for example of an antivirus laboratory) in order to find out exactly how much of malicious code was packed by this engine for real?
At this point, I can only state that it was possible for me to conduct the experiment on a very small amount of samples in one of the antivirus companies, yet results are well worth it! Among the caught samples, we have:

Win32/Spy.Bebloh (banking trojan) — http://www.virusradar.com/en/Win32_Spy.Bebloh/detail
Win32/PSW.Fareit (Trojan for stealing passwords) — http://www.virusradar.com/en/Win32_PSW.Fareit/detail
Win32/Rustock (Backdoor) — http://www.virusradar.com/en/Win32_Rustock/detail
Win32/TrojanDownloader.Carberp (Dropper that installs banking Trojan Carberp) — http://www.virusradar.com/en/Win32_TrojanDownloader.Carberp/detail
Win32/Kelihos (Spam sender) — http://www.virusradar.com/en/Win32_Kelihos/detail

This list contains only well-known families. These two hash values also appear in other malware families including Ransomware / file encryptors (Win32/Filecoder.HydraCrypt), a Bitcoin miner (CoinMiner.LC), a WinLocker (LockScreen.AQT) and others.

So far, I can also add that this rather simple looking Yara rule did not lead to a single false-positive to the date, but the testing was conducted only on few dozen computers. Therefore, I really express my hopes that after publishing of this research, You (!) and the ones who have read this far will share their feedback that will be big enough to prove or disprove this rule’s precision! Of course, using just eight bytes for a signature is exceptionally small and the rule will very likely catch some “legit” files. Nevertheless, this lightweight Yara rule can live in a sandbox as part of a more complex analysis logic.

Conclusions

The goal of this article is not the reverse engineering of the BlackEnergy malware but I rather want to highlight the fact that Tools used during assembly of one or another malware code leave their traces and can lead to the appearance of useful indicators that can be utilized for the defense. And those indicators can live unnoticed for a rather long time…

Translated from original by Andrii Bezverkhyi | CEO SOC Prime

Was this article helpful?

Like and share it with your peers.

Name	Descripiton
PHPSESSID	Preserves user session state across page requests. Cookie generated by applications based on the PHP language. This is a general purpose identifier used to maintain user session variables. It is normally a random generated number, how it is used can be specific to the site, but a good example is maintaining a logged-in status for a user between pages.
sp_i	Used to store information about authenticated User.
sp_r	Used to store information about authenticated User.
sp_a	Used to store information about authenticated User.

Name	Descripiton
tuuid	Collects anonymous data related to the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been loaded.
tuuid_last_update	Collects anonymous data related to the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been loaded.
um	Collects anonymous data related to the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been loaded.
umeh	Collects anonymous data related to the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been loaded.
na_sc_x	Used by the social sharing platform AddThis to keep a record of parts of the site that has been visited in order to recommend other parts of the site.
APID	Collects anonymous data related to the user's visits to the website.
IDSYNC	Collects anonymous data related to the user's visits to the website.
_cc_aud	Collects anonymous statistical data related to the user's website visits, such as the number of visits, average time spent on the website and what pages have been loaded. The purpose is to segment the website's users according to factors such as demographics and geographical location, in order to enable media and marketing agencies to structure and understand their target groups to enable customised online advertising.
_cc_cc	Collects anonymous statistical data related to the user's website visits, such as the number of visits, average time spent on the website and what pages have been loaded. The purpose is to segment the website's users according to factors such as demographics and geographical location, in order to enable media and marketing agencies to structure and understand their target groups to enable customised online advertising.
_cc_dc	Collects anonymous statistical data related to the user's website visits, such as the number of visits, average time spent on the website and what pages have been loaded. The purpose is to segment the website's users according to factors such as demographics and geographical location, in order to enable media and marketing agencies to structure and understand their target groups to enable customised online advertising.
_cc_id	Collects anonymous statistical data related to the user's website visits, such as the number of visits, average time spent on the website and what pages have been loaded. The purpose is to segment the website's users according to factors such as demographics and geographical location, in order to enable media and marketing agencies to structure and understand their target groups to enable customised online advertising.
dpm	Via a unique ID that is used for semantic content analysis, the user's navigation on the website is registered and linked to offline data from surveys and similar registrations to display targeted ads.
acs	Collects anonymous data related to the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been loaded, with the purpose of displaying targeted ads.
clid	Collects anonymous data related to the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been loaded, with the purpose of displaying targeted ads.
KRTBCOOKIE_#	Registers a unique ID that identifies the user's device during return visits across websites that use the same ad network. The ID is used to allow targeted ads.
PUBMDCID	Registers a unique ID that identifies the user's device during return visits across websites that use the same ad network. The ID is used to allow targeted ads.
PugT	Registers a unique ID that identifies the user's device during return visits across websites that use the same ad network. The ID is used to allow targeted ads.
ssi	Registers a unique ID that identifies a returning user's device. The ID is used for targeted ads.
_tmid	Registers a unique ID that identifies the user's device upon return visits. The ID is used to target ads in video clips.
wam-sync	Used by the advertising platform Weborama to determine the visitor's interests based on pages visits, content clicked and other actions on the website.
wui	Used by the advertising platform Weborama to determine the visitor's interests based on pages visits, content clicked and other actions on the website.
AFFICHE_W	Used by the advertising platform Weborama to determine the visitor's interests based on pages visits, content clicked and other actions on the website.
B	Collects anonymous data related to the user's website visits, such as the number of visits, average time spent on the website and what pages have been loaded. The registered data is used to categorise the users' interest and demographical profiles with the purpose of customising the website content depending on the visitor.
1P_JAR	These cookies are used to gather website statistics, and track conversion rates.
APISID	Google set a number of cookies on any page that includes a Google reCAPTCHA. While we have no control over the cookies set by Google, they appear to include a mixture of pieces of information to measure the number and behaviour of Google reCAPTCHA users.
HSID	Google set a number of cookies on any page that includes a Google reCAPTCHA. While we have no control over the cookies set by Google, they appear to include a mixture of pieces of information to measure the number and behaviour of Google reCAPTCHA users.
NID	Google set a number of cookies on any page that includes a Google reCAPTCHA. While we have no control over the cookies set by Google, they appear to include a mixture of pieces of information to measure the number and behaviour of Google reCAPTCHA users.
SAPISID	Google set a number of cookies on any page that includes a Google reCAPTCHA. While we have no control over the cookies set by Google, they appear to include a mixture of pieces of information to measure the number and behaviour of Google reCAPTCHA users.
SID	Google set a number of cookies on any page that includes a Google reCAPTCHA. While we have no control over the cookies set by Google, they appear to include a mixture of pieces of information to measure the number and behaviour of Google reCAPTCHA users.
SIDCC	Security cookie to protect users data from unauthorised access.
SSID	Google set a number of cookies on any page that includes a Google reCAPTCHA. While we have no control over the cookies set by Google, they appear to include a mixture of pieces of information to measure the number and behaviour of Google reCAPTCHA users.
__utmx	This cookie is associated with Google Website Optimizer, a tool designed to help site owners improve their wbesites. It is used to distinguish between two varaitions a webpage that might be shown to a visitor as part of an A/B split test. This helps site owners to detemine which version of a page performs better, and therefore helps to improve the website.
__utmxx	This cookie is associated with Google Website Optimizer, a tool designed to help site owners improve their wbesites. It is used to distinguish between two varaitions a webpage that might be shown to a visitor as part of an A/B split test. This helps site owners to detemine which version of a page performs better, and therefore helps to improve the website.

Name	Descripiton
_hjid	Hotjar cookie. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.
_hjIncludedInSample	This cookie is associated with web analytics functionality and services from Hot Jar, a Malta based company. It uniquely identifies a visitor during a single browser session and indicates they are included in an audience sample.
intercom-id-[xxx]	This cookie is used by Intercom as a session so that users can continue a chat as they move through the site.
intercom-session-[xxx]	Used to keeping track of sessions and remember logins and conversations.
demdex	Via a unique ID that is used for semantic content analysis, the user's navigation on the website is registered and linked to offline data from surveys and similar registrations to display targeted ads.
CookieConsent	Stores the user's cookie consent state for the current domain.
__cfduid	Used by the content network, Cloudflare, to identify trusted web traffic.
ss	These cookies enable the website to provide enhanced functionality and personalisation . They may be set by us or by third party providers whose services we have added to our pages. These services may include the Live Chat facility, Contact Us form(s), the Product Quotation forms and submission process, and the Email Newsletter sign up functionality .

Name	Descripiton
_ga	This cookie name is asssociated with Google Universal Analytics - which is a significant update to Google's more commonly used analytics service. This cookie is used to distinguish unique users by assigning a randomly generated number as a client identifier. It is included in each page. Registers a unique ID that is used to generate statistical data on how the visitor uses the website. request in a site and used to calculate visitor, session and campaign data for the sites analytics reports. By default it is set to expire after 2 years, although this is customisable by website owners.
_gat	Used by Google Analytics to throttle request rate. This cookie name is associated with Google Universal Analytics, according to documentation it is used to throttle the request rate - limiting the collection of data on high traffic sites. It expires after 10 minutes.
_gid	This cookie name is asssociated with Google Universal Analytics. This appears to be a new cookie and as of Spring 2017 no information is available from Google. It appears to store and update a unique value for each page visited. Registers a unique ID that is used to generate statistical data on how the visitor uses the website.
IDE	Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.
r/collect	Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.
test_cookie	Used to check if the user's browser supports cookies.
collect	Used to send data to Google Analytics about the visitor's device and behaviour. Tracks the visitor across devices and marketing channels.
ads/user-lists/#	These cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites.
c	Registers anonymised user data, such as IP address, geographical location, visited websites, and what ads the user has clicked, with the purpose of optimising ad display based on the user's movement on websites that use the same ad network.
khaos	Registers anonymised user data, such as IP address, geographical location, visited websites, and what ads the user has clicked, with the purpose of optimising ad display based on the user's movement on websites that use the same ad network.
put_#	Registers anonymised user data, such as IP address, geographical location, visited websites, and what ads the user has clicked, with the purpose of optimising ad display based on the user's movement on websites that use the same ad network.
rpb	Registers anonymised user data, such as IP address, geographical location, visited websites, and what ads the user has clicked, with the purpose of optimising ad display based on the user's movement on websites that use the same ad network.
rpx	Registers anonymised user data, such as IP address, geographical location, visited websites, and what ads the user has clicked, with the purpose of optimising ad display based on the user's movement on websites that use the same ad network.
tap.php	Registers anonymised user data, such as IP address, geographical location, visited websites, and what ads the user has clicked, with the purpose of optimising ad display based on the user's movement on websites that use the same ad network.

DISMANTLING BLACKENERGY, PART2 – “THE MARK”

Everyone, meet the Flashplayer!

The tracks left by the Engine

Conclusions

Table of Contents

Was this article helpful?

Boost Your Cyber Defense withThreat Detection Marketplace

DISMANTLING BLACKENERGY, PART2 – “THE MARK”

Everyone, meet the Flashplayer!

The tracks left by the Engine

Conclusions

#ez_toc_widget_sticky-2 .ez-toc-widget-sticky-container ul.ez-toc-widget-sticky-list li.active{ background-color: #ededed; } Table of Contents

Was this article helpful?

Boost Your Cyber Defense withThreat Detection Marketplace

Table of Contents