Improving YARA Rules from TA17–293A

6 min readOct 22, 2017

--

At first, I’d like to thank the US CERT for the publication of the newest threat report on Dragonfly activity named TA17–293A.

It’s my belief that even information that is in some aspects inaccurate or questionable needs the feedback generated by a strong and interconnected community. Therefore I am strong supporter of threat information sharing and emphasize this position by extracting, refining and deriving new signatures from published reports of all sharing parties within our community.

The latest report by the US CERT contained a lot of information on a threat actor named Dragonfly, which I relate in my “APT Groups and Operations” Google Docs spreadsheet to “Energetic Bear” and “Crouching Yeti”.

“Russia” section in “APT Groups and Operations” spreadsheet (https://goo.gl/8A8Sa2)

Could someone from Crowdstrike or Kaspersky please check if the mapping is correct? There seem to be too many “Bears” in a single line.

This report contained TTPs, IOCs and YARA rules for the mentioned malware and hack tool samples.

The first thing I do is to check the provided hashes with my script “Munin”, which gives me a better overview on the samples by providing me information on the file names used, the first submission date, AV coverage and certain tags like “Signed” (signed executable), “Revoked” (revoked certificate) or “Harmless” (Microsoft software catalogue).

TA17–293A samples checked with “munin.py” (https://github.com/Neo23x0/munin)

Results of the sample online check: https://docs.google.com/spreadsheets/d/1DLsQrADjmmHxhHJo5DPhCn0aDEP5wkbS5iYEzAyk6BU/edit?usp=sharing

While the script was running in the background, I already noticed that the adversary used a renamed version of “PsExec” as “ps.exe” and the US CERT included hashes of the well-known and pretty popular tool in the IOC list. The US CERT correctly lists them as “benign” and adds a comment that explains the integration.

They state “According to DHS analysis, cyber actors may place ps.exe on compromised systems. Attention should be paid to copies of Psexec.exe that have been renamed and appear in unusual locations”.

PsExec with comment in the TA17–293A IOC list

There is no problem with that, as long as an analyst reads and pre-qualifies the IOCs before bringing them into production. Unfortunately such IOCs often find their way into big threat information exchanges and finally end up in a security monitoring / SIEM / EDR solution causing numerous false positives.

The PsExec hash in Alienvault’s OTX (https://otx.alienvault.com/browse/pulses/?q=AEEE996FD3484F28E5CD85FE26B6BDCD&sort=-created)

The PsExec hash in CIRCL.LU’s MISP (some of them correctly marked “IDS=No”)

The two listed versions of PsExec could be an indicator of compromise if an organisation forbids the use of PsExec or can be certain that the listed versions are not used by the system administrators (both is rather unlikely).

If you have a proper security monitoring solution in place you could check for process starts with the following command line:

ps.exe -accepteula

A simple execution of “ps.exe” wouldn’t be enough in most corporate environments as there are numerous tools, including the“ps.exe” in Cygwin, that have this name. But there is no SysInternals tool that features the “-accepteula” parameter and is named “ps.exe”.

I quickly wrote a Sigma rule to describe this detection method.

Sigma rule to cover the PsExec execution mentioned in TA17–243A (https://github.com/Neo23x0/sigma/blob/master/rules/apt/apt_ta17_293a_ps.yml)

But let us proceed. I was especially worried about the included YARA rules. After a quick review, I noted that I had to untangle the complex conditions to uncover the original intentions of the author. Furthermore I didn’t have the samples for which these rules were created. So I couldn’t just drop strings or conditions because I wasn’t able to verify that the modified rule would still match the original sample set.

Line breaks helped me understand the conditions better. I noticed that e.g. the string $s25 was included in a condition that says:

 StringA AND StringB OR StringC

Due to the fact that “AND” binds tighter than “OR”, this string alone would always trigger this rule. A simple “/icon.png” in any file causes this rule to fire.

I guessed that the author intended to combine the string $s25 with string $s0, which is “file://”. I still wasn’t convinced that this wouldn’t fire on hundreds of samples.

I renamed the rules $s14 to $s21 and used $x*, for any type of string that is specific enough to serve as single indicator, and simplified the condition for these strings to “1 of ($x*)”.

I renamed the $s0 string to $n1 and combined it with strings named $ax* that looked specific enough to serve as second part of this condition. I tried to use the strings $au* in combination with $n1 but they proved to be too weak. If I had the original samples, I would have analyzed a sample that included “file://” and “icon.png” and would have checked it for other usable values.

My solution is prone to false negatives but I am totally aware of that risk and have no other choice but to take it.

Other strings in that listing looked so specific that I reduced the difficulty to make them match by including them into the $x* string list. Others went into the list with the $s* identifier. Of those strings, two are needed to make the rule trigger.

The second rule I changed was the 4th rule in the set.

The rule isn’t wrong but it contains a very short string:

a(“

As we use YARA in our scanners that walk the whole file system scanning every single file on disk — high performance is crucial. YARA uses “Aho Corasick” for string matching and is therefore very fast, but disadvantageous rules can affect the performance substantially.

YARA includes a feature that warns on strings that have negative effects on performance. This 3 character string didn’t cause the warning message but I am pretty sure that an atom of this length has negative effects on performance. A long time ago I wrote a gist named “YARA Performance Guidelines”, which is basically just a compilation of Victor’s and Wesley’s feedback on questions that I’ve asked them. I’ll add that Gist to the Appendix of this post.

As the condition combines four $decode* strings of sufficient complexity, I just tried to leave the very short string out.

Modified YARA rule (for performance reasons)

The last step was a minor improvement of the rules including the ZIP header condition at position 0.

strings:
    $zip_magic = { 50 4b 03 04 }
condition:
    $zip_magic at 0

As all the elements in the “strings” are applied in string matching, YARA would first find all strings in a file that match and then check their location if a location is defined in the condition. We can improve the rule by removing the ZIP magic from the strings and include the check at position 0 in the condition.

uint32(0) == 0x04034b50

Or use uint32be() to read big endian values from the offset 0.

uint32be(0) == 0x504b0304

There is a feature called “Short-Circuit Evaluation”. Try to write conditions in which the elements that are most likely to be “False” are placed first. The condition is evaluated from left to right. The sooner it can be identified as “non-triggering” the better.

See the “YARA Performance Guidelines” for more details on this matter.

Links

Conquer String Search with the Aho-Corasick Algorithm

The Aho-Corasick algorithm uses a trie data structure to efficiently match multiple patterns against a large blob of…

www.toptal.com

YARA Performance Guidelines

Improving YARA Rules from TA17–293A

Conquer String Search with the Aho-Corasick Algorithm

The Aho-Corasick algorithm uses a trie data structure to efficiently match multiple patterns against a large blob of…

Written by Florian Roth