Yara and Rich Headers are full of win

Rich Headers are awesome and you should read about them here: http://www.ntcore.com/files/richsign.htm

I was looking at a sample recently and, being the good analyst I am, attempted to automate my DFIR processes using the tools I had handy.

Step 1: Get the Rich Headers for the malicious binary.
Let's use 9e6658fab423d9b3fabc3578ac5482bf4f21f6fb98949d8ef4f3cad349862b82, a known RAMNIT ransomware sample.

$ laika.py 9e6658fab423d9b3fabc3578ac5482bf4f21f6fb98949d8ef4f3cad349862b82 | jq '.scan_result[].moduleMetadata | select(.META_PE) | .META_PE."Rich Header"'
{
  "Checksum": 3183276661,
  "Hashes": {
    "SHA1": "dd0f0861bc67028b3cab0d6cdc55a29cc2822f64",
    "SHA256": "8fd544eee7389c64a29440555f5b968ae0632ed3b989a63402b6d43934a8973e",
    "MD5": "8451f317057b289c04b2c5b202c09715"
  },
  "Rich Header Values": [
    {
      "Count": 6,
      "Version": 7299,
      "Id": 14
    },
    {
      "Count": 20,
      "Version": 0,
      "Id": 1
    },
    {
      "Count": 7,
      "Version": 4035,
      "Id": 93
    },
    {
      "Count": 4,
      "Version": 9044,
      "Id": 48
    },
    {
      "Count": 1,
      "Version": 8803,
      "Id": 42
    },
    {
      "Count": 1,
      "Version": 8447,
      "Id": 4
    }
  ]
}

Each Id and Version represents a specific tool, and the associated Count is the number of input-files to that tool.

Step 2: Determine the compilation toolset.
Let's take a look at our tools using my getrichlaikaboss script from https://github.com/agrajag9/getrichlaikaboss

$ ./getrich.py -i ~/Downloads/9e6658fab423d9b3fabc3578ac5482bf4f21f6fb98949d8ef4f3cad349862b82.json -c comp_ids.txt
[{'Count': 6, 'Id': 14, 'Version': 7299},
 {'Count': 20, 'Id': 1, 'Version': 0},
 {'Count': 7, 'Id': 93, 'Version': 4035},
 {'Count': 4, 'Id': 48, 'Version': 9044},
 {'Count': 1, 'Id': 42, 'Version': 8803},
 {'Count': 1, 'Id': 4, 'Version': 8447}]
000E1C83: No matching ID and Version
00010000: [---] Unmarked objects
005D0FC3: No matching ID and Version
00302354: No matching ID and Version
002A2263: No matching ID and Version
000420FF: [LNK] VC++ 6.0 SP5 imp/exp build 8447

Sadly the list of compilation Ids and Versions is far from complete, but we know the author was at least using the VC++ linker from VisualStudio 6.0, which was released in 1998.

But maybe we can find some more bad files this way?

Step 3: Make a yara signature for pivoting.
This one's nice and easy using yara 3.5, but

import "pe"

rule getrich4 {
    condition:
        pe.rich_signature.toolid(14,7299)
        and pe.rich_signature.toolid(1,0)
        and pe.rich_signature.toolid(93,4035)
        and pe.rich_signature.toolid(48,9044)
        and pe.rich_signature.toolid(42,8803)
        and pe.rich_signature.toolid(4,8447)
}

Step 4: Find more badness!
This is the expensive part. I use VirusTotal retrohunts to do this, but you might also test this on a big corpus of files or other things.

Step 5: Confirm your findings!
When I ran this in VT I got back over 1,000 other samples, some with 0 AV detections.

"Doesn't that mean this isn't a good signature?"

Not necessarily! Let's take a look at one of the 0-hit samples in VT: 2d54705ca9ba291be7a7bac9f8d71cf140ab8405454f45e511469aca06b293d9

VT says this hash was also uploaded in a bundle with b10275667d8d2acb93f46fd5ffd449dbe84e2669aabbe78328b98e333325a67a, which is another ransomware sample, this time of the Cerber family. How odd!

How about another with a 0-hit detection rate? This time aed8119e766064055393b159c1a36580b1cac1cca31009334bbbc85be0107540

VT says this is typically uploaded with the filename "RomeTW.exe", a filename from the game "Rome: Total War". Googling around shows this hash might actually be a legitimate file from the game. But here's where context is key: if you're operating on a business network, do you want your employees downloading games through your internet connection? Probably not, but that's fine.

Let's try a sample with 2 AV detections this time: b3728881afa118ef8f05998c137bf9c7fdb77f07d75f3d2ed4d3bcdc8584988a

One of the 2 detections is from CrowdStrike: malicious_confidence_85% (D). Well that's interesting! Even more interesting is that the submission filename is "System.dll", which is identical to the submission filename for 2d54705ca9ba291be7a7bac9f8d71cf140ab8405454f45e511469aca06b293d9, the one that lead to finding a Cerber sample!

Step 6: Do it live!
The vast majority of samples found using this signature were Ramnit ransomware samples - only a small number of the samples identified were benign on their own, but those seemed to be related to malicious files. This makes me think that we've built a signature at least good enough to find more malicious samples. And if you pair this signature with LaikaBOSS, then you'll be able to explode blobs being downloaded on your network and find matching malicious samples even inside other objects! That's some cache-$$$ right there.

Comments

Popular posts from this blog

"It's Not All Black And White" or "Why You Should Switch From Blacklisting To Whitelisting"

"WannaCry Is North Korea!" or "Rich Headers Win Again!"