Preface

Recently, I’ve joined @VK and @0verflows advanced malware analysis course called “Zero2Auto”. The first lesson was about algorithms in malware; compression, hashing and encryption. The first lesson was supplied with a PDF which is now released as a post by Vitaly based on another post about the Netwalker sample. I was thinking on how I could practice this lesson, and I concluded that a simple thing I could do is expand upon these two posts as they were not detailed enough for most beginner reverse engineers. My main goal would be to prove the assumed findings in Vitaly’s post by expanding on the mechanisms detailed within it and to detail new findings that might relate to the lessons subject which is how to recognize known encoding algorithms and automate them.


Required prior knowledge:


Dealing with the first stage PowerShell

The first thing we’ll do is download the sample referenced in both blog posts mentioned above:

SHA-256 hash of the sample: f4656a9af30e98ed2103194f798fa00fd1686618e3e62fba6b15c9959135b7be

This is a very long and obfuscated PowerShell script, it’s so long that I couldn’t even load it into my VM’s PowerShell ISE without it crashing, so I decided that I can circumvent this by loading the script into Sublime(which is the best text editor on earth).

624x112

The script might be long and scary but do not fear, for all we need to do is examine the first line of the code:

624x17

The first command “Invoke Expression” will simply run the command wrapped inside “$()”, within this statement we can see that the statement will perform base64 decoding. So to decode the first stage obfuscation, we can simply remove the Invoke expression command and pipe the entire decoded string to a text file using “ Out-File -FilePath .\Process.txt” and this would result the decode payload stored in Process.txt. The second stage payload is nothing but the same scary mess, but this time a long bytearray is being decrypted within a loop, it simply XORs each byte within that array with 0x47.

Again, all I decide to do is to pipe the final product to a text file:

580x29

We reach the second stage payload, which contains 2 long bytearrays – which as stated within the blog posts are two DLL files representing x86 and x64 versions of the malware DLL that would be loaded into the memory of explorer.exe. I’m merely interested in the bytearray representing the x86 DLL. Using sublime text I can click cntrl+shift+l and click on the last line of the first byte array – I’ll copy this bytearray to another script file, then I’ll simply invoke a PowerShell command to write this byte array to a file. It’s worth noting that for some reason sublime text appended newline characters at the last character in each line so running this script wont work unless you replace all new line characters within the script.

556x72

Reversing the Netwalker x86 DLL:

Let’s throw the file in PEBear and see if we can find anything interesting:

516x164

What is this? Ah yes, do not worry the malware author corrupted the PE File header and replaced the “MZ” characters with the header with a word value 0xDEAD (remember this).

What I’ll do, is replace this value using HxD to a proper PE header so we could examine it within IDA and Resource Hacker:

624x214

Much better!

Usually, upon reaching this point I would perform basic static analysis by examining the file’s strings, view any anomalies within its header and examine its resources. Then I would perform basic dynamic analysis by running the sample and monitor it, but we must remember our initial goal – we must expand upon Vitaly’s findings and find any worthwhile material we can explore ourselves. So first let’s examine Vitaly’s first mention of CRC32:

508x677

How did Vitaly know this is indeed a CRC32 hashing algorithm? Well lets start by utilizing the KANAL plugin within PEid, I’ll load the malware into PEid and launch the KANAL tool:

372x172

377x60

As we can see, KANAL recognizes that there is a reference for the CRC32 algorithm within a lot of locations but what exactly did it find there? Let’s jump to 0x1000424F

221x72

What is this constant? Let’s google it:

402x290

Aha, alright – even if one would view how crc2 checksum is calculated one could quickly see the recognizable division flow at 0x1000421C.

173x475

When dynamically analyzing the file, at location 0x10001A59 one can see that the value 0x3e006b7a is resolved as FindResourceA.

695x39

442x54

How NetWalker utilizes PE Header stomping to break analysis

Let’s examine the following assumption made in Vitaly’s blog:

624x301

At location 0x1000A0B0 one can find the API resolving function:

521x188

So, I assumed Vitaly is referencing the content with sub_1003710

285x445

And indeed, he was, how ever by simply breaking on this location and running the sample it would crash. I decided to attempt to understand why this was happening. First attempted to skip the call at 0x1000371E which I renamed to func_checkValidHeader but I this function crashed the sample every time with a access violation exception, so lets take a look at it.

204x265

First, it loads the offset of the current function into EAX and ANDs it with 0xFFFF000. It would then begin to iterate through a loop, subtracting 0x800 from EAX and attempting to locate the value 0xDEAD within the address referenced in EAX. Sounds familiar? Sure does – as we recall the DLL PE header was stomped with 0xDEAD, this code routine is attempting to validate that no one tampered with the sample.

514x63

Since I modified the header, the sample would get stuck in an infinite loop until EAX would point to an invalid memory location resulting an access violation exception – to add to this finding if we go to address 0x1000372D the sample attempts to fix the stomped header using the value returned by func_checkValidHeader which should point to the base address of the DLL. It would then replace 0xDEAD with “MZ” thus fixing the header.

601x153

126x22

To quickly solve this issue, I just patched the binary by removing the header patcher and that solved the problem.

617x210

We can indeed verify Vitay’s findings after this as the sample doesn’t crash. First the sample loads the resource, locks it.

285x422

The malware then loads the resources size and allocates a buffer within the heap to load the resource into it using memcpy.

384x302

The malware then loads the key length and the key itself and saves it to a stack variable

398x96

Copied key:

169x19

Afterwards the key value, size and a pointer to the heap buffer containing the resource are saved and pushed into a function I renamed func_RC4Decrypt:

267x340

Vitaly assumes this within the blog:

624x240

If one followed the lesson in the course carefully one knows that one of the recognizable features of RC4 KSA is a loop flow iterating 256 times:

548x173

and if we examine the function located at 0x10009210 we can confirm an example for this at 0x10009281 and at 0x100092CF:

287x253

EBP is being loaded with 256 as a preparation of the second KSA iteration:

263x46

262x210


Decryption process as seen within the debugger:

624x47

Size(red), key(blue), resource(rest)

594x165

After the function decryption is finished:

596x165

Finally, at location 0x10003832 the sample restores the Netwalker header back to 0xDEAD.

329x143

I wonder if 0xDEAD is a cross binary constant across all Netwalker samples ;)

Sources:

https://zero2auto.com/2020/05/19/netwalker-re/

https://blog.trendmicro.com/trendlabs-security-intelligence/netwalker-fileless-ransomware-injected-via-reflective-loading/

https://any.run/report/f4656a9af30e98ed2103194f798fa00fd1686618e3e62fba6b15c9959135b7be/ca44ad38-0e46-455e-8cfd-42fb53d41a1d