Zero2Auto - CruLoader Malware
As part of the course we were instructed to analyze a custom malware sample developed for us, below is a full analysis of that sample plus a an automated script to extract the final payload of that sample.
During an ongoing investigation, one of our IR team members managed to locate an unknown sample on an infected machine belonging to one of our clients. We cannot pass that sample onto you currently as we are still analyzing it to determine what data was exfilatrated. However, one of our backend analysts developed a YARA rule based on the malware packer, and we were able to locate a similar binary that seemed to be an earlier version of the sample we’re dealing with. Would you be able to take a look at it? We’re all hands on deck here, dealing with this situation, and so we are unable to take a look at it ourselves. We’re not too sure how much the binary has changed, though developing some automation tools might be a good idea, in case the threat actors behind it start utilizing something like Cutwail to push their samples. I have uploaded the sample alongside this email.
Thanks, and Good Luck!
Basic Static & Dynamic Anaylsis
We begin by running the sample in a sandboxed environment like any.run. We can immediately see that this process launches itself and svchost as a sub process. This can also be confirmed within Intezer, and in Intezer one can examine the strings contained within each sub process. We can assume there is process injection happening.
Regarding any connection made to outer servers, we can see that pastebin.com is being contacted.
There is a suspicious looking resource within the resource section:
Before the sandbox quit on Any.run – one can see a strange looking MessageBox string.
Finally, the strings found in all first 3 loaded process appear to encrypt, and this can be easily confirmed within Intezer:
We’re going to be looking for process injection, I suspect that the resource located within the resource section would be mapped into memory, unpacked or decrypted and then injected into the second sub process. There are 4 sub process in total so our final payload would be located on the fourth sub process. It is assumed that the payload would connect to pastebin and display a message box. There is definitely string encryption going on, so we’ll have to deal with that as well.
Static and Dynamic Anaylsis
This binary is compiled with Visual Studio C++, the implications of that mean that the actual main function is located somewhere within the start function and I’ve located it by recursively traversing xrefs from one of the imported functions. The main function is located at 0x00401400
When viewing the code of the main function one can see gibberish strings being pushed before function calls:
The function that is called after each push is the same, I instantly assume there is some string encryption going on. The extensive use of LoadLibraryA and GetProcAddress also makes me assume these strings are API strings that are resolved using LoadLibraryA and GetProcAddress.
At loc_401550 and loc_401570 one can located something that resembles a RC4 KSA routine, it is easily recognized by the two loop procedures iterating 256 times.
These are just quick assumptions I’ve made by looking at the binary, the rest of it is filled with obfuscated code and otherwise an extensive use of registers and dynamic resolving so we’ll have to resort to dynamic analysis. I’ve disabled ASLR for this binary with CFF explorer so it would be easier to debug it. I’ve set relevant breakpoints within the debugger. So all the resource related functions to locate any resource loading, CreateProcessInternalW for process initialization and VirtualProtect, VirtualAllocEx and WriteProcessMemory for injection. In addition I’ve set breakpoints on OutputDebugString and IsDebuggerPresent to catch easy implemented anti analysis.
First, as assumed the sub_401300 seems to be resolving strings:
The function func_StringDecrypt seems to be decrypting strings using some kind of custom base64 decoding, The reason I suspect is – is because the use of the full alphanumeric string that is being passed within this function. This function would be studied extensively later on in this paper and we’ll attempt to generate an automation script for it to decode all strings within this sample.
Then the sample does something interesting:
It allocates space for the resource but it skips the first 0x1C bytes within the resource.
Hmm, perhaps this is the decryption key for the RC4 algorithm we saw before?
Then a call is performed to sub_402DB0 which I renamed to func_CopyResourceWithoutKeyToAllocMem because that is exactly what it does, it just copies the resource without its key, so starting at offset 0x1C.
because that is exactly what it does, it just copies the resource without its key, so starting at offset 0x1C. This function would seem very confusing at first, but if we set a hardware breakpoint on the allocated memory we’ll break within this function we can confirm this. Because at 00403002 it performs this copying procedure and then it just exists the function:
Then sub_4025B0 is executed, it receives a stack address and we simply skip this function while looking at that address on dump we can see that it simply zeroes it out.
So thankfully I’m saving a lot of time by skipping these rabbit holes.
Then the assumed RC4 algorithm executes, what I want to do is locate the address to where the sample mapped the resource to memory as I suspect that it’s going to be decrypted.
So I’m going to skip the RC4 decryption routine and jump straight to address 0x40161D
Yay! I decide to dump the new PE file out to disk but we’re not done yet. We have to see how it’s going to be injected to memory.
We jump into sub_401000:
First the sample resolves the ImageBase of the current executing sample and then it attempts to confirm and locate the address of the NT_Headers of the decrypted payload.
Then the current process main.bin is created in suspended mode
Seems like there is going to be process injection involved.
First memory is allocated within the new process. The sample attempts to execute VirtualAllocEx on the new process. Attempting to allocate memory at the payloads PE default ImageBase and with the virtual Size of the image. This won’t work though under our modified execution since we disabled ASLR. Why? Because both processes execute at the same ImageBase, the new process already has memory allocated within that region, so we must enable ASLR and start again. So, lets do that just that and we’ll see that it would work.
Then the sample will copy the payloads section to their correct virtual addresses:
Then something very interesting happens: The ImageBase of the payload is written to the PEB of the new process, specifically at offset 0x8.
We can assume that the payload would use this to resolve its own APIs.
Finally, SetThreadContext and ResumeThread are called, and the injected payload executes.
Second Payload Analysis
Advanced Static and Dynamic Analysis
I’m going to assume process injection again and go straight into analysis, I’ll disable ASLR for this execution as I would be executing the second stage payload independently and so the previous problem, we encountered due disabling ASLR shouldn’t bother us.
It is observed that this second stage contains API hashing as the extensive use of CRC32 constants indicates that:
In addition, the function that was seen in the first stage loader that simply copies the payload into memory from the resource section can be seen:
I’ve set relevant breakpoints within the debugger. So, all the resource related functions to locate any resource loading, CreateProcessInternalW for process initialization and VirtualProtect, VirtualAllocEx and WriteProcessMemory for injection. In addition, I’ve set breakpoints on OutputDebugString and IsDebuggerPresent to catch easy implemented anti analysis.
The malware takes the name of the file that its currently executing from and hashes it using a CRC32 hashing algorithm, it can be identified by the CRC32 hashing constants found within this function. The sample then compares the result against a constant value. I’m assuming its using this method as an anti-analysis method to check if the samples name is “sample” or “malware” its really hard to tell. If this check matches the sample quits execution.
Then an API resolving routine called which utilizes the CRC32 hashing algorithm we seen earlier:
The HashID(EDX) and the ID(ECX) which identifies the library to which to resolve the API from
I will not explain how the API Hashing and resolving in depth. Basically the export table of each loaded DLL is hashed to check which function name matches the hash passed into the function. If anyone wants to read about how that might be implemented they can read about it on my github right here.
The sample loads IsDebuggerPresent to check if the malware is executing under a debugger, this can be easily circumvented. The next anti analysis method located within sub_401000 checks if any blacklisted process is running – the malware hashes each running process and then checks if the hash matches to a pre-computed hash array. If they match the malware quits execution.
Then the sample executes sub_401D50 which resolves a lot of APIs that might indicate process injection:
- Start svchost with suspended flags
- Copy current PE into allocated memory
- Allocate Memory in svchost
- Rebuild current PE payload relocation table
- Write payload into svchost
- Use CreateRemoteThread to execute function sub_401DC0
To continue execution, we simply must attach to a second debugger instance and set a breakpoint on the functions location and after running CreateRemoteThread we should hit it.
After setting up the breakpoint lets resume execution of the svchost instance and then skip CreateRemoteThread and see if anything happens.
Success! We can continue analyzing this function within our documented IDA instance as this function is located in the second stage payload.
It’s important to note that interestingly enough – this function is called previously at loc_402085 when the func_CRC32Hash returns a hashed value for the current executing process name which matches a hardcoded hash.
I quickly assumed that this is a method to detect if the binary is running from svchost, because as we seen in the malware’s execution process tree, it executes svchost twice. I later confirmed this check by renaming the sample to svchost.exe and it worked:
Let’s continue with the analysis.
This function first resolves a few Internet WINAPIs:
Then a strange string located at offset 0x0413C7C is passed into a code block and decrypted by a very simple algorithm:
So, our encrypted byte sequence:
Which resolves to [Redacted]
Hmm.. This might indicate stenography is involved. Let’s continue with the analysis
This pastebin link is passed into sub_401290 this function returns the image link contained within the pastebin and saves it within the memory.
This resolved link is passed into sub_4013A0, first the function reads the contents of the image file linked passed into it. Using sub_401290 which I renamed to func_ReadWebContents
Then the function decodes a string located with qword_413CA4
Which resolves to output.jpg, then it computes a path to the Temp directory and converts the it to a WCHAR type string, then specially picked bytes are extracted from the data section to compute this path:
Then CreateDirectory is invoked to create the cruloader folder, after which the output.jpg file string is append to this path to create the following string:
CreateFileW is then invoked to create this file:
Then the PNG file extracted from the previous website is copied into this file using WriteFile
Afterwards the sample attempts to locate a string “redaolurc” (which is cruloader reversed) within the image data download
After locating the string within the image data, this offset is used to access encrypted data. The data is xored with xmmword(which is 128-bit) 40 times. The xor key is 0x61(‘a’). One could also notice that there are a lot of ‘a’ characters within the encoded payload. These ‘a’ characters are actually zeroes within the binary, because 0 ^ 0x61 = 0x61 so this payload isn’t obfuscated with high class obfuscation as on could infer this pretty quickly.
And the result is a valid PE file:
After dumping this PE I’ve observed it within IDA and it would appear as if this is the final payload!
This PE Is then mapped, relocated and then fixed with VirtualProtect and finally injected into svchost.exe again within the function sub_401750 this executing the final payload.
And that’s pretty much it!
Alright, Let’s begin attempting to automate the process of extracting all payloads and dumping them on disk.
We begin with the resource section; this one is pretty easy.
First we begin looking for a string at offset 0x60 + 0xc from the begging of the resource section, and load the string which is 16 bytes in length.
Then, we load the rest of the payload into another variable
Then we use the ARC4 python module to decrypt this data and dump it on disk
Now as we posses the second payload, we must locate the pastebin URL inside the PE file.
We know our URL is located two XMM_WORDS (32 bytes) in size after the offset of the string “cruloader” so let’s set this up:
We calculate the offset by locating the “cruloader” string, adding 3 bytes to skip the null bytes and then jumping after both irrelevant XMM_WORDS. We extract our data and we set the XOR key to 0xC5
Then for each byte extracted we perform a four ROL and then a xor to match the decryption algorithm
We then use URLLIB to extract the pastebin URL and then we use that same lib to extract the contents of the payload image
Finally, to locate the payload within the PNG payload file we locate the reverse “cruloader” string, extract the payload and then xor it with 0x61.
And that’s it! Easy as that!
I really enjoyed this challenge and I’m looking forward to continuing the course 😊 Hope you enjoyed reading this!