Alright! Now, let’s get comfy with IDA and analyze some samples.
Lab 5-1
1.What is the address of DllMain?
As it can be seen in the Functions window, but also in IDA View, the DLLEntryPoint is located at the 0x1001516D address.
Figure 1.1
2.Use the Imports window to browse to gethostbyname. Where is the import located?
gethostbyname is located at 0x100163cc in the .idata section.
Double-clicking the function sends us to the IDA View, where we can see that the function takes a string as the only argument.
Figure 2.1
3.How many functions call gethostbyname?
Pressing X on the keyboard while having gethosbyname selected, the xrefs window pops up. We can see that a number of 18 functions call gethostbyname. 9 of them are of type p (a call cross-reference) and the other 9 are of type r (read cross-reference).
Figure 3.1
4.Focusing on the call to gethostbyname located at 0x10001757, can you Figure out which DNS request will be made?
At 0x1000174E, the offset points to a variable named aThisIsRdoPicsP which has the value [This is RDO]pics.practicalmalwareanalysis.com.
In the next line, 0Dh, which is 13 in base 10 is added to the eax regiser, then it is pushed before calling gethostbyname. So, it’s skipping 13 places in the above mentioned string, leaving us just with pics.practicalmalwareanalysis.com, which is the DNS request to be made.
After that, the EAX register is pushed on to the stack in order to be used by the gethostbyname function.
Figure 4.1
5.How many local variables has IDA Pro recognized for the subroutine at 0x10001656? 6.How many parameters has IDA Pro recognized for the subroutine at 0x10001656?
An argument is positive from the stackpointer (esp+xxx) and a local variable is a negative from the stackpointer (esp-xxx). So, IDA has recognized 23 local variables and one parameter, lpThreadParameter.
Figure 5.1
7.Use the Strings window to locate the string \cmd.exe /c in the disassembly. Where is it located?
It is located in the xdoors_d section, at 0x10095B34 and cross-referenced within the sub_1000FF58 subroutine.
Figure 7.1
8.What is happening in the area of code that references \cmd.exe /c?
Now, let’s analyze the code. In the Figure mentioned above, we can find more strings which, I assume, are the lines which the attacker sees when he gains the reverse shell. As it can be seen further down the code, there is a remote shell session specified.
In Figure 8.1, we can see a lot more strings from the same subroutine that might indicate what other commands can the malware perform, such as : exit,quit, cd.
Figure 8.1
Looking into the subroutine, in Figure 8.2, we can see few WindowsAPI functions being used, such as : GetCurrentDirectoryA, GetTickCount, GetLocalTime. The GetTickCount function is used to enumerate the number of miliseconds that have passed since the Windows system was started. This can be used as an anti-analysis tactic.
For example, the value returned by the function can be compared with 10-15 minutes. This could be the amount of time that a sandbox might need to boot up the VM, copy the malicious binary to the VM, let it run for a few minutes, create a report and return to the clean state the VM. If it would have been an user’s machine it would, most of the time, have a much higher runtime.
The function GetLocalTime is used in the same manner. The malware might try to detect, using this function, how much time has elapsed since its execution. This can be an anti-analysis tactic that tries to prevent the debugging of the malware.
Figure 8.2
9.In the same area, at 0x100101C8, it looks like dword_1008E5C4 is a global variable that helps decide which path to take. How does the malware set dword_1008E5C4? (Hint: Use dword_1008E5C4’s cross-references.)
We can see that, at 0x100101C8, the dword_1008E5C4 variable is compared with the value in the register ebx (Figure 9.1). We can also observe the cross-references for this variable.
Figure 9.1
In Figure 9.2 we went to the first reference, for the “w” type which stands for write,
Figure 9.2
Following the call made at 0x10001673 we go to the subroutine sub_10003695. In Figure 9.3, we can see that the code tries to get the version information of the machine. In the end, it compares the VersionInformatio.dwPlatformID with the value 2.
Looking up the Microsoft Documentation, we found that the value 2 represents Win NT or later.
Win32NT 2 The operating system is Windows NT or later.
Figure 9.3
So, if the machine that the malware runs on is Windows NT or later, it will continue with the commands.
10.A few hundred lines into the subroutine at 0x1000FF58, a series of comparisons use memcmp to compare strings. What happens if the string comparison to robotwork is successful (when memcmp returns 0)?
In Figure 10.1, the compraison to robotwork can be seen. We follow the red arrow that is the path if the jump is not taken. The jmp type is jnz, “Jump if Not Zero”. We want to see what happens when memcmp returns 0, so we have to go for the path where the jump is not taken.
Figure 10.1
This leads us to the following block of commands in Figure 10.2:
Figure 10.2
The subroutine sub_100052A2 is called so let’s follow it. The subroutine can be seen in Figure 10.3.
Figure 10.3
We can see, in the bottom part, that the SOFTWARE\Microsoft\Windows\CurrentVersion registry key is used.
11.What does the export PSLIST do?
The export PSLIST can be seen in Figure 11.1.
Figure 11.1
In the bottom part of the code, a call to the subroutine sub100036C3 is being made. Let’s look into that in Figure 11.2!
Figure 11.2
It loooks like it compares the platformID with 2 again. If the condition is met, it compares the dwMajorVersion with the value 5. We can find what it means looking it up in the windows documentation in Figure 11.3.
Figure 11.3
In the end, it pushes the value 1 into the stack and leaves in order to continue the rest of the execution.
12.Use the graph mode to graph the cross-references from sub_10004E79. Which API functions could be called by entering this function? Based on the API functions alone, what could you rename this function?
As it can be seen in Figure 12.1, the Xrefs from graph shows that the functions GetSystemDefaultLangID, sub_100038EE, aLanguageID0xX, sprintf, strlen are being called.
Based on the API function, we could call this function GetSysDefLang.
Figure 12.1
13.How many Windows API functions does DllMain call directly? How many at a depth of 2?
In Figure 13.1, we can see the functiones directly called by DLLMain, at depth 1. This graph was made with the User Xrefs Chart (View-> Graph-> Uxer Xrefs Chart) at a depth of 1.
Figure 13.1
At depth 1, we can find 4 Windows APIs, strncpy, _ strnicmp, chreateThread, strlen.
For a graph at depth 2, let’s look at Figure 13.2, 13.3 and 13.4.
Figure 13.2
Figure 13.3
Figure 13.4
There are a lot of Windows APIs used. Among them we can take note of CloseHandle, FreeLibrary, ExitThread, WinExec, closesocket, sleep, gethostbyname, memcpy, etc..
14.At 0x10001358, there is a call to Sleep (an API function that takes one parameter containing the number of milliseconds to sleep). Looking backward through the code, how long will the program sleep if this code executes?
Analyzing the block of code in Figure 14.1 we firstly observe that the offset (or address) is moved into eax. Then, 13 (0x0Dh) is added so that it now points to the “3” from the string [This is CTI]3 0 .
The atoi() function converts a character string to an integer value. Then, 30 is multiplied by 1000 and stored in the eax register. The value 30000 is pushed onto the stack right before the call of the Sleep function.
Hence, the program will sleep for 30000ms or 30s.
Figure 14.1
15.At 0x10001701 is a call to socket. What are the three parameters?
The three parameters that are used to call the socket function are 6, 1, and 2, labeled as protocol, type and af.
Figure 15.1
16.Using the MSDN page for socket and the named symbolic constants functionality in IDA Pro, can you make the parameters more meaningful? What are the parameters after you apply changes?
Checking the documentation for the socket function at Microsoft, https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-socket, we can see that the 1 value at the protocol label corresponds to SOCK_STREAM which is a TCP connection. For protocol, 6 corresponds to IPPROTO_TCP. And for af, 2 corresponds to AF_INET. We can rename those accordingly.
17.Search for usage of the in instruction (opcode 0xED). This instruction is used with a magic string VMXh to perform VMware detection. Is that in use in this malware? Using the cross-references to the function that executes the in instruction, is there further evidence of VMware detection?
The in instruction was found at 0x100061DB, used by function sub_10006196 using Binary Search(Find all occurrences) (Figure 17.1).
Figure 17.1
Checking the most informed source on the internet, Google, we can find a few things about the magic string VMXh.
If we are outside VMware, a privilege error occurs. If we’re inside VMware, the magic value (VMXh) is moved to register EBX; otherwise, it is left at 0.
Based on the version values returned by ECX, we can even determine the specific VMware product.
The same instructions as in Figure 17.2 from https://www.aldeid.com/wiki/VMXh-Magic-Value, can be found inside our program (Figure 17.3).
Figure 17.2
Figure 17.3
Now, let’s take a look at cross-references.
In Figure 17.4, we can see that withe the use of function sub_10006196 sample verifies if it is inside a Virtual Machine. If it is, the installation is aborted.
Figure 17.4
18.Jump your cursor to 0x1001D988. What do you find?
At 0x1001D988 is the beggining of what looks like a string (Figure 18.1).
Figure 18.1
19.If you have the IDA Python plug-in installed (included with the commercial version of IDA Pro), run Lab05-01.py, an IDA Pro Python script provided with the malware for this book. (Make sure the cursor is at 0x1001D988.) What happens after you run the script?
The python script is a XOR decoder. We can use an online one with the key 0x55, as seen in the .py.
Figure 19.1
20.With the cursor in the same location, how do you turn this data into a single ASCII string?
We can turn this data into a single ASCII string by pressing the A key.
21.Open the script with a text editor. How does it work?
We can sea the variable declared with ScreenEA(). This function gets the segment’s starting address.
Then we are iterating from 0x00 to 0x50 (80 in decimal, the length of the string) and decoding with the xor key 0x55. PatchByte() is used to modify a memory location.
The code can be seen in Figure 21.1.
Figure 21.1