|  | 
	
		|  | 
	
		| In-Memory Execution of an Executable | 
	
		| Author:
		Amit Malik | 
	
		|  | 
	
		|  | 
	
		|  | 
	
		
	
			 |  | 
	
		
	
		|  | 
	
		|  | 
	
		|  | 
	
		|  | 
	
		
	
		 |  | 
		| This  article is the part of our free "Reverse Engineering & Malware 
		  Analysis Course".  You can visit our training page here and all the presentations of previous sessions here | 
		|  | 
		|  | 
	
	  |  | 
	
		| In this article, we will learn how to perform in-memory or file-less execution of executable with practical code example. Here I will  explain about some of the fancy techniques used by exploits and malwares from  shellcode perspective. This article requires a strong understanding of  PE file format. If you are not comfortable with PE file format then first visit  our first training session on PE Format Basics. | 
	
	  |  | 
	
	
	
		|  | 
	
	
		
	
	
	 |  | 
	
	
		| Technically an exploit is the combination of two  things | 
	
	
		| 
		  Vulnerability – the software security  bugShellcode – the actual malicious payload | 
	 |  | 
	
	  | Vulnerability gives us control over execution flow  while shellcode is the actual payload that carries out the malicious activity.  Without the shellcode vulnerability is just a simple software bug. Further we can divide shellcodes into two parts: | 
	
	  | 
	    Normal shellcodesStaged shellcodes (often times termed as  drive by download) | 
 |  | 
	
	  | In a normal shellcode, shellcode itself carry out  the malicious activity for eg: bind shell, reverse shell shellcodes etc. They do  not require any other payload to be downloaded for their working. On the other  hand staged shellcodes require another payload for their working and are often  divided into two stages. 
	    Stage 1 – that will download stage 2.Stage 2 – It is the actual malicious payload
 Stage 1 downloads the stage 2 payload and executes  it. After that stage 2 will perform all kind of malicious activity.  Here the interesting part is how stage 1 executes  stage 2 payloads. In this article I will discuss about it in detail.  
      The two possibilities for the stage 1 shellcode to  execute stage 2 shellcode could be, | 
	
	  | 
	    Download the payload, save it on the  disk and create a new processDownload the payload  and execute it directly from the memory  | 
	 |  | 
	
	  | #1 will increase the footprints and moreover there  is greater chances of detection by the host based security softwares like  antivirus. 
	    However in #2, as the payload is executed directly  from the memory so it can bypass host based security softwares very easily. But  unfortunately no windows API provides mechanism to execute file directly from  memory. All windows API like CreateProcess, WinExec, ShellExcute etc. requires  file to be locally present.
 So the question is how we can do that if there is no  such API?
 | 
	
	  |  | 
	
	  |  | 
	
	  |  | 
	
	
		
	
	 |  | 
	
	  |  | 
	
		| I think in this regard the first known work on  In-memory execution was done by ZomBie of 29A labs and then the Nologin also  published its own version of the same. Later on Stephen Fewer from harmony  security applied the logic on the DLL and coined a new term reflective DLL  injection which is the integral part of Metasploit framework. Interestingly it is possible because the structure  of a PE file is exactly the same on disk as in mapped memory. So we can easily  calculate the offsets or addresses in memory if we know the offset on disk and  vice-versa. It makes it possible to mimic the actual operating system loader  that loads the executable in memory.  Operating system loader is responsible for  process initialization, so if we can make a prototype of it then we can also  create a process probably directly from the memory. But before that, we need to  take a look into the OS loader working especially how it map executable in  memory. 
        Following are the simplified steps that carried out  by OS loader when you launch Executables. | 
	
	
	  | 
	    
	      Read first page of the file which includes DOS  header, PE header, section headers etc.
	       Fetch Image Base address from PE header and  determine if that address is available else allocate another area.  (Case of relocation)
	       Map the sections into the allocated area
	       Read information from import table and load the  DLLs
	       Resolve the function addresses and create Import  Address Table (IAT).
	       Create initial heap and stack using values from  PE header.
	       Create main thread and start the process. | 
	 |  | 
	
	  | If we can create a programme that can mimic some of  the above steps then we can execute exe directly from memory. For example, consider a situation: you download an  exe/dll from internet so until you save it on the disk it will remain in the  volatile memory.  This means we can read the  header information of that file directly from memory and based on the above  steps we can execute that file directly from memory, in short it is possible to  execute an exe/dll without its file or file-less execution is possible.  If you take a close look on the above steps then we  can easily say that most of the information is stored in the PE header itself,  which we can read programmatically.  Technically the minimum information  required to run any executable is as follows, | 
	
	  | 
	    
	      Address space
   Proper sections (exe sections) placement into the  address space
	       Imported API  addresses
           | 
	
	  |  | 
	
	  
  
	 |  | 
	
	  | In  PE, everything is relative to Image Base so if we can get Image Base address  allocation then we can proceed to next steps easily else we have to add  relocation support to our loader prototype but for this article, I am  ignoring that part and will be assuming that we have an allocation with Image  Base. | 
  |  | 
  
	  
  
	 |  | 
	
	  | In  PE File header, NumberOfSections field can give us the total number of sections,  after that we can read section’s headers and can write on to the proper address  in the memory. (We read the offset from PointerToRawData and copy that data at  VirtualAddress by taking length from SizeOfRawData field). | 
   |  | 
  
	  
  
 |  | 
	
	  | Again by reading Import Table structure we can  get the names of DLLs and APIs used by the executable. Remember FirstThunk in the  import table structure is actually IAT after name resolution | 
  
	
	  |  | 
	
	  |  | 
	
	  |  | 
	
	  
  
	
	  |  | 
	
	  | Based  on the above information we can write a basic loader prototype.  Please note that I am ignoring couple of  important things in the code intentionally like relocation case, section  permissions, ordinal based entries fixes etc. | 
	
	  |  | 
	
      | /* In memory execution example */
/*
Author: Amit Malik
http://www.securityxploded.com
Compile in Dev C++
*/
#include 
#include 
#include 
#define DEREF_32( name )*(DWORD *)(name)
int main()
{
     char file[20];
     HANDLE handle;
     PVOID vpointer;
     HINSTANCE laddress;
     LPSTR libname;
     DWORD size;
     DWORD EntryAddr;
     int state;
     DWORD byteread;
     PIMAGE_NT_HEADERS nt;
     PIMAGE_SECTION_HEADER section;
     DWORD dwValueA;
     DWORD dwValueB;
     DWORD dwValueC;
     DWORD dwValueD; 
     printf("Enter file name: ");
     scanf("%s",&file);
          
           
     // read the file
     printf("Reading file..\n");
     handle = CreateFile(file,GENERIC_READ,0,0,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,0);
     
     // get the file size
     size = GetFileSize(handle,NULL);
     
     // Allocate the space 
     vpointer = VirtualAlloc(NULL,size,MEM_COMMIT,PAGE_READWRITE);
     
     // read file on the allocated space
     state = ReadFile(handle,vpointer,size,&byteread,NULL);
     CloseHandle(handle);
     printf("You can delete the file now!\n");
     system("pause");
     
     // read NT header of the file
     nt = PIMAGE_NT_HEADERS(PCHAR(vpointer) + PIMAGE_DOS_HEADER(vpointer)->e_lfanew);
     handle = GetCurrentProcess();
     
     // get VA of entry point
     EntryAddr = nt->OptionalHeader.ImageBase + nt->OptionalHeader.AddressOfEntryPoint;
     
     // Allocate the space with Imagebase as a desired address allocation request
     PVOID memalloc = VirtualAllocEx(
                                     handle, 
                                     PVOID(nt->OptionalHeader.ImageBase), 
                                     nt->OptionalHeader.SizeOfImage, 
                                     MEM_RESERVE | MEM_COMMIT, 
                                     PAGE_EXECUTE_READWRITE
                                     );
    
     // Write headers on the allocated space
     WriteProcessMemory(handle, 
     memalloc, 
     vpointer, 
     nt->OptionalHeader.SizeOfHeaders, 
     0
     );
     
     
     // write sections on the allocated space
     section = IMAGE_FIRST_SECTION(nt);
     for (ULONG i = 0; i < nt->FileHeader.NumberOfSections; i++) 
     {
         WriteProcessMemory(
                           handle, 
                           PCHAR(memalloc) + section[i].VirtualAddress, 
                           PCHAR(vpointer) + section[i].PointerToRawData, 
                           section[i].SizeOfRawData, 
                           0
                           );
     }
     
     // read import dirctory    
     dwValueB = (DWORD) &(nt->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT]);
     
     // get the VA 
     dwValueC = (DWORD)(nt->OptionalHeader.ImageBase) + 
                          ((PIMAGE_DATA_DIRECTORY)dwValueB)->VirtualAddress;
     
     
     while(((PIMAGE_IMPORT_DESCRIPTOR)dwValueC)->Name)
     {
            // get DLL name
            libname = (LPSTR)(nt->OptionalHeader.ImageBase + 
                              ((PIMAGE_IMPORT_DESCRIPTOR)dwValueC)->Name);
                              
            // Load dll
            laddress = LoadLibrary(libname);
            
            // get first thunk, it will become our IAT
            dwValueA = nt->OptionalHeader.ImageBase + 
                                  ((PIMAGE_IMPORT_DESCRIPTOR)dwValueC)->FirstThunk;
            
            // resolve function addresses
            while(DEREF_32(dwValueA))
            {
                dwValueD = nt->OptionalHeader.ImageBase + DEREF_32(dwValueA);
                // get function name 
                LPSTR Fname = (LPSTR)((PIMAGE_IMPORT_BY_NAME)dwValueD)->Name;
                // get function addresses
                DEREF_32(dwValueA) = (DWORD)GetProcAddress(laddress,Fname);
                dwValueA += 4;
            }
            dwValueC += sizeof( IMAGE_IMPORT_DESCRIPTOR );
     }
   
   
     // call the entry point :: here we assume that everything is ok.
     ((void(*)(void))EntryAddr)();
           
}
           
           
 | 
	
	  |  | 
	
	  | Compile  the above code in Dev C++.  For proof of  concept, I will execute the MessageBox code that I had shown in my 'Assembly  Basics' article.  Now perform the following steps, | 
	
	  | 
	    Compile the MessageBox code again but before that select project properties in WinAsm (project->Project Properties->Release) and in Link block add the following command:      	/BASE:0x500000 Click on ok. Now assemble and link the code you will get EXE with 500000 Image Base which is good for our POC | 
	
	  |  | 
	
	  |  | 
	
	  |  | 
	
	  | Below  snapshot shows you the execution directly from memory, | 
	 |  | 
	
	
	
	  |  | 
	
	  |  | 
	
	  |  | 
	
	  |  | 
	
	  
  
	
		|  | 
	
	
		| Recently Kaspersky said that they saw a file  less worm, actually these things are not new. Metasploit has file less Trojan  from years in terms of reflective DLL injection. Many malicious codes and  packers use heavily these things. It is also strongly known for security softwares  bypassing. Overall it is very powerful mechanism and must be known to a malware  analyst.  | 
	
	
	  |  | 
	
		|  | 
	
		|  | 
    
    	
	
		
	
	 |  | 
	
	
		| 
		  Nologin - Remote Library Injection		    Harmony Security - Reflective DLL InjectionIn Memory Execution – Zombie | 
	
		|  | 
	
		
	
		|  | 
    
	
	
		
	
			 |  | 
	
		
	
		|  | 
	
		|  | 
	
		|  | 
	
		|  | 
	
		|  | 
	
		|  | 
	
		|  | 
		|  |