Process Doppelgänging – a new way to impersonate a process

Recently at Black Hat Europe conference, Tal Liberman and Eugene Kogan form enSilo lab presented a new technique called Process Doppelgänging. The video from the talk is available here. (Also, it is worth mentioning that Tal Liberman is an author of the AtomBombing injection).

This technique is a possible substituent of the well-known Process Hollowing (RunPE), that is commonly used in malware. Both, Process Doppelgänging and Process Hollowing, gives an ability to run a malicious executable under the cover of a legitimate one. Although they both serve the same goal of process impersonation, they differ in implementation and make use of different API functions. This is why, most of the current antivirus solutions struggled in detecting Process Doppelgänging. In this post we will take a closer look on how the Process Doppelgänging works and compare it with the popular RunPE.

WARNING: Running this PoC on Windows 10 may cause a BSOD – the reason is a bug in Windows 10. Details here.

Process Doppelgänging vs Process Hollowing (aka RunPE)

The popular RunPE technique substitutes the PE content after the process is created (suspended), but before it is fully initialized. In order to implement this technique, we need to do by our own the step that WindowsLoader took so far: converting the PE file from it’s raw form into a virtual form, relocating it to the base where it is going to be loaded, and pasting into the process’ memory. Then, we can awake the process from the suspended state, and the WindowsLoader will continue loading our (potentially malicious) payload. You can find a commented implementation here.

The Process Doppleganging, in contrary, substitutes the PE content before even the process is created. We overwrite the file image before the loading starts – so, WindowsLoader automatically takes care of the fore-mentioned steps. My sample implementation of this technique can be found here.

NTFS transactions

On the way to it’s goal, Process Doppelgänging uses a very little known API for NTFS transactions.

Transactions is a mechanism commonly used while operating on databases – however, in a similar way it exists in the NTFS file system. It allows to encapsulate a series of operations into a single unit. Thanks to this, multiple operations can be treated as a whole: they can either succeed as a whole – and be committed, or fail as a whole – and be rolled back. Outside of our transaction, the result of the operations is not visible. It starts to be noticeable after the transaction is closed.

Windows API makes several functions available for the purpose of transactions:

Briefly speaking, we can create a file inside a transaction, and for no other process this file is visible, as long as our transaction is not committed. It can be used to drop and run malicious payloads in an unnoticed way. If we roll back the transaction in an appropriate moment, the operating system behaves like our file was never created.

The steps taken

Usage of NTFS transactions

Firstly, we need to create a new transaction, using the API CreateTransaction.

Then, inside of this transaction we will create a dummy file to store our payload (using CreateFileTransacted).

This dummy file will be then used to create a section (a buffer in a special format), which makes a base for our new process.

After we created the section, we no longer need the dummy file – we can close it and roll back the transaction (using RollbackTransaction).

Usage of undocumented process creation API

So far we created a section containing our payload, loaded from the dummy file. You may ask – how are we going to create a process out of this? The well known API functions for creating processes on Windows require file path to be given. However, if we look deeper inside those functions, we will find that they are just wrappers for other, undocumented functions. There is a function Zw/NtCreateProcessEx which, rather than the path to the raw PE file, requires a section with a PE content to be given. If we use this function, we can create a new process in a “fileless” way.

Definition of the NtCreateProcessEx:

Creation of process by this way requires more steps to be taken – there are some structures that we have to fill and setup manually – such as process parameters (RTL_USER_PROCESS_PARAMETERS). After filling them and wring into the space of the remote process, we need to link them to the PEB. Mistake in doing it will cause the process to not run.

After setting everything up, we can run the process by creating a new thread starting from it’s Entry Point.

Despite some inconveniences, creating the process by a low-level API gives also interesting advantages. For example, we can set manually the file path – making an illusion, that this is the file that has been loaded, even if it was not. By this way, we can impersonate any windows executable, but also we can make an illusion, that the PE file runs from a non-existing file, or a file of a non-executable format.

Below you can see an example where the illusion was created, that the PE file runs from a TXT file:

How to detect?

Although this technique may look dangerous, it can be easily detected with the help of any tool that compares if the image loaded in the memory matches the corresponding file on the disk. Example: detection with PE-sieve (former hook_finder):

The process of injection is also not fully stealthy. It still requires writing into the memory (including PEB) of the newly created process, as well as creating a remote thread. Such operations may trigger alerts.

In addition, the mechanism of NTFS  transactions is very rarely used – so, if any executable call the related APIs, it should become an object of a closer examination.

So far this technique is new, that’s why it is not broadly recognized by AV products – but once we are aware of it’s existence, implementing detection should not be difficult.

Posted in Malware, Programming, Techniques | Tagged , | 7 Comments

Hook the planet! Solving FlareOn4 Challenge6 with libPeConv

Recently I started making a small library for loading and manipulating PE files (libpeconv – it’s open source, available on my GitHub). In my previous post, I demonstrated how the Challenge 3 from FlareOn4 could be solved with it’s help: I used libPeConv to import the function from the original crackme, so that it can be used as local – without the need of re-implementing it or emulating.

This time, we will have a closer look at challenge 6 from FlareOn4. This challenge is a bit more difficult, so it is a good opportunity to show some other capabilities of libPeConv – not only importing functions, but also hooking the imported code in various ways.

When I solved this crackme for the first time, during the FlareOn competition, my approach was very dirty – it required me to go through 26 MessageBoxes, write down each value, convert them from hex to ASCII and put them together to make the full flag – oh, my! Looking at the write-ups afterwards, I noticed that most of the people did it this way (check appendix for more details). But I was sure that there must be a better solution – and with the help of libPeConv, I finally did it in a way in which I wanted: no pop-ups to click, the flag is automatically composed by the loader.


The end result looks like this:

github  The repository with all the presented loaders (code + compiled binaries) is available here.

The full code of the final loader:

In this post I will to explain in details how I made it, show the experiments and the reasoning behind them.

Tool used

For analyzing the crackme:

For building the solution:


The challenge named payload.dll is a 64bit PE file. When we look at it’s export table, we can find that it exports one function, named EntryPoint:

But if we try to run it in a typical way, by rundll32.exe payload.dll,EntryPoint, it turns out that the function cannot be found:

Pretty weird… So, let’s try to run it by ordinal: rundll32.exe payload.dll,#1:

This way works – however still we are far from getting the flag.

The curious thing is why the exported function was not found by the name? It seems that the export name has been overwritten while the DLL was loading. To check it very fast, we can use PE-sieve, a tool that detects modifications of the running PE vs the PE on the disk.

I called the function again by the ordinal, and when the MessageBox popped up, I scanned the running rundll32.exe process by PE-sieve. Indeed we can see that the DLL was overwritten:

The modified image has been automatically dumped by the PE-sieve, so we can open it by typical tools. First, I use PE-bear to take a look at the exports table:

And yes, now it looks very different… Let’s see this function in IDA.

Looking inside we can confirm that this was the function responsible for displaying the message that saw before:

This message is displayed when the function is called without any parameters. If, in contrast, it is called with proper parameters, some further chunk of code is decrypted and executed:

The name of the function is used as the key for the decryption. So, what are the conditions that the supplied arguments must fulfill?

The exported function expects 4 arguments:

As we can see, the checked argument is the third one of the arguments supplied:

It is compared against the function name. If it is exactly the same as the function name, the decryption proceeds – otherwise, the fail MessageBox (“Insert clever message…”) is shown.

Let’s run the function with proper parameters and see what happens.
This is a small wrapper that will help us call this function from our code:

We can do the same from the command line:

rundll32.exe payload.dll [func_name] [checked_str]

Cool, a new message popped up. It seems to be a chunk of the key: 0x75 -> ASCII ‘u’. But this is just one of the pieces, and we have to get the full key.

For this purpose, we will look inside the DLL to find the the code that was responsible for overwriting the exports table. That function starts at RVA 0x5D30:

This is the pseudocode:

It decrypts the code chunk pointed by the given index and redirects to the new exported function. We want to manipulate the indexes in order to get all the remaining key parts. The index of the chunk is calculated basing on the current time, inside the function at RVA 0x4710:

We can see operation modulo 26, so it means it is the maximal value. There are 26 possible indexed -> 26 pieces of the key. The calculated index is then supplied to the decrypting function. The decrypting function is pretty simple – based on XOR:

First, the random generator is initialized basing on the supplied chunk_index + a constant. Then, pseudo-random values, retrieved by rand() are used as the XOR key. Thanks to the feature of this (weak) random generator, values are not really random – the same seed gives always the same sequence, so it works pretty well as the key.

The simplest to implement (and terribly annoying to execute)  approach to solve this task is to keep changing  the system time, running the DLL and writing down the popping up chunks of the key. Sounds too ugly? Let’s see what libPeConv can do about it…

Importing and hooking function with LibPeConv

Preparations and tests

In the previous post I gave some overview of PeConv library, so if you didn’t read that part, please take a look. This time, I will use the features that I introduced before, plus some others. We will not only import and use the code of the original crackme, but also mix it with our own code, to alter some behaviors.

First, I want to import from the crackme the function that overwrites the exports. This function has at RVA 0x5D30 and it’s prototype is:

__int64 __fastcall to_overwrite_mem(__int64 a1);

Let’s make a loader that will load the crackme to the current process. This was my first version:


Everything looks good and should work, but when I run it, I met an unpleasant surprise:


When we try to debug the code, we will find that the exception is thrown from inside the statically linked functions srand() and rand(). They were used by the function dexor_chunk_index, within the function to_overwrite_memory that we imported from payload.dll.

This happened due to the fact that this DLL requires that the CRT should be initialized prior to use. It is easy to fix – we just need to find the proper functions, responsible for the CRT initialization, and then call them manually. Let’s have a look in IDA – it should automatically recognize and name the functions related to CRT.

Function for CRT initialization has an offset 0x664C:


Its prototype is:

char __fastcall _scrt_initialize_crt(int a1);

And the function for releasing CRT (we need to call it at the end, in order to avoid stability issues in our application) has an offset 0x6824:


Its prototype is:

char __fastcall _scrt_uninitialize_crt(__int64 a1, __int64 a2);

We need to call them appropriately before and after our actions. Example:


And now everything works smoothly without any crashes.

Eventually, if for some reason we don’t want to, or cannot initialize CRT, we can redirect the needed CRT functions to their local copies:

If we want to log the rand values, instead of making redirection to the original function, we can redirect to our own wrapper, i.e.:


Now, instead of running silently, it will print a value each time when the rand was called:


Now, let’s test if the exported function has been overwritten properly. If so, we should be able to use it analogically like in the case of the previous basic loader.
Instead of GetProcAddress, that I would use on the module loaded in a typical way, I used a function from PeConv with analogical API:

peconv::get_exported_func(loaded_pe, MAKEINTRESOURCE(1));

And this is the code of the full loader, this time using libPeConv:

And yes, it works exactly the same:

As we know, the argument that we supply to the function changes depending on the current month and year. For December 2017 it is:


But it would be nice if our loader can fill it automatically.

This argument must me exactly the same as the exported function name. We can take advantage of this and use a libPeConv’ feature of listing exported function names. Code:


This is the improved version of loader:

Everything works fine:

Ok, tests and preparations are over, now is time for the solution.

Manipulating the chunk index

We have everything ready to start manipulating the index. There are various approaches – one of them is to hook the imported function GetSystemTime. But with  libPeConv we can hook also local functions, so let’s make it even simpler.

The function that calculates the index can be found in the payload.dll at RVA 0x4710:


We will use exactly the same function as we used before, to redirect the statically linked rand and srand:


All we need to prepare is our own function returning the index. For example, we can enforce it to return some hardcoded index:


And it works!


But recompiling the loader each time when we want to change the index is not a good idea. So, I made an improved version of the loader that allows you view the chunk at the index supplied as the argument. Code of the loader:

See it in action:

By this way, we can retrieve all the key pieces one by one. But still, we need to encounter those annoying MessageBoxes. Let’s replace them and redirect the message that was going to be displayed to our own function. This time we will be hooking a function linked dynamically – so, installation of the hook will be a bit different.

Hooking IAT with libPeConv

When libPeConv loads executable, it resolves each imported function with the help of a specially dedicated class: peconv::t_function_resolver. The library allows also to use non-standard resolvers instead of the default one. The only condition is that they have to inherit from this base class.

One of the resolvers that comes in the full package is a hooking_func_resolver. It allows to hook IAT. Basically, when it loads imports, it may substitute some of the imported functions by our own functions (the only condition is that they must have the same API). For now, this resolver supports replacing functions defined by names. So, for example if we want to replace MessageBoxA by our own my_MessageBoxA:


For example, we can replace it by the following function:


Now, instead of displaying the MessageBox, with the character written in hex…


…it will display the same character as ASCII:


This is the code of the full loader:

We can use it along with a small batch script:

@echo off
set loopcount=0
peconv_hooked_msgbox_sol.exe payload.dll %loopcount%
set /a loopcount=loopcount+1
if %loopcount%==26 goto exitloop
goto loop

As the result we get the full flag printed and 0 annoying pop-ups:


In the final version, the key is composed by the application itself:


LibPeConv is my new project, still on very early stage of development, so many things may change – but it already proven that it can be useful in solving some challenges. It gives you possibility not only to import code of other executables to your projects, but also to hook it and modify. I hope you will try it and have so much fun with it as I have developing it. I am looking forward to hear some feedback from you!

All the binaries that were used in this demo are here – the password to the zip is “crackme”.


See also other approaches:


Posted in CrackMe, Programming, Tools | Tagged , , , | Leave a comment

Import all the things! Solving FlareOn4 Challenge 3 with libPeConv

Recently I started making a small library for loading and converting PE files (libpeconv, available on my GitHub). The library is still on early stages of development, so please don’t judge and don’t use it in any serious projects. The API may change anytime! However, I have so much fun developing and testing it, that I wanted to share some of my experiments and ideas.

Some time ago I solved some of the FlareOn4 challenges, i.e. the challenge 3. That time I didn’t have the libpeconv yet, so I solved it by some other method. Now it came to my mind, that with the help of my new library solving it could be way much faster and easier.  In this post I will describe my alternative solution and some of the related experiments.

Tool used

For the static analysis:

  • IDA (demo version is enough)

For building the projects:

  • Visual Studio + CMake
  • Python27 (optional – for the helper script that I used in “Bonus”)


The challenge named greek_to_me.exe is a 32bit PE file. It has stripped relocations.


When we deploy it, it shows the empty console and waits. It is not reading any data from the standard input, so we can conclude that it is using some another way to read the password from the user.

We will start from some static analysis in IDA. The crackme has a very simple and clean structure, it is not obfuscated. We can see that at the beginning of the execution it creates a socket and waits for the input.

The socket listens at localhost on port 2222:


After getting the connection, it reads 4 bytes from the input into the buffer:


After it read 4 bytes, it starts processing the input and uses it for decoding an encrypted buffer:


If the checksum is valid, it means the encrypted code was decrypted properly, and it is further executed.

As we can see, only 1 byte of the input is used for decoding the buffer, so we can easily brutforce it. The code responsible for decoding the buffer is also pretty simple:

const size_t encrypted_len = 0x79;
for (int i = 0; i < encrypted_len; i++) {
    BYTE val = encrypted_code[i];
    encrypted_code[i] = (unknown_byte ^ val) + 0x22;

The only part of the crackme that may be somehow challenging is the checksum – this function is not that simple to reimplement. However, if we want to make a brutforcer, we need to be able to calculate the checksum after every attempt.

In my previous solution, I just reimplemented the checksum – it worked but it was not so much fun 😉 . I saw also some other approaches such as emulating the checksum function by the Unicorn engine, using angr framework, or making a brutforcer that talks to the original program via socket. Can it be done even faster? Let’s see…

LibPeConv comes into play

With PeConv we can convert any PE file from raw format to virtual and back. It also provides a custom PE loader – it’s goal is to provide a possibility of loading any PE file into the current process (even if it is not a DLL and even if it does not have no relocations table – it will be explained in the further part). This loaded PE can be later used as a fully functional PE file that can run from inside the curent process. We can also use any selected function from its code – all we need to know is the function’s RVA and the API.

In this case, I will use libpeconv to load the crackme and import from it the function calculating checksum. Also, rather than copying the encrypted buffer to my code, I will read it directly from the loaded PE.

Preparing the required information

Let’s take a look at the crackme again in IDA. We need to find the appropriate offsets and understand the API of the function that we are going to import.

The function calculating the checksum starts at RVA 0x11E6:


It takes 2 arguments: pointer to the buffer and it’s size.

It returns a WORD type:


Summing up, we can define the function prototype as:

WORD calc_checksum(BYTE *decoded_buffer, size_t buf_size)

It is also worth to note, that this function is self-contained and does not call any imported libraries – that makes importing it even easier (we are not forced to load any imports for the module or to apply relocations).

Another thing that we need is the encrypted buffer. It starts at RVA 0x107C and is 0x79 (121) bytes long:


That’s all! Let’s start coding.

Solving the crackme with libPeConv

The current version of libpeconv allows to load PE file in two ways. By the function load_pe_module and by the function load_pe_executable. The second one: load_pe_executable is a complete loader, that loads given PE to the current process in the RWX memory, automatically applies relocations and load dependencies. The first one (load_pe_module) does not load the dependencies and also gives more control: we may load the PE file in non-executable memory and applying the relocations is optional. More information (and *very* possible updates on the API) you can find here:

As we saw, the function that we want to import is self-contained, so it will not harm if we load the crackme PE without imports and without relocations (to see it loaded as a fully functional PE see the next part of the article). I will use a function load_pe_module

BYTE* loaded_pe = (BYTE*)load_pe_module(
    v_size, // OUT: size of the loaded module
    true,   // executable
    false   // without relocations

Now, let’s import the function. First let’s make a pointer to it:

WORD (*calc_checksum) (BYTE *buffer, size_t buf_size) = NULL;

Calculate the absolute offset to the function within the loaded module:

ULONGLONG offset = DWORD(0x11e6) + (ULONGLONG) loaded_pe;

And filling the pointer:

calc_checksum = ( WORD (*) (BYTE *, size_t ) ) offset;

That’s it, now we can use the function in our application like any other function.

But before we can start brutforcing, we also need to fill the pointer to the buffer:

g_Buffer = (uint8_t*) (0x107C + (ULONGLONG) loaded_pe);

This is the full brutforcer that I perpared:
And it works  🙂  The value that we got is exactly what it was supposed to be:


But still, the found value is just a part of the solution, not the flag that we are searching for. As we know from the static analysis, if this value is given correct, the chunk of code will be decrypted and executed. Would be cool to see how exactly that chunk of code looks when it is written into it’s place, don’t you think?

And also it is very easy to achieve. The PE file is loaded in the RWX memory inside the current process – so we can easily substitute the encrypted chunk of code with the decoded. Simple memcpy will do the job:

memcpy(g_Buffer, g_Buffer2, g_BufferLen);

Then, libPeConv will help us to convert the PE file back to the raw format, so that we can open it in IDA. We can do it with the help of pe_virtual_to_raw from libpeconv:

size_t out_size = 0;
BYTE* unmapped_module = pe_virtual_to_raw(
    loaded_pe, //pointer to the module
    v_size, //virtual size
    module_base, //in this case we need here
                 //the original module base, because
                 //the loaded PE was not relocated
    out_size //OUT: raw size of the unmapped PE

And this is the complete solution:

#include <stdio.h>
#include "peconv.h"
BYTE *g_Buffer = NULL;
const size_t g_BufferLen = 0x79;
BYTE g_Buffer2[g_BufferLen] = { 0 };
WORD (*calc_checksum) (BYTE *decoded_buffer, size_t buf_size) = NULL;
bool test_val(BYTE xor_val)
for (size_t i = 0; i < g_BufferLen; i++) {
BYTE val = g_Buffer[i];
g_Buffer2[i] = (xor_val ^ val) + 0x22;
WORD checksum = calc_checksum(g_Buffer2, g_BufferLen);
if (checksum == 0xfb5e) {
return true;
return false;
BYTE brutforce()
BYTE xor_val = 0;
do {
} while (!test_val(xor_val));
return xor_val;
bool dump_to_file(char *out_path, BYTE* buffer, size_t buf_size)
FILE *f1 = fopen(out_path, "wb");
if (!f1) {
return false;
fwrite(buffer, 1, buf_size, f1);
return true;
int main(int argc, char *argv[])
#ifdef _WIN64
printf("Compile the loader as 32bit!\n");
return 0;
char default_path[] = "greek_to_me.exe";
char *path = default_path;
if (argc > 2) {
path = argv[1];
size_t v_size = 0;
BYTE* loaded_pe = peconv::load_pe_module(path,
true, // load as executable?
false // apply relocations ?
if (!loaded_pe) {
printf("Loading module failed!\n");
return 0;
g_Buffer = (BYTE*) (0x107C + (ULONGLONG) loaded_pe);
ULONGLONG func_offset = 0x11e6 + (ULONGLONG) loaded_pe;
calc_checksum = ( WORD (*) (BYTE *, size_t ) ) func_offset;
BYTE found = brutforce();
printf("Found: %x\n", found);
memcpy(g_Buffer, g_Buffer2, g_BufferLen);
size_t out_size = 0;
/*in this case we need to use the original module base, because
* the loaded PE was not relocated */
ULONGLONG module_base = peconv::get_image_base(loaded_pe);
BYTE* unmapped_module = peconv::pe_virtual_to_raw(loaded_pe,
module_base, //the original module base
out_size // OUT: size of the unmapped (raw) PE
if (unmapped_module) {
char out_path[] = "modified_pe.exe";
if (dump_to_file(out_path, unmapped_module, out_size)) {
printf("Module dumped to: %s\n", out_path);
peconv::free_pe_buffer(unmapped_module, v_size);
peconv::free_pe_buffer(loaded_pe, v_size);
return 0;

view raw
hosted with ❤ by GitHub

Comparing the dumped executable with the original one we can see that the buffer was overwritten:
So let’s see the modified exe in IDA:
And yes! At the known offset there is the flag revealed:

Bonus – loading  and running a PE with stripped relocations

Ok, you may say – it was easy – the loaded function was self contained, so we could as well rip it off from the original file, not using any loaders. But what if the function calls several other functions within the given module and also imported functions? Will the same trick work? And could it work even for PE file without relocations?

To answer those questions I prepared another test case. Now, instead of loading one function, I will load and execute the full crackme from inside the brutforcer.

First we will modify few things. This time, instead of using load_pe_module I will use load_pe_executable – to load full executable with dependencies.

BYTE* loaded_pe = (BYTE*)load_pe_executable(path, v_size);

The function will automatically detect that the PE file has no relocations, and enforce loading it at it’s original module base. Mind the fact, that allocating memory at the specific base may not always work – so, sometimes it takes several runs to execute it properly. You must also make sure that the module base of the loader does not collide with the module base required by the payload (if the loader’s base is random it is good enough).

Once the PE file is loaded, we just need to get it’s Entry Point – and then we can call it like any other function*:

// Deploy the payload:
// read the Entry Point from the headers:
ULONGLONG ep_va = get_entry_point_rva(loaded_pe)
    + (ULONGLONG) loaded_pe;

//make pointer to the entry function:
int (*loaded_pe_entry)(void) = (int (*)(void)) ep_va;

//call the loaded PE's ep:
int ret = loaded_pe_entry();

* – but mind the fact that depending on the payload’s implementation details, once you redirected your execution to it’s entry point, it may just exit after finishing it’s job and never return back to your code.

I am going to modify the brutforcer code in such a way, that this time after finding the value the original crackme will be run. This is the code of the full application:

#include <stdio.h>
#include "peconv.h"
BYTE *g_Buffer = NULL;
const size_t g_BufferLen = 0x79;
BYTE g_Buffer2[g_BufferLen] = { 0 };
WORD (*calc_checksum) (BYTE *decoded_buffer, size_t buf_size) = NULL;
bool test_val(BYTE xor_val)
for (size_t i = 0; i < g_BufferLen; i++) {
BYTE val = g_Buffer[i];
g_Buffer2[i] = (xor_val ^ val) + 0x22;
WORD checksum = calc_checksum(g_Buffer2, g_BufferLen);
if (checksum == 0xfb5e) {
return true;
return false;
BYTE brutforce()
BYTE xor_val = 0;
do {
} while (!test_val(xor_val));
return xor_val;
int main(int argc, char *argv[])
#ifdef _WIN64
printf("Compile the loader as 32bit!\n");
return 0;
char default_path[] = "greek_to_me.exe";
char *path = default_path;
if (argc > 2) {
path = argv[1];
size_t v_size = 0;
BYTE* loaded_pe = peconv::load_pe_executable(path, v_size);
if (!loaded_pe) {
printf("Loading module failed!\n");
return 0;
g_Buffer = (BYTE*) (0x107C + (ULONGLONG) loaded_pe);
ULONGLONG func_offset = 0x11e6 + (ULONGLONG) loaded_pe;
calc_checksum = ( WORD (*) (BYTE *, size_t ) ) func_offset;
BYTE found = brutforce();
printf("Found: %x\n", found);
// Deploy the payload!
// read the Entry Point from the headers:
ULONGLONG ep_va = peconv::get_entry_point_rva(loaded_pe) + (ULONGLONG) loaded_pe;
//make pointer to the entry function:
int (*loaded_pe_entry)(void) = (int (*)(void)) ep_va;
//call the loaded PE's ep:
printf("Calling the Entry Point of the loaded module:\n");
int res = loaded_pe_entry();
printf("Finished: %d\n", res);
return 0;

view raw
hosted with ❤ by GitHub

To make sure that everything works fine (the deployed payload really creates socket and gives response in exactly the same way like the one deployed independently), I wrote a small Python script that will communicate with it and display the response:

import socket
import sys
import argparse
def main():
parser = argparse.ArgumentParser(description="Send to the Crackme")
parser.add_argument('–key', dest="key", default="0xa2", help="The value to be sent")
args = parser.parse_args()
my_key = int(args.key, 16) % 255
print '[+] Checking the key: ' + hex(my_key)
key = chr(my_key) + '012'
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('', 2222))
result = s.recv(512)
if result is not None:
print "[+] Response: " + result
except socket.error:
print "Could not connect to the socket. Is the crackme running?"
if __name__ == "__main__":

view raw
hosted with ❤ by GitHub

And now, let’s see it all in action:

This is all what I prepared for today, I hope you enjoyed it! The lib is now under rapid development so many things will get refactored and improved, stay tuned!
The binaries of all the presented loaders, along with the crackme, are available here: – the password to the zip is: crackme


See other approaches to solve the same crackme:

Posted in CrackMe, Programming, Tools | Tagged , , , | 2 Comments

Solving the Shabak’s Airplane challenge – Task 3

Some time ago I solved the Airplane challenge published by Israeli Shin-Bet (Shabak). The crackme has three levels of increasing difficulty. Each one is a 32 bit Windows application. It was a very pleasant task, not difficult but also not too trivial. In this writeup I will present my solutions.

Task 1 and 2 have been described in the previous part, you can read it here. Now it’s time for the final one!

Task 3

Mirror [task3], password “Challenge”

This time the crackme comes with a hint:

Maybe this program doesn't do more than it seems, our special agent 
have told us that when the program was executed in a different 
country, it behaved differently

It reminds me of the techniques used by some malware to target only the chosen countries. Usually it is implemented in one of the two ways:

  1. sending a request to some of the services that gives geolocation data basing on the external IP
  2. checking the installed language/keyboard layout

Let’s run the crackme and observe how it behaves, if it makes any internet connections etc. (we can use i.e. ProcMon).

The crackme printed a message: “May you enter Deep and Dreamless Slumber”:

…and terminated after some timeout. No internet connection has been made. So, I guess it will do something about checking the installed language.

This time I will start from the static analysis in IDA. Let’s load the application and have a look at the referenced functions:

There is GetLocaleInfoEx. I suspect it will be involved in verification process, so, let’s follow where it is called:

The output is saved in a variable of WORD size. If I try to follow this variable and check the references, I don’t find anything more than the above line:

However, it’s upper byte seems to be referenced somewhere else!

It seems if this flag matches, some other function is copied on the place of the function printing the initial “slumber” message:


So, we have some self-modifying code here 🙂 ! Let’s see what is this function doing:


fs:30h -> PEB
Ldr + 0x14 -> _LIST_ENTRY InMemoryOrderModuleList

It doesn’t seem to be a function printing the password. Instead, it searches Kernel32.dll through the loaded DLLs:


We can also see an atypical NOP instruction, that can confuse some debuggers:


OllyDbg and it’s derivatives fails to parse it properly:


If we want to analyze it under OllyDbg we need to substitute this fragment by a typical NOP (0x90) in order to get a clear view:


Now I will do some dynamic analysis in OllyDbg. I set the breakpoint on the flag check (the one that was deciding whether or not to overwrite the function):


…and when it was hit, I changed of the Z flag in the registry. After the function was overwritten, I enforce OllyDbg to re-analyze the code and then set the breakpoint at the function’s beginning:


When the breakpoint is hit, we can step follow the function’s execution to see is details what it is doing.

At the end there is something interesting – a new PE file in the memory:


We can see that the previously stored pointer to kernel32.dll is being overwritten by the pointer to this module:


I dumped this PE and unpapped it using pe_unmapper in order to get a better view. It is named stub.dll and it exports one function: GetComputerNameW:


After following references in IDA, we can find, that this module was unpacked just before the flag check:


It was manually loaded in the memory (without being dropped on the disk and without using LoadLibrary function). This trick is also very often used in malware.

Anyways, now we need to find out where the stub.dll is used. So, I set the breakpoint on this module:


The breakpoint is hit inside ntdll:


Now I set the breakpoint on the .text section of the main module (Third.exe) to see the point where the execution returns:


This is where the stub.dll was referenced from inside the main module:


So, at this point the application gets the address of the function: “GetComputerNameW”. It can fetch this function either from kernel32.dll (if the locale flag was not set) or from the stub.dll (if the locale flag was set).

It seems we are pretty close to the solution, because some formatted printing (“%s”) is done just after that lines (probably this is the flag being printed). Most probably the key  lies inside the function GetComputerName, so let’s go there.


Inside GetComputerNameW:


Again OllyDbg cannot manage parsing some instructions. So, I opened the dumped version of Stub.dll in IDA and used as a reference.

This is how the beginning of the function looks:


fs:30h -> PEB
PEB + 0x10 -> _RTL_USER_PROCESS_PARAMETERS ProcessParameters
ProcessParameters + 0x44 -> _UNICODE_STRING CommandLine.Buffer

The function fetches the command line of the main process, and then process the buffer.

Again, the atypical NOP instructions has been used (marked red on the picture):


We can see two buffers being compared. One of them is the command line buffer stored the memory, and another is hardcoded in the stub.dll. Four consecutive DWORDs are compared. If those two buffers are not matching, then the function sets an error code and exits:


The hardcoded buffer starts at RVA 0x2000 – that is the beginning of .rdata section).


Of course we need the above function to exit without error – then our flag will be printed.

It is easy to conclude, that in order to get the flag, we must have the same values in the memory buffer as in the hardocoded buffer.

I set the breakpoint before this comparison started. The, I just copied the hardcoded buffer and overwritten by its content the buffer in the memory:


Now, let it run. And this is what we get:



We reached the “Airplane Complete” page.

R. Sanchez is safe, happy end! 🙂

Posted in CrackMe | Tagged | 1 Comment

Solving the Shabak’s Airplane challenge – Tasks 1 and 2

Some time ago I solved the Airplane challenge published by Israeli Shin-Bet (Shabak). The crackme has three levels of increasing difficulty. Each one is a 32 bit Windows application. It was a very pleasant task, not difficult but also not too trivial. In this writeup I will present my solutions.

The story is about saving Shabak’s operative, R. Sanchez from from the prison 😀 So, let’s go!

Task 1
Mirror: [task1], password “Challenge”

When we run the application, it just exits and nothing happens. So, I opened it under a debugger (OllyDbg). I started the analysis from viewing the referenced strings, and I noticed something potentially interesting:

%PROGRAMFILES%\\meseeker inc

It seems to be some custom file referenced from inside the code. Let’s go to this point of code, and see how it is used:

Indeed, the program is searching for this file and checking it’s attributes. If they match the required, some output is printed, that probably is our password.

Now we can solve it by two ways – either create the file with the proper attributes, or to influence the execution, so that it will print the password no matter what. I have chosen the second way – setting a breakpoint on each condition, and when it is hit, changing the flag in order to emulate the the appropriate condition being met.

So, indeed it resulted in printing the password:


Task 2
Mirror: [task2], password “Challenge”

In contrary to the previous one, Task 2 prompts for the password:

Let’s enter whatever and see what happens:

It dropped a file “GettingSchwifty.bat” and tried to load it. It turned out not to be a valid PE, so the error occurred.

It seems the password that we typed was supposed to decrypt this PE file (name .bat is just a disguise). Let’s take a look at the dropped file:


As we can see, it has some regular patterns inside. It made me think that it may be XOR encrypted. So, I tried to XOR it with some valid PE file, to see if it reveals the password (I used my python script:

./ --file GettingSchwifty.bat --keyfile Second.exe

When we view the output by a hexeditor, we can see the repeating pattern at the beginning:

This may be our password, so let’s try. I copied this fragment, saved it as a key.bin and then tried dexor again:

./ --file GettingSchwifty.bat --keyfile key.bin

And hurray, the output is a valid PE file: a DLL named Piper.dll:

Since I already have the DLL, I don’t really care what was the password that allowed to decrypt it. I will just run the main executable (Second.exe) under the debugger, set the breakpoint before the GettingSchwifty.bat was loaded, and replace it with my version.

When the breakpoint before the LoadLibraryA is hit. I am deleting the dropped GettingSchwifty.bat and copying on it’s place my decrypted DLL.

It got loaded properly, so now we can enter to the function inside the DLL:

But it’s not over yet. One more password is required, before we get our flag printed. The application ask a question over a pipe “flumbus_channel” and we are supposed to answer it:

After a brief analysis I concluded that the brutforce is not the solution. So, we must approach it by some other way. By some googling around I found the answer for the asked question: “What is cooler than being cool?”.


The answer is: “Ice cold”! Pretty obvious, isn’t it? 😉 But is it what the application wanted us to say? Let’s pass the input and check. I want a fast solution, so instead of writing a client that will talk over the pipe, I will just edit the buffer in the memory. Let’s set a breakpoint on the call to ReadFile and follow the buffer in dump:

After the ReadFile returned, we can edit this buffer in order to emulate the input being read:

The password is translated to the uppercase, then it is used to decrypt the output buffer. Checksum of the decrypted buffer is calculated and compared with the hardcoded one: 0x55B8B000

It seems the password “ice cold” was the right one, the checksum matches! The output buffer got decrypted and by following it in dump we can already see the second flag:

However, displaying it nicely on the screen requires more effort – there are some debug checks, that causes application to exit:

I just patched the conditions above, so that the antidebug measures can not be taken:

And we get the password printed:

Another level cleared!

That’s how I reached the Task 3! This one will be a bit longer, so I am going to describe it in a second writeup.


Solution of the Task 3:

Posted in CrackMe | Tagged | 1 Comment

Starting with Windows Kernel Exploitation – part 3 – stealing the Access Token

Recently I started learning Windows Kernel Exploitation, so I decided to share some of my notes in form of a blog.

In the previous parts I shown how to set up the environment. Now we will get familiar with the payloads used for privilege escalation.

What I use for this part:

  • The environment described in the previous parts [1] and [2]
  • nasm
  • HxD

Just to recall, we are dealing with a vulnerable driver, to which we are supplying a buffer from the userland. In the previous part we managed to trigger some crashes, by supplying a malformed input. But the goal is to prepare the input in such a way, that instead of crashing the execution will be smoothly redirected into our code.

Very often, the passed payload is used to escalate privileges of the attacker’s application. It can be achieved by stealing the Access Token of the application with higher privileges.

Viewing the Access Token

Every process running on the system has it’s EPROCESS structure that encapsulates all the data related to it. You can see the full definition i.e. here. (The EPROCESS structure has some slight differences from one version of Windows to another – read more). Some members of EPROCESS, such as PEB (Process Environment Block), are accessible form the user mode. Others – i.e. the mentioned Access Token – only from the kernel mode. We can see all the fields of EPROCESS using WinDbg:



As we can see, the field Token has an offset 0xF8 from the beginning of the structure.

Let’s display the details of the type containing the token:

dt nt!_EX_FAST_REF


The token is stored in a union _EX_FAST_REF, having two fields: RefCnt (reference counter) and Value. We are interested in replacing the Value only. The reference counter should better stay untouched for the sake of application stability.

Now, let’s have a look at tokens of some applications running on the Debuggee machine. We can list the processes using:




The first column shown is an address of EPROCESS structure corresponding to the particular process.

Now, using the displayed addresses, we can find more details about chosen processes.

!process [address of EPROCESS]

We can notice the Access Token among the displayed fields:


We can also display the token in more low-level ways:

dt nt!_EX_FAST_REF [address of EPROCESS] + [offset to the Token field]



dd [address of EPROCESS] + [offset to the Token field]


As we can conclude from the above, the function !process automatically applied the mask and filtered out the reference counter from the displayed information. We can do the same thing manually, applying the mask that removes last 3 bytes with the help of eval expression:

?[token] & 0xFFFFFFF8


Stealing the Access Token via WinDbg

As an exercise, we will run a cmd.exe on a Debuggee machine and elevate it’s privileges from the Debugger machine, using WinDbg. See the video:

First, I am listing all the processes. Then, I am displaying Access Tokens of the chosen processes: System and cmd. I copied the the Access Token of System to into cmd, applying appropriate masks in order to preserve the reference counter. As a result, cmd.exe got elevated.

The token-stealing payload

Now we have to replicate this behavior via injected code. Of course it is not gonna be as easy, because we will be no longer aided by WinDbg.

Some well documented examples of the token-stealing payloads are provided as a part of Exploit code in the official HEVD repository:

The purpose of all the included payloads is the same: stealing the Access Token. However, we can see that they are in a bit different variants, appropriate for particular vulnerabilities. Most of their code is identical, only the ending differs (commented as “Kernel Recovery Stub“). It is a code used to make all the necessary cleanups, so that the application will not crash while returning after the payload execution.

Anyways, let’s take a look at the generic one:

First of all, we have to find the beginning of EPROCESS structure. With WinDbg there was no effort required to do this – it was just displayed on the command. Now, we need to find the beginning of this structure by our own, navigating through some other fields.

As a starting point, we will use KPCR (Kernel Processor Control Region) structure, that is pointed by FS register on 32bit versions of Windows (and by GS on 64 bit).

The code presented above takes advantage of the relationship between the following structures:

KPCR (PrcbData) -> KPRCB (CurrentThread) -> KTHREAD (ApcState) -> KAPC_STATE (Process) -> KPROCESS

KPROCESS is the first field of the EPROCESS structure, so, by finding it we ultimately found the beginning of EPROCESS:

When the EPROCESS of the current process has been found, we will use it’s other fields to find the EPROCESS of the SYSTEM process.

LIST_ENTRY is an element of a double link list, connecting all the running processes:

The field Flink points to the LIST_ENTRY field of the next process. So, by navigating there and substituting the field’s offset, we get a pointer to the  EPROCESS structure of another process.

Now, we need to get the PID value (UniqueProcessId) and compare it with the PID typical for the System process:

This is the corresponding code fragment in the exploit:

Once we have EPROCESS of the System as well as EPROCESS of our process, we can copy the token from one to another. In the presented code reference counter was not preserved:

When we look for the offsets of particular fields, WinDbg comes very handy. We can display commented structures by the following command:

dt nt!

For example:

dt nt!_KPCR

dt nt!_KPRCB

0x120 + 0x004 = 0x124

That gives the mentioned offset:

Writing the payload

We can write the code of the payload by inline assembler (embedded inside the C/C++ code) as it is demonstrated in HEVD exploit:

However, in such case our code will be wrapped by the compiler. As we can see, some additional prolog and epilog was added:


That’s why we have to remove the additional DWORDs from the stack before we return, by adding 3*sizeof(DWORD) = 12 (0xC) to the stack pointer (ESP):


If we want to avoid the hassle, we can declare our function as naked (read more here). It can be done by adding a special declaration before the function, i.e.:

__declspec(naked) VOID TokenStealingPayloadWin7()

Another option is to compile the assembler code externally, i.e. using NASM. Then, we can export the compiled buffer i.e. to a hexadecimal string.

As an exercise, we will also add some slight modification to the above payload, so that it can preserve the reference counter:



nasm.exe shellc.asm

Then, we can open the result in a hexeditor and copy the bytes. Some of the hexeditors (i.e. HxD) have even a support to copy the data as an array appropriate for a specific programming language:

You can see the both variants of the payload (the inline and the shellcode) demonstrated in my StackOverflow exploit for HEVD:


See it in action:

Details about exploiting this vulnerability will be described in the next part. See also writeups by Osanda and Sam added in the appendix.

Appendix – Osanda Malith on Stack Overflow – Sam Brown on Stack Overflow – a handy set of commonly used WinDbg commands

Posted in KernelMode, Tutorial, WKE | Tagged , | 11 Comments

Starting with Windows Kernel Exploitation – part 2 – getting familiar with HackSys Extreme Vulnerable Driver

Recently I started learning Windows Kernel Exploitation, so I decided to share some of my notes in form of a blog.

The previous part was about setting up the lab. Now, we will play a bit with HackSysExtremeVulnerableDriver by Ashfaq Ansari in order to get comfortable with it. In the next parts I am planning to walk through the demonstrated vulnerabilities and exploitation techniques.

What I use for this part:

Installing and testing HEVD

First, I will show how to install HEVD. We will and configure Debugee and the Debugger in order to see the Debug Strings and HEVD’s symbols. We will also play a bit with dedicated exploits. You can see the video and read the explanations below:

Watching the DebugStrings

HEVD and the dedicated exploits prints a lot of information as DebugStrings. We can watch them from the Debugger machine (using WinDbg) as well as from Debugee machine (using DebugView).

Before installing HEVD, we will set up everything in order to see the strings that are being printed during driver’s initialization.

On the Debugger:

We need to break the execution of the Debugee in order to get the kd prompt (in WinDbg: Debug -> Break). Then, we enable printing Debug Strings via command:

ed nt!Kd_Default_Mask 8

After that, we can let the Debugee run further by executing the command:


Warning: Enabling this slows down the Debugee. So, whenever possible, try to watch DebugStrings locally (on the Debugee only).

On the Debugee:

We need to run DebugView as Administrator. Then we choose from the menu:

Capture -> Capture Kernel


Installing the driver

First, we will download the pre-build package (driver+exploit) on the Debugee (the victim machine), install them and test. We can find it on the github of HackSysTeam, in section releases ( The package contains two version of driver – vulnerable and not. We will pick the vulnerable one, built for 32 bit (i386).


We choose Service Start as Automatic. Then we click: [Register Service] and when it succeeded: [Start Service].

For driver installation I used OSR Driver Loader, that is a very convenient manager. But alternatively, you can do the installation from commandline, using:

sc create [service name] type=kernel binpath=[driver path]
sc start [service name]

If the installation succeeded, we should see the HEVD banner printed on WinDbg (on the Debugger machine) as well as on DbgView on Debugee Machine.

Adding symbols

The precompiled package of HEVD comes with symbols (sdb file) that we can also add to our Debugger. First, let’s stop the Debugee by sending it a break signal, and have a look at all the loaded modules.


To find the HEVD module, we can set a filter:

lm m H*

We will see, that it does not have any symbols attached. Well, it can be easily fixed. First, turn on:


– in order to print all the information about the paths to which WinDbg referred in search for the symbol. Then, try to reload the symbols:


…and try to refer to them again. You will see the path, where we can copy the pdb file. After moving the pdb file to the appropriate location on the Debugger machine, reload the symbols again. You can test them by trying to print all the functions from HEVD:

x HEVD!*

(See the details on the Video#1)

Testing the exploits

The same package contains also a set of the dedicated exploits. We can run each of them by executing an appropriate command. Let’s try to deploy some of them and set cmd.exe as a program to be executed.


Pool Overflow Exploit deployed:


If the exploitation went successful, the requested application (cmd.exe) will be deployed with elevated privileges.

By the command


we can confirm, that it is really run elevated:


At the same time, we can see on our Debugger machine the Debug Strings printed by the exploit:


All of the exploits, except the double fetch should run well on one core.  If we want this exploit to work, we need to enable two cores on the Debugee machine.

WARNING: Some of the exploits are not 100% reliable and we can encounter a system crash after deploying them. Don’t worry, this is normal.

Hi driver, let’s talk!

Just like in case of the user land, in the kernel land exploitation begins from finding the points, where we can supply an input to the program. Then, we need to find the input that can corrupt the execution (in contrary to the user land – in kernel land a crash will directly result in having a blue screen!). Finally, we will be trying to craft the input in a way that let us control the execution of the vulnerable program.

In order to communicate with a driver from user mode we will be sending it IOCTLs – Input-Output controls. The IOCTL allows us to send from the user land some input buffer to the driver. This is the point from which we can attempt the exploitation.

HEVD contains demos of various classes of vulnerabilities. Each of them can be triggered using a different IOCTL and exploited by the supplied buffer. Some (but not all) will cause our system to crash when triggered.

Finding Device name & IOCTLs

Before we try to communicate with a driver, we need to know two things:

  1. the device that the driver creates (if it doesn’t create any, we will not be able to communicate)
  2. list of IOCTLs (Input-Output Controls) that the driver accepts

HEVD is open-source, so we can read all the necessary data directly from the source code. In real life, most of the time we will have to reverse the driver in order to get it.

Let’s have a look at the fragment of code where HEVD creates a device.

The name of the device is mentioned above.

Now, let’s see find the list of IOCTLs. We will start from looking at the array of IRPs:

The function linked to IRP_MJ_DEVICE_CONTOL will be dispatching IOCTLs sent to the driver. So, we need to take a look inside this function.

It contains a switch, that calls a handler function appropriate to handle a particular IOCTL. We can grab our list of IOCTLs by coping the switch cases. The values of the constants are defied in a header:

Writing a client application

Ok, we got all the necessary data that we can use to communicate with the driver by our own program. We can put it all together in a header file, i.e.: hevd_constants.h

#pragma once
#include <windows.h>
const char kDevName[] = "\\\\.\\HackSysExtremeVulnerableDriver";

view raw
hosted with ❤ by GitHub

Number of each IOCTL is created by a macro defined in a standard windows header winioctl.h:

If you include windows.h header, the above macro will be added automatically. For now, we not need to bother about meaning of the particular constants – we will just use the defined elements as they are.

So, we are ready to write a simple user land application that will talk to the driver. First, we open the device using function CreateFile. Then, we can send the IOCTL using DeviceIoControl.

Below you can see a tiny example. This application sends the STACK_OVERFLOW IOCTL to the driver: send_ioctl.cpp

#include <stdio.h>
#include <windows.h>
const char kDevName[] = "\\\\.\\HackSysExtremeVulnerableDriver";
HANDLE open_device(const char* device_name)
HANDLE device = CreateFileA(device_name,
return device;
void close_device(HANDLE device)
BOOL send_ioctl(HANDLE device, DWORD ioctl_code)
//prepare input buffer:
DWORD bufSize = 0x4;
BYTE* inBuffer = (BYTE*) HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, bufSize);
//fill the buffer with some content:
RtlFillMemory(inBuffer, bufSize, 'A');
DWORD size_returned = 0;
BOOL is_ok = DeviceIoControl(device,
NULL, //outBuffer -> None
0, //outBuffer size -> 0
//release the input bufffer:
HeapFree(GetProcessHeap(), 0, (LPVOID)inBuffer);
return is_ok;
int main()
HANDLE dev = open_device(kDevName);
return 0;

view raw
hosted with ❤ by GitHub

Try to compile this program and deploy it on the Debugee machine. Start the DebugView and observe DebugStrings printed by the driver.

If you enabled printing DebugStrings on the Debugger machine, you should see similar output:

As we can see, the driver got our input and reported about it.

Exercise: let’s have a crash!

As an exercise, I created a small client for HEVD, that allows to send it various IOCTLs with the input buffer of the requested length. You can find the source code here:

..and the compiled 32 bit binary here.

Try to play with various IOCTLs, till you get the crash. Because the Debugee runs under the control of the Debugger, you should not get a blue screen – instead, WinDbg will get triggered. Try to make a brief crash analysis for every case. Start from printing the information by:

!analyze -v

Some other helpful commands:

k - stack trace
kb - stack trace with parameters
r - registers
dd [address]- display data as DWORD starting from the address

For more, check the WinDbg help file:


In our sample application, the user buffer is filled with “A” -> ASCII 0x41 (

RtlFillMemory(inBuffer, bufSize, 'A');

So, wherever we see it in the crash analysis, it means the particular data can be filled by the user.

Example #1

Example #2

Mind the fact, that triggering the same vulnerability can give you a different output, depending on the immediate source of the crash, that is related to i.e. size of the overflow, current layout of the memory, etc.

Part 3:


Posted in KernelMode, Tutorial, WKE | Tagged , | 8 Comments

Starting with Windows Kernel Exploitation – part 1 – setting up the lab

Recently I started learning Windows Kernel Exploitation, so I decided to share some of my notes in form of a blog.

This part will be about setting up the lab. In further parts I am planning to describe how to do some of the exercises from HackSysExtremeVulnerableDriver by Ashfaq Ansari.

I hope someone will find this useful!

What I use for this part:

  • Kali Linux – as a host system (you can use anything you like)
  • VirtualBox
  • 2 Virtual Machines: Windows 7 32 bit (with VirtualBox Guest Additions installed) – one will be used as a Debugger and another as a Debugee
  • WinDbg (you can find it in Windows SDK)

When we do userland debugging, we can have a debugger and a debuggee on the same machine. In case of kernel debugging it is no longer possible – we need a full control over the debugee operating system. Also, when we will interrupt the execution, full operating system will freeze. That’s why we need two virtual machines with separate roles.

Setting up the Debugger

Debugger is the machine form where we will be watching the Debugee. That’s why, we need to install WinDbg there, along with symbols, that will allow us to interpret system structures.

In order to install WinDbg we need to download Windows SDK (depending on the version of Windows, sometimes we will also need to install some required updates).

It is important to choose Debugging Tools from the installer options:


Once we have WinDbg installed. we should add Symbols. In order to do this, we just need to add an environment variable, to which WinDbg will automatically refer:


… and fill it with the link from where it can download symbols.

Full variable content may look like this (downloaded symbols will be stored in C:\Symbols):


Setting up the Debugee

We need to enable Debugee to let it be controlled from outside. In order to do this, we are adding one more option in a boot menu – if we start the machine with this configuration, it is enabled for debugging.
We need to use a tool bcdedit. First we copy the current settings into a new entry, titled i.e. “Debug me”:

bcdedit /copy {current} /d "Debug me"

It gives us in return a GUID of the new entry. We need to copy it and use to enable debugging on this entry:

bcdedit /debug {MY_GUID} on

At the end we can see the settings where the debugging interface will be available:

bcdedit /dbgsettings

Setting up the connection between the Debugger and the Debuggee

Debugger and Debugge will be communicating via Serial Port COM1, that will be emulated in the host system by a Named Pipe.  It is very simple to configure, we just have to make sure that the debugger and the debuggee have the same pipe name set. Debugger will be creating the pipe, while the Debuggee will be connecting to the existing one (that’s why we always have to run Debugger first):

I use Linux as my host system, so I chose as a pipe name:


Note that if you are using Windows as your host system, your pipe name will have to follow different convention. Example:


Read more:

Testing the connection

We have everything set up, now we just need to test if it works correctly! Let’s start the Debugger first, run WinDbg, and make it wait for the connection from the Debugee. Example:

File->Kernel Debug


We are choosing COM as an interface:


Then we will run the Debugee machine, and when we see that it got connected to the pipe, we will send it interrupt. Example:

The Debugee is connected to the pipe:


Now we can interrupt it, clicking Debug->Break:


If we get the kd prompt, it means we are in control of the Debugee:


See the full process on the video:

The Debugee frozen, waiting for the instructions form the Debugger. By a ‘g’ command we can release the Debugee and let it run further:


Part 2:

Posted in KernelMode, Tutorial, WKE | Tagged | 12 Comments

Hijacking extensions handlers as a malware persistence method

Recently I gave a presentation titled “Wicked malware persistence methods” (read more here). After releasing the slides I got questions about some of the demonstrated methods – especially about the details of extension handler hijacking – so, I decided to explain it in a blog post.

As an introduction, you can see a video demonstrating how it looks in action:

The demo app is open source and you can find it on my github:

Basically, the goal to achieve was to deploy a malware each time when the user clicks a file with some defined extension – in a way that no change in the default behavior will be noticed. For the demonstration purpose, instead of a malware I used simply calc.exe 😉

How the extension handling works?

On Windows, extensions that are known by the operating system are defined in the registry. For example, we have an .html extension:

Each of the defined extensions may be connected with some handler, that is also defined in the registry. In my case, .html files are handled by FirefoxHTML:

The handler has various features defined – but the most important is command:

The command defines what action has the be taken when the file with the particular extension is clicked.
Let’s take a closer look at the above command:

C:\Program Files\Mozilla Firefox\firefox.exe -osint -url "%1"

As we can see, it runs firefox.exe with some parameters – one of them (%1) is the name of the file that was clicked. Thanks to this, Firefox opens the clicked file.

How to abuse it for executing a malware?

Knowing the above, we can overwrite the command by our own, that will be deploying a malware. For example:

Now, the .html extension is not handled by the firefox.exe, but by the ProxyApp.exe (see the code here) – that deploys firefox.exe, but also a malicious app (or calc in our demo case 😉 ). From the point of view of the user nothing has changed – firefox opens the document as it was before – but in the background another application starts to run…

If we modify the handlers defined under the keys of the current user, we not need any type of privilege elevation to install this type of hijack.

Global and local handlers – and how to hijack even more extensions

Extension handlers are defined at two levels – global, that are defined in the


…and local – that are defined for a particular user, i.e.:


Their hierarchy of execution goes like this: if no local extension/extension handler is defined, then the global one  is executed.

This gives us very important advantage…

Obviously, without Administrator privileges we cannot modify the keys under HKEY_CLASSES_ROOT – but still we can read them. Also, we can modify the keys belonging to the current user.

So, the trick is simple – read the extensions handlers defined globally, rewrite them locally and then install the hijack.

Does it work for every version of Windows?

I tested it under: Windows 7 32/64, Windows 8.1 32/64, Windows 10 32 bit – and it worked stealthy, without any problems.

Win8.1 64 bit:

Win10 32 bit:


Posted in Malware, Techniques, Tutorial | 4 Comments

Introducing PE_unmapper

Recently I wrote a small tool, that can be used as a helper in malware analysis.
Various malware types unpack their core modules in memory, load them and run.
In order to unpack them fast, we can let the malware do all the operations and then just dump the result. However, the dumps are in virtual format – so, we may have problems running them independently and viewing by typical tools.
PE_unmapper allows to convert those dumps into their raw format. The tool is totally independent, so it is up to you by which way you prefer to make dumps. You only need to know the base where the module was loaded, in order to relocate it properly.

download-icon-png-5 The tool is open-source, available on my github

youtube-512 See it in action on YouTube:


Posted in Malware, Tools, Tutorial | Tagged , | 1 Comment