Import all the things! Solving FlareOn4 Challenge 3 with libPeConv

Recently I started making a small library for loading and converting PE files (libpeconv, available on my GitHub). The library is still on early stages of development, so please don’t judge and don’t use it in any serious projects. The API may change anytime! However, I have so much fun developing and testing it, that I wanted to share some of my experiments and ideas.

Some time ago I solved some of the FlareOn4 challenges, i.e. the challenge 3. That time I didn’t have the libpeconv yet, so I solved it by some other method. Now it came to my mind, that with the help of my new library solving it could be way much faster and easier.  In this post I will describe my alternative solution and some of the related experiments.

Tool used

For the static analysis:

  • IDA (demo version is enough)

For building the projects:

  • Visual Studio + CMake
  • Python27 (optional – for the helper script that I used in “Bonus”)

Overview

The challenge named greek_to_me.exe is a 32bit PE file. It has stripped relocations.

exe_info

When we deploy it, it shows the empty console and waits. It is not reading any data from the standard input, so we can conclude that it is using some another way to read the password from the user.

We will start from some static analysis in IDA. The crackme has a very simple and clean structure, it is not obfuscated. We can see that at the beginning of the execution it creates a socket and waits for the input.

The socket listens at localhost on port 2222:

make_socket

After getting the connection, it reads 4 bytes from the input into the buffer:

recv_4

After it read 4 bytes, it starts processing the input and uses it for decoding an encrypted buffer:

read_buf

If the checksum is valid, it means the encrypted code was decrypted properly, and it is further executed.

As we can see, only 1 byte of the input is used for decoding the buffer, so we can easily brutforce it. The code responsible for decoding the buffer is also pretty simple:

const size_t encrypted_len = 0x79;
for (int i = 0; i < encrypted_len; i++) {
    BYTE val = encrypted_code[i];
    encrypted_code[i] = (unknown_byte ^ val) + 0x22;
}

The only part of the crackme that may be somehow challenging is the checksum – this function is not that simple to reimplement. However, if we want to make a brutforcer, we need to be able to calculate the checksum after every attempt.

In my previous solution, I just reimplemented the checksum – it worked but it was not so much fun 😉 . I saw also some other approaches such as emulating the checksum function by the Unicorn engine, using angr framework, or making a brutforcer that talks to the original program via socket. Can it be done even faster? Let’s see…

LibPeConv comes into play

With PeConv we can convert any PE file from raw format to virtual and back. It also provides a custom PE loader – it’s goal is to provide a possibility of loading any PE file into the current process (even if it is not a DLL and even if it does not have no relocations table – it will be explained in the further part). This loaded PE can be later used as a fully functional PE file that can run from inside the curent process. We can also use any selected function from its code – all we need to know is the function’s RVA and the API.

In this case, I will use libpeconv to load the crackme and import from it the function calculating checksum. Also, rather than copying the encrypted buffer to my code, I will read it directly from the loaded PE.

Preparing the required information

Let’s take a look at the crackme again in IDA. We need to find the appropriate offsets and understand the API of the function that we are going to import.

The function calculating the checksum starts at RVA 0x11E6:

checksum_func

It takes 2 arguments: pointer to the buffer and it’s size.

It returns a WORD type:

return_word

Summing up, we can define the function prototype as:

WORD calc_checksum(BYTE *decoded_buffer, size_t buf_size)

It is also worth to note, that this function is self-contained and does not call any imported libraries – that makes importing it even easier (we are not forced to load any imports for the module or to apply relocations).

Another thing that we need is the encrypted buffer. It starts at RVA 0x107C and is 0x79 (121) bytes long:

enc_code

That’s all! Let’s start coding.

Solving the crackme with libPeConv

The current version of libpeconv allows to load PE file in two ways. By the function load_pe_module and by the function load_pe_executable. The second one: load_pe_executable is a complete loader, that loads given PE to the current process in the RWX memory, automatically applies relocations and load dependencies. The first one (load_pe_module) does not load the dependencies and also gives more control: we may load the PE file in non-executable memory and applying the relocations is optional. More information (and *very* possible updates on the API) you can find here:
https://github.com/hasherezade/libpeconv/blob/master/libpeconv/include/peconv/pe_loader.h

As we saw, the function that we want to import is self-contained, so it will not harm if we load the crackme PE without imports and without relocations (to see it loaded as a fully functional PE see the next part of the article). I will use a function load_pe_module

BYTE* loaded_pe = (BYTE*)load_pe_module(
    path,
    v_size, // OUT: size of the loaded module
    true,   // executable
    false   // without relocations
);

Now, let’s import the function. First let’s make a pointer to it:

WORD (*calc_checksum) (BYTE *buffer, size_t buf_size) = NULL;

Calculate the absolute offset to the function within the loaded module:

ULONGLONG offset = DWORD(0x11e6) + (ULONGLONG) loaded_pe;

And filling the pointer:

calc_checksum = ( WORD (*) (BYTE *, size_t ) ) offset;

That’s it, now we can use the function in our application like any other function.

But before we can start brutforcing, we also need to fill the pointer to the buffer:

g_Buffer = (uint8_t*) (0x107C + (ULONGLONG) loaded_pe);

This is the full brutforcer that I perpared:
https://gist.github.com/hasherezade/44b440675ccc065f111dd6a90ed34399#file-brutforcer_1-cpp
And it works  🙂  The value that we got is exactly what it was supposed to be:

brutforce_1.png

But still, the found value is just a part of the solution, not the flag that we are searching for. As we know from the static analysis, if this value is given correct, the chunk of code will be decrypted and executed. Would be cool to see how exactly that chunk of code looks when it is written into it’s place, don’t you think?

And also it is very easy to achieve. The PE file is loaded in the RWX memory inside the current process – so we can easily substitute the encrypted chunk of code with the decoded. Simple memcpy will do the job:

memcpy(g_Buffer, g_Buffer2, g_BufferLen);

Then, libPeConv will help us to convert the PE file back to the raw format, so that we can open it in IDA. We can do it with the help of pe_virtual_to_raw from libpeconv:

size_t out_size = 0;
BYTE* unmapped_module = pe_virtual_to_raw(
    loaded_pe, //pointer to the module
    v_size, //virtual size
    module_base, //in this case we need here
                 //the original module base, because
                 //the loaded PE was not relocated
    out_size //OUT: raw size of the unmapped PE
);

And this is the complete solution:
https://gist.github.com/hasherezade/36a4a531840cfe1fd5997bc7c5f6be4d#file-brutforcer_2-cpp

Comparing the dumped executable with the original one we can see that the buffer was overwritten:
filled_buf.png
So let’s see the modified exe in IDA:
flag_revealed
And yes! At the known offset there is the flag revealed:
et_tu_brute_force@flare-on.com

Bonus – loading  and running a PE with stripped relocations

Ok, you may say – it was easy – the loaded function was self contained, so we could as well rip it off from the original file, not using any loaders. But what if the function calls several other functions within the given module and also imported functions? Will the same trick work? And could it work even for PE file without relocations?

To answer those questions I prepared another test case. Now, instead of loading one function, I will load and execute the full crackme from inside the brutforcer.

First we will modify few things. This time, instead of using load_pe_module I will use load_pe_executable – to load full executable with dependencies.

BYTE* loaded_pe = (BYTE*)load_pe_executable(path, v_size);

The function will automatically detect that the PE file has no relocations, and enforce loading it at it’s original module base. Mind the fact, that allocating memory at the specific base may not always work – so, sometimes it takes several runs to execute it properly. You must also make sure that the module base of the loader does not collide with the module base required by the payload (if the loader’s base is random it is good enough).

Once the PE file is loaded, we just need to get it’s Entry Point – and then we can call it like any other function*:

// Deploy the payload:
// read the Entry Point from the headers:
ULONGLONG ep_va = get_entry_point_rva(loaded_pe)
    + (ULONGLONG) loaded_pe;

//make pointer to the entry function:
int (*loaded_pe_entry)(void) = (int (*)(void)) ep_va;

//call the loaded PE's ep:
int ret = loaded_pe_entry();


* – but mind the fact that depending on the payload’s implementation details, once you redirected your execution to it’s entry point, it may just exit after finishing it’s job and never return back to your code.

I am going to modify the brutforcer code in such a way, that this time after finding the value the original crackme will be run. This is the code of the full application:
https://gist.github.com/hasherezade/9d5186b27c730d01849ac1787b3d699b#file-brutforcer_3-cpp

To make sure that everything works fine (the deployed payload really creates socket and gives response in exactly the same way like the one deployed independently), I wrote a small Python script that will communicate with it and display the response:
https://gist.github.com/hasherezade/328210a57464360e23e125929b62b301#file-test-py

And now, let’s see it all in action:

This is all what I prepared for today, I hope you enjoyed it! The lib is now under rapid development so many things will get refactored and improved, stay tuned!
The binaries of all the presented loaders, along with the crackme, are available here: https://drive.google.com/open?id=1ZFnRsuZxdlw6j2OVEfIJCLfmd8jwmu7y – the password to the zip is: crackme

Appendix

See other approaches to solve the same crackme:

Posted in CrackMe, Programming, Tools | Tagged , , , | Leave a comment

Solving the Shabak’s Airplane challenge – Task 3

Some time ago I solved the Airplane challenge published by Israeli Shin-Bet (Shabak). The crackme has three levels of increasing difficulty. Each one is a 32 bit Windows application. It was a very pleasant task, not difficult but also not too trivial. In this writeup I will present my solutions.

Task 1 and 2 have been described in the previous part, you can read it here. Now it’s time for the final one!

Task 3

http://10100110110100001100001011000100110000101101011.com/Airplane/3_with_the_best.php

Mirror [task3], password “Challenge”

This time the crackme comes with a hint:

hint:
Maybe this program doesn't do more than it seems, our special agent 
have told us that when the program was executed in a different 
country, it behaved differently

It reminds me of the techniques used by some malware to target only the chosen countries. Usually it is implemented in one of the two ways:

  1. sending a request to some of the services that gives geolocation data basing on the external IP
  2. checking the installed language/keyboard layout

Let’s run the crackme and observe how it behaves, if it makes any internet connections etc. (we can use i.e. ProcMon).

The crackme printed a message: “May you enter Deep and Dreamless Slumber”:

…and terminated after some timeout. No internet connection has been made. So, I guess it will do something about checking the installed language.

This time I will start from the static analysis in IDA. Let’s load the application and have a look at the referenced functions:

There is GetLocaleInfoEx. I suspect it will be involved in verification process, so, let’s follow where it is called:

The output is saved in a variable of WORD size. If I try to follow this variable and check the references, I don’t find anything more than the above line:

However, it’s upper byte seems to be referenced somewhere else!

It seems if this flag matches, some other function is copied on the place of the function printing the initial “slumber” message:

check_cond.png

So, we have some self-modifying code here 🙂 ! Let’s see what is this function doing:

search_dll.png

fs:30h -> PEB
PEB + 0XC -> _PEB_LDR_DATA Ldr
Ldr + 0x14 -> _LIST_ENTRY InMemoryOrderModuleList

It doesn’t seem to be a function printing the password. Instead, it searches Kernel32.dll through the loaded DLLs:

check_next_char.png

We can also see an atypical NOP instruction, that can confuse some debuggers:

confusing_nop

OllyDbg and it’s derivatives fails to parse it properly:

confsing_nop_olly

If we want to analyze it under OllyDbg we need to substitute this fragment by a typical NOP (0x90) in order to get a clear view:

to_nop

Now I will do some dynamic analysis in OllyDbg. I set the breakpoint on the flag check (the one that was deciding whether or not to overwrite the function):

cond_check_olly

…and when it was hit, I changed of the Z flag in the registry. After the function was overwritten, I enforce OllyDbg to re-analyze the code and then set the breakpoint at the function’s beginning:

overwritten_func

When the breakpoint is hit, we can step follow the function’s execution to see is details what it is doing.

At the end there is something interesting – a new PE file in the memory:

unpacked_dll

We can see that the previously stored pointer to kernel32.dll is being overwritten by the pointer to this module:

overwrite_kernel32_ptr

I dumped this PE and unpapped it using pe_unmapper in order to get a better view. It is named stub.dll and it exports one function: GetComputerNameW:

stub_dll.png

After following references in IDA, we can find, that this module was unpacked just before the flag check:

unpacking_stub.png

It was manually loaded in the memory (without being dropped on the disk and without using LoadLibrary function). This trick is also very often used in malware.

Anyways, now we need to find out where the stub.dll is used. So, I set the breakpoint on this module:

bp_at_stub

The breakpoint is hit inside ntdll:

bp_in_ntdll.png

Now I set the breakpoint on the .text section of the main module (Third.exe) to see the point where the execution returns:

bp_2

This is where the stub.dll was referenced from inside the main module:

get_computer_name.png

So, at this point the application gets the address of the function: “GetComputerNameW”. It can fetch this function either from kernel32.dll (if the locale flag was not set) or from the stub.dll (if the locale flag was set).

It seems we are pretty close to the solution, because some formatted printing (“%s”) is done just after that lines (probably this is the flag being printed). Most probably the key  lies inside the function GetComputerName, so let’s go there.

go_to_func.png

Inside GetComputerNameW:

inside_get_comp_name

Again OllyDbg cannot manage parsing some instructions. So, I opened the dumped version of Stub.dll in IDA and used as a reference.

This is how the beginning of the function looks:

get_comp_name.png

fs:30h -> PEB
PEB + 0x10 -> _RTL_USER_PROCESS_PARAMETERS ProcessParameters
ProcessParameters + 0x44 -> _UNICODE_STRING CommandLine.Buffer

The function fetches the command line of the main process, and then process the buffer.

Again, the atypical NOP instructions has been used (marked red on the picture):

weird_nops.png

We can see two buffers being compared. One of them is the command line buffer stored the memory, and another is hardcoded in the stub.dll. Four consecutive DWORDs are compared. If those two buffers are not matching, then the function sets an error code and exits:

buf_check.png

The hardcoded buffer starts at RVA 0x2000 – that is the beginning of .rdata section).

rdata_start

Of course we need the above function to exit without error – then our flag will be printed.

It is easy to conclude, that in order to get the flag, we must have the same values in the memory buffer as in the hardocoded buffer.

I set the breakpoint before this comparison started. The, I just copied the hardcoded buffer and overwritten by its content the buffer in the memory:

overwritten_buf

Now, let it run. And this is what we get:

t3_done

Solved!

We reached the “Airplane Complete” page.

http://10100110110100001100001011000100110000101101011.com/Airplane/Airplane_Complete_U_D_1.html

R. Sanchez is safe, happy end! 🙂

Posted in CrackMe | Tagged | 1 Comment

Solving the Shabak’s Airplane challenge – Tasks 1 and 2

Some time ago I solved the Airplane challenge published by Israeli Shin-Bet (Shabak). The crackme has three levels of increasing difficulty. Each one is a 32 bit Windows application. It was a very pleasant task, not difficult but also not too trivial. In this writeup I will present my solutions.

The story is about saving Shabak’s operative, R. Sanchez from from the prison 😀 So, let’s go!

Task 1

http://10100110110100001100001011000100110000101101011.com/Airplane/1_the_best_researcher.php
Mirror: [task1], password “Challenge”

When we run the application, it just exits and nothing happens. So, I opened it under a debugger (OllyDbg). I started the analysis from viewing the referenced strings, and I noticed something potentially interesting:

%PROGRAMFILES%\\meseeker inc

It seems to be some custom file referenced from inside the code. Let’s go to this point of code, and see how it is used:

Indeed, the program is searching for this file and checking it’s attributes. If they match the required, some output is printed, that probably is our password.

Now we can solve it by two ways – either create the file with the proper attributes, or to influence the execution, so that it will print the password no matter what. I have chosen the second way – setting a breakpoint on each condition, and when it is hit, changing the flag in order to emulate the the appropriate condition being met.

So, indeed it resulted in printing the password:

Solved!

Task 2

http://10100110110100001100001011000100110000101101011.com/Airplane/2_should_work.php
Mirror: [task2], password “Challenge”

In contrary to the previous one, Task 2 prompts for the password:

Let’s enter whatever and see what happens:

It dropped a file “GettingSchwifty.bat” and tried to load it. It turned out not to be a valid PE, so the error occurred.

It seems the password that we typed was supposed to decrypt this PE file (name .bat is just a disguise). Let’s take a look at the dropped file:

Inside:

As we can see, it has some regular patterns inside. It made me think that it may be XOR encrypted. So, I tried to XOR it with some valid PE file, to see if it reveals the password (I used my python script: dexor.py):

./dexor.py --file GettingSchwifty.bat --keyfile Second.exe

When we view the output by a hexeditor, we can see the repeating pattern at the beginning:

This may be our password, so let’s try. I copied this fragment, saved it as a key.bin and then tried dexor again:

./dexor.py --file GettingSchwifty.bat --keyfile key.bin

And hurray, the output is a valid PE file: a DLL named Piper.dll:

Since I already have the DLL, I don’t really care what was the password that allowed to decrypt it. I will just run the main executable (Second.exe) under the debugger, set the breakpoint before the GettingSchwifty.bat was loaded, and replace it with my version.

When the breakpoint before the LoadLibraryA is hit. I am deleting the dropped GettingSchwifty.bat and copying on it’s place my decrypted DLL.

It got loaded properly, so now we can enter to the function inside the DLL:

But it’s not over yet. One more password is required, before we get our flag printed. The application ask a question over a pipe “flumbus_channel” and we are supposed to answer it:

After a brief analysis I concluded that the brutforce is not the solution. So, we must approach it by some other way. By some googling around I found the answer for the asked question: “What is cooler than being cool?”.

(source: http://www.urbandictionary.com/define.php?term=Cooler%20Than%20Being%20Cool)

The answer is: “Ice cold”! Pretty obvious, isn’t it? 😉 But is it what the application wanted us to say? Let’s pass the input and check. I want a fast solution, so instead of writing a client that will talk over the pipe, I will just edit the buffer in the memory. Let’s set a breakpoint on the call to ReadFile and follow the buffer in dump:

After the ReadFile returned, we can edit this buffer in order to emulate the input being read:

The password is translated to the uppercase, then it is used to decrypt the output buffer. Checksum of the decrypted buffer is calculated and compared with the hardcoded one: 0x55B8B000

It seems the password “ice cold” was the right one, the checksum matches! The output buffer got decrypted and by following it in dump we can already see the second flag:

However, displaying it nicely on the screen requires more effort – there are some debug checks, that causes application to exit:

I just patched the conditions above, so that the antidebug measures can not be taken:

And we get the password printed:

Another level cleared!

That’s how I reached the Task 3! This one will be a bit longer, so I am going to describe it in a second writeup.

Cheers!

Solution of the Task 3: https://hshrzd.wordpress.com/2017/06/25/shabak-airplane-challenge-task-3/

Posted in CrackMe | Tagged | 1 Comment

Starting with Windows Kernel Exploitation – part 3 – stealing the Access Token

Recently I started learning Windows Kernel Exploitation, so I decided to share some of my notes in form of a blog.

In the previous parts I shown how to set up the environment. Now we will get familiar with the payloads used for privilege escalation.

What I use for this part:

  • The environment described in the previous parts [1] and [2]
  • nasm
  • HxD

Just to recall, we are dealing with a vulnerable driver, to which we are supplying a buffer from the userland. In the previous part we managed to trigger some crashes, by supplying a malformed input. But the goal is to prepare the input in such a way, that instead of crashing the execution will be smoothly redirected into our code.

Very often, the passed payload is used to escalate privileges of the attacker’s application. It can be achieved by stealing the Access Token of the application with higher privileges.

Viewing the Access Token

Every process running on the system has it’s EPROCESS structure that encapsulates all the data related to it. You can see the full definition i.e. here. (The EPROCESS structure has some slight differences from one version of Windows to another – read more). Some members of EPROCESS, such as PEB (Process Environment Block), are accessible form the user mode. Others – i.e. the mentioned Access Token – only from the kernel mode. We can see all the fields of EPROCESS using WinDbg:

dt nt!_EPROCESS

eprocess

As we can see, the field Token has an offset 0xF8 from the beginning of the structure.

Let’s display the details of the type containing the token:

dt nt!_EX_FAST_REF

token.png

The token is stored in a union _EX_FAST_REF, having two fields: RefCnt (reference counter) and Value. We are interested in replacing the Value only. The reference counter should better stay untouched for the sake of application stability.

Now, let’s have a look at tokens of some applications running on the Debuggee machine. We can list the processes using:

!dml_proc

Example:

proc

The first column shown is an address of EPROCESS structure corresponding to the particular process.

Now, using the displayed addresses, we can find more details about chosen processes.

!process [address of EPROCESS]

We can notice the Access Token among the displayed fields:

vbox_info.png

We can also display the token in more low-level ways:

dt nt!_EX_FAST_REF [address of EPROCESS] + [offset to the Token field]

token_details.png

Or:

dd [address of EPROCESS] + [offset to the Token field]

token_raw.png

As we can conclude from the above, the function !process automatically applied the mask and filtered out the reference counter from the displayed information. We can do the same thing manually, applying the mask that removes last 3 bytes with the help of eval expression:

?[token] & 0xFFFFFFF8

mask.png

Stealing the Access Token via WinDbg

As an exercise, we will run a cmd.exe on a Debuggee machine and elevate it’s privileges from the Debugger machine, using WinDbg. See the video:

First, I am listing all the processes. Then, I am displaying Access Tokens of the chosen processes: System and cmd. I copied the the Access Token of System to into cmd, applying appropriate masks in order to preserve the reference counter. As a result, cmd.exe got elevated.

The token-stealing payload

Now we have to replicate this behavior via injected code. Of course it is not gonna be as easy, because we will be no longer aided by WinDbg.

Some well documented examples of the token-stealing payloads are provided as a part of Exploit code in the official HEVD repository: https://github.com/hacksysteam/HackSysExtremeVulnerableDriver/blob/master/Exploit/Payloads.c

The purpose of all the included payloads is the same: stealing the Access Token. However, we can see that they are in a bit different variants, appropriate for particular vulnerabilities. Most of their code is identical, only the ending differs (commented as “Kernel Recovery Stub“). It is a code used to make all the necessary cleanups, so that the application will not crash while returning after the payload execution.

Anyways, let’s take a look at the generic one:

https://github.com/hacksysteam/HackSysExtremeVulnerableDriver/blob/master/Exploit/Payloads.c#L186

First of all, we have to find the beginning of EPROCESS structure. With WinDbg there was no effort required to do this – it was just displayed on the command. Now, we need to find the beginning of this structure by our own, navigating through some other fields.

As a starting point, we will use KPCR (Kernel Processor Control Region) structure, that is pointed by FS register on 32bit versions of Windows (and by GS on 64 bit).

The code presented above takes advantage of the relationship between the following structures:

KPCR (PrcbData) -> KPRCB (CurrentThread) -> KTHREAD (ApcState) -> KAPC_STATE (Process) -> KPROCESS

KPROCESS is the first field of the EPROCESS structure, so, by finding it we ultimately found the beginning of EPROCESS:

When the EPROCESS of the current process has been found, we will use it’s other fields to find the EPROCESS of the SYSTEM process.

LIST_ENTRY is an element of a double link list, connecting all the running processes:

The field Flink points to the LIST_ENTRY field of the next process. So, by navigating there and substituting the field’s offset, we get a pointer to the  EPROCESS structure of another process.

Now, we need to get the PID value (UniqueProcessId) and compare it with the PID typical for the System process:

This is the corresponding code fragment in the exploit:

Once we have EPROCESS of the System as well as EPROCESS of our process, we can copy the token from one to another. In the presented code reference counter was not preserved:


When we look for the offsets of particular fields, WinDbg comes very handy. We can display commented structures by the following command:

dt nt!<structure name>

For example:

dt nt!_KPCR

dt nt!_KPRCB

0x120 + 0x004 = 0x124

That gives the mentioned offset:

Writing the payload

We can write the code of the payload by inline assembler (embedded inside the C/C++ code) as it is demonstrated in HEVD exploit:

https://github.com/hacksysteam/HackSysExtremeVulnerableDriver/blob/master/Exploit/Payloads.c#L63

However, in such case our code will be wrapped by the compiler. As we can see, some additional prolog and epilog was added:

function

That’s why we have to remove the additional DWORDs from the stack before we return, by adding 12 (0xC) to the stack pointer (ESP):

https://github.com/hacksysteam/HackSysExtremeVulnerableDriver/blob/master/Exploit/Payloads.c#L94

fixing

If we want to avoid the hassle, we can declare our function as naked (read more here). It can be done by adding a special declaration before the function, i.e.:

__declspec(naked) VOID TokenStealingPayloadWin7()

https://github.com/hasherezade/wke_exercises/blob/master/stackoverflow_expl/payload.h#L16

Another option is to compile the assembler code externally, i.e. using NASM. Then, we can export the compiled buffer i.e. to a hexadecimal string.

As an exercise, we will also add some slight modification to the above payload, so that it can preserve the reference counter:
https://github.com/hasherezade/wke_exercises/blob/master/stackoverflow_expl/shellc.asm

asm_snippet

Compile:

nasm.exe shellc.asm

Then, we can open the result in a hexeditor and copy the bytes. Some of the hexeditors (i.e. HxD) have even a support to copy the data as an array appropriate for a specific programming language:

You can see the both variants of the payload (the inline and the shellcode) demonstrated in my StackOverflow exploit for HEVD:

https://github.com/hasherezade/wke_exercises/tree/master/stackoverflow_expl

Compiled: https://drive.google.com/open?id=0Bzb5kQFOXkiSWTJOS2VZZ0JiU3c

See it in action:

Details about exploiting this vulnerability will be described in the next part. See also writeups by Osanda and Sam added in the appendix.

Appendix

https://osandamalith.com/2017/04/05/windows-kernel-exploitation-stack-overflow/ – Osanda Malith on Stack Overflow

https://www.whitehatters.academy/intro-to-windows-kernel-exploitation-3-my-first-driver-exploit/ – Sam Brown on Stack Overflow

https://briolidz.wordpress.com/2013/11/17/windbg-some-debugging-commands/ – a handy set of commonly used WinDbg commands

Posted in KernelMode, Tutorial, WKE | Tagged , | 8 Comments

Starting with Windows Kernel Exploitation – part 2 – getting familiar with HackSys Extreme Vulnerable Driver

Recently I started learning Windows Kernel Exploitation, so I decided to share some of my notes in form of a blog.

The previous part was about setting up the lab. Now, we will play a bit with HackSysExtremeVulnerableDriver by Ashfaq Ansari in order to get comfortable with it. In the next parts I am planning to walk through the demonstrated vulnerabilities and exploitation techniques.

What I use for this part:


Installing and testing HEVD

First, I will show how to install HEVD. We will and configure Debugee and the Debugger in order to see the Debug Strings and HEVD’s symbols. We will also play a bit with dedicated exploits. You can see the video and read the explanations below:

Watching the DebugStrings

HEVD and the dedicated exploits prints a lot of information as DebugStrings. We can watch them from the Debugger machine (using WinDbg) as well as from Debugee machine (using DebugView).

Before installing HEVD, we will set up everything in order to see the strings that are being printed during driver’s initialization.

On the Debugger:

We need to break the execution of the Debugee in order to get the kd prompt (in WinDbg: Debug -> Break). Then, we enable printing Debug Strings via command:

ed nt!Kd_Default_Mask 8

After that, we can let the Debugee run further by executing the command:

g

Warning: Enabling this slows down the Debugee. So, whenever possible, try to watch DebugStrings locally (on the Debugee only).

On the Debugee:

We need to run DebugView as Administrator. Then we choose from the menu:

Capture -> Capture Kernel

capture_kernel

Installing the driver

First, we will download the pre-build package (driver+exploit) on the Debugee (the victim machine), install them and test. We can find it on the github of HackSysTeam, in section releases (https://github.com/hacksysteam/HackSysExtremeVulnerableDriver/releases). The package contains two version of driver – vulnerable and not. We will pick the vulnerable one, built for 32 bit (i386).

osr_load

We choose Service Start as Automatic. Then we click: [Register Service] and when it succeeded: [Start Service].

For driver installation I used OSR Driver Loader, that is a very convenient manager. But alternatively, you can do the installation from commandline, using:

sc create [service name] type=kernel binpath=[driver path]
sc start [service name]

If the installation succeeded, we should see the HEVD banner printed on WinDbg (on the Debugger machine) as well as on DbgView on Debugee Machine.

Adding symbols

The precompiled package of HEVD comes with symbols (sdb file) that we can also add to our Debugger. First, let’s stop the Debugee by sending it a break signal, and have a look at all the loaded modules.

lm

To find the HEVD module, we can set a filter:

lm m H*

We will see, that it does not have any symbols attached. Well, it can be easily fixed. First, turn on:

!sym_noisy

– in order to print all the information about the paths to which WinDbg referred in search for the symbol. Then, try to reload the symbols:

.reload

…and try to refer to them again. You will see the path, where we can copy the pdb file. After moving the pdb file to the appropriate location on the Debugger machine, reload the symbols again. You can test them by trying to print all the functions from HEVD:

x HEVD!*

(See the details on the Video#1)

Testing the exploits

The same package contains also a set of the dedicated exploits. We can run each of them by executing an appropriate command. Let’s try to deploy some of them and set cmd.exe as a program to be executed.

deploying_pool_overfl

Pool Overflow Exploit deployed:

pool_overfl

If the exploitation went successful, the requested application (cmd.exe) will be deployed with elevated privileges.

By the command

whoami

we can confirm, that it is really run elevated:

system

At the same time, we can see on our Debugger machine the Debug Strings printed by the exploit:

dgb_str

All of the exploits, except the double fetch should run well on one core.  If we want this exploit to work, we need to enable two cores on the Debugee machine.

WARNING: Some of the exploits are not 100% reliable and we can encounter a system crash after deploying them. Don’t worry, this is normal.


Hi driver, let’s talk!

Just like in case of the user land, in the kernel land exploitation begins from finding the points, where we can supply an input to the program. Then, we need to find the input that can corrupt the execution (in contrary to the user land – in kernel land a crash will directly result in having a blue screen!). Finally, we will be trying to craft the input in a way that let us control the execution of the vulnerable program.

In order to communicate with a driver from user mode we will be sending it IOCTLs – Input-Output controls. The IOCTL allows us to send from the user land some input buffer to the driver. This is the point from which we can attempt the exploitation.

HEVD contains demos of various classes of vulnerabilities. Each of them can be triggered using a different IOCTL and exploited by the supplied buffer. Some (but not all) will cause our system to crash when triggered.

Finding Device name & IOCTLs

Before we try to communicate with a driver, we need to know two things:

  1. the device that the driver creates (if it doesn’t create any, we will not be able to communicate)
  2. list of IOCTLs (Input-Output Controls) that the driver accepts

HEVD is open-source, so we can read all the necessary data directly from the source code. In real life, most of the time we will have to reverse the driver in order to get it.

Let’s have a look at the fragment of code where HEVD creates a device. https://github.com/hacksysteam/HackSysExtremeVulnerableDriver/blob/master/Driver/HackSysExtremeVulnerableDriver.c#L79

The name of the device is mentioned above.

Now, let’s see find the list of IOCTLs. We will start from looking at the array of IRPs:

https://github.com/hacksysteam/HackSysExtremeVulnerableDriver/blob/master/Driver/HackSysExtremeVulnerableDriver.c#L109

The function linked to IRP_MJ_DEVICE_CONTOL will be dispatching IOCTLs sent to the driver. So, we need to take a look inside this function.

https://github.com/hacksysteam/HackSysExtremeVulnerableDriver/blob/master/Driver/HackSysExtremeVulnerableDriver.c#L193

It contains a switch, that calls a handler function appropriate to handle a particular IOCTL. We can grab our list of IOCTLs by coping the switch cases. The values of the constants are defied in a header:

https://github.com/hacksysteam/HackSysExtremeVulnerableDriver/blob/master/Driver/HackSysExtremeVulnerableDriver.h#L57

Writing a client application

Ok, we got all the necessary data that we can use to communicate with the driver by our own program. We can put it all together in a header file, i.e.: hevd_constants.h

Number of each IOCTL is created by a macro defined in a standard windows header winioctl.h:

If you include windows.h header, the above macro will be added automatically. For now, we not need to bother about meaning of the particular constants – we will just use the defined elements as they are.

So, we are ready to write a simple user land application that will talk to the driver. First, we open the device using function CreateFile. Then, we can send the IOCTL using DeviceIoControl.

Below you can see a tiny example. This application sends the STACK_OVERFLOW IOCTL to the driver: send_ioctl.cpp


Try to compile this program and deploy it on the Debugee machine. Start the DebugView and observe DebugStrings printed by the driver.

If you enabled printing DebugStrings on the Debugger machine, you should see similar output:

As we can see, the driver got our input and reported about it.

Exercise: let’s have a crash!

As an exercise, I created a small client for HEVD, that allows to send it various IOCTLs with the input buffer of the requested length. You can find the source code here:

https://github.com/hasherezade/wke_exercises/tree/master/task1

..and the compiled 32 bit binary here.

Try to play with various IOCTLs, till you get the crash. Because the Debugee runs under the control of the Debugger, you should not get a blue screen – instead, WinDbg will get triggered. Try to make a brief crash analysis for every case. Start from printing the information by:

!analyze -v

Some other helpful commands:

k - stack trace
kb - stack trace with parameters
r - registers
dd [address]- display data as DWORD starting from the address

For more, check the WinDbg help file:

.hh

In our sample application, the user buffer is filled with “A” -> ASCII 0x41 (https://github.com/hasherezade/wke_exercises/blob/master/task1/src/main.cpp#L34):

RtlFillMemory(inBuffer, bufSize, 'A');

So, wherever we see it in the crash analysis, it means the particular data can be filled by the user.

Example #1

Example #2

Mind the fact, that triggering the same vulnerability can give you a different output, depending on the immediate source of the crash, that is related to i.e. size of the overflow, current layout of the memory, etc.

Part 3:
https://hshrzd.wordpress.com/2017/06/22/starting-with-windows-kernel-exploitation-part-3-stealing-the-access-token/

Appendix

Posted in KernelMode, Tutorial, WKE | Tagged , | 7 Comments

Starting with Windows Kernel Exploitation – part 1 – setting up the lab

Recently I started learning Windows Kernel Exploitation, so I decided to share some of my notes in form of a blog.

This part will be about setting up the lab. In further parts I am planning to describe how to do some of the exercises from HackSysExtremeVulnerableDriver by Ashfaq Ansari.

I hope someone will find this useful!

What I use for this part:

  • Kali Linux – as a host system (you can use anything you like)
  • VirtualBox
  • 2 Virtual Machines: Windows 7 32 bit (with VirtualBox Guest Additions installed) – one will be used as a Debugger and another as a Debugee
  • WinDbg (you can find it in Windows SDK)

When we do userland debugging, we can have a debugger and a debuggee on the same machine. In case of kernel debugging it is no longer possible – we need a full control over the debugee operating system. Also, when we will interrupt the execution, full operating system will freeze. That’s why we need two virtual machines with separate roles.

Setting up the Debugger

Debugger is the machine form where we will be watching the Debugee. That’s why, we need to install WinDbg there, along with symbols, that will allow us to interpret system structures.

In order to install WinDbg we need to download Windows SDK (depending on the version of Windows, sometimes we will also need to install some required updates).

It is important to choose Debugging Tools from the installer options:

install.png

Once we have WinDbg installed. we should add Symbols. In order to do this, we just need to add an environment variable, to which WinDbg will automatically refer:

_NT_SYMBOL_PATH

… and fill it with the link from where it can download symbols.

https://msdl.microsoft.com/download/symbols

Full variable content may look like this (downloaded symbols will be stored in C:\Symbols):

SRV*C:\Symbols*https://msdl.microsoft.com/download/symbols

Setting up the Debugee

We need to enable Debugee to let it be controlled from outside. In order to do this, we are adding one more option in a boot menu – if we start the machine with this configuration, it is enabled for debugging.
We need to use a tool bcdedit. First we copy the current settings into a new entry, titled i.e. “Debug me”:

bcdedit /copy {current} /d "Debug me"

It gives us in return a GUID of the new entry. We need to copy it and use to enable debugging on this entry:

bcdedit /debug {MY_GUID} on

At the end we can see the settings where the debugging interface will be available:

bcdedit /dbgsettings

Setting up the connection between the Debugger and the Debuggee

Debugger and Debugge will be communicating via Serial Port COM1, that will be emulated in the host system by a Named Pipe.  It is very simple to configure, we just have to make sure that the debugger and the debuggee have the same pipe name set. Debugger will be creating the pipe, while the Debuggee will be connecting to the existing one (that’s why we always have to run Debugger first):

I use Linux as my host system, so I chose as a pipe name:

/tmp/wke_pipe

Note that if you are using Windows as your host system, your pipe name will have to follow different convention. Example:

\\.\pipe\wke_pipe

Read more: https://en.wikipedia.org/wiki/Named_pipe

Testing the connection

We have everything set up, now we just need to test if it works correctly! Let’s start the Debugger first, run WinDbg, and make it wait for the connection from the Debugee. Example:

File->Kernel Debug

file_kd

We are choosing COM as an interface:

kernel_debugging

Then we will run the Debugee machine, and when we see that it got connected to the pipe, we will send it interrupt. Example:

The Debugee is connected to the pipe:

connected_to_pipe.png

Now we can interrupt it, clicking Debug->Break:

debug_break

If we get the kd prompt, it means we are in control of the Debugee:

kd_prompt.png

See the full process on the video:

The Debugee frozen, waiting for the instructions form the Debugger. By a ‘g’ command we can release the Debugee and let it run further:

run_further

Part 2:
https://hshrzd.wordpress.com/2017/06/05/starting-with-windows-kernel-exploitation-part-2/

Posted in KernelMode, Tutorial, WKE | Tagged | 9 Comments

Hijacking extensions handlers as a malware persistence method

Recently I gave a presentation titled “Wicked malware persistence methods” (read more here). After releasing the slides I got questions about some of the demonstrated methods – especially about the details of extension handler hijacking – so, I decided to explain it in a blog post.

As an introduction, you can see a video demonstrating how it looks in action:

The demo app is open source and you can find it on my github:

https://github.com/hasherezade/persistence_demos/tree/master/extension_hijack

Basically, the goal to achieve was to deploy a malware each time when the user clicks a file with some defined extension – in a way that no change in the default behavior will be noticed. For the demonstration purpose, instead of a malware I used simply calc.exe 😉

How the extension handling works?

On Windows, extensions that are known by the operating system are defined in the registry. For example, we have an .html extension:

Each of the defined extensions may be connected with some handler, that is also defined in the registry. In my case, .html files are handled by FirefoxHTML:

The handler has various features defined – but the most important is command:


The command defines what action has the be taken when the file with the particular extension is clicked.
Let’s take a closer look at the above command:

C:\Program Files\Mozilla Firefox\firefox.exe -osint -url "%1"

As we can see, it runs firefox.exe with some parameters – one of them (%1) is the name of the file that was clicked. Thanks to this, Firefox opens the clicked file.

How to abuse it for executing a malware?

Knowing the above, we can overwrite the command by our own, that will be deploying a malware. For example:

Now, the .html extension is not handled by the firefox.exe, but by the ProxyApp.exe (see the code here) – that deploys firefox.exe, but also a malicious app (or calc in our demo case 😉 ). From the point of view of the user nothing has changed – firefox opens the document as it was before – but in the background another application starts to run…

If we modify the handlers defined under the keys of the current user, we not need any type of privilege elevation to install this type of hijack.

Global and local handlers – and how to hijack even more extensions

Extension handlers are defined at two levels – global, that are defined in the

HKEY_CLASSES_ROOT

…and local – that are defined for a particular user, i.e.:

HKEY_USERS\S-1-5-21-1929933236-2258453022-3626796957-1000_Classes

Their hierarchy of execution goes like this: if no local extension/extension handler is defined, then the global one  is executed.

This gives us very important advantage…

Obviously, without Administrator privileges we cannot modify the keys under HKEY_CLASSES_ROOT – but still we can read them. Also, we can modify the keys belonging to the current user.

So, the trick is simple – read the extensions handlers defined globally, rewrite them locally and then install the hijack.

Does it work for every version of Windows?

I tested it under: Windows 7 32/64, Windows 8.1 32/64, Windows 10 32 bit – and it worked stealthy, without any problems.

Win8.1 64 bit:

Win10 32 bit:

Appendix

https://attack.mitre.org/wiki/Technique/T1042

Posted in Malware, Techniques, Tutorial | 2 Comments