SAS CTF and the many ways to persist a kernel shellcode on Windows 7

On May 18, 2024, Kaspersky’s Global Research & Analysis Team (GReAT), with the help of its partners, held the qualifying stage of the SAS CTF, an international competition of cybersecurity experts held as part of the Security Analyst Summit conference. More than 800 teams from all over the world took part in the event, solving challenges based on real cases that Kaspersky GReAT encountered in its work, but a couple of challenges remained unsolved. One of those challenges was based on a security issue that allows kernel shellcode to be hidden in the system registry and executed during system boot on a fully updated Windows 7/Windows Server 2008 R2 due to an incomplete fix for the CVE-2010-4398 vulnerability. Although security updates and technical support for Windows 7 ended in early 2020, the fact that the released patch only partially addressed the issue was known long before that, and we saw this flaw exploited in a targeted attack in 2018. At the time, we notified Microsoft about the in-the-wild exploitation, but Microsoft refused to address it because using this technique requires attackers to have administrator privileges. In this blog post, we will provide technical details about this flaw and the SAS CTF task based on it.

Vulnerability details

There is a design flaw in older versions of Windows operating systems (Windows NT 4.0 through Windows 7) that allows a kernel shellcode to persist and be launched at system boot by writing specially crafted data to some of the many locations in the system registry.

Windows Kernel API has a function called RtlQueryRegistryValues that can be used to query multiple values from the registry subtree with a single call.

RtlQueryRegistryValues syntax

The values to be queried by this function are defined by the QueryTable parameter, which contains a pointer to a table consisting of _RTL_QUERY_REGISTRY_TABLE structures.

_RTL_QUERY_REGISTRY_TABLE structure definition

Each table entry defines the name of the value to query, its default type (e.g., REG_NONE, REG_BINARY, REG_DWORD, REG_SZ etc.; this is optional) and default data, the address of the buffer to store the value or the address of the callback function, and flags that control how to query this value.

One of the supported flags, RTL_QUERY_REGISTRY_DIRECT, causes RtlQueryRegistryValues not to execute a callback function (pointed to by the entry’s QueryRoutine field), but to store the queried value directly to the provided buffer (pointed to by the entry’s EntryContext field).

While writing data directly to the provided buffer instead of executing a callback may be more convenient, it leads to unexpected consequences if the requested value in the registry is for some reason of an unexpected type. For instance, if the code expects a value of type REG_DWORD, which has a fixed size of four bytes, but receives a value of type REG_BINARY, which is variable in size, the value may not fit fully into the prepared buffer. As a result, if RtlQueryRegistryValues returns more data than the calling function expected, a buffer overflow occurs that can be easily exploited on Windows 7 and older systems because of the lack of stack cookies.

To address this issue, Microsoft has implemented and encouraged developers to use an additional flag, RTL_QUERY_REGISTRY_TYPECHECK, which is intended to be used in conjunction with the RTL_QUERY_REGISTRY_DIRECT flag to check that the type of the requested value matches the type expected by the caller.

Note from RtlQueryRegistryValues documentation

However, this is by no means a complete fix, and for Windows 7 Microsoft itself started using the new flag only where it was absolutely necessary to address possible privilege escalation vulnerabilities. As for the vulnerable registry/code paths that could be accessed with admin rights, they were not patched, giving attackers the opportunity to stealthily store and execute kernel shellcode.

In one of the attacks, we observed an APT actor using two DirectX drivers for exploitation – “dxgmms1.sys” and “dxgkrnl.sys” – but a quick look revealed about a dozen vulnerable drivers included in the Windows 7/Windows Server 2008 R2 base package.

Exploitation

To execute kernel shellcode, attackers exploit multiple stack buffer overflows in two drivers using the RtlQueryRegistryValues function. This is done in two stages.

In the first stage, attackers exploit the insecure use of the RtlQueryRegistryValues function in the “dxgmms1.sys” driver. The vulnerable code queries several registry values from the path “HKLM\SYSTEM\ControlSet001\Control\GraphicsDrivers\MemoryManager”, and making these registry entries bigger than expected results in several buffer overflows. Attackers can use this to write the shellcode to a fixed location in the kernel memory at the address 0xfffff78000000800, which is an address of the KUSER_SHARED_DATA structure + 0x800.

Exploitation of “dxgmms1.sys” driver

In the second stage, attackers exploit the insecure use of the RtlQueryRegistryValues function in the “dxgkrnl.sys” driver – the registry values used by the vulnerable code are located at “HKLM\SYSTEM\ControlSet001\Control\GraphicsDrivers”. This allows attackers to overwrite the return address of one of the called functions with an address of 0xfffff78000000800, resulting in the execution of the shellcode written in the first stage of exploitation.

Exploitation of ” dxgkrnl.sys” driver

All registry values used during exploitation are expected to be of type REG_DWORD, but the attackers have set them to malicious values of type REG_SZ/REG_BINARY. Since the SYSTEM hive is explicitly trusted, the data type mismatch is ignored and this results in successful exploitation.

The SAS CTF challenge

The beginning

You are presented with a README.txt note and three other files:

The SOFTWARE and SYSTEM files are what they are supposed to be, and are the registry hives of a Windows system.

Now, our first goal would be to find the piece of registry that is causing the VM to crash. This can be done in several ways, such as trying to find a piece of executable code in the registry hives (there is a NOP sled at offset 0x92D675 in the SYSTEM hive). But let’s try to reproduce the crash instead.

Identifying the VM and the OS

We are going to use regipy to parse and dump the registry hives. By dumping the SYSTEM hive, we can easily recognize the VirtualBox devices:

Just to be sure, we can even find the right version of the VirtualBox additions package, which is 6.1.46:

We also can identify the exact Windows build to run, which turns out to be Windows 7 SP1 x64:

Now let’s grab a Windows 7 SP1 VM or install a fresh one in a VirtualBox VM. While the VM is booting, let’s also build a timeline of the registry hive that we may need later:

Now download your favorite Live CD (for example, a vanilla Ubuntu Desktop ISO that we’ll boot to transplant the registry hives into the Windows system).

Install the VirtualBox guest additions from the official ISO to match what was installed in the original system. The clues in the README note (video driver!), the list of installed drivers and the shimcache (try “regipy-plugins-run -p shimcache -o output.txt SYSTEM && cat output.txt”, it will mention running dxdiag.exe) suggest that the system should be configured with Direct3D support, and this is crucial to triggering the exploit.

Once installed, “dxdiag.exe” should show “Enabled” for Direct3D on the VM:

Set up the debugger

Before we continue, let’s turn on kernel debugging inside the VM. Since we know there should be a BSOD, we will need it. You can also do this later by backing up the original registry hives to boot into the system and run the proper commands.

We will also set up a second Windows VM with the Windows Debugger and connect it to our target VM using a pipe-based virtual COM port. Start WinDbg on the debugger system (“Kernel Debug”), reboot the VM and you should see the kernel debugger connect. If not, check the COM port connection between the machines. It is also possible to use the host machine to run the debugger.

Crash!

Once it is working, replace the SOFTWARE and SYSTEM hives. Back up the original files, copy the hives (drag and drop, or via a share) to the VM and reboot into a Live CD, mount the NTFS volume, then copy the hives to “mountpoint/Windows/System32/config/”. Reboot and you should get an infinite BSOD loop/connection to the debugger.

Without the debugger it looks like this:

With the debugger, WinDbg output looks like this:

Analyzing the crash

We need to investigate this crash. Now, we can either extract the crash dump and inspect it offline, or debug live with our debugger machine (host, or a second VM) – let’s continue with the latter course. Make sure you can download the correct symbols, set up the symbol path, and execute “.reload /f” in WinDbg to force the download.

By inspecting the addresses on the stack around the stack pointer we can find an address inside “dxgkrnl”:

Further on in the stack we see the return addresses from nt!ObCreateObject:

Now we have a choice: either analyze the vulnerability in dxgkrnl and dxgmms1 until we understand exactly what is happening, or take a more hacky route, guided by the task note (“I tried to fix the registry but now it bluescreens all the time”):

check the memory around the crash pointer. At the address +0x800 from the crash site you can clearly see a shellcode that doesn’t belong to any module and can be analyzed;

search for the crash pointer address in the registry, using the timeline we generated and looking for “recent” changes.

Nothing. Let’s reverse the byte order (it may be a binary string, little endian):

$ grep -i 0000000080f7ffff timeline-system.txt

2024-05-16

13:39:56.411698+00:00,\ControlSet001\Control\GraphicsDrivers,5,"[Value(name

='DxgKrnlVersion', value=8197, value_type='REG_DWORD', is_corrupted=False),

Value(name='UseXPModel', value=0, value_type='REG_DWORD',

is_corrupted=False), Value(name='TdrLevel',

value='00000000000000000000000000000000000000000000000000000000000000000000

000000000000000000000000000000000000000000000000000000000000000000000000000

0000000000000000000000000000000000000000080f7ffff',

value_type='REG_BINARY', is_corrupted=False), Value(name='TdrDdiDelay',

value='03000000000000000000000000000000000000000000000000000000000000000000

000000000000000000000000000000000000000000000000000000000000000000000000000

00000000000000000000000000000000000000000000000000000000080f7ffff',

value_type='REG_BINARY', is_corrupted=False), Value(name='TdrDebugMode',

value='02000000000000000500000000000000030000000000000000000000000000000000

000000000000000000000000000000000000000000000000000000000000000000000000000

0000000000000080f7ffff', value_type='REG_BINARY', is_corrupted=False)]"

2024-05-16

13:41:27.599198+00:00,\ControlSet002\Control\GraphicsDrivers,5,"[Value(name