The (Anti-)EDR Compendium

EDR functionality and bypasses in 2024, with focus on undetected shellcode loader.

Currently, there is a big focus on memory encryption for implants:

  • SWAPPALA / SLE(A)PING
  • Thread Pool / Pool Party
  • Gargoyle
  • Ekko
  • Cronos
  • Foliage

Also, there is a lot of work involving call stack spoofing:

  • ThreadStackSpoofer
  • CallStackSpoofer
  • AceLdr
  • CallStackMasker
  • Unwinder
  • TitanLdr

This is cool, but what if i told you it is not strictly necessary? Read on.

This is part of a three article series:

The target audience is confused Red Teamers. Basic knowledge in anti-EDR and maldev is recommended.

I am not an EDR expert. I’ve just read “Evading EDR” by Matt Hand and Elastic Security-Labs and you should too.

This article gets updated regularly, and is not mobile friendly. Last update 01.10.2024.

I mentioned parts of this in a talk at HITB BKK 2024: My First and Last Shellcode Loader. Shortened, but with more background.

Intro

Whats an EDR

EDR is “Endpoint Detection and Response”. Its an agent deployed on each machine, which observes events generated by the OS to identify attacks. If it detects something, it will generate an alert and send it to the SIEM or SOAR, where it will be looked at by human analysts. “Reponse” means the actions performed after having identified a threat, like isolating the host, which is not part of this article. EPP is Endpoint Protection Platform, and will attempt to interrupt attacks instead of just detecting it.

The UI of MDE (Microsoft Defender for Endpoint): MDE UI Overviewo

We can see the EDR detected something, and attempts to give the analyst more information about the incident: Involved processes, their arguments and hashes, child processes etc. The analyst at the end has to make the decision if its a false positive or an active attack. But generally the RedTeam wants to avoid raising any alarms, and tries to stay under the radar.

EDR attempts to implement detections higher up on the pyramid of pain, mostly on TTP’s: Tools, Techniques, Procedures.

Pyramid Of Pain

Idealized EDR

Knowing and understanding of even just one EDR is hard, and of all EDR’s impossible. The EDR written about here is an abstract version of an ideal EDR. Not so much what is being done today, but what is theoretically possible with the available Windows sensor/telemetry infrastructure. The closest inspiration is Windows Defender for Endpoint (MDE), which I used for testing.

I will not teach you how to bypass a specific EDR, but how to think conceptually about the attack surface to implement your own techniques. The actual inner working of an EDR is mostly unknown (except in case of Elastic), and is considered a blackbox. While we mostly know what kind of information an EDR receives, it is not so clear how the information is being used and correlated internally.

An a hacker, we are interested in the input and output of a system. This article should give an overview of the input.

Shellcode Loader

A loader will load a shellcode. The shellcode is usually our beacon, like CobaltStrike, Sliver, or Metasploit.

The loader contains the encrypted shellcode, loads it into memory, and executes it.

┌───────────┐   ┌────────────┐    ┌────────┐
│           │   │            │    │        │
│  Loader   ├──►│ C2 Beacon  ├───►│ Profit │
│           │   │ Shellcode  │    │        │
│           │   │            │    │        │
└───────────┘   └────────────┘    └────────┘

Goal is to make this process not detected by EDR for Initial-Access (IA).

Shellcode Loader Example

When executing shellcode, it the usual steps are:

  • Allocate a memory region with read-write permissions
  • Copy shellcode into that region (decrypt it too)
  • Change permissions of memory region to read-execute
  • Execute the shellcode

Which looks like this in C, but is similar in most languages:

    char *shellcode = "\xAA\xBB...";
    char *dest = VirtualAlloc(NULL, 0x1234, 0x3000, p_RW);
    memcpy(dest, shellcode, 0x1234)
    VirtualProtect(dest, 0x1234, p_RX, &result)
    (*(void(*)())(dest))();  // jump to dest: execute shellcode
┌──────────┐                                  ┌───────────────┐
│          │      ┌─────────────────┐         │ Memory Region │
│          │      │ Alloc           │         │               │
│          │      │                 ├────────►│               │
│          │      └─────────┬───────┘         │               │
│          │                │                 │               │
│          │      ┌─────────▼───────┐         │               │
│ Payload  │      │ Copy & Decrypt  ├─────────►               │
│          ├─────►│                 │         │               │
│          │      └─────────┬───────┘         │               │
│          │                │                 │               │
│          │      ┌─────────▼───────┐         │               │
│          │      │ Make Executable ├────────►│               │
│          │      │                 │         │               │
│          │      └─────────┬───────┘         │               │
│          │                │                 │               │
│          │      ┌─────────▼───────┐         │               │
│          │      │ Execute         ├─────────►               │
│          │      │                 │         │               │
│          │      └─────────────────┘         │               │
└──────────┘                                  └───────────────┘

There are many variantions of this simple recipe, some of them focus on shellcode injection on remote processes. Which works the same by using OpenProcess() on the destination process, and use this as the hProcess argument for the function calls like VirtualAlloc(hProcess, ...) and WriteProcessMemory(hProcess, ...). Cross-process access using hProcess are more scrutinized by the EDR.

Another typical thing being done is to call the shellcode by creating a new thread. Be it with CreateThread() in your own address space, or CreateRemoteThread() for process injection or module stomping.

The copying itself, here performed by the userspace function memcpy(), can also be done with RtlCopyMemory() or others.

EDR Detection

Bubbles Of Bane

There are three main techniques for detection (of loaders):

  • File scanning: Signatures (“yara”) scan for files
  • Memory scanning: Signatures (“yara”) scan for process memory
  • Telemetry/Behaviour: Actions performed by the process (mostly via OS)

For example, Windows Defender Antivirus implements the AV scanning, while Windows Defender for Endpoint MDE is an EDR which heavily depends on telemetry to perform behaviour analysis. If it feels the need, it will scan the memory of processes too.

I call this the “Bubbles of Bane”:

            ┌───────────────────┐
            │         Memory    │
┌───────────┼─────┐   Scanning  │
│ AV        │     │             │
│ Signature │     │             │
│ Scanning  │     │             │
│       ┌───┼─────┼────────┐    │
│       │   │     │        │    │
│       │   └─────┼────────┼────┘
│       │         │        │     
└───────┼─────────┘        │     
        │                  │     
        │    Telemetry     │     
        │    Behaviour     │     
        │    Analysis      │     
        │                  │     
        └──────────────────┘     

Most .exe file implants generated out of the box by C2 frameworks are signatured, and therefore not useful. Therefore the first step is to either obfuscate the code, which is hard. For an example, see Harnessing the Power of Cobalt Strike Profiles for EDR Evasion .

Or alternatively to use a loader, which carries the implant as payload and loads it when executed. Most often this technique uses shellcode generated by the C2 (alternatively, can use the generated DLL output of the C2, or the EXE. It is possible to convert it into either Shellcode or a DLL, for example with Donut). The advantage using a loader is that the payload can be encrypted, so the only thing which needs to be obfuscated from AV file signature scanning is the actual loader itself.

Public loaders are usually signatured sooner or later. But they are easy to write in basically all langues Windows understands (C, .net C#, vba, vbs, powershell, jscript…). Simple self written-loaders are surprisingly effective, as this article will show.

Instead of scanning a file, the EDR can also scan the memory of processes. This defeats loaders, as the payload code has to be unencrypted in memory to be executed. To avoid detection in memory, the process needs to encrypt its memory regions when sleeping. Then at the time the EDR scans the process, nothing suspicious should be in memory. Memory scanning is a performance intensive operation, and only being done if the EDR thinks its worthwile. This is based on the telemetry collected (or in regular intervals “on-demand”, like once a day).

Typical memory scanners are pe-sieve and moneta

Most of the detection usecases depend on telemetry: Important function calls into Windows generate events which are processed, correlated and analysed by the EDR. Like changing of permissions of memory regions, creating processes and threads, copying memory and similar.

For example, if we use a loader to bypass AV, and simply allocate a memory region for our shellcode, we dont generate much telemetry for the EDR. But the payload will be detectable by a memory scanner. If we introduce memory encryption to bypass memory scanner, then we generate more telemetry, which in turn can be used to detect the memory encryption.

Bubbles of Bane with Ekko memory encryption:

            ┌───────────────────┐
            │         Memory    │
┌───────────┼─────┐   Scanning  │
│ AV        │     │             │
│ Signature │     │             │
│ Scanning  │     │             │
│       ┌───┼─────┼────────┐    │
│       │   │     │ [EKKO] │    │
│       │   └─────┼────────┼────┘
│       │         │        │     
└───────┼─────────┘        │     
        │                  │     
        │    Telemetry     │     
        │    Behaviour     │     
        │    Analysis      │     
        │                  │     
        └──────────────────┘     

AV Signature Scanning

When a file is being written to disk, it will be scanned by the AV. The AV has a database of signatures with know-bad malware (like yara rules). File write events are generated by the OS and delivered to the AV via AMSI or kernel minifilter.

The signature scanning is based on the static content of the file. The PE headers will be parsed, and the content of the PE sections content scanned. It happens before the EXE will be executed. Upon positive detection, the file will be removed before execution.

A signature will look similar to a yara rule:

// https://github.com/Yara-Rules/rules/blob/master/malware/APT_APT17.yar (shortened)
rule APT17_Sample_FXSST_DLL 
{
    meta:
        ...        
    strings:
        $x1 = "Microsoft? Windows? Operating System" fullword wide
        $x2 = "fxsst.dll" fullword ascii
        $y1 = "DllRegisterServer" fullword ascii
        $y2 = ".cSV" fullword ascii
        $s1 = "VirtualProtect"
        $s2 = "Sleep"
        $s3 = "GetModuleFileName"
   
   condition:
        uint16(0) == 0x5a4d and filesize < 800KB and ( 1 of ($x*) or all of ($y*) ) and all of ($s*)
}

A general solution would be code obfuscation, which I will not cover in this article. It generally cannot be reliably applied on compiled code, but needs to be incorporated into the compiling process. That means each tool needs to implement it by itself.

It would solve all our problems: No signatures on-disk or in-memory, and no need to load it, therefore no telemetry.

            ┌───────────────────┐
            │         Memory    │
┌───────────┼─────┐   Scanning  │
│ AV        │     │             │
│ Signature │     │             │
│ Scanning  │     │             │
│       ┌───┼─────┼────────┐    │
│       │   │Obfus│        │    │
│       │   │catio│        │    │
│       │   │n    │        │    │
│       │   └─────┼────────┼────┘
│       │         │        │     
└───────┼─────────┘        │     
        │                  │     
        │    Telemetry     │     
        │    Behaviour     │     
        │    Analysis      │     
        │                  │     
        └──────────────────┘     

https://retooling.io/blog/an-unexpected-journey-into-microsoft-defenders-signature-world https://avred.r00ted.ch

AV Emulation

The AV component will also perform emulation of the target binary.

Emulation means that the AV will read and interpret the ASM instructions in the .text section by itself. It does not execute them natively, it is not virtualized execution, and also not qemu/bochs full emulation. Its a CPU emulation, including common Windows syscalls and subsystems.

In pseudocode:

    asm_bytes = [
        0xB8, 0x04, 0x00, 0x00, 0x00,   # mov eax, 4
        0xBB, 0x06, 0x00, 0x00, 0x00,   # mov ebx, 6
        0x01, 0xD8                      # add eax, ebx
    ]

    asm_instructions = disassembler.disasm(asm_bytes);
    # asm_instructions = [
    #     { name = "mov", src = "4", dst="eax" }
    #     { name = "mov", src = "6", dst="ebx" }
    #     { name = "add", src = "ebx", dst="eax" }
    # ]

    for instruction in asm_instructions: 
      if instruction.name == "add":
        register[instruction.dst] += register[instruction.src]
      if instruction.name == "mov":
        ...

AV emulation creates their own “interpreter” for X86 assembly, and re-implements part of Windows OS syscalls, and with it a virtual file system (FileOpen()), virtual registry for RegOpen(), fake processes etc. The ntdll.dll function GetUserNameA() may be implemented to always return “JohnDoe”.

Example experience for a RedTeamer:

  • Write a loader
  • Insert Metasploit shellcode
  • File being detected when dropped on disk

Then:

  • Write a second loader
  • Encrypt metasploit shellcode with strong AES
  • its still detected when dropped on disk

The AV Emulator will execute/emulate the loader. After a while execution stops, and the Metasploit shellcode is found unencrypted in memory. AV will then detect the signatures of it in memory.

There are an infinite amount of possibilities to detect an Emulator. But generally the emulation is not running forever, but restricted by:

What Typical Limit
Time ?
Number of instructions ?
Number of API calls ?
Amount of memory used ?

Reference:

Receive Events

The EDR receives events of stuff processes are doing via the OS:

 Process                                                 
┌────────────────┐                    ┌─────────────┐    
│                │                    │             │    
│                │                    │  Windows    │    
│                │                    │  kernel     │    
├────────────────┤  Syscalls          │             │    
│ (Hooked)       ├───────────────────►│             │    
│                │                    │             │    
│ ntdll.dll      ├─────────────────┐  │             │    
│ NtApi          │   Usermode      │  │             │    
├────────────────┤   Hooks         │  └──────┬──────┘    
│                │                 │         │           
│                │                 │         │ kernel
│                │                 │         │ callbacks     
│                │                 │         │           
│                │                 ▼         ▼           
│                │          ┌────────────────────────┐   
│                │          │      EDR               │   
│                │          └────────────────────────┘   
└────────────────┘                                        

There are two main channels to receive data:

  • Usermode (hooked API)
  • Kernel callbacks (ETW, ETW-TI, kernel-mode driver)

These sensors will create events about what is happening in the system, when something is added/removed/changed like: 

  • Files
  • Registry Keys
  • Processes, Threads
  • Memory Regions

The EDR will contain rules to match the events for malicious behaviour. Rules can be either: 

  • Precise/Brittle: Detect one specific thing well (low False-Positive FP), easy to bypass
  • Robust: More generic detection, harder to bypass, higher FP, more exceptions

Note that the EDR does not see data modification inside the process by itself. Or in other words, a process calling a function RtlCopyMemory() of ntdll.dll will potentially generate telemetry, as ntdll.dll can be hooked. Doing the same with a byte-wise copy in a for-loop will not result in any telemetry.

Telemetry is gained from both hooked ntdll.dll and from the kernel. Usermode hooks can be trivially removed, but this generates telemetry. The kernelspace events are more trustworthy, and cannot be removed.

Note that the main execution unit for Windows is the thread, not a process. But to keep it simple, i will use process mostly.

The graphic is a bit oversimplified, and can be extended with more sensors, which are the input of an EDR:

                                                                     ┌──────────────┐          
                                                                     │              │          
        ┌─────────────┐ EtwWrite() ┌──────────┐   Kernel callbacks   │              │          
        │ Process     ├───────────►│          ├─────────────────────►│              │          
        │             │            │          │                      │              │          
        │             │            │          │                      │              │          
        ├─────────────┤            │   OS     │   ETW                │              │          
┌───────┤  ntdll.dll  │            │          ├─────────────────────►│              │          
│       │             │ syscall    │          │                      │              │          
│  ┌───►│             ├───────────►│          │   ETW-TI             │    EDR       │          
│  │    ├─────────────┤            │          ├─────────────────────►│              │          
│  │    │             │            └──────────┘                      │              │          
│  │    ├─────────────┤                                              │              │          
│  │    │ amsi.dll    │ pipe                      AMSI               │              │          
│  └────┤             ├─────────────────────────────────────────────►│              │          
│       │             │                                              │              │          
└──────►│             │                                              │              │          
        ├─────────────┤                                              │              │          
        │             │                                              │              │          
        │             │                                              │              │          
        │             │                                              │              │          
        │             │                                              │              │          
        └─────────────┘                                              └──────────────┘          

EDR input is therefore:

  • Usermode hooks / AMSI
  • Kernel callbacks
  • ETW
  • ETW-TI

And I will discuss each of them individually.

Usermode Hooks

While the official kernel interface for Linux are syscalls, for Windows its ntdll.dll. This is called the Native API (NtAPI). ntdll.dll will call the correct syscall for us. The Windows Application Program Interface (WinAPI), the other DLL’s like kernel32.dll, all use or call the NtAPI (ntdll.dll) at the end. Note that syscall numbers may change between Windows versions, and therefore hardcoding them is not reliable.

 WinAPI                                       NtApi                                 Kernel                  
┌─────────────────────────────────────────┐  ┌───────────────────────────────────┐                          
│                                         │  │                                   │                          
│                                         │  │                                   │                          
│ ┌────────────────┐   ┌────────────────┐ │  │ ┌─────────────────────────┐       │ ┌───────────────────────┐
│ │                │   │                │ │  │ │                         │syscall│ │                       │
│ │ kernel32.dll   ├──►│ kernelbase.dll ├─┼──┤►│ ntdll.dll               ├───────┤►│Kernel                 │
│ │ OpenProcess    │   │ OpenProcess    │ │  │ │ NtOpenProcess           │       │ │NtOpenProcess          │
│ │                │   │                │ │  │ │                         │       │ │                       │
│ └────────────────┘   └────────────────┘ │  │ └─────────────────────────┘       │ └───────────────────────┘
│                                         │  │                                   │                          
│                                         │  │                                   │                          
│ ┌────────────────┐   ┌────────────────┐ │  │ ┌─────────────────────────┐       │ ┌───────────────────────┐
│ │                │   │                │ │  │ │                         │syscall│ │                       │
│ │ kernel32.dll   ├──►│ kernelbase.dll ├─┼──┤►│ ntdll.dll               ├───────┼─►Kernel                 │
│ │ VirtualAllocEx │   │ VirtualAllocEx │ │  │ │ NtAllocateVirtualMemory │       │ │NtAllocateVirtualMemory│
│ │                │   │                │ │  │ │                         │       │ │                       │
│ └────────────────┘   └────────────────┘ │  │ └─────────────────────────┘       │ └───────────────────────┘
│                                         │  │                                   │                          
│                                         │  │                                   │                          
└─────────────────────────────────────────┘  └───────────────────────────────────┘                          
       ▲                                         ▲                                      ▲                   
       │                                         │                                      │                   
       │                                         │                                      │                   
    Usermode Hooks                            Usermode Hooks                         Kernel                 
    Specific                                  Generic                                Callbacks              

Example NtAPI function in ntdll.dll, performing a syscall with ASM instruction syscall:

	SysNtCreateFile proc
			mov r10, rcx
			mov eax, 55h
			syscall
			ret
	SysNtCreateFile endp

Typical WinAPI call, with a hook:

                                                                                ┌─────────────────┐
                                                                                │                 │
┌───────────────────┐   ┌─────────────────┐   ┌───────────────────┐             │                 │
│                   │   │                 │   │                   │             │      OS         │
│  Application.exe  │   │ kernel32.dll    │   │  ntdll.dll        │  syscall    │                 │
│                   ├──►│                 ├──►│                   ├────────────►│                 │
│  .text            │   │ CreateFile()    │   │  NtCreateFile()   │             │      kernel     │
│                   │   │                 │   │                   │             │                 │
└───────────────────┘   └─────────────────┘   └─────────┬─────────┘             │                 │
                                                        │hook                   │                 │
                                                        │                       │                 │
                                               ┌────────▼────────────────┐      │                 │
                                               │                         │      │                 │
                                               │ amsi.dll                │      │                 │
                                               │                         │      │                 │
                                               │ NtCreateFile_Hook()     │      │                 │
                                               └─────────────────────────┘      │                 │
                                                         │                      └─────────────────┘
                                                         ▼
                                                        EDR

Userspace hooks are just patches in ntdll.dll exported functions, which call into another DLL before the function is executed. Windows provides functionality to directly hook functions.

 Original Function On-Disk:              EDR Hooked Function In-Memory:
 ----------------------                  -----------------------

 mov     r10, rcx                        mov     r10, rcx
>mov     eax, 50h                        jmp     0x7ffaeadea621
 test    byte ptr [0x7FFE0h], 1          test    byte ptr [0x7FFE0h], 1
 jne     0x17e76540ea5                   jne     0x17e76540ea5
 syscall                                 syscall
 ret                                     ret

Examples of commonly hooked ntdll.dll functions:

Function name Related attacker techniques
NtOpenProcess Process Injection
NtAllocateVirtualMemory Process Injection
NtWriteVirtualMemory Process Injection
NtCreateThreadEx Process Injection
NtSuspendThread APC Shellcode Injection
NtResumeThread APC Shellcode Injection
NtQueueApcThread APC Shellcode Injection

The EDR receives the function call names and its parameters as telemetry.

This is accomplished by using kernel callsbacks (PsSetCreateProcessNotifyRoutine) to get notified whenever a new process is created at an early stage, and then inject a DLL into the process (like amsi.dll), patching the original ntdll.dll functions to take a detour into amsi.dll by using Asyncronous Procedure Calls (kKAPC injection).

After ntdll.dll is patched, each function call will therefore be intercepted by amsi.dll.

EDR function hooking with KAPC will create a APC which performs the hooking. The technique “Early Bird APC injection” uses the same APC mechanism, which can therefore run before the KAPC hooking has been performed.

Usermode hooks can be bypassed with:

  • Direct syscalls (avoid calling ntdll.dll)
  • Indirect syscalls (calling ntdll.dll functions, but after the hook)
  • Patching / restoring ntdll.dll (removing the hooks completely)

Usermode hooks are easy to bypass, as they are completely located in “our own” memory space, where we can freely mess with it. But restoring ntdll.dll itself would generate telemetry, which is the reason why direct syscalls are being used for this.

An EDR should not depend solely on usermode hooks, but only use them for auxiliary telemetry. But they provide more information than kernel callbacks. Kernel callbacks only “see” the syscall/ntdll.dll function, not the original function which was originally initiated. This is useful, as it generates more generic detections, without depending on hooking all the weird and unusual DLL functions. But it may generate more false positives, as it more difficult to identify “non-malicious” behaviour with just the syscalls.

For example, CreateFileA(), CreateFileW(), OpenFile() and CreateFileTransacted() will all call NtCreateFile() at the end.

Note that the callstack can show which function in the chain has been initially called. Usermode hooks are used less and less, and not by all EDRs ( source):

EDR Usermode Hooks

Kernel telemetry

The Windows OS provides information about processes in form of notification callback routines. Especially about process-, thread- and image-creation. It is generated by the kernel itself, there is no way to surpress these like with usermode hooks (without kernel privileges).
These callbacks are initiated in the context of the relevant process and thread. Therefore the events have information about the origin process.

There are various different sources of kernel mode instrumentation:

  • ETW (Windows Event Tracing infrastructure)
  • ETW-TI (Thread Intelligence)
  • Kernel Callbacks (PsSetCreateProcessNotifyRoutine etc.)
  • NDIS / Minifilter drivers (for filesystem)

Kernel callbacks are:

  • PsSetCreateProcessNotifyRoutine: Process creation, termination
  • PsSetCreateThreadNotifyRoutine: Thread creation, deletion
  • PsSetLoadImageNotifyRoutine: Windows image loader
  • ObRegisterCallbacks: Object Manager callbacks, like NtOpenProcess, NtOpenThread, NtOpenFile, …

Reference:

An example event is PS_CREATE_NOTIFY callback, which gives the EDR different pieces of information:

Field Notes
ParentProcessId
CreatingThreadId
*FileObject The .exe on disk
ImageFileName Parameter of created process
CommandLine Parameter of created process
CreationStatus

Sysmon can capture this event from the kernel, and will produce the following:

Process Create:
RuleName: - 
UtcTime: 2024-04-28 22:08:22.025

ProcessGuid: {a23eae89-bd56-5903-0000-0010e9d95e00}
ProcessId: 6228
Image: C:\Windows\System32\wbem\WmiPrvSE.exe
FileVersion: 10.0.22621.1 (WinBuild.160101.0800)
Description: WMI Provider Host
Product: Microsoft® Windows® Operating System
Company: Microsoft Corporation
OriginalFileName: Wmiprvse.exe
CommandLine: C:\Windows\system32\wbem\wmiprvse.exe -secured -Embedding
CurrentDirectory: C:\Windows\system32\

User: NT AUTHORITY\NETWORK SERVICE
LogonGuid: {a23eae89-b357-5903-0000-002005eb0700}
LogonId: 0x7EB05
TerminalSessionId: 1
IntegrityLevel: System
Hashes: SHA1=91180ED89976D16353404AC982A422A707F2AE37,MD5=7528CCABACCD5C1748E63E192097472A,SHA256=196CABED59111B6C4BBF78C84A56846D96CBBC4F06935A4FD4E6432EF0AE4083,IMPHASH=144C0DFA3875D7237B37631C52D608CB

ParentProcessGuid: {a23eae89-bd28-5903-0000-00102f345d00}
ParentProcessId: 580
ParentImage: C:\Windows\System32\svchost.exe
ParentCommandLine: C:\Windows\system32\svchost.exe -k DcomLaunch -p
ParentUser: NT AUTHORITY\SYSTEM

Note that only the fields ImageFilename, CommandLine, ParentProcessId translate directly to the Image, CommandLine, ParentProcessId of the kernel event. But most of the other information is gathered by Sysmon additionally. These additional information are gathered by querying the kernel, for example by issuing GetProcessInformation on the ProcessId. Or in other ways, like parsing the PEB of the process. Not all information provided is equally trustworthy.

A ETW ImageLoad event from Microsoft-Windows-kernel-Process recorded with SilkETW:

{
  ProviderGuid: "22fb2cd6-0e7b-422b-a0c7-2fad1fd0e716",
  ProviderName: "Microsoft-Windows-kernel-Process",
  EventName: "ImageLoad",
  ThreadID: 9584,
  ProcessID: 7536,
  ProcessName: "notepad",

  YaraMatch: [],
  Opcode: 0,
  OpcodeName: "Info",
  TimeStamp: "2024-07-08T19:06:10.8845667+01:00",
  PointerSize: 8,
  EventDataLength: 142,

  XmlEventData: {
    ProviderName: "Microsoft-Windows-kernel-Process",
    FormattedMessage: "Process 7’536 had an image loaded with name \Device\HarddiskVolume2\Windows\System32\notepad.exe. ",
    
    EventName: "ImageLoad"
    ProcessID: "7’536",
    PID: "7536",
    TID: "9584",
    
    PName: "",
    DefaultBase: "0x7ff631650000",
    ImageName: "\Device\HarddiskVolume2\Windows\System32\notepad.exe",
    ImageBase: "0x7ff631650000",
    ImageCheckSum: "265’248",
    ImageSize: "0x38000",

    MSec: "9705.0646",
    TimeDateStamp: "1’643’917’504",
  }
}

Memory Regions

Upon starting an .exe, the sections in the PE .exe file get copied into memory, completely as a block.

.text contains the assembly code, while the .data and similar contains data for the program.

New memory regions can be created using VirtualAlloc() or similar.

 EXE                                               
 Program                 Process                   
                                                   
┌──────────┐            ┌──────────────┐           
│          │            │              │           
│  Header  ├───────────►│ Header       │           
│          │            │              │           
├──────────┤            ├──────────────┤           
│          │            │              │           
│          │            ├──────────────┤           
│  .text   ├─────┐      │              │     Backed 
│          │     │      │              │     RX    
│          │     └─────►│ .text        │           
├──────────┤            │              │           
│          │            │              │           
│  .data   ├────┐       ├──────────────┤           
│          │    │       │              │           
│          │    │       │              │           
└──────────┘    │       ├──────────────┤           
                │       │              │    Backed  
                │       │              │    RW     
                └──────►│ .data        │           
                        │              │           
                        ├──────────────┤           
                        │              │           
                        │              │           
                        ├──────────────┤           
                        │              │           
                        │ Virtual      │    Unbacked
                        │ Alloc()      │    RW     
                        │              │           
                        └──────────────┘           

The memory regions coming from the PE image are called backed regions. They are trustworthy, as they are 1:1 copies from the PE file, which is scanned on-disk by the AV. The memory regions are “backed” by the file on-disk. It can also be called IMAGE regeions.

If the process allocates additional memory by allocating it, it is “unbacked”. Also called USER memory or PRIVATE. There is no file backend, so its “unbacked”.

Generelly it can be though of, memory regions having the property of:

  • USER/PRIVATE/Unbacked: Bad, potentially malicious, shellcode
  • IMAGE/Backed: Good, pretty trusted

This is mainly as shellcode from exploits or process injection usually lives in PRIVATE memory. Also threads should start from backed regions. PRIVATE RWX memory is even more suspicious.

Here some trustworthy memory regions of type IMG (IMAGE, backed): Memory Regions Good

Here some untrustworthy memory regions of type PRV (PRIVATE, unbacked): Memory Regions Bad

One property of memory pages is Copy-On-Write (COW). A memory scanner is able to check if the memory page was written to, which is unusual for read-only .text sections and others, as these should be shared between processes. This is used by Moneta via PSAPI_WORKING_SET_EX_BLOCK from PSAPI_WORKING_SET_EX_INFORMATION structure. Data-only attacks, e.g. for AMSI-patch or ETW-patch, are preferred.

References:

Memory Scanning

Memory signature scanning will detect malicious code in-memory, in either .text or data sections (stack, heap, .data etc.).

                        Event       
                          │       
 Process                  ▼       
┌───────────┐        ┌───────────┐
│           │        │           │
│           │        │           │
│           │        │           │
├───────────┤        │           │
│           │  Read  │           │
│ .text     ◄────────┤    EDR    │
│  (bad)    │  Scan  │           │
├───────────┤        │           │
│           │        │           │
│           ◄────────┤           │
│ .data     │        │           │
│  (bad)    │        └───────────┘
│           │                     
└───────────┘                     

Its basically same like AV signature scanning; grep or yara' the memory content against known malicious signatures.

Memory scanning is performance intensive. It is not done constantly, but depends on a trigger.

Query Process Information

The EDR, upon receiving events, will also attempts to enrich it:

  • Process information (like executable name and command line arguments)
  • Memory scan (possibly)
  • Process image file scan (rarely)
                                                                                              
                                                                     ┌──────────────┐          
                                                                     │              │          
        ┌─────────────┐ EtwWrite() ┌──────────┐   Kernel callbacks   │              │          
        │ Process     ├───────────►│          ├─────────────────────►│              │          
        │             │            │          │                      │              │          
        │             │            │          │                      │              │          
        ├─────────────┤            │   OS     │   ETW                │              │          
┌───────┤  ntdll.dll  │            │          ├─────────────────────►│              │          
│       │             │ syscall    │          │                      │              │          
│  ┌───►│             ├───────────►│          │   ETW-TI             │    EDR       │          
│  │    ├─────────────┤            │          ├─────────────────────►│              │          
│  │    │             │            └──────────┘                      │              │          
│  │    ├─────────────┤                                              │              │          
│  │    │ amsi.dll    │ pipe                      AMSI               │              │          
│  └────┤             ├─────────────────────────────────────────────►│              │          
│       │             │                                              │              │          
└──────►│             │                                              │              │          
        ├─────────────┤                                              │              │          
        │             │                                              │              │          
        │             │                                              │              │          
        │  ┌──────────┤                          Process Info        │              │          
        │  │          │◄─────────────────────────────────────────────┤              │          
        │  │ PEB      │                                              │              │          
        │  │ Eprocess │                                              │              │          
        │  │          │                                              └──┬──┬────────┘          
        │  │          │                                                 │  │                   
        │  └──────────┤                          Memory Scan            │  │                   
        │             │◄────────────────────────────────────────────────┘  │                   
        └───────▲─────┘                                                    │                   
                │                                                          │                   
         File   │                                                          │                   
         ┌──────┴────┐                            File Scan                │                   
         │           │◄────────────────────────────────────────────────────┘                   
         │           │                                                                         
         │           │                                                                         
         │           │                                                                         
         └───────────┘                                                                         

The EDR does not only receive events, but will also actively query the OS for more information. For example, when receiving a PS_CREATE_NOTIFY event, the EDR will gain more information about the process creating the event, like by using GetProcessInformation() or OpenProcess(), access the PEB, arguments, or memory regions. Or accessing the ImageFileName and scan the origin EXE image file.

Note that the EDR is a normal process, even if SYSTEM or PPL’d, and having its own dedicated kernel driver. With its SYSTEM privileges it can gather information about pretty much all other processes.

Here is an example of a PsSetCreateProcessNotifyRoutine handler function:

void CreateProcessNotifyRoutine(HANDLE ppid, HANDLE pid, BOOLEAN create) {
    if (create) {
        PEPROCESS process = NULL;
        PUNICODE_STRING processName = NULL;

        // Retrieve the process name from the EPROCESS structure
        PsLookupProcessByProcessId(pid, &process);
        SeLocateProcessImageName(process, &processName);

        DbgPrint("MyDumbEDR: %d (%wZ) launched", pid, processName);
    }
}

The handler function only received the pid of the process. To also display the image name, a few functions have to be called, which access PEB or EPROCESS structure.

Data stored in the PEB (Process Environment Block, at GS:[0x60]). It is in usermode, and can be manipulated freely.

  • ImageBase Address
  • loaded DLLs
  • process parameters:
    • image name
    • arguments
    • environment variables
    • working directory

EPROCESS is a kernel data structure, and cannot be manipulated directly (sometimes indirectly):

  • process create and exit time
  • process id
  • parent process id
  • address of PEB
  • image filename
    • similar to process parameters image name in the PEB
    • also available in the SectionObject

Process Information Data Structures

The PEB:

typedef struct _PEB {
  BYTE                          Reserved1[2];
  BYTE                          BeingDebugged;
  BYTE                          Reserved2[1];
  PVOID                         Reserved3[2];
  PPEB_LDR_DATA                 Ldr;
  PRTL_USER_PROCESS_PARAMETERS  ProcessParameters;
  PVOID                         Reserved4[3];
  PVOID                         AtlThunkSListPtr;
  PVOID                         Reserved5;
  ULONG                         Reserved6;
  PVOID                         Reserved7;
  ULONG                         Reserved8;
  ULONG                         AtlThunkSListPtr32;
  PVOID                         Reserved9[45];
  BYTE                          Reserved10[96];
  PPS_POST_PROCESS_INIT_ROUTINE PostProcessInitRoutine;
  BYTE                          Reserved11[128];
  PVOID                         Reserved12[1];
  ULONG                         SessionId;
} PEB, *PPEB;

Whereas ProcessParameters is:

typedef struct _RTL_USER_PROCESS_PARAMETERS {
  BYTE           Reserved1[16];
  PVOID          Reserved2[10];
  UNICODE_STRING ImagePathName;
  UNICODE_STRING CommandLine;
} RTL_USER_PROCESS_PARAMETERS, *PRTL_USER_PROCESS_PARAMETERS;

Callstack Analysis

When a process calls a windows function, it is possible to find out the parent functions which lead to this call. This is called the callstack.

Exaple callstack

The EDR can chose to inspect the process initiating a function or API call, and analyze the call stack for suspicious things:

 Process
┌──────────────────────────────────────────────────────────────────────┐           ┌─────────────────┐ 
│                                                                      │           │  OS  kernel     │ 
│  ┌───────────────────┐   ┌─────────────────┐   ┌───────────────────┐ │           │                 │ 
│  │                   │   │                 │   │                   │ │           │                 │ 
│  │  Application.exe  │   │ kernel32.dll    │   │  ntdll.dll        │ │syscall    │                 │ 
│  │                   ├──►│                 ├──►│                   ├─┼──────────►│ NtWriteFile()   │ 
│  │  .text            │   │ CreateFile()    │   │  NtCreateFile()   │ │           │                 │ 
│  │                   │   │                 │   │                   │ │           └────┬────────────┘ 
│  └───────────────────┘   └─────────────────┘   └───────────────────┘ │                │              
│                                                                      │                │Notify        
│                             Stack                                    │                │              
│                            ┌──────────────────────────────────┐      │                ▼              
│                            │ Application.exe: SomeFunction()  │      │  Inspect  ┌─────────────────┐ 
│                            │ kernel32.dll: CreateFile()       │◄─────┼───────────┤                 │ 
│                            │ ntdll.dll: NtCreateFile()        │      │           │                 │ 
│                            └──────────────────────────────────┘      │           │                 │ 
│                                                                      │           │     EDR         │ 
│                                                                      │           │                 │ 
│                                                                      │           │                 │ 
└──────────────────────────────────────────────────────────────────────┘           └─────────────────┘ 

It is possible to detect a wide variety of attacks and bypasses with this technique. But its somewhat performance-intensive.

A callstack’s origin should be from an memory region from backed memory, go through a supporting DLL (e.g. user32.dll), then ntdll.dll, and where finally the actual syscall instruction is executed.

Elastic has callstack analysis rules to identify:

  • Direct syscalls
  • Callback-based evasion
  • Module Stomping
  • Library loading from unbacked region
  • Process created from unbacked region

If call comes from a unbacked region, it is most likely from a shellcode.

Call stack analysis is usually not applied to all API functions. Elastic mentions the following:

  • VirtualAlloc, VirtualProtect
  • MapViewOfFile, MapViewOfFile2
  • VirtualAllocEx, VirtualProtectEx
  • QueueUserAPC
  • SetThreadContext
  • WriteProcessMemory, ReadProcessMemory

Reference:

Thread State Analysis

Threads can be sleeping for different reasons. Investigating the state, and how the thread got there due his callstack, we find indicators for sleeping beacons, or memory encryption.

Clean (spoofed) callstack for NtDelayExecution(): Sleep Callstack Spoofed

If memory encryption is being used, the thread is usually put to sleep by calling either:

  • Kernelbase.dll!SleepEx
  • ntdll.dll!NtDelayExecution

Suspicious things for calls to these sleep functions:

  • Calls to virtual memory in the callstack
  • Source in non-backed memory regions

Refernece:

Performance Impact

Performance of the EDR is of utmost importance. If developer machines are slow when installing 10'000 NPM packages, people will move to Apple where protections are less, and Microsoft cant allow that. This is such a problem that Microsoft introduced asyncronous Dev Drive scanning.

The least performance intensive operation would be if the detection can be applied directly to a rare event (lets say, opening of a process handle to lsass.exe). Memory scans can involve iterating or yara-scanning megabytes of .text sections, which is very expensive. Scanning files is the most expensive, even with SSDs.

Most detections are in between those: One or multiple events with suspicious information, which leads to some more correlation. These then may kick-off the memory scanning.

Performance Impact What
1 Event
3 Event Correlation
10 Query process
100 Memory Scan
1000 File Scan

What could trigger a memory scan?

What Triggers scan? Notes
VirtualAlloc() No too common, except when RWX
WriteProcessMemory() No very common
memcpy() No Not visible for EDR
VirtualProtect No? RWX or RW->RX may be trigger
CreateRemoteThread() Yes Should trigger memory scan

VirtualAlloc() and WriteProcessMemory() are very commonly called functions. CreateRemoteThread() is not only less often called, it is also a more clear indicator of potentially malicious behaviour.

EDR Attacks

The EDR receives events from a large amount of sensors, with various trustworthyness. Also much of the information required is not available in the event itself, but has to be access in or via the kernel (KPROCESS, EPROCESS) or the process memory space itself (e.g. PEB including command line arguments, parent process id).

Many attacks depend on the fact of TOCTOU vulnerability: time of check, time of use.

Command Line Spoofing

EDR’s can check for potentially malicious command line arguments, for newly spawned processes, for example when using mimikatz: mimikatz.exe "privilege::debug" "lsadump::sam". Even if we rename mimikatz.exe, the arguments privilege::debug is a pretty clear indicator with low false positive rate.

But in windows, its possible to spoof command line arguments. The process' command line arguments are stored in the PEB of the respective process. Additionally when we create a new process, the process-creation function will also contain the initial arguments (of the exe to be started).

So we have basically two places for command line arguments:

  • In the PEB of the child process
  • On child create function: CreateProcessW(..., "command line args", ...)

In the PEB:

typedef struct _PEB {
  ...
    PRTL_USER_PROCESS_PARAMETERS  ProcessParameters;
  ...
}

typedef struct _RTL_USER_PROCESS_PARAMETERS {
  ...
  UNICODE_STRING ImagePathName;
  UNICODE_STRING CommandLine;
} *PRTL_USER_PROCESS_PARAMETERS;

As the PEB is modifiable by its process, data in it cannot be trusted.

EDR queries an existing process for its command line, and usually trusts it blindly:

┌────────────────────┐                  ┌─────────────────┐    
│ Process            │                  │                 │    
│                    │                  │                 │    
│      PEB           │                  │                 │    
│     ┌──────────────┤                  │                 │    
│     │              │                  │    EDR          │    
│     │ ImageName    │◄─────────────────┤                 │    
│     │ CommandLine  │                  │                 │    
│     │              │                  │                 │    
│     └──────────────┤                  │                 │    
│                    │                  │                 │    
└────────────────────┘                  └─────────────────┘    

But it can be verified. When a parent process calls CreateProcess() to create a child process:

   ┌─────────┐                  ┌──────────┐           ┌───────────┐
   │ Process │                  │          │           │ Child     │
   │         │  CreateProcess() │   OS     │ Spawns    │ Process   │
   │         ├─────────────────►│          ├──────────►│           │
   │         │          ▲       │          │           │           │
   │         │          │       └──────────┘           │PEB        │
   │         │          │                              ├─────────┐ │
   │         │          │        ┌───────┐             │ Command │ │
   │         │          │        │       │       ┌────►│ Line    │ │
   │         │          └────────┤  EDR  ├───────┘     ├─────────┘ │
   │         │                   │       │             │           │
   └─────────┘                   └───────┘             └───────────┘

The EDR can compare the command line in CreateProcess() and then the PEB of the resulting child process, and alert if they dont match.

Intercepting the function call arguments in CreateProcessW(..., "command line args", ...) does not really help much either, as we can create the process in a suspended state with fake arguments, overwrite them with the correct ones remotely, and then resume the process.

  1. Parent: Create new suspended process with fake arguments
  2. EDR: receives event with fake arguments
  3. Parent: Overwrite PEB of child with real arguments
  4. Parent: Continue (start) child process (using real arguments)
  5. Child process: Overwrite its PEB with fake arguments again
  6. EDR: querying the process gets the fake arguments

If the EDR thinks the child process is malicious in the future, it will provide information to the analyst, including the process' command line arguments, taken from the PEB. So the child process needs to overwrite the PEB again, as a “cleanup”.

Command line arguments for processes are therefore pretty untrustworthy.

PPID Spoofing

In Windows, unlike Linux, there is no dependency between parent- and child process, as there is (was) no fork(). The child gets certain attributes from the parent, including the PID of the parent. It will also be stored in the EPROCESS structure of the process.

The function CreateProcessW() can be instructed to provide its own attributes, including the parent process of the child, in the STARTUPINFOEX structure. So already upon creation, we can give the child a wrong parent PID.

CreateProcessW() interface:

BOOL CreateProcessW(
  [in, optional]      LPCWSTR               lpApplicationName,
  [in, out, optional] LPWSTR                lpCommandLine,
  [in, optional]      LPSECURITY_ATTRIBUTES lpProcessAttributes,
  [in, optional]      LPSECURITY_ATTRIBUTES lpThreadAttributes,
  [in]                BOOL                  bInheritHandles,
  [in]                DWORD                 dwCreationFlags,
  [in, optional]      LPVOID                lpEnvironment,
  [in, optional]      LPCWSTR               lpCurrentDirectory,
  [in]                LPSTARTUPINFOW        lpStartupInfo,  // PPID spoofing here
  [out]               LPPROCESS_INFORMATION lpProcessInformation
);

The actual PPID spoofing is just setting attributes in struct STARTUPINFOEX and give this as lpStartupInfo parameter:

{
  STARTUPINFOEXA si;
  HANDLE fakeParent = OpenProcess(.., <pid of fake parent process>);
  ..
  UpdateProcThreadAttribute(si.lpAttributeList, 0, PROC_THREAD_ATTRIBUTE_PARENT_PROCESS, &fakeParent, ..);
  CreateProcessA(NULL, (LPSTR)"notepad", .., EXTENDED_STARTUPINFO_PRESENT, .., &si.StartupInfo, ..);
}

Where:

typedef struct _STARTUPINFOEXA {
  STARTUPINFOA                 StartupInfo;
  LPPROC_THREAD_ATTRIBUTE_LIST lpAttributeList; // attributes, one is the ppid
} STARTUPINFOEXA, *LPSTARTUPINFOEXA;

It will be stored in the EPROCESS kernel structure:

typedef struct _EPROCESS
{
    KPROCESS Pcb;
    ...
    HANDLE InheritedFromUniqueProcessId; // PPID
    ...
}

This can be retrieved by the EDR with NtQueryInformationProcess():

__kernel_entry NTSTATUS NtQueryInformationProcess(
  [in]            HANDLE           ProcessHandle,
  [in]            PROCESSINFOCLASS ProcessInformationClass,
  [out]           PVOID            ProcessInformation,  // PROCESS_BASIC_INFORMATION
  [in]            ULONG            ProcessInformationLength,
  [out, optional] PULONG           ReturnLength
);

typedef struct _PROCESS_BASIC_INFORMATION {
    NTSTATUS ExitStatus;
    PPEB PebBaseAddress;
    ULONG_PTR AffinityMask;
    KPRIORITY BasePriority;
    ULONG_PTR UniqueProcessId;
    ULONG_PTR InheritedFromUniqueProcessId; // PID
} PROCESS_BASIC_INFORMATION;

PPID spoofing can be detected, as upon process creation, an event is delivered to the EDR about the new process. This event is usually in the context of the origin process, or the process is referenced in it. The EDR can then compare the content of the STARTUPINFOEX structure with the process the event comes from (e.g. by just comparing the PID of both). Here EDR sees the CreateProcess() call with PPID=y (2), and the effective PID of the process initiating this call (1) having PID=x.

  ┌─────────┐                  ┌──────────┐           ┌───────────┐
  │ Process │  CreateProcess() │          │           │ Child     │
  │         │  PPID=y          │   OS     │ Spawns    │ Process   │
  │         ├─────────────────►│          ├──────────►│           │
  │         │          ▲       │          │           │           │
  │         │          │       └──────────┘           │EPROCESS   │
  │ ┌───────┤    1     │2                             ├─────────┐ │
  │ │PID=x  │◄─────────┤        ┌───────┐          3  │ PPID=y  │ │
  │ │       │          │        │       │       ┌────►│         │ │
  │ └───────┤          └────────┤  EDR  ├───────┘     ├─────────┘ │
  │         │                   │       │             │           │
  └─────────┘                   └───────┘             └───────────┘

So the EDR has:

  1. Parent: PID
  2. Parent: PPID in its issued CreateProcess() call destined for the child
  3. Child: Its PPID

And compare those, especially 1) and 2). Or later 1/2 and 3. It is not always completely clear for the events received, where the origin PID comes from (for example with ETW).

Note that InheritedFromUniqueProcessId is stored in EPROCESS, but still cannot be trusted, as it can be set from userspace.

ETW-patch

A ETW patch will overwrite EtwEventWrite() in ntdll.dll, so the process will not emit any ETW events by itself anymore. This is mostly useful for Powershell and .NET related events. It usually involves:

  • VirtualProtect .text: RX -> RW
  • Overwrite memory (replace function body with a return 0)
  • VirtualProtect .text: RW -> RX
   Process                                                    
  ┌──────────────────────┐                                    
  │                      │                                    
  │                      │                                    
  ├──────────────────────┤                                    
  │                      │   ntdll.dll RW -> patch -> RX      
  │ .text                ├──────────────┐                     
  │                      │              │                     
  ├──────────────────────┤              │       ┌─────────┐   
  │                      │              │       │         │   
  │                      │              │ ◄─────┤  EDR    │   
  │                      │              │       │  sus?   │   
  ├──────────────────────┤              │       │         │   
  │ ntdll.dll            │              │       └─────────┘   
  │                      │              │                     
  │  - EtwEventWrite()   │◄─────────────┘                     
  │                      │                                    
  │                      │                                    
  ├──────────────────────┤                                    
  │                      │                                    
  │                      │                                    
  │                      │                                    
  └──────────────────────┘                                                                       

Probably changing permissions of ntdll.dll to modify it will generate more telemetry than patching ETW is avoiding. Its memory permissions need to be changed from RX to RW and then back to RX again.

Note that this will only affect the events generated by the patched process. ETW cannot be deactivated globally.

ETW events are mostly used for managed processes (DotNet, C#) and Powershell. ETW was used a lot by Sysmon before, so ETW-patch was anti-Sysmon.

References:

AMSI-AV patching

AMSI will scan scripts executed in supported Windows interpreters, like Powershell, MS Office VBA runtime, or .NET. Or in other words, the application itself asks the OS to perform an AV scan via AMSI on some file or buffer it intends to execute.

To disable AMSI runtime code scanning, for example patch amsi.dll!AmsiOpenSession to remove telemetry. Alternatives are AmsiScanString() / AmsiScanBuffer().

The process is identical to ETW-patch: Make code section writeable, break the functions, restore original permissions again.

     Process                                                   
  ┌──────────────────────┐                                   
  │                      │                                   
  │                      │                                   
  ├──────────────────────┤                                   
  │                      │   ntdll.dll RW -> patch -> RX     
  │ .text                ├──────────────┐                    
  │                      │              │                    
  ├──────────────────────┤              │       ┌─────────┐  
  │                      │              │       │         │  
  │                      │              │ ◄─────┤  EDR    │  
  │                      │              │       │  sus?   │  
  ├──────────────────────┤              │       │         │  
  │ ntdll.dll            │              │       └─────────┘  
  │                      │              │                    
  │  - AmsiOpenSession() │◄─────────────┘                    
  │                      │                                   
  │                      │                                   
  ├──────────────────────┤                                   
  │                      │                                   
  │                      │                                   
  │                      │                                   
  └──────────────────────┘                                   

Disabling the AMSI-AV function is usually done by a loader, before executing well signatured malicious managed code or Powershell scripts. The loader is being scanned, but the .NET/Powershell loaded at runtime wont be.

This is useful for when loading a signatured malicious powershell script in powershell, which otherwise would be scanned by the AMSI interface. A famous site to generate obfuscated AMSI-AV patches is https://amsi.fail.

AMSI-hooks patching

AMSI-hook patching (or AMSi patching) is just removing the EDR’s ntll.dll patches which call into amsi.dll. It is basically identical to ETW-patch or AMSI-AV patch, as it just modifies ntdll.dll again. It can generate additional telemetry, for example when loading a clean version of ntll.dll from disk.

       Process                                                      
      ┌──────────────────────┐                                      
      │                      │                                      
      │                      │                                      
      ├──────────────────────┤                                      
      │                      │   ntdll.dll RW -> patch -> RX        
      │ .text                ├──────────────┐                       
      │                      │              │                       
      ├──────────────────────┤              │       ┌─────────┐     
      │                      │              │       │         │     
      │                      │              │ ◄─────┤  EDR    │     
      │                      │              │       │  sus?   │     
      ├──────────────────────┤              │       │         │     
      │ ntdll.dll            │              │       └─────────┘     
      │                      │              │                       
      │                      │◄─────────────┘                       
      │                      │                                      
      │                      │                                      
      ├──────────────────────┤                                      
      │                      │                                      
      │                      │                                      
      │                      │                                      
      └──────────────────────┘                                      

References:

AMSI Bypass

AMSI bypass can either mean to bypass the AMSI-AV interface as described above. Or it means to call OS kernel functions without invoking the ntdll.dll hooks in it.

This can be done by using direct syscalls: If you know the correct syscall number, you can invoke it directly, without involving ntdll.dll.

Or for indirect syscalls: re-use parts of the ntdll.dll functions, AFTER the hook-invocation.

In both cases, the AMSI-hooks are bypassed, and the EDR will not get any telemetry.

If this is the normal function call graph with hooked ntdll.dll:

                                                                                      ┌─────────────┐       
                                                                                      │             │       
┌───────────────────┐   ┌─────────────────┐   ┌───────────────────┐                   │             │       
│                   │   │                 │   │ ntdll.dll:        │                   │    OS       │       
│  Application.exe  │   │ kernel32.dll    │   │ NtCreateFile()    │                   │             │       
│                   ├──►│                 ├──►│                   │                   │             │       
│                   │   │ CreateFile()    │   │                   │                   │    Kernel   │       
│                   │   │                 │   │                   │                   │             │       
└───────────────────┘   └─────────────────┘   │                   │                   │             │       
                                              │                   │                   │             │       
                                     ┌────────┼───jmp callback    │                   │             │       
                                     │        │                   │          syscall  │             │       
                                     │ ┌──────┼──►syscall         ├─────────────────► │             │       
                                     │ │      │                   │                   │             │       
                                     │ │      │                   │                   │             │       
                                     │ │      └───────────────────┘                   │             │       
                                     │ │                                              │             │       
                                     │ │ ┌─────────────────────────┐                  │             │       
                                     │ └─┤                         │                  │             │       
                                     │   │ amsi.dll:               │                  └─────────────┘       
                                     └──►│ HookedNtCreateFile()    │                                        
                                         └──────────┬──────────────┘                                        
                                                    │ notify                                                
                                                    ▼                                                       
                                              ┌────────────┐                                                
                                              │    EDR     │                                                
                                              │    :-)     │                                                
                                              └────────────┘                                                

Here with:

  • Direct syscall: Just do the syscall yourself (with the correct syscall number)
  • Indirect syscall: Re-use parts of hooked ntdll.dll, invocate syscall but not the hook
                  direct                                                                           
                  syscall                                                                        
                ┌────────────────────────────────────────────────────────┐            ┌─────────────┐   
                │                                                        │            │             │   
┌───────────────┴───┐   ┌─────────────────┐   ┌───────────────────┐      │            │             │   
│                   │   │                 │   │ ntdll.dll:        │      │            │    OS       │   
│  Application.exe  │   │ kernel32.dll    │   │ NtCreateFile():   │      │            │             │   
│                   ├──►│                 ├──►│                   │      │            │             │   
│                   │   │ CreateFile()    │   │                   │      │            │    Kernel   │   
│                   │   │                 │   │                   │      │            │             │   
└──────────────┬────┘   └─────────────────┘   │                   │      │   syscall  │             │   
               │                              │                   │      └──────────► │             │   
               │                              │   jmp callback    │                   │             │   
               │                              │                   │          syscall  │             │   
               └──────────────────────────────┼──►syscall         ├─────────────────► │             │   
                indirect                      │                   │                   │             │   
                syscall                       │                   │                   │             │   
                                              └───────────────────┘                   │             │   
                                                                                      │             │   
                                          ┌────────────────────────┐                  │             │   
                                          │amsi.dll                │                  └─────────────┘   
                                          │                        │                             
                                          │HookedNtCreateFile()    │                       
                                          └────────────────────────┘                                        
                                                       no notify                                                
                                                                                                             
                                              ┌────────────┐                                                
                                              │    EDR     │                                                
                                              │    :-(     │                                                
                                              └────────────┘                                                

Or replace ntdll.dll completely with an unhooked version from disk, like in RefleXXion.

References:

Image Spoofing

Similar to spoofing arguments, an attacker may also want to “spoof” the exe: Start a non-malicious exe like notepad.exe, which the EDR records, then replace the content of the process with malicious one like mimikatz. This attempts to trick the EDR into thinking something nonmalicious has been started. This bypasses simple EDR’s.

The source .exe file is called the Image for a process.

Process hollowing:

                   Event: CreateProcess("notepad.exe")
                        ▲                             
                        │                             
                        │                             
                        │   notepad.exe               
┌───────────┐           │  ┌───────────┐              
│           │ Start     │  │           │              
│           │ Suspended │  │           │              
│           ├───────────┴─►│           │              
│           │              │           │              
│           │              ├───────────┤              
│           │ Overwrite    │ .text     │              
│           │ Memory       │           │              
│           ├──────────────┤►          │              
│           │              ├───────────┤              
│           │              │           │              
│           │              │           │              
│           │              │           │              
│           │ Resume       │           │              
│           ├─────────────►│           │              
│           │              │           │              
└───────────┘              └───────────┘              

There are some other techniques:

  • Process Hollowing: Overwrite process memory of suspended process with WriteProcessMemory()
  • Process Doppelgänging: Overwrite a file with Transactional NTFS (TxF), start the process, then roll back the transaction so the original file is restored
  • Process Herpaderping: Write malicious code to a exe, create process, quickly replace malicious content with non-malicious one before it gets scanned
  • Process Ghosting: Create empty file, semi-delete it, write malicious data, create process from it

Memory scanning will scan the memory of processes using signatures, like an AV. Therefore malicious code like CobaltStrike can still be identified, even if injected in a genuine process.

Or by comparing the process memory content with the exe file content. The original exe name is stored in the PEB (peb.ProcessParameters.ImagePathName), or the kernel’s EPROCESS structure (eprocess.ImageFilename[15], eprocess.SeAuditProcessCreationInfo.ImageFileName). Comparing the content of memory with that of a file is performance intensive.

Alternatively, the EDR can gather telemetry which identifies the manipulations. Or the supporting techniques like direct syscalls, e.g. with call stack analysis.

Technique Used API
Hollowing CreateProcess, NtUnmapViewOfSection, VirtualAllocEx, WriteProcessMemory, SetThreadContext, ResumeThread
Doppelgänging CreateTransaction, CreateFileTransacted, NtCreateProcessEx
Herpaderping NtCreateSection, NtCreateProcessEx, NtCreateThreadEx
Ghosting CreateFileA, NtOpenFile, NtSetInformationFile, NtCreateSection, NtCreateProcess, WriteRemoteMem, NtCreateThreadEx

Hollowing references:

Module Stomping

This is similar to Image Spoofing, but with DLL’s.

Module stomping writes the shellcode into the .text section of a unused DLL in a remote process, and creates new thread starting starting there.

                   Event: LoadLibrary("genuine.dll")           
                        ▲                                      
                        │                                      
                        │                                      
                        │   genuine.dll                        
┌───────────┐           │  ┌───────────┐                       
│           │ Load      │  │           │                       
│           │ DLL       │  │           │                       
│           ├───────────┴─►│           │                       
│           │              │           │                       
│           │              ├───────────┤                       
│           │ Overwrite    │ .text     │                       
│           │ Memory       │           │                       
│           ├──────────────┤►          │                       
│           │              ├───────────┤                       
│           │              │           │                       
│           │              │           │                       
│           │              │           │                       
│           │ Start        │           │                       
│           ├─────────────►│           │                       
│           │              │           │                       
└───────────┘              └───────────┘     

Same as Image Spoofing, it can be detected by:

  • Memory signature scanning
  • Memory/file comparison of .text section
  • Telemetry of the stomping
  • Identifying supporting techniques like direct/indirect syscalls with telemetry

References:

Memory Encryption

It is possible to encrypt all suspicious regions before sleeping, and decrypt it again when the process resumes. This is not trivial, and requires great care, weird Windows functionality, and support from the payload (e.g. the beacon itself). It can create a lot of telemetry, but much of it is not well capturable by the EDR.

                                             Event           
                                                 │           
                                                 │           
                                                 │           
 Process                Process                  ▼           
┌───────────┐          ┌───────────┐        ┌───────────┐    
│           │          │           │        │           │    
│           │          │           │        │           │    
│           │          │           │        │           │    
├───────────┤          ├───────────┤        │           │    
│           │          │           │  Read  │           │    
│ .text     ├─────────►│ .text     ◄────────┤    EDR    │    
│           │          │  Encrypted│  Scan  │           │    
├───────────┤          ├───────────┤        │           │    
│           │          │           │        │           │    
│           │          │           ◄────────┤           │    
│ .data     │          │ .data     │        │           │    
│           │          │  Encrypted│        └───────────┘    
│           │          │           │                         
└───────────┘          └───────────┘                         

A beacon usually Sleep() for a certain amount of time. If it uses memory encryption, any scans performed during this time will just see encrypted memory.

Callstack spoofing

The callstack is basically a function call hierarchy: a list of functions, each called by the one before it. When a process calls a syscall (or a hooked ntdll.dll function), this list can be retrieved by the EDR and analyzed.

When using direct syscalls, indirect syscalls, or other shenanigans, the callstack looks “wrong” by default, which can be identified by the EDR.

Callstack spoofing makes sure that the callstack looks genuine again. It is a supporting technique: e.g. an AMSI-bypass can be detected by using callstacks, so we need to improve the AMSI-bypass so the callstack looks more natural.

The actual callstack spoofing usually doesnt generate telemetry, and can be implemented pretty savely. But by re-using existing callstack-spoofing implementations, it can be identified by signature scanning (be it on-disk, or in-memory).

Suspicious callstack for NtDelayExecution(): Sleep Callstack Not Spoofed

Clean (spoofed) callstack for NtDelayExecution(): Sleep Callstack Spoofed

Anti-Detection depends on faking the callstack, copying a clean one, or just hide the malicious callstack. Many techniques exist to check the integrity of the callstack, often by correlating with other information. The thread start address should originate from a reasonable location for example.

In a normal thread, the user mode start address is typically the third function call in the thread’s stack – after ntdll!RtlUserThreadStart and kernel32!BaseThreadInitThunk. So, when the thread has been hijacked, this is going to be obvious in the call stack For “early bird” APC injection, the base of the call stack will be ntdll!LdrInitializeThunk, ntdll!NtTestAlert, ntdll!KiUserApcDispatcher and then the injected code.

References:

Remote Processes

The attacker can choose if he wants to mess with his own process, or another one of the system. The Windows functions described here can be mostly also used on another process, just by using OpenProcess() first.

This is mostly used for process injection. It is very useful to migrate into another process, like teams.exe. It C2 can be hidden in the normal communication of the application, its JavaScript so a lot of RW->RX allocations.

Messing with remote processes is more scrutinized by the EDR, it is safer to just stay in your own process. Instead for migration, use DLL sideloading, or other techniques which do not depend on OpenProcess() something.

This includes:

  • VirtualAllocEx() / VirtualFreeEx()
  • ReadProcessMemory() / WriteProcessMemory()
  • CreateRemoteThread()
  • QueryInformationProcess() / NtQueryInformationProcess()
 Process                              Child Process
┌──────────────┐                     ┌─────────────┐ 
│              │                     │             │ 
│              │ OpenProcess()       │             │ 
│              ├────────────────────►│             │ 
│              │              handle │             │ 
│       HANDLE │◄────────────────────┤             │ 
│              │                     │             │ 
│              │ VirtualAlloc(handle)│             │ 
│              ├────────────────────►│             │ 
└──────────────┘                     └─────────────┘ 

Suspended processes

A very common approach is to create a suspended process with argument CREATE_SUSPEND, then mess with it, then let it execute/resume.

CreateProcessA("C:\\Windows\\System32\\calc.exe", NULL, NULL, NULL, FALSE, CREATE_SUSPENDED, NULL, NULL, &si, &pi);
...
ResumeThread(pi.hProcess);

Many techniques depend on this functionality. Currently using suspended processes doesnt seem to bother the EDR much, but this may change it the future.

For example we can create a new process in suspended state, and queue an APC to execute our shellcode, which may make it invisible to an EDR (as it may be executed before KAPC injection).

 Process                                      Child Process
┌──────────────┐                             ┌─────────────┐  
│              │                             │             │  
│              │ CreateProcessA(suspended)   │             │  
│              ├────────────────────────────►│             │  
│              │                             │             │  
│       HANDLE │◄────────────────────────────┤             │  
│              │                             │             │  
│              │ VirtualAllocEx()            │             │  
│              │ WriteProcessMemory()        │             │  
│              │ QueueUserApc()              │             │  
│              ├────────────────────────────►│             │  
│              │                             │             │  
│              │                             │             │  
│              │ ResumeThread()              │             │  
│              ├─────────────────────────────┤             │  
└──────────────┘                             └─────────────┘  

Outro

EDR Wisdoms

  • Use threatcheck or avred to identify which part of your stuff gets identified by AV, and patch it
  • Memory scanning is performance intensive, and usually requires a trigger to be performed
  • Usermode AMSI is less and less relevant, and therefore AMSI-hooks patching too

Mistakes writing loaders

  • Using function calls to copy memory

  • Putting more than minimal amount of effort into handling entropy

  • Putting more than minimal amount of effort into handling encryption

  • Generate too much telemetry

  • Threads not starting in backed memory

  • Marking RX pages RW again

  • Having unclean callstacks

Proposed Loader

Proposed loader layout:

                                                                        ┌──────────┐                  
                                                                        │ encrypted│                  
                                                                        │ Payload  │                  
                                                                        │          │                  
                                                                        └────┬─────┘                  
                                                                             │                        
                                                                             │                        
                                                                             ▼                        
┌───────────┐    ┌──────────────┐    ┌─────────────┐    ┌───────────┐   ┌──────────┐     ┌────────────┐
│   EXE     │    │  Execution   │    │ Anti        │    │EDR        │   │ Alloc RW │     │  Payload   │
│   File    ├───►│  Guardrails  ├───►│ Emulation   ├───►│conditioner├──►│ Decode/Cp├────►│  Execution │
│           │    │              │    │             │    │           │   │ RX       │     │            │
│           │    │              │    │             │    │           │   │ Exec     │     │            │
└───────────┘    └──────────────┘    └─────────────┘    └───────────┘   └──────────┘     └────────────┘
  • EXE File: All code should be contained in the .text section (IMAGE)
  • Execution Guardrails: Only let it execute on the intended target (Anti-Middleboxes)
  • Anti-Emulation: Stop AV emulating our binary (mem usage, cpu cycles count, time trickery…)
  • EDR Feng-Shui: Condition EDR by doing a lot of our Alloc/Copy/VirtualProtect loop with nonmalicious data and free
  • Payload: Encrypted (how doesnt matter)
  • Alloc/Decode/Virtualprotect/Exec: As normal as possible (avoid using DLL functions here). Avoid RWX.
  • Payload Execution: As normal as possible (jmp to payload, avoid creating new threads)

Not part

Detections based on:

  • File access
  • Registry access
  • Network access

Low level techniques which are not discussed:

  • Software breakpoints
  • Hardware breakpoints
  • VEH
  • APC injection