http://alex-ionescu.com/publications/euskalhack/euskalhack2017-cfg.pdf
这是 Alex 大神的一篇演讲,介绍了一种新的绕过 CFG 的思路
MRDATA
从 Win8.1 开始微软为 CFG 的 bitmap 添加了保护机制,将 bitmap 指针等一系列全局变量放置于文件的 .mrdata
区段,这是一个新的 PE 区段,用于保存那些易变的只读数据。
这个区段在模块加载时被标记为 PAGE_READONLY ,理论上无法被修改。
但是某些时候 ntdll 需要去修改 .mrdata
区段中的某些数据。为此Windows 提供了一个新的 API : LdrProtectMrdata( bProtect )
函数用于设置 .mrdata
区段是否开启保护 ,参数传入 0 表示 unprotect,传入 1 表示 protect。
很明显在模块加载和卸载时都会调用这个函数来设置一些数据,然而有些函数在运行时也会调用这个 API。
例如SetProtectedPolicy
和 GetProtectedPolicy
就会用到它。这两个函数用于设置和获取进程的保护策略,这些策略保存在通过 LdrMrdataHeap
分配的内存中,即策略处于 .mrdata
区段
还有许多函数会调用,这里简单列出一些,
1 2 3 4 5 6 7 8 9 10 11 12
| RtlAddFunctionTable RtlAddGrowableFunctionTable RtlDeleteFunctionTable RtlDeleteGrowableFunctionTable RtlInsertInvertedFunctionTable RtlInstallFunctionTableCallback RtlSetProtectedPolicy RtlpAddVectoredHandler RtlpCallVectoredHandlers RtlpRemoveVectoredHandler RtlxRemoveInvertedFunctionTable
|
Bypassing CFG with MRDATA
Edge JIT 时会有大量针对 Growable Function Table 的操作,这些操作会多次调用 LdrProtectMrdata
设置
.mrdata
区段的属性。如果攻击者多次触发 JIT 就会导致 .mrdata
区段频繁的改变属性
显然这种 Growable Function Table 是一种共享资源,微软通过 SRWLock 来对 Table 进行管理,对 Lock 进行了 ACquire 操作之后才能对 Table 进行修改。
SRWLock
SRWLock 是一种轻量级的读写锁,其本质就是一个指针,它标识信息的方式如下所示
1 2 3
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ______________________________|________________________________ 31 16 4 3 2 1 0
|
指针的低四位被用作四个不同的标识
- owned 拥有位 0 位为1表明有线程正在读\写资源
- CONTENDED 写入位 1 位为1表明有一个或多个线程在等待独占资源,也即当前有线程正在独占资源
- SHARED 读取位 2 位为1表明有一个或多个线程在等待读取资源
- CONTENTION 结构位 3 位为1表明有一个线程正在获取 WAITBLOCK 结构指针
指针的高 28 位为地址位。当没有线程在请求独占资源时,其用来表示正在共享读资源的线程个数;当在同时有一个以上线程在等待资源的时候,它会被用做指向一个结构体链表,其指向的地址以 0x10 对齐。在有其他线程在读\写资源的时候,一个线程调用AcquireSRWLockExclusive或者AcquireSRWLockShared 操作会将一个在栈上构建的结构体挂入SRWLock 所指向的链表,这样每个将要读/写资源的线程都会在栈上构建这么一个结构体,并将结构体挂入链表中。
该结构体的定义如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| typedef struct _RTLP_SRWLOCK_WAITBLOCK { LONG SharedCount; volatile struct _RTLP_SRWLOCK_WAITBLOCK *Last; volatile struct _RTLP_SRWLOCK_WAITBLOCK *Next; union { LONG Wake; struct { PRTLP_SRWLOCK_SHARED_WAKE SharedWakeChain; PRTLP_SRWLOCK_SHARED_WAKE LastSharedWake; }; }; BOOLEAN Exclusive; } volatile RTLP_SRWLOCK_WAITBLOCK, *PRTLP_SRWLOCK_WAITBLOCK;
|
下面是单项链表的结构
1 2 3 4 5
| typedef struct _RTLP_SRWLOCK_SHARED_WAKE { LONG Wake; volatile struct _RTLP_SRWLOCK_SHARED_WAKE *Next; } volatile RTLP_SRWLOCK_SHARED_WAKE, *PRTLP_SRWLOCK_SHARED_WAKE;
|
由于写操作一定是独占的,因此这里重点关注 Exclusive 独占型请求。
AcquireSRWLockExclusive 操作对应函数 RtlAcquireSRWLockExclusive,其源码如下,注释中对代码的流程进行了说明
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118
| VOID NTAPI RtlAcquireSRWLockExclusive(IN OUT PRTL_SRWLOCK SRWLock) { __ALIGNED(16) RTLP_SRWLOCK_WAITBLOCK StackWaitBlock; PRTLP_SRWLOCK_WAITBLOCK First, Last; if (InterlockedBitTestAndSetPointer(&SRWLock->Ptr, RTL_SRWLOCK_OWNED_BIT)) { LONG_PTR CurrentValue, NewValue; while (1) { CurrentValue = *(volatile LONG_PTR *)&SRWLock->Ptr; if (CurrentValue & RTL_SRWLOCK_SHARED) { if (CurrentValue & RTL_SRWLOCK_CONTENDED) { goto AddWaitBlock; } else { StackWaitBlock.Exclusive = TRUE; StackWaitBlock.SharedCount = CurrentValue >> RTL_SRWLOCK_BITS; StackWaitBlock.Next = NULL; StackWaitBlock.Last = &StackWaitBlock; StackWaitBlock.Wake = 0; ASSERT_SRW_WAITBLOCK(&StackWaitBlock); NewValue = (ULONG_PTR)&StackWaitBlock | RTL_SRWLOCK_SHARED | RTL_SRWLOCK_CONTENDED | RTL_SRWLOCK_OWNED; if (InterlockedCompareExchangePointer(&SRWLock->Ptr, (PVOID)NewValue, (PVOID)CurrentValue) == (PVOID)CurrentValue) { RtlpAcquireSRWLockExclusiveWait(SRWLock, &StackWaitBlock); break; } } } else { if (CurrentValue & RTL_SRWLOCK_OWNED) { if (CurrentValue & RTL_SRWLOCK_CONTENDED) { AddWaitBlock: StackWaitBlock.Exclusive = TRUE; StackWaitBlock.SharedCount = 0; StackWaitBlock.Next = NULL; StackWaitBlock.Last = &StackWaitBlock; StackWaitBlock.Wake = 0; ASSERT_SRW_WAITBLOCK(&StackWaitBlock); First = RtlpAcquireWaitBlockLock(SRWLock); if (First != NULL) { Last = First->Last; Last->Next = &StackWaitBlock; First->Last = &StackWaitBlock; RtlpReleaseWaitBlockLock(SRWLock); RtlpAcquireSRWLockExclusiveWait(SRWLock, &StackWaitBlock); break; } } else { StackWaitBlock.Exclusive = TRUE; StackWaitBlock.SharedCount = 0; StackWaitBlock.Next = NULL; StackWaitBlock.Last = &StackWaitBlock; StackWaitBlock.Wake = 0; ASSERT_SRW_WAITBLOCK(&StackWaitBlock); NewValue = (ULONG_PTR)&StackWaitBlock | RTL_SRWLOCK_OWNED | RTL_SRWLOCK_CONTENDED; if (InterlockedCompareExchangePointer(&SRWLock->Ptr, (PVOID)NewValue, (PVOID)CurrentValue) == (PVOID)CurrentValue) { RtlpAcquireSRWLockExclusiveWait(SRWLock, &StackWaitBlock); break; } } } else { if (!InterlockedBitTestAndSetPointer(&SRWLock->Ptr, RTL_SRWLOCK_OWNED_BIT)) { break; } } } YieldProcessor(); } } }
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| NTAPI RtlpAcquireSRWLockExclusiveWait(IN OUT PRTL_SRWLOCK SRWLock, IN PRTLP_SRWLOCK_WAITBLOCK WaitBlock) { LONG_PTR CurrentValue; while (1) { CurrentValue = (volatile LONG_PTR *)&SRWLock->Ptr; if (!(CurrentValue & RTL_SRWLOCK_SHARED)) { if (CurrentValue & RTL_SRWLOCK_CONTENDED) { if (WaitBlock->Wake != 0) { break; } } else { break; } } YieldProcessor(); //只有在没有线程在读取,没有其他生产线程在独占,或者独占的线程将该线程的WAKE标志设为非0,时退出死循环 } }
|
RtlAcquireSRWLockExclusive 函数的功能总结起来
- 当有线程还在读取资源,当前没有写资源线程在等待, 那么挂入 WAITBLOCK,设置标志 (读取 独占 拥有),进入等待,读取线程全部Release时 线程等待结束,以独占模式访问资源
- 有其他生产线程在等待时,将 WAITBLOCK 挂入SRWLock指针所指向链表的末尾,进入等待,在前面所有已挂入的等待都Release时,线程才结束等待 以独占模式访问资源。
- 如果有线程在写资源,但是没有其他线程在等待的,那么挂入 WAITBLOCK,设置标志 (独占 拥有),进入等待,读取线程全部Release时 线程等待结束,以独占模式访问资源
与之相对的 ReleaseSRWLockExclusive 的代码如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
| VOID NTAPI RtlReleaseSRWLockExclusive(IN OUT PRTL_SRWLOCK SRWLock) { LONG_PTR CurrentValue, NewValue; PRTLP_SRWLOCK_WAITBLOCK WaitBlock; while (1) { CurrentValue = *(volatile LONG_PTR *)&SRWLock->Ptr; if (!(CurrentValue & RTL_SRWLOCK_OWNED)) { RtlRaiseStatus(STATUS_RESOURCE_NOT_OWNED); } if (!(CurrentValue & RTL_SRWLOCK_SHARED)) { if (CurrentValue & RTL_SRWLOCK_CONTENDED) { WaitBlock = RtlpAcquireWaitBlockLock(SRWLock); if (WaitBlock != NULL) { RtlpReleaseWaitBlockLockExclusive(SRWLock, WaitBlock); break; } } else { ASSERT(!(CurrentValue & ~RTL_SRWLOCK_OWNED)); NewValue = 0; if (InterlockedCompareExchangePointer(&SRWLock->Ptr, (PVOID)NewValue, (PVOID)CurrentValue) == (PVOID)CurrentValue) { break; } } } else { RtlRaiseStatus(STATUS_RESOURCE_NOT_OWNED); } YieldProcessor(); } }
|
对应的 RtlReleaseSRWLockShared 也与之类似,只是在 SharedCount
有一个判断,当 SharedCount
减小到 0 时才唤醒挂起链表中的线程
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
| VOID NTAPI RtlReleaseSRWLockShared(IN OUT PRTL_SRWLOCK SRWLock) { LONG_PTR CurrentValue, NewValue; PRTLP_SRWLOCK_WAITBLOCK WaitBlock; BOOLEAN LastShared; while (1) { CurrentValue = *(volatile LONG_PTR *)&SRWLock->Ptr; if (CurrentValue & RTL_SRWLOCK_SHARED) { if (CurrentValue & RTL_SRWLOCK_CONTENDED) { WaitBlock = RtlpAcquireWaitBlockLock(SRWLock); if (WaitBlock != NULL) { LastShared = (--WaitBlock->SharedCount == 0); if (LastShared) RtlpReleaseWaitBlockLockLastShared(SRWLock, WaitBlock); else RtlpReleaseWaitBlockLock(SRWLock); break; } } else { NewValue = CurrentValue >> RTL_SRWLOCK_BITS; if (--NewValue != 0) { NewValue = (NewValue << RTL_SRWLOCK_BITS) | RTL_SRWLOCK_SHARED | RTL_SRWLOCK_OWNED; } if (InterlockedCompareExchangePointer(&SRWLock->Ptr, (PVOID)NewValue, (PVOID)CurrentValue) == (PVOID)CurrentValue) { break; } } } else { RtlRaiseStatus(STATUS_RESOURCE_NOT_OWNED); } YieldProcessor(); } }
|
Windows 的实现与 ReacOs 中有所不同,但是大体思路是一样的, 函数 RtlAcquireSRWLockExclusive 代码如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92
| #define SRWLockSpinCount 1024 #define Busy_Lock 1 #define Wait_Lock 2 #define Release_Lock 4 #define Mixed_Lock 8 struct _SyncItem { _SyncItem* back; _SyncItem* notify; _SyncItem* next; QWORD shareCount; DWORD flag; }; void __fastcall RtlAcquireSRWLockExclusive(volatile PRTL_SRWLOCK *srwlock) { __declspec( align( 16 ) ) _SyncItem syn = {0}; _RDI = (volatile signed __int64 *)srwlock; v15 = 0; if ( _interlockedbittestandset64(srwlock, 0i64) ) { lockStatu = srwlock->ptr; while ( 1 ) { if ( lockStatu & Busy_Lock ) { if ( (unsigned __int8)RtlpWaitCouldDeadlock(a1, a2, a3, a4, v9) ) ZwTerminateProcess(-1i64, 3221225547i64); syn->shareCount = NtCurrentTeb()->ClientId.UniqueThread; v3 = 0; syn->flag = 3; syn->next = null; if ( lockStatu & Wait_Lock ) { syn->notify = null; syn->bala = -1; a1 = (volatile signed __int32 *)(unsigned __int8)lockStatu; syn->back = lockStatu & 0xFFFFFFFFFFFFFFF0; newStatu= &syn | lockStatu & 8 | 7; v3 = ~((unsigned __int8)lockStatu >> 2) & 1; } else { syn->notify = &syn; if ( (lockStatu >> 4) > 1 ) newStatu = &syn | Wait_Lock | Busy_Lock | Mixed_Lock; else newStatu = &syn | Wait_Lock | Busy_Lock; if ( !(lockStatu >> 4) ) syn->bala = -2; } v8 = _InterlockedCompareExchange(srwlock->ptr, newStatu, lockStatu); if( v8 == lockStatu ) { if( v3 ) OptimizeSRWLockList(srwlock, newStatu); for ( int i = SRWLockSpinCount; i>0; --i ) { if ( !(syn.flag & 2) ) break; _mm_pause(); } if(interlockedbittestandreset(syn->flag ,1)) { do NtWaitForAlertByThreadId(srwlock,0); while( syn->flag & 4 ) } } else { RtlBackoff(&v15); lockStatu = (size_t)pSRWLock->Ptr; continue; } } else { if ( lockStatu == _InterlockedCompareExchange(srwlock, lockStatu+1, lockStatu) ) return; RtlBackoff(&v15); lockStatu = (size_t)pSRWLock->Ptr; continue; } } } }
|
一般来说用于管理 MRDATA的 SRWLock 不会处于 MRDATA中,但是有些 SRWLock 却是在 MRDATA 解除保护之后才能获取到。这是一个很奇怪的设计,也许我们可以通过尝试修改 SRWLock 的相关数据来做一些事情。
首先随意选择一个调用 LdrProtectMrdata
的函数进行观察。这里选择的函数是 RtlDeleteGrowableFunctionTable , 这个函数会在 JIT 代码段被回收时调用。函数伪代码如下,为了直观删去了一些无关内容
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
| __int64 __fastcall RtlDeleteGrowableFunctionTable(__int64 a1) { LdrProtectMrdata(0i64); if ( qword_18017A370 ) { RtlAcquireSRWLockExclusive(&LdrpMrdataLock); v3 = *(_DWORD *)LdrpMrdataHeapUnprotected; if ( !*(_DWORD *)LdrpMrdataHeapUnprotected ) LdrpChangeMrdataHeapProtection(4i64); *(_DWORD *)LdrpMrdataHeapUnprotected = v3 + 1; RtlReleaseSRWLockExclusive(&LdrpMrdataLock); } RtlAcquireSRWLockExclusive(&RtlpDynamicFunctionTableLock); RtlAvlRemoveNode(&RtlpDynamicFunctionTableTree, v1 + 11); *v5 = v4; *(_QWORD *)(v4 + 8) = v5; RtlReleaseSRWLockExclusive(&RtlpDynamicFunctionTableLock); RtlFreeHeap(v6, 0i64, v1); if ( qword_18017A370 ) { RtlAcquireSRWLockExclusive(&LdrpMrdataLock); v7 = *(_DWORD *)LdrpMrdataHeapUnprotected; *(_DWORD *)LdrpMrdataHeapUnprotected = v7 - 1; if ( v7 == 1 ) LdrpChangeMrdataHeapProtection(2i64); RtlReleaseSRWLockExclusive(&LdrpMrdataLock); } LdrProtectMrdata(1i64); return ; }
|
从上述代码中可以看到,函数共请求了三次 SRWLock
,三次操作均与 .MRDATA
相关。因此函数在开始和结束的位置分别调用 LdrProtectMrdata(0)
、LdrProtectMrdata(1)
开关 .MRDATA
段的保护。
1 2 3 4 5 6 7 8 9 10 11
| LdrProtectMrdata( a1 ) { if ( a1 ) { LdrpChangeMrdataProtection(2u); } else { LdrpChangeMrdataProtection(4u); } }
|
如果我们将 LdrpMrdataLock
或者 RtlpDynamicFunctionTableLock
的标记位修改成 RTL_SRWLOCK_SHARED | RTL_SRWLOCK_CONTENDED | RTL_SRWLOCK_OWNED
,那么根据上面对于 SRWLock
的函数描述,相关的 SRWLock
将被认为已被占用而一直挂起新的请求。于是上面的函数在请求 SRWLock
时便会被挂起,其后的操作将不会继续执行(包括后来的LdrProtectMrdata(1i64)
),而 .MRDATA
由于函数开始时的 LdrProtectMrdata(0i64);
操作将被设置为 0 ,从而关闭 .MRDATA
段的保护。
由此通过修改 SRWLock
破坏了线程调用的完整性,从而获得 .MRDATA
的操作权限。
JIT 操作一般处于单独的线程中执行,不会影响 js 解析线程的执行,因此在获得 .MRDATA
的操作权限后,并不会中断 js 代码的继续执行。攻击者便可以使用任意地址写修改 bitmap 指针,从而绕过 CFG 保护
总结
这种方法思路十分新颖,且操作简单,通过修改一位数据便可以绕过 CFG 保护。不仅如此,由于 .MRDATA
段中还包含了很多全局性的敏感对象,使用这种方法还有可能达到其他意想不到的效果
Refenrence