A forum for reverse engineering, OS internals and malware analysis 

Forum for discussion about kernel-mode development.
 #14042  by iSecure
 Sun Jun 17, 2012 1:26 pm
Hi there =)
Today i want to share some code which tries to implement this idea.

I know the title is kind of confusing. By "self-relocatable" i mean the code that is capable of constantly moving itself in memory while keeping execution.

I'm planning to write an article about this. But before i do, i want to be sure that it is worth the effort. Maybe it will turn out as pure theoretical work, with no practical application.

At first sight this might look like a perfect technique for rootkits.
Would it be difficult to detect / prevent / disinfect such a thing on a live system?

Attachment contain the source code.
It relocates every n-seconds, as an example payload it prints processes creation / termination events.
When executed you should see something like this in debug output...
nomad.jpg
nomad.jpg (100.33 KiB) Viewed 680 times
Tested with Windows XP / 7 / 8 Developer Preview (x86).

I would like to hear yours thoughts, opinions and critics about this. Thanks.
Attachments
(5.83 KiB) Downloaded 81 times
 #14050  by Cr4sh
 Sun Jun 17, 2012 6:35 pm
I think, that deleting and initializing again all hooks/callbacks on each "reallocation" is not very good idea.

My variant:
Code: Select all
************************************
* Registered hook/callback with the
* static address.
************************************
   |
   V
************************************
* Polymorphic trampoline, that 
* transfers execution to the main
* rootkit body: 10-20Kb of randomly
* generated code.
* Location address is constant, but
* for each body reallocation rootkit
* generates unique trampoline code.
************************************
   |
   V
************************************
* Main rootkit body at random 
* address, that changing each 
* N seconds.
************************************
Also, dividing rootkit body code between few different buffers and linking them with jmps would be interesting.
 #14056  by iSecure
 Sun Jun 17, 2012 7:11 pm
Yeah, you are right, register/deregister every time is big overhead, but the perfomance issues soon will not be a problem with modern hardware.
The main problem would be: in the time window between deletion and installation of routines, they would not be called. Although this time window is really small, and this scenario would probably happen not too often.

The problem i see with code at static location is that: once it have been found, it will be dumped, reversed, analysed, etc.
The key idea of the subject was not to prevent detection, but to prevent further actions. Ideally, the time period dT1 required to change code location in memory should be less than the time period dT2 required to take an action (dT1 < dT2). But currently this is a great performance hit (i have tested my code with smaller delays and without delay at all, probably it could be solved with better programming).

I am also planning to develop a technique which allows to do custom relocation of each image section individually, it means: sections do not form a contigious memory region (Section[n].VA + Section[n].Size != Section[n+1].VA). I think this could be a step to "dividing rootkit body code between few different buffers and linking them with jmps".
 #14122  by fasmotol
 Wed Jun 20, 2012 2:23 pm
Cr4sh, isn't trash generation too long process? I mean if we want to relocate body as often as it is possible. However the solution with polymorphic trampoline is a really good one, i thought the same thing, heh. And the question, by the way - what value dT1 is ideally supposed to be (i mean some average number)? Or it doesn't matter how often body is relocated (the main things is - it is relocated)?
I think this could be a step to "dividing rootkit body code between few different buffers and linking them with jmps".
+ imho it would be a good idea to change execution flow every time by putting trash procedures in cfg.
 #14124  by iSecure
 Wed Jun 20, 2012 2:45 pm
Ideally, dT1 should be as close to 0 as possible =) Of course, too low numbers (e.g. 100 ms and less) will lead to huge performance degradation of system, because of the fact that relocation is bound to allocate/free memory, which is kinda 'heavy' operation. And i'm still not sure if low enough value of dT1 will indeed protect the code from dump/analysis. The idea is: while dumping is in process and not yet completed, the code should move itself already to another memory region.
 #14138  by Cr4sh
 Thu Jun 21, 2012 12:43 pm
The key idea of the subject was not to prevent detection, but to prevent further actions
I mean if we want to relocate body as often as it is possible.
Actually, researches (during analysis of rootkit samples) don't care about such "anti-detection", 'cause they can work with the static physical memory dumps, and idea with body realocation 'as is' will not work.
So, the key features of my ideas (from my previous message in this thread) are:
1) Сomplication of main rootkit body detection by using of trampolines (lot of randomly-generated code) to transfer execution to it.
2) Complication of rootkit body dumping after detection by splitting it into blocks, that located in different memory regions and merged with jmps.
The idea is: while dumping is in process and not yet completed, the code should move itself already to another memory region.
Researcher can freeze the target machine execution at all and dump whole physical memory, as I said above.
 #14140  by iSecure
 Thu Jun 21, 2012 12:57 pm
This is intended for live systems, ofc if you can attach windbg etc it will be not impossible to find / stop / dump / analysis... But before you research this in laboratory, you should acquire sample somehow from live infected system.
cause they can work with the static physical memory dumps
First they need to know current position of rootkit body (parts) in virtual memory, then convert it to physical addresses. How do they know what memory region to analysis in full physical memory dump, if they don't know where to look for it?
2) Complication of rootkit body dumping after detection by splitting it into blocks, that located in different memory regions and merged with jmps.
This is good trick, but difficult to implement (for me at the moment). And as i said above, i'm looking forward to implement something like this.
What if rootkit wil combine these 2 technologies: self-relocation in memory + self-splitting into blocks?
Last edited by iSecure on Thu Jun 21, 2012 1:03 pm, edited 1 time in total.
 #14141  by Cr4sh
 Thu Jun 21, 2012 1:02 pm
First they need to know current position of rootkit body (parts) in virtual memory, then convert it to physical addresses. How do they know what memory region to analysis in full physical memory dump, if they don't know where to look for it?
Ok, step-by-step:
1) Suspend machine execution.
2) Dump _whole_ physical memory.
3) Analyze hooks/callbacks/threads using the memory dump.
4) By information about hooks/callbacks/threads find the rootkit body location in memory dump.
5) ...
6) PROFIT!
And this is not a rocket science, such analysis can be done very easy with the public available tools.
What if rootkit wil combine these 2 technologies: self-relocation in memory + self-splitting into blocks?
This is a best variant.
 #14142  by iSecure
 Thu Jun 21, 2012 1:08 pm
Step-by-step:
1) Suspend machine execution.
2) Dump _whole_ physical memory.
3) Analyze hooks/callbacks/threads using the memory dump.
4) By information about hooks/callbacks/threads find the rootkit body location in memory dump.
5) ...
6) PROFIT!
What if:
1) researcher can't suspend infected live system
2) there is no callbacks / new threads used by rootkit

Yes, it sounds like impossible scenario, but still =)
 #14144  by fasmotol
 Thu Jun 21, 2012 1:41 pm
self-relocation in memory + self-splitting into blocks
+ ciphering these code blocks - i mean when one block de-ciphers the next one and so on. so, it would be a hard work to get all de-ciphered pieces together to make up a clear rootkit body.
1) researcher can't suspend infected live system
2) there is no callbacks / new threads used by rootkit
1) can you show an example or describe the situation like that?
2) so, you gotta strange rootkit, cause it has a weak system control.