分析开源的inlineHook代码, 总结inlineHook的原理与实现.
本文首发于看雪论坛, 转载请注明出处.
前言 #
最近在面试某大厂的安全岗位时,面试官问到了一些有关hook的知识, 在简单分析了下F8大牛的开源代码之后, 有了这篇文章.
参考文章和项目代码 #
文章:
http://ele7enxxh.com/Android-Arm-Inline-Hook.html
http://gslab.qq.com/portal.php?mod=view&aid=168
项目:
https://github.com/ele7enxxh/Android-Inline-Hook
https://github.com/F8LEFT/FAInHook
使用方法 #
MainActivity:
static {
        System.loadLibrary("FHook");
    }
    
public native String stringFromJNI();
我们会将会测试Hook 这个"stringFromJNI()“函数.
分析 #
使用 #
在native层, 我们的main.cpp中
会在JNI_OnLoad中有一个init函数
jstring stringFromJNI(
        JNIEnv *env,
        jobject ) {
     doInHook();
    // doGotHook();
    return env->NewStringUTF(getStr());
}
JNIEXPORT jint JNICALL JNI_OnLoad(JavaVM* vm, void *reserved) {
    JNIEnv* env = nullptr;
    jint resultstr = -1;
    if (vm->GetEnv((void **) &env, JNI_VERSION_1_6) != JNI_OK) {
        return -1;
    }
    auto jclazz = env->FindClass("com/example/l0phtg/hookstudyf8/MainActivity");
    JNINativeMethod natives[] = {
            {"stringFromJNI", "()Ljava/lang/String;", (void*)stringFromJNI}};
    env->RegisterNatives(jclazz, natives, 1);
    env->DeleteLocalRef(jclazz);
    init();
    return JNI_VERSION_1_6;
}
在Hook.cpp我们来看一下init函数:
bool init() {
    auto hook = FAInHook::instance();
    hook->registerHook((Elf_Addr)getStr,
                       (Elf_Addr)inlCallback,
                       (Elf_Addr*)&inlCallbackSrc);
    auto lib = dlopen("libFHook.so", RTLD_NOW);
    gotCallbackSrc = (const char* (*)())dlsym(lib, "_Z6getStrv");
    dlclose(lib);
    return false;
}
可以看到, 我们实例化了一个FAInHook对象, new FAInHook().
并调用了registerHook函数来注册对getStr的hook.
FAInHook *FAInHook::instance() {
    static FAInHook* mIns = nullptr;
    if(mIns == nullptr) {
        mIns = new FAInHook();
    }
    return mIns;
}
我们在使用是还会用到的doInHook:
bool doInHook() {
    static bool isHooked = false;
    if (isHooked) {
        isHooked = false;
        FAInHook::instance()->unhookAll();
    } else {
        isHooked = true;
        FAInHook::instance()->hookAll();
    }
    return true;
}
现在可以看到, 我们主要的任务就是分析registerHook和doInHook这两个函数的实现.
registerHook函数 #
我们现在主要分析registerHook函数.
先来分析参数: 在Hook之前我们首先要注册这个函数
函数申明:
    HOOK_STATUS registerHook(Elf_Addr orginalFunAddr, Elf_Addr newFunAddr,
                             Elf_Addr* callOrigin);
参数(原始函数地址, 新函数地址, 调用原始函数).
函数主要流程:
- 
注册函数信息, 计算hook stub.
首先判断
originFunAddr和newFunAddr是否是函数地址.auto info = getHookInfo(originFunAddr); 得到函数信息
然后判断函数是否已经被Hook.
 - 
检查判断指令类型(thumb or arm or x86 …)
 - 
createStub(info) 创建stub, 就是thumb下创建ldr.w pc, [pc], addr 来执行跳转到newFuncAddr功能
 - 
createCallOriginalStub(info) 创建originalFunAddr的stub, 主要会涉及一些对pc相关指令的处理.
 
//  register hook
FAInHook::HOOK_STATUS FAInHook::registerHook(
        Elf_Addr orginalFunAddr, Elf_Addr newFunAddr, Elf_Addr *callOrigin) {
    // register hook information, calc hook stub at the same time.
    if(!FAHook::MemHelper::isFunctionAddr((void *) orginalFunAddr)
       || !FAHook::MemHelper::isFunctionAddr((void *) newFunAddr)) {
        return FERROR_NOT_EXECUTABLE;
    }
    auto info = getHookInfo(orginalFunAddr);
    if(nullptr != info) {
        auto hookStatus = info->getHookStatus();
        if(FAHook::HOOKED == hookStatus) {
            return FERROR_ALREADY_HOOKED;
        } else if(FAHook::REGISTERED == hookStatus) {
            delHookInfo(info);
        }
    }
    // check for FunctionType
    auto type = FAHook::Instruction::getFunctionType(orginalFunAddr);
    if(FAHook::ERRTYPE == type) {
        return FERROR_UNKNOWN;
    }
    info = new FAHook::HookInfo((void *) orginalFunAddr, (void *) newFunAddr);
    info->setOriginalFunctionType(type);
    FAHook::Instruction* instruction = nullptr;
    switch(type) {
#if defined(__arm__)
        case FAHook::ARM:
            instruction = new FAHook::ArmInstruction();
            break;
        case FAHook::THUMB:
            instruction = new FAHook::ThumbInstruction();
            break;
#elif defined(__aarch64__)
        case FAHook::ARM64:
            instruction = new FAHook::Arm64Instruction();
            break;
#elif defined(__i386__) || defined(__x86_64__)
        case FAHook::X86:
        case FAHook::X64:
            instruction = new FAHook::IntelInstruction();
            break;
#elif defined(__mips64__)
            case FAHook::MIPS64:
            instruction = new FAHook::Mips64Instruction();
            break;
#elif defined(__mips__)
        case FAHook::MIPS:
            instruction = new FAHook::MipsInstruction();
            break;
#endif
        default:
            assert(false && "not support abi");
            return FERROR_UNKNOWN;
            break;
    }
    if(!instruction->createStub(info)
       || !instruction->createBackStub(info)
       || (callOrigin != nullptr) ?
            !instruction->createCallOriginalStub(info) : false  // want a callback
       ) {
        delete instruction;
        delete info;
        return FERROR_MEMORY;
    }
    addHookInfo(info);
    info->setHookStatus(FAHook::REGISTERED);
    if(callOrigin != nullptr) {
        *callOrigin = (Elf_Addr) info->getCallOriginalIns();
    }
    delete instruction;
    return FERROR_SUCCESS;
}
判断该地址是否是函数地址 #
- 打开
/proc/self/maps, 读取每行的信息, 用strstr根据权限做出判断.(r-x, 表示可读可执行, 即为code)() - addr >= startAddr && addr <= endAddr
 
bool FAHook::MemHelper::isFunctionAddr(void *addr) {
    char buf[MAX_BUF];
    auto fp = fopen(maps, "r");
    if(nullptr == fp) {
        return false;
    }
    while(fgets(buf, MAX_BUF, fp)) {
        if(strstr(buf, "r-xp") != nullptr) {
            void* startAddr = (void*)strtoul(strtok(buf, "-"), nullptr, 16);
            void* endAddr = (void*)strtoul(strtok(nullptr, " "), nullptr, 16);
            if(addr >= startAddr && addr <= endAddr) {
                fclose(fp);
                return true;
            }
        }
    }
    fclose(fp);
    FLOGE(this functionAddr is not a function!);
    return false;
}
得到Hook地址的信息, 是否已经Hook. #
hook_map是一个std::map<Elf_Addr, FAInHook::HookInfo*>的map类型. find函数会返回返回一个迭代器, 可以用it->first和it->second来访问它的成员(key和value). 这个过程其实对已经注册过的hook函数的处理.
FAHook::HookInfo *FAInHook::getHookInfo(Elf_Addr origFunAddr) {
    auto it = hook_map.find(origFunAddr);
    if(it == hook_map.end()) {
        return nullptr;
    }
    return it->second;
}
得到originlFunAddr的指令类型, #
auto type = FAHook::Instruction::getFunctionType(orginalFunAddr);
下面是getFunctionAddr的实现, 可以看到. 判断指令类型的方式, 是通过自己定义宏来实现的. 当然, 在Arm指令中, 我们还要是否该指令为thumb指令.
static FunctionType getFunctionType(Elf_Addr functionAddr) {
#if defined(__arm__)
            if(0 == functionAddr) {
                return ERRTYPE;
            } else if((functionAddr & 3) == 0) {
                return ARM;
            } else {
                return THUMB;
            }
#elif defined(__aarch64__)
            return ARM64;
#elif defined(__i386__)
            return X86;
#elif defined(__x86_64__)
            return X64;
#elif defined(__mips64__)  /* mips64el-* toolchain defines __mips__ too */
            return MIPS64;
#elif defined(__mips__)
            return MIPS;
#endif
        }
接下来, 我们分析最重要的过程: #
- info = new HookInfo(originFunAddr, newFunAddr);
 - info.setOriginalFunctionType(type); 设置指令类型为(Arm或者thumb)
 - instruction = new FAHook::ArmInstruction(); new arm或者thumb指令. 我发现无构造函数.
 - instruction->createStub(info); 创建stub.(stub为 jump stub来jump到newFuncAddr)
 - instruction->createCallOriginalStub(info) 创建原函数的call back stub.
 
在看HookInfo.h时, 我们可以看到FAHook是一个namespace, 而里面主要包含了一个HookInfo的类:
我们先来分析它的构造函数:
这里运用了c++中的构造函数初始化列表来初始化类成员.
HookInfo(void* originalAddr, void* hookAddr)
    : original_addr_(originalAddr), hook_addr_(hookAddr),
        original_stub_back_(nullptr), back_len_(0), call_original_ins_(nullptr),
        hook_status_(ERRSTATUS),
        original_function_type_(ERRTYPE), hook_function_type_(ERRTYPE){}
        
分析createStub(info)
我们这里分析FAHook::ThumbInstrution::createStub(FAHook::HookInfo *info):
- 将地址按4字节对齐.
 - 保存我们的stub指令, (方便之后path)指令为: 
LDR.W PC, [PC]. 可参考( http://ele7enxxh.com/Android-Arm-Inline-Hook.html). 
bool FAHook::ThumbInstruction::createStub(FAHook::HookInfo *info) {
    auto stubSize = 0;
    uint8_t *stub = nullptr;
    uint32_t addr = (uint32_t)info->getOriginalAddr();
    auto clearBit0 = addr & 0xFFFFFFFE;
    if (clearBit0 % 4 != 0) {                       // need to align 4, just patch with nop
        stub = new uint8_t[10];
        ((uint16_t*)stub)[stubSize++] = 0xBF00;     //NOP
    } else {
        stub = new uint8_t[8];
    }
    ((uint16_t*)stub)[stubSize++] = 0xF8DF;
    ((uint16_t*)stub)[stubSize++] = 0xF000; // LDR.W PC, [PC]
    ((uint16_t*)stub)[stubSize++] = (uint32_t)info->getHookAddr() & 0xFFFF;
    ((uint16_t*)stub)[stubSize++] = (uint32_t)info->getHookAddr() >> 16;
    info->setJumpStubLen(stubSize * 2);
    info->setJumpStubBack(stub);
    return true;
}
  下面分析 createCallOriginalStub(HookInfo *info):
  #
thumbInstruction.cpp的实现:
基础知识:
reinterpret_cast<uint16_t*> 为类型转换
- 处理ldr.w指令.
 - 调用createExecMemory(length); // 分配buffer空间
 - 修正pc相关指令. (ldr liternal. b. b. bl. cbz. ldrw. add)
 
为什么要修正pc相关指令?
举例分析: b
指令编码分析: [15:12] 1101 [11:8] cond [7:0] imm8
解析时:
- 
imm32 = ZeroExtend(imm8:‘0’, 32);
 - 
BranchWritePC(PC+imm32)
 
可以看到, b指令的指令编码中, 存放的立即数为imm8, 而真实的跳转地址为(pc + imm32). 由于我们是要inlineHook, 所以我们的hook函数执行完成之后还有继续执行我们原来的函数,那么我们就要执行被patch掉的那些指令(我们已经将这些指令保存了下来),但由于存放这些指令的内存是我们mmap出来的,所以我们 要想能够在这里成功运行pc相关指令的话, 我们需要将pc相关的指令转换为其它pc无关的指令。
inlineHook原理图, 来源于gslab.
  
bool FAHook::ThumbInstruction::createCallOriginalStub(FAHook::HookInfo *info) {
    uint16_t *area(reinterpret_cast<uint16_t *>(getOriginalAddr(info)));    // 起始地址
    uint16_t *trail(reinterpret_cast<uint16_t *>(
                            reinterpret_cast<uintptr_t >(area) + info->getJumpStubLen())); // 结束地址
    if(T$pcrel$ldrw(area[0]) &&  // 第一条指令
        area[1] == 0xF000   // 判断第一条指令是否为 ldr pc, [pc]  
            ) {
        uint32_t *arm(reinterpret_cast<uint32_t *>(area));
        info->setCallOriginalIns(reinterpret_cast<uint8_t *>(arm[1]));
        return true;
    }
    size_t required((trail - area) * sizeof(uint16_t)); // required == 需要patch多少字节
    size_t used(0);
    while (used < required)
        used += MSGetInstructionWidthThumb(reinterpret_cast<uint8_t *>(area) + used);
    used = (used + sizeof(uint16_t) - 1) / sizeof(uint16_t) * sizeof(uint16_t);
    size_t blank((used - required) / sizeof(uint16_t));
    uint16_t backup[used / sizeof(uint16_t)];
    memcpy(backup, area, used);
    size_t length(used);
    for (unsigned offset(0); offset != used / sizeof(uint16_t); ++offset)
        if (T$pcrel$ldr(backup[offset]))
            length += 3 * sizeof(uint16_t);
        else if (T$pcrel$b(backup[offset]))
            length += 6 * sizeof(uint16_t);
        else if (T2$pcrel$b(backup + offset)) {
            length += 5 * sizeof(uint16_t);
            ++offset;
        } else if (T$pcrel$bl(backup + offset)) {
            length += 5 * sizeof(uint16_t);
            ++offset;
        } else if (T$pcrel$cbz(backup[offset])) {
            length += 16 * sizeof(uint16_t);
        } else if (T$pcrel$ldrw(backup[offset])) {
            length += 4 * sizeof(uint16_t);
            ++offset;
        } else if (T$pcrel$add(backup[offset]))
            length += 6 * sizeof(uint16_t);
        else if (T$32bit$i(backup[offset]))
            ++offset;
        unsigned pad((length & 0x2) == 0 ? 0 : 1);
        length += (pad + 2) * sizeof(uint16_t) + 2 * sizeof(uint32_t);
    uint16_t *buffer = (uint16_t *) MemHelper::createExecMemory(length);
    if(buffer == nullptr) {
        return false;
    }
    size_t start(pad), end(length / sizeof(uint16_t));
    uint32_t *trailer(reinterpret_cast<uint32_t *>(buffer + end));
    for (unsigned offset(0); offset != used / sizeof(uint16_t); ++offset) {
        if (T$pcrel$ldr(backup[offset])) {
            union {
                uint16_t value;
                struct {
                    uint16_t immediate : 8;
                    uint16_t rd : 3;
                    uint16_t : 5;
                };
            } bits = {backup[offset+0]};
            buffer[start+0] = T$ldr_rd_$pc_im_4$(bits.rd, T$Label(start+0, end-2) / 4);
            buffer[start+1] = T$ldr_rd_$rn_im_4$(bits.rd, bits.rd, 0);
            // XXX: this code "works", but is "wrong": the mechanism is more complex than this
            *--trailer = ((reinterpret_cast<uint32_t>(area + offset) + 4) & ~0x2) + bits.immediate * 4;
            start += 2;
            end -= 2;
        } else if (T$pcrel$b(backup[offset])) {
            union {
                uint16_t value;
                struct {
                    uint16_t imm8 : 8;
                    uint16_t cond : 4;
                    uint16_t /*1101*/ : 4;
                };
            } bits = {backup[offset+0]};
            intptr_t jump(bits.imm8 << 1);
            jump |= 1;
            jump <<= 23;
            jump >>= 23;
            buffer[start+0] = T$b$_$im(bits.cond, (end-6 - (start+0)) * 2 - 4);
            *--trailer = reinterpret_cast<uint32_t>(area + offset) + 4 + jump;
            *--trailer = A$ldr_rd_$rn_im$(A$pc, A$pc, 4 - 8);
            *--trailer = T$nop << 16 | T$bx(A$pc);
            start += 1;
            end -= 6;
        } else if (T2$pcrel$b(backup + offset)) {
            union {
                uint16_t value;
                struct {
                    uint16_t imm6 : 6;
                    uint16_t cond : 4;
                    uint16_t s : 1;
                    uint16_t : 5;
                };
            } bits = {backup[offset+0]};
            union {
                uint16_t value;
                struct {
                    uint16_t imm11 : 11;
                    uint16_t j2 : 1;
                    uint16_t a : 1;
                    uint16_t j1 : 1;
                    uint16_t : 2;
                };
            } exts = {backup[offset+1]};
            intptr_t jump(1);
            jump |= exts.imm11 << 1;
            jump |= bits.imm6 << 12;
            if (exts.a) {
                jump |= bits.s << 24;
                jump |= (~(bits.s ^ exts.j1) & 0x1) << 23;
                jump |= (~(bits.s ^ exts.j2) & 0x1) << 22;
                jump |= bits.cond << 18;
                jump <<= 7;
                jump >>= 7;
            } else {
                jump |= bits.s << 20;
                jump |= exts.j2 << 19;
                jump |= exts.j1 << 18;
                jump <<= 11;
                jump >>= 11;
            }
            buffer[start+0] = T$b$_$im(exts.a ? A$al : bits.cond, (end-6 - (start+0)) * 2 - 4);
            *--trailer = reinterpret_cast<uint32_t>(area + offset) + 4 + jump;
            *--trailer = A$ldr_rd_$rn_im$(A$pc, A$pc, 4 - 8);
            *--trailer = T$nop << 16 | T$bx(A$pc);
            ++offset;
            start += 1;
            end -= 6;
        } else if (T$pcrel$bl(backup + offset)) {
            union {
                uint16_t value;
                struct {
                    uint16_t immediate : 10;
                    uint16_t s : 1;
                    uint16_t : 5;
                };
            } bits = {backup[offset+0]};
            union {
                uint16_t value;
                struct {
                    uint16_t immediate : 11;
                    uint16_t j2 : 1;
                    uint16_t x : 1;
                    uint16_t j1 : 1;
                    uint16_t : 2;
                };
            } exts = {backup[offset+1]};
            int32_t jump(0);
            jump |= bits.s << 24;
            jump |= (~(bits.s ^ exts.j1) & 0x1) << 23;
            jump |= (~(bits.s ^ exts.j2) & 0x1) << 22;
            jump |= bits.immediate << 12;
            jump |= exts.immediate << 1;
            jump |= exts.x;
            jump <<= 7;
            jump >>= 7;
            buffer[start+0] = T$push_r(1 << A$r7);
            buffer[start+1] = T$ldr_rd_$pc_im_4$(A$r7, ((end-2 - (start+1)) * 2 - 4 + 2) / 4);
            buffer[start+2] = T$mov_rd_rm(A$lr, A$r7);
            buffer[start+3] = T$pop_r(1 << A$r7);
            buffer[start+4] = T$blx(A$lr);
            *--trailer = reinterpret_cast<uint32_t>(area + offset) + 4 + jump;
            ++offset;
            start += 5;
            end -= 2;
        } else if (T$pcrel$cbz(backup[offset])) {
            union {
                uint16_t value;
                struct {
                    uint16_t rn : 3;
                    uint16_t immediate : 5;
                    uint16_t : 1;
                    uint16_t i : 1;
                    uint16_t : 1;
                    uint16_t op : 1;
                    uint16_t : 4;
                };
            } bits = {backup[offset+0]};
            intptr_t jump(1);
            jump |= bits.i << 6;
            jump |= bits.immediate << 1;
            //jump <<= 24;
            //jump >>= 24;
            unsigned rn(bits.rn);
            unsigned rt(rn == A$r7 ? A$r6 : A$r7);
            buffer[start+0] = T$push_r(1 << rt);
            buffer[start+1] = T1$mrs_rd_apsr(rt);
            buffer[start+2] = T2$mrs_rd_apsr(rt);
            buffer[start+3] = T$cbz$_rn_$im(bits.op, rn, (end-10 - (start+3)) * 2 - 4);
            buffer[start+4] = T1$msr_apsr_nzcvqg_rn(rt);
            buffer[start+5] = T2$msr_apsr_nzcvqg_rn(rt);
            buffer[start+6] = T$pop_r(1 << rt);
            *--trailer = reinterpret_cast<uint32_t>(area + offset) + 4 + jump;
            *--trailer = A$ldr_rd_$rn_im$(A$pc, A$pc, 4 - 8);
            *--trailer = T$nop << 16 | T$bx(A$pc);
            *--trailer = T$nop << 16 | T$pop_r(1 << rt);
            *--trailer = T$msr_apsr_nzcvqg_rn(rt);
#if 0
            if ((start & 0x1) == 0)
                buffer[start++] = T$nop;
            buffer[start++] = T$bx(A$pc);
            buffer[start++] = T$nop;
            uint32_t *arm(reinterpret_cast<uint32_t *>(buffer + start));
            arm[0] = A$add(A$lr, A$pc, 1);
            arm[1] = A$ldr_rd_$rn_im$(A$pc, A$pc, (trailer - arm) * sizeof(uint32_t) - 8);
#endif
            start += 7;
            end -= 10;
        } else if (T$pcrel$ldrw(backup[offset])) {
            union {
                uint16_t value;
                struct {
                    uint16_t : 7;
                    uint16_t u : 1;
                    uint16_t : 8;
                };
            } bits = {backup[offset+0]};
            union {
                uint16_t value;
                struct {
                    uint16_t immediate : 12;
                    uint16_t rt : 4;
                };
            } exts = {backup[offset+1]};
            buffer[start+0] = T1$ldr_rt_$rn_im$(exts.rt, A$pc, T$Label(start+0, end-2));
            buffer[start+1] = T2$ldr_rt_$rn_im$(exts.rt, A$pc, T$Label(start+0, end-2));
            buffer[start+2] = T1$ldr_rt_$rn_im$(exts.rt, exts.rt, 0);
            buffer[start+3] = T2$ldr_rt_$rn_im$(exts.rt, exts.rt, 0);
            // XXX: this code "works", but is "wrong": the mechanism is more complex than this
            *--trailer = ((reinterpret_cast<uint32_t>(area + offset) + 4) & ~0x2) + (bits.u == 0 ? -exts.immediate : exts.immediate);
            ++offset;
            start += 4;
            end -= 2;
        } else if (T$pcrel$add(backup[offset])) {
            union {
                uint16_t value;
                struct {
                    uint16_t rd : 3;
                    uint16_t rm : 3;
                    uint16_t h2 : 1;
                    uint16_t h1 : 1;
                    uint16_t : 8;
                };
            } bits = {backup[offset+0]};
            if (bits.h1) {
                return false;
            }
            unsigned rt(bits.rd == A$r7 ? A$r6 : A$r7);
            buffer[start+0] = T$push_r(1 << rt);
            buffer[start+1] = T$mov_rd_rm(rt, (bits.h1 << 3) | bits.rd);
            buffer[start+2] = T$ldr_rd_$pc_im_4$(bits.rd, T$Label(start+2, end-2) / 4);
            buffer[start+3] = T$add_rd_rm((bits.h1 << 3) | bits.rd, rt);
            buffer[start+4] = T$pop_r(1 << rt);
            *--trailer = reinterpret_cast<uint32_t>(area + offset) + 4;
            start += 5;
            end -= 2;
        } else if (T$32bit$i(backup[offset])) {
            buffer[start++] = backup[offset];
            buffer[start++] = backup[++offset];
        } else {
            buffer[start++] = backup[offset];
        }
    }
    buffer[start++] = T$bx(A$pc);
    buffer[start++] = T$nop;
    uint32_t *transfer = reinterpret_cast<uint32_t *>(buffer + start);
    transfer[0] = A$ldr_rd_$rn_im$(A$pc, A$pc, 4 - 8);
    transfer[1] = reinterpret_cast<uint32_t>(area + used / sizeof(uint16_t)) + 1;
    info->setCallOriginalIns(reinterpret_cast<uint8_t *>(buffer + pad) + 1);
    return true;
}
MSGetInstructionWithThumb #
调用 used += MSGetInstructionWithThumb(reinterpret_cast<uint8_t >(area) + used); MSGetInstructionWithThumb: 参数为(uint16_t). 返回结果: 为这条指令是多少字节的指令.(4 or 2)
T$32bit$i的作用: (指令(ic) & 1110 0000 0000 0000) && (ic & 0001 1000 0000 0000 != 0x0000); 第一个判断为确定高位3个bit(即bit[15], bit[14], bit[13])为1. 第二个判断为确保bit[12], bit[11]有值(即至少这两位有 1 位为 1).
其实就是判断是该thumb指令是否为thumb32指令, 
thumb32指令的判断依据是 b[15:11] 为 0b11101或0b11110或0b11111.
MSGetInstructionWithThumb:
        static size_t MSGetInstructionWidthThumb(void *start) {
            uint16_t *thumb(reinterpret_cast<uint16_t *>(start));   //
            return T$32bit$i(thumb[0]) ? 4 : 2;
        }
T$32bit$i:
        static inline bool T$32bit$i(uint16_t ic) {
            return ((ic & 0xe000) == 0xe000 && (ic & 0x1800) != 0x0000);
        }
分析MemHelper类 #
有4个方法:
- static bool isFunctionAddr(void* addr);
 - static bool unProtectMemory(void* addr, uint32_t size); remove 写保护
 - static bool protectMemory(void* addr, uint32_t size); add 写保护
 - static void* createExecMemory(uint32_t size); 创建一个可执行的内存
 
有4个field:
- std::vector<void*> all_memory_page;
 - void* current_page = nullptr;
 - uint32_t page_ptr = 0;
 - static uint32_t page_size;      // 构造函数
page_size = sysconf(_SC_PAGESIZE). 
我们现在分析一下createExecMemory(uint32_t size):
- 可以看到, 分配内存的操作是通过
mmap实现的. - all_memory_page是一个vector, 每个单位保存一个指针, 指向mmap的内存.
 
createExecMemory:
void *FAHook::MemHelper::createExecMemory(uint32_t size) {
    if(size & 1) {
        size ++;
    }
    if(size > page_size) {
        return nullptr;
    }
    if(gMemHelper.current_page != nullptr && page_size - gMemHelper.page_ptr_ >= size) {
        auto funPtr = (void*)((size_t)gMemHelper.current_page + gMemHelper.page_ptr_);
        gMemHelper.page_ptr_ += size;
        // Align 4
        while(gMemHelper.page_ptr_ & 0x3) {
            gMemHelper.page_ptr_ ++;
        }
        return funPtr;
    }
    // scroll to next page
    auto newPage = mmap(nullptr, page_size, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
    if(newPage != MAP_FAILED) {
        gMemHelper.alloc_memory_page_.push_back(newPage);
        gMemHelper.current_page = newPage;
        gMemHelper.page_ptr_ = 0;
        return createExecMemory(size);
    }
    return nullptr;
}
分析doInHook #
- 主要操作: FAInHook::instance()->hookAll();
 - FAInHook::instance()->unHookAll();
 
bool doInHook() {
    static bool isHooked = false;
    if (isHooked) {
        isHooked = false;
        FAInHook::instance()->unhookAll();
    } else {
        isHooked = true;
        FAInHook::instance()->hookAll();
    }
    return true;
}
hookAll():
void FAInHook::hookAll() {
    for(auto it: hook_map) {
        if(it.second->getHookStatus() == FAHook::REGISTERED) {
            Hook(it.second);
        }
    }
}
进而转到Hook函数
- 调用`enableJumpStub(info)
 - info->setHookStatus(FAHook::HOOKED)
 
bool FAInHook::Hook(FAHook::HookInfo *info) {
    if(!FAHook::Instruction::enableJumpStub(info)) {
        return false;
    }
    info->setHookStatus(FAHook::HOOKED);
    return true;
}
分析enableJumpStub(info):
bool FAHook::Instruction::enableJumpStub(FAHook::HookInfo *info) {
    auto origAddr = getOriginalAddr(info);
    auto len = info->getJumpStubLen();
    auto stubAddr = info->getJumpStubBack();
    return patchMemory(origAddr, stubAddr, len);
}
可以看到, 在得到了origAddr和stubAddr和len之后,我们会进入到patch函数patchMemory, 根据我们前面的分析, 它会patch原函数的入口指令的前(8 or 10?)个字节.
patchMemory:
bool FAHook::Instruction::patchMemory(void *dest, void *src, uint32_t len) {
    if(dest == nullptr || src == nullptr || len == 0) {
        return false;
    }
    if(!MemHelper::unProtectMemory(dest, len)) {
        return false;
    }
    memcpy(dest, src, len);
    MemHelper::protectMemory(dest, len);
#ifdef __arm__
    cacheflush((Elf_Addr)dest, (Elf_Addr)dest + len, 0);
#endif
    return true;
}
首先会调用unProtectMemory函数来将对应内存修改为(rwx), 然后调用memcpy来修改内存, 最后调用protectMemory来修改对应内存为(r-x).
总结 #
现在我们就基本对该项目进行了简单的分析, 我这里总结一下:
- 
registerHook 主要操作其实就是:
- createStub:  创建代理(这个代理就是要执行跳转到我们的
newFuncAddr函数) - createCallOriginalStub:  call back代理(这个代理就是执行回调, 回调我们的
originalFunAddr.) 主要涉及处理pc相关指令(原因在文中已经有介绍). 
 - createStub:  创建代理(这个代理就是要执行跳转到我们的
 - 
doInHook
- 主要就是patchMemory. patch我们originalFunAddr的函数起始处的几个指令为
stubInstruction.之后函数涉及pc相关指令的修复, 方便继续执行原函数. 
 - 主要就是patchMemory. patch我们originalFunAddr的函数起始处的几个指令为
 - 
unHook
- 也是patchMemory. 就是将我们原函数的原始指令进行复原.