文章目录

Towards Automating Code-Reuse Attacks Using Synthesized Gadget Chains
- abstract
- Introduction
- shortcomings of state-of-the-art approaches
- Design
- - gadgets
  - Logical encoding
  - precoditions and postconditions
  - Formula Generation
  - Algorithm configuration
- Implementation
- Evaluation
- - Setup
  - Finding a chain
  - Real-World Applicability
  - Target-Specific Constraints
  - Chain statistics
- Limitation

Towards Automating Code-Reuse Attacks Using Synthesized Gadget Chains

ESORICS 2021

abstract

In the arms race between binary exploitation techniques and mitigation schemes, code-reuse attacks have been proven indispensable.

arm race 军备竞赛
在军备竞赛中的利用技术和缓解方案中，代码重用攻击必不可少。

Typically, one of the initial hurdles is that an attacker cannot execute their own code due to countermeasures such as data execution prevention(DEP).

一种初始障碍是攻击者不能在对抗措施下执行代码

While this technique is powerful, the task of finding and correctly chaining gadgets remains cumbersome.

查找和链接gadgets仍然较为繁琐。

Although various methods automating this task have been proposed, they either rely on hard-coded heuristics or make specific assumptions about the gadgets semantics.

现有的方法仍然依靠硬编码启发式算法或者对gadgets做出假设

This not only drastically limits the search space but also sacrifices their capability to find valid chains unless specific gadgets can be located.

As a result, they often produce no chain or an incorrect chain that crashes the program.

In this paper, we present SGC, the first generic approach to identify gadget chains in an automated manner without imposing restrictions on the gadgets or limiting its applicability to specific exploitation scenarios.

提出通用的方法去识别gadgets chain不需要极大的限制

Instead of using heuristics to find a gadget chain, we offload this task to an SMT solver.

没有使用启发式方法而是交给SMT求解器。

More specifically, we build a logical formula that encodes the CPU and memory state at the time when the attacker can divert execution flow to the gadget chain, as well as the attacker’s desired program state that the gadget chain should construct.

In combination with a logical encoding of the data flow between gadgets, we query an SMT solver whether a valid gadget chain exists.

If successful, the solver provides a proof of existence in the form of a synthesized gadget chain.

This way, we remain fully flexible w.r.t to the gadgets.

w.r.t关于

In empirical tests, we find that the solver often uses all types of control-flow transfer instructions and even gadgets with side effects.

发现求解器经常使用所有类型的控制流传输指令，甚至带有副作用的gadgets

Our evaluation shows that SGC successfully finds working gadget chains for real-world exploitation scenarios within minutes, even when all state-of-the-art approaches fail.

Introduction

DEP的引入使得注入代码的执行变得不可能，因为内存被标记为可写或可执行。这使得攻击者开发新的技术来重用现有代码。（例如ret2libc）

作为额外的防线，现代操作系统随机化程序的地址空间布局（ASLR）。

尽管如此，单个信息泄露或非随即部分仍会为攻击者提供发动攻击的能力。

控制流完整性CFI强制执行仅执行程序所需的良性集合内的合法控制流转换的属性，这虽然会极大的限制攻击者链接任意代码片段的自由，但代码重用攻击在实践中仍是可行的。

最初尝试利用基于模式匹配的策略来识别chain，后来利用符执行对gadget进行分类并识别不良副作用，例如将值写入内存。

但迄今为止最先进的方法也依赖于各种启发式方法限制大型搜索空间。

这些启发式方法试图找到通用chain来跨多个目标工作，但在某些情况下并不存在这样的chain。

执行第一个gadgets之前CPU和内存的状态
攻击者所需要的CPU和内存的状态
gadgets之间的数据流

shortcomings of state-of-the-art approaches

现有的gadgets chain可以分为两类：

硬编码的链接规则：Ropper和ROPgadget属于这一类，需要基于正则表达式的硬编码来连接gadgets，这些工具显然是不灵活的
符探索：angrop和ROPium是对gadgets的中间表示进行操作，这允许它们符方式确定副作用并进行分类。小工具首先被提升，然后被分析最后被串联起来。后者通常涉及一种算法，如深度优先搜索（ROPium）或广度优先搜索（angrop），以确定符合攻击者规范的gadgets序列。缺乏对更复杂约束的支持。

最终的公式由三个主要部分构成：前提条件、gadget chain、后置条件。

前提条件描述初始状态，用来作为chain的初始gadget输入

gadget chain 包含单个指令的编码，gadget内指令之间的数据流以及gadget之间的数据流。

后置条件定义了执行gadget chain后应该达到的状态。

将公式传递给SMT求解器，给出满足赋值的相关模型。

Algorithm configuration

一些参数定义了该方法的性能：

Implementation

5000行代码

https://github.com/RUB-SysSec/gadget_synthesis

gadget的提取依赖于Binary Ninja（V2.3.2660）

所有进一步的步骤都建立在Miasm之上 https://github.com/cea-sec/miasm

然后交给SMT求解器 Boolector

Evaluation

Setup

x86-64架构，为了便于分析，禁用了ASLR。

使用不同的程序集，chromium，apache2，nginx和openssl的最新版本，这些目标都是动态链接的，为工具配置忽略共享库，模拟只知道主可执行文件的基地址，但不知道库位置的场景。

为了评估是否可用于现实世界的漏洞，使用了dnsmasq版本2.77评估

Finding a chain

当前攻击者主要针对调用库函数，例如mprotect更改内存区域的保护标志，mmap映射可以放置其shellcode的RWX页面或在执行系统调用是选择带有参数的execve。

mrotect(addr, len, prot) 三个参数
mmap(addr, length, prot, flags,fd, offset)六个参数
execve(path, argv, envp) 四个参数，还有一个是系统调用，以及将字符串放入内存中

在x86-64架构下，这些参数是通过寄存器传递的。结果如下图所示

耗时分析：

反汇编时间较长，依赖于Binary Ninja和Miasm的组合，首先分析整个二进制，然后在Miasm中反汇编单个函数。

Limitation

反汇编方法幼稚，只考虑了常规的指令偏移量，作为一种改进可以搜索不对齐的小工具，因为任何字节序列都可以呗解释为x86-64的指令
SMT求解器运行时间过长

DEP无法防御代码重用，ASLR可以通过基址泄露绕过，CFI防止控制流重定向到任意的代码位置，但可以在SMT中加入约束条件，使SMT只选择通过CFI执行策略的rop chain。

声明：本站部分文章及图片源自用户投稿，如本站任何资料有侵权请您尽早请联系jinwei@zod.com.cn进行处理,非常感谢！

系统与软件安全研究（三）