Exploiting the Sparseness of Control-Flow and Call Graphs for Efficient and On-Demand Algebraic Program Analysis
Algebraic Program Analysis (APA) is a ubiquitous framework that has been employed as a unifying model for various problems in data-flow analysis, termination analysis, invariant generation, predicate abstraction and a wide variety of other standard static analysis tasks. APA models program summaries as elements of a regular algebra $\alg.$ Suppose that a summary in $A$ is assigned to every transition of the program and that we aim to compute the effect of running the program starting at line $s$ and ending at line $t.$ APA first computes a regular expression $\regexp$ capturing all program paths of interest. In case of intraprocedural analysis, $\regexp$ models all paths from $s$ to $t$, whereas in the interprocedural case it models all interprocedurally-valid paths, i.e.~paths that go back to the right caller function when a callee returns. This regular expression $\regexp$ is then interpreted over the algebra $\alg$ to obtain the desired result. Suppose the program has $n$ lines of code and each evaluation of an operation in the regular algebra takes $O(k)$ time. It is well-known that a single APA query, or a set of queries with the same starting point $s,$ can be answered in $O(n \cdot \alpha(n) \cdot k),$ where $\alpha$ is the inverse Ackermann function.
In this work, we consider an on-demand setting for APA: the program is given in the input and can be preprocessed. The analysis has to then answer a large number of on-line queries, each providing a pair $(s, t)$ of program lines which are the start and end point of the query, respectively. The goal is to avoid the significant cost of running a fresh APA instance for each query. Our main contribution is a series of algorithms that, after a lightweight preprocessing of $O(n \cdot \lg n \cdot k)$, answer each query in $O(k)$ time. In other words, our preprocessing has almost the same asymptotic complexity as a single APA query, except for a sub-logarithmic factor, and then every future query is answered instantly, i.e.~by a constant number of operations in the algebra. We achieve this remarkable speedup by relying on certain structural sparsity properties of control-flow and call graphs (CFGs and CGs). Specifically, we exploit the fact that control-flow graphs of real-world programs have a tree-like structure and bounded treewidth and nesting depth and that their call graphs have small treedepth in comparison to the size of the program. Finally, we provide experimental results demonstrating the effectiveness and efficiency of our approach and showing that it beats the runtime of classical APA by several orders of magnitude.
Thu 26 OctDisplayed time zone: Lisbon change
16:00 - 17:30 | |||
16:00 18mTalk | Historia: Refuting Callback Reachability with Message-History Logics OOPSLA Shawn Meier University of Colorado at Boulder, Sergio Mover École Polytechnique, Gowtham Kaki University of Colorado at Boulder, Bor-Yuh Evan Chang University of Colorado at Boulder; Amazon DOI | ||
16:18 18mTalk | Exploiting the Sparseness of Control-Flow and Call Graphs for Efficient and On-Demand Algebraic Program Analysis OOPSLA Giovanna Kobus Conrado Hong Kong University of Science and Technology, Amir Kafshdar Goharshady Hong Kong University of Science and Technology, Kerim Kochekov Hong Kong University of Science and Technology, Yun Chen Tsai Hong Kong University of Science and Technology, Ahmed Khaled Zaher Hong Kong University of Science and Technology DOI | ||
16:36 18mTalk | A Cocktail Approach to Practical Call Graph Construction OOPSLA Yuandao Cai Hong Kong University of Science and Technology, Charles Zhang Hong Kong University of Science and Technology DOI | ||
16:54 18mTalk | Building Dynamic System Call Sandbox with Partial Order Analysis OOPSLA Quan Zhang Tsinghua University, Chijin Zhou Tsinghua University, Yiwen Xu Tsinghua University, Zijing Yin Tsinghua University, Mingzhe Wang Tsinghua University, Zhuo Su Tsinghua University, Chengnian Sun University of Waterloo, Yu Jiang Tsinghua University, Jiaguang Sun Tsinghua University DOI | ||
17:12 18mTalk | Improving Oracle-Guided Inductive Synthesis by Efficient Question Selection OOPSLA Ruyi Ji Peking University, Chaozhe Kong Peking University, Yingfei Xiong Peking University, Zhenjiang Hu Peking University DOI |