After decades of research, constructing call graphs for modern C-based software remains either imprecise or inefficient when scaling up to the ever-growing complexity. The main culprit is the difficulty of resolving function pointers, as precise pointer analyses are cubic in nature and become exponential when considering calling contexts. This paper takes a practical stance by first conducting a comprehensive empirical study of function pointer manipulations in the wild. By investigating 5355 indirect calls in five popular open-source systems, we conclude that, instead of the past uniform treatments for function pointers, a cocktail approach can be more effective in “squeezing” the number of difficult pointers to a minimum using a potpourri of cheap methods. In particular, we decompose the costs of constructing highly precise call graphs of big code by tailoring several increasingly precise algorithms and synergizing them into a concerted workflow. As a result, many indirect calls can be precisely resolved in an efficient and principled fashion, thereby reducing the final, expensive refinements. This is, in spirit, similar to the well-known cocktail medical therapy.
The results are encouraging — our implemented prototype called Coral can achieve similar precision versus the previous field-, flow-, and context-sensitive Andersen-style call graph construction, yet scale up to millions of lines of code for the first time, to the best of our knowledge. Moreover, by evaluating the produced call graphs through the lens of downstream clients (i.e., use-after-free detection, thin slicing, and directed grey-box fuzzing), the results show that Coral can dramatically improve their effectiveness for better vulnerability hunting, understanding, and reproduction. More excitingly, we found twelve confirmed bugs (six impacted by indirect calls) in popular systems (e.g., MariaDB), spreading across multiple historical versions.
Thu 26 OctDisplayed time zone: Lisbon change
16:00 - 17:30 | |||
16:00 18mTalk | Historia: Refuting Callback Reachability with Message-History Logics OOPSLA Shawn Meier University of Colorado at Boulder, Sergio Mover École Polytechnique, Gowtham Kaki University of Colorado at Boulder, Bor-Yuh Evan Chang University of Colorado at Boulder; Amazon DOI | ||
16:18 18mTalk | Exploiting the Sparseness of Control-Flow and Call Graphs for Efficient and On-Demand Algebraic Program Analysis OOPSLA Giovanna Kobus Conrado Hong Kong University of Science and Technology, Amir Kafshdar Goharshady Hong Kong University of Science and Technology, Kerim Kochekov Hong Kong University of Science and Technology, Yun Chen Tsai Hong Kong University of Science and Technology, Ahmed Khaled Zaher Hong Kong University of Science and Technology DOI | ||
16:36 18mTalk | A Cocktail Approach to Practical Call Graph Construction OOPSLA Yuandao Cai Hong Kong University of Science and Technology, Charles Zhang Hong Kong University of Science and Technology DOI | ||
16:54 18mTalk | Building Dynamic System Call Sandbox with Partial Order Analysis OOPSLA Quan Zhang Tsinghua University, Chijin Zhou Tsinghua University, Yiwen Xu Tsinghua University, Zijing Yin Tsinghua University, Mingzhe Wang Tsinghua University, Zhuo Su Tsinghua University, Chengnian Sun University of Waterloo, Yu Jiang Tsinghua University, Jiaguang Sun Tsinghua University DOI | ||
17:12 18mTalk | Improving Oracle-Guided Inductive Synthesis by Efficient Question Selection OOPSLA Ruyi Ji Peking University, Chaozhe Kong Peking University, Yingfei Xiong Peking University, Zhenjiang Hu Peking University DOI |