Concrete Type Inference for Code Optimization using Machine Learning with SMT Solving (SPLASH 2023 - OOPSLA)

Sun 22 - Fri 27 October 2023 Cascais, Portugal

Who

Fangke Ye, Jisheng Zhao, Jun Shirako, Vivek Sarkar

Track

SPLASH 2023 OOPSLA

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 25 Oct 2023 11:36 - 11:54 at Room I - AI4SE Chair(s): Guido Salvaneschi

Abstract

Despite the widespread popularity of dynamically typed languages such as Python, it is well known that they pose significant challenges to code optimization due to the lack of concrete type information. To overcome this limitation, many ahead-of-time optimizing compiler approaches for Python rely on programmers to provide optional type information as a prerequisite for extensive code optimization. Since few programmers provide this information, a large majority of Python applications are executed without the benefit of code optimization, thereby contributing collectively to a significant worldwide wastage of compute and energy resources.

In this paper, we introduce a new approach to concrete type inference that is shown to be effective in enabling code optimization for dynamically typed languages, without requiring the programmer to provide any type information. We explore three kinds of type inference algorithms in our approach based on: 1) machine learning models including GPT-4, 2) constraint-based inference based on SMT solving, and 3) a combination of 1) and 2). Our approach then uses the output from type inference to generate multi-version code for a bounded number of concrete type options, while also including a catch-all untyped version for the case when no match is found. The typed versions are then amenable to code optimization. Experimental results show that the combined algorithm in 3) delivers far superior precision and performance than the separate algorithms for 1) and 2).
The performance improvement due to type inference, in terms of geometric mean speedup across all benchmarks compared to standard Python, when using 3) is $26.4\times$ with Numba as an AOT optimizing back-end and $62.2\times$ with the Intrepydd optimizing compiler as a back-end. These vast performance improvements can have a significant impact on programmers' productivity, while also reducing their applications' use of compute and energy resources.

DOI

https://doi.org/10.1145/3622825

Fangke Ye

Georgia Institute of Technology

United States

Jisheng Zhao

Georgia Institute of Technology

United States

Jun Shirako

Georgia Institute of Technology

United States

Vivek Sarkar

Georgia Institute of Technology

United States

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 25 Oct
Displayed time zone: Lisbon change

11:00 - 12:30	AI4SEOOPSLA at Room I Chair(s): Guido Salvaneschi University of St. Gallen

11:00 18m Talk		Grounded Copilot: How Programmers Interact with Code-Generating Models OOPSLA Shraddha Barke University of California at San Diego, Michael B. James University of California at San Diego, Nadia Polikarpova University of California at San Diego DOI
11:18 18m Talk		Turaco: Complexity-Guided Data Sampling for Training Neural Surrogates of Programs OOPSLA Alex Renda Massachusetts Institute of Technology, Yi Ding Purdue University, Michael Carbin Massachusetts Institute of Technology DOI Pre-print
11:36 18m Talk		Concrete Type Inference for Code Optimization using Machine Learning with SMT Solving OOPSLA Fangke Ye Georgia Institute of Technology, Jisheng Zhao Georgia Institute of Technology, Jun Shirako Georgia Institute of Technology, Vivek Sarkar Georgia Institute of Technology DOI
11:54 18m Talk		An Explanation Method for Models of Code OOPSLA Yu Wang Nanjing University, Ke Wang , Linzhang Wang Nanjing University DOI
12:12 18m Talk		Optimization-Aware Compiler-Level Event Profiling OOPSLA Matteo Basso Università della Svizzera italiana (USI), Switzerland, Aleksandar Prokopec Oracle Labs, Andrea Rosà USI Lugano, Walter Binder USI Lugano Link to publication DOI