SPLASH 2023
Sun 22 - Fri 27 October 2023 Cascais, Portugal
Wed 25 Oct 2023 14:18 - 14:36 at Room II - program synthesis 2 Chair(s): Chandrakana Nandi

Many data extraction tasks of practical relevance require not only syntactic pattern matching but also semantic reasoning about the content of the underlying text. While regular expressions are very well suited for tasks that require only syntactic pattern matching, they fall short for data extraction tasks that involve both a syntactic and semantic component. To address this issue, we introduce semantic regexes, a generalization of regular expressions that facilitates combined syntactic and semantic reasoning about textual data. We also propose a novel learning algorithm that can synthesize semantic regexes from a small number of positive and negative examples. Our proposed learning algorithm uses a combination of neural sketch generation and compositional type-directed synthesis for fast and effective generalization from a small number of examples. We have implemented these ideas in a new tool called Smore and evaluated it on representative data extraction tasks involving several textual datasets. Our evaluation shows that semantic regexes can better support complex data extraction tasks than standard regular expressions and that our learning algorithm significantly outperforms existing tools, including state-of-the-art neural networks and program synthesis tools.

Wed 25 Oct

Displayed time zone: Lisbon change

14:00 - 15:30
program synthesis 2OOPSLA at Room II
Chair(s): Chandrakana Nandi Certora
14:00
18m
Talk
Mobius: Synthesizing Relational Queries with Recursive and Invented Predicates
OOPSLA
Aalok Thakkar University of Pennsylvania, Nathaniel Sands University of Southern California, Georgios Petrou University of Southern California, Rajeev Alur University of Pennsylvania, Mayur Naik University of Pennsylvania, Mukund Raghothaman University of Southern California
DOI
14:18
18m
Talk
Data Extraction via Semantic Regular Expression Synthesis
OOPSLA
Jocelyn (Qiaochu) Chen University of Texas at Austin, Arko Banerjee University of Texas at Austin, Çağatay Demiralp Massachusetts Institute of Technology, Greg Durrett University of Texas at Austin, Işıl Dillig University of Texas at Austin
DOI
14:36
18m
Talk
Synthesizing Efficient Memoization Algorithms
OOPSLA
Yican Sun Peking University, Xuanyu Peng Peking University, Yingfei Xiong Peking University
DOI
14:54
18m
Talk
Algebro-geometric Algorithms for Template-Based Synthesis of Polynomial ProgramsDistinguished Paper
OOPSLA
Amir Kafshdar Goharshady Hong Kong University of Science and Technology, S. Hitarth Hong Kong University of Science and Technology, Fatemeh Mohammadi KU Leuven, Harshit Jitendra Motwani Ghent University
DOI
15:12
18m
Talk
Modular Component-Based Quantum Circuit Synthesis
OOPSLA
Chan Gu Kang Korea University, Hakjoo Oh Korea University
DOI