Programming-by-Demonstration for Long-Horizon Robot Tasks (POPL 2024 - POPL Research Papers)

Sun 14 - Sat 20 January 2024 London, United Kingdom

Who

Noah Patton, Kia Rahmani, Meghana Missula, Joydeep Biswas, Işıl Dillig

Track

POPL 2024

Time Zone

The program is currently displayed in (GMT) London.

Use conference time zone: (GMT) LondonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 19 Jan 2024 16:10 - 16:30 at Kelvin Lecture - Machine and Automata Learning Chair(s): Steven Holtzen

Abstract

The goal of programmatic Learning from Demonstration (LfD) is to learn a policy in a programming language that can be used to control a robot’s behavior from a set of user demonstrations. This paper presents a new programmatic LfD algorithm that targets long-horizon robot tasks which require synthesizing programs with complex control flow structures, including nested loops with multiple conditionals. Our proposed method first learns a program sketch that captures the target program’s control flow and then completes this sketch using an LLM-guided search procedure that incorporates a novel technique for proving unrealizability of programming by-demonstration problems. We have implemented our approach in a new tool called PROLEX and present the results of a comprehensive experimental evaluation on 120 benchmarks involving complex tasks and environments. We show that, given a 120 second time limit, PROLEX can find a program consistent with the demonstrations in 80% of the cases. Furthermore, for 81% of the tasks for which a solution is returned, PROLEX is able to find the ground truth program with just one demonstration. In comparison, CVC5, a syntax-guided synthesis tool, is only able to solve 18% of the cases even when given the ground truth program sketch, and an LLM-based approach, GPT-Synth, is unable to solve any of the tasks due to the environment complexity.

Noah Patton

The University of Texas at Austin

Kia Rahmani

The University of Texas at Austin

Meghana Missula

The University of Texas at Austin

Joydeep Biswas

The University of Texas at Austin

Işıl Dillig

University of Texas at Austin

United States

Time Zone

The program is currently displayed in (GMT) London.

Use conference time zone: (GMT) LondonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 19 Jan
Displayed time zone: London change

15:10 - 16:30	Machine and Automata LearningPOPL at Kelvin Lecture Chair(s): Steven Holtzen Northeastern University

15:10 20m Talk		Efficient CHAD POPL Tom Smeding Utrecht University, Matthijs Vákár Utrecht University DOI Pre-print
15:30 20m Talk		ReLU Hull Approximation POPL Zhongkui Ma The University of Queensland, Jiaying Li Microsoft, Guangdong Bai The University of Queensland
15:50 20m Talk		On Learning Polynomial Recursive Programs POPL Alex Buna-Marginean University of Oxford, Vincent Cheval Inria Paris, Mahsa Shirmohammadi CNRS & IRIF, Paris, James Worrell University of Oxford
16:10 20m Talk		Programming-by-Demonstration for Long-Horizon Robot Tasks POPL Noah Patton The University of Texas at Austin, Kia Rahmani The University of Texas at Austin, Meghana Missula The University of Texas at Austin, Joydeep Biswas The University of Texas at Austin, Işıl Dillig University of Texas at Austin