On any modern computer architecture today, parallelism comes with a modest cost, born from the creation and management of threads or tasks. Today, programmers battle this cost by manually optimizing/tuning their codes to minimize the cost of parallelism without harming its benefit, performance. This is a difficult battle: programmers must reason about architectural constant factors hidden behind layers of software abstractions, including thread schedulers and memory managers, and their impact on performance, also at scale. In languages that support higher-order functions, the battle hardens: higher order functions can make it difficult, if not impossible, to reason about the cost and benefits of parallelism.
Motivated by these challenges and the numerous advantages of high-level languages, we believe that it has become essential to manage parallelism automatically so as to minimize its cost and maximize its benefit. This is a challenging problem, even when considered on a case-by-case, application-specific basis. But if a solution were possible, then it could combine the many correctness benefits of high-level languages with performance by managing parallelism without the programmer effort needed to ensure performance. This paper proposes techniques for such automatic management of parallelism by combining static (compilation) and run-time techniques. Specifically, we consider the Parallel ML language with (nested) task parallelism and describe a compiler pipeline that translates parallel tasks to ``parallelism potentials'' that may (or not) translate to actual parallelism at run time. We then pair this compilation pipeline with a run-time system that translates the potential to reality with the aim of managing parallelism, both its cost and benefit, automatically. We prove that our techniques have no asymptotic impact on the work and span of a parallel programs and thus preserve their asymptotic properties. We implement the proposed techniques by extending the MPL compiler for Parallel ML and show that it can eliminate the burden of manual optimization while delivering good practical performance.
Thu 18 JanDisplayed time zone: London change
15:30 - 16:50
|Pipelines and Beyond: Graph Types for ADTs with Futures
|Disentanglement with Futures, State, and Interaction
|DisLog: A Separation Logic for Disentanglement
Alexandre Moine Inria, Sam Westrick Carnegie Mellon University, Stephanie Balzer Carnegie Mellon UniversityPre-print
|Automatic Parallelism ManagementDistinguished Paper