Scaling up machine learning without tears (and what do programming languages have to do with it)
The rapid rise in demand for training large neural networks on thousands of accelerators has put partitioning techniques in the spotlight of ML systems. However, implementing various forms of partitioning and parallelism often requires substantial programming effort and careful profiling and analysis to achieve high hardware utilization. Making it easier for ML engineers to distribute ML workloads and also to predict/simulate their performance on existing and future accelerator systems is key both for accelerating and productionizing ML research, as well as for informing the design of future hardware systems. In this talk I will outline some of the challenges we faced and lessons we learnt while working in this space; and how key concepts of programming languages, such as convenient domain-specific abstractions and types, program transformations at various intermediate representation levels, and abstract interpretation help address some of these challenges. I will also highlight some idiosyncrasies of the domain of ML programs and accelerators that make certain problems more tractable, but also pose new problems compared to general-purpose languages.
Wed 25 OctDisplayed time zone: Lisbon change
09:30 - 10:30
|Scaling up machine learning without tears (and what do programming languages have to do with it)Keynote|
Dimitrios Vytiniotis Google DeepMind