“Boring” Problems in Distributed ML feat. Richard Liaw | Stanford MLSys Seminar Episode 28
Episode 28 of the Stanford MLSys Seminar Series!
Assorted boring problems in distributed machine learning
Speaker: Richard Liaw
Abstract:
Much of the academic focus on “distributing/scaling up machine learning” is synonymous with “training larger supervised ML models like GPT-3 with more and more compute resources”. However, training is only a small part of the ML lifecycle. In this talk, I’ll focus on a couple other machine learning problems that demand a large amount of compute resources, which may be a bit more “boring” but equally (or arguably more!) important. I’ll cover a couple problems that my collaborators and I have previously worked on at UC Berkeley and now at Anyscale: abstractions for scalable reinforcement learning and building RLlib (ICML 18, ICLR 20), distributed hyperparameter tuning and dynamic resource allocation for hyperparameter tuning (SOCC 19, Eurosys 21), and ray as a substrate for the next generation of ML platforms.
Speaker bio:
Richard Liaw is an engineer at Anyscale, where he