Prof. Carlo Fischione (Royal Institute of Technology – KTH, Sweden) will give two classes on Monday 15 and Tuesday 16, from 2 pm to 6 pm in Aula Seminari DISIM (Alan Turing Building), on the following topics:
Title: Fundamentals of Machine Learning over Networks
Lecturer: Prof. Carlo Fischione
Abstract: This course covers fundamentals of machine learning over networks (MLoNs). It starts from a conventional single-agent setting where one server runs a convex/nonconvex optimization problem to learn an unknown function. We introduce several approaches to address this seemingly, simple yet fundamental, problem. We introduce an abstract form of MLoNs, present centralized and distributed solution approaches to address this problem, and exemplify via training a deep neural network over a network. The course covers various important aspects of MLoNs, including optimality, computational complexity, communication complexity, security, large-scale learning, online learning, MLoN with partial information, and several application areas. As most of these topics are under heavy researches nowadays, the course is not based on a single textbook but builds on a series of key publications in the field.
 Bubeck, Sébastien. “Convex optimization: Algorithms and complexity.” Foundations and Trends in Machine Learning, vol. 8, no.3-4 (2015): 231-357.
 L. Bottou, F. Curtis, J. Norcedal, “Optimization Methods for Large-Scale Machine Learning”, SIAM Rev., 60(2), 223–311.
 Boyd, Stephen, et al. “Distributed optimization and statistical learning via the alternating direction method of multipliers.” Foundations and Trends in Machine learning 3.1 (2011): 1-122.
 Goodfellow, Y. Bengio, A. Courville, “Deep Learning”, MIT press 2016
 Jordan, Michael I., Jason D. Lee, and Yun Yang. “Communication-efficient distributed statistical inference,” Journal of the American Statistical Association, 2018.
 Smith, Virginia, et al. “CoCoA: A general framework for communication-efficient distributed optimization.” Journal of Machine Learning Research 18 (2018): 230.
 Alistarh, Dan, et al. “QSGD: Communication-efficient SGD via gradient quantization and encoding.” Advances in Neural Information Processing Systems. 2017.
 Schmidt, Mark, Nicolas Le Roux, and Francis Bach. “Minimizing finite sums with the stochastic average gradient.” Mathematical Programming 162.1-2 (2017): 83-112.
 Boyd, Stephen, et al. “Randomized gossip algorithms,” IEEE Transactions on Information Theory, 2006.
 Scaman, Kevin, et al. “Optimal algorithms for smooth and strongly convex distributed optimization in networks,” ICML, 2017.