Computational and Quantitative Biology PhD

    Vinaora Nivo Slider 3.x
     

    Riccardo Zecchina - The learning landscape in deep neural networks and its exploitation by learning algorithms

    Seminars
     Registration Closed
     
    121
    Date: Friday, 21st January 2022 15:00

    Venue: Webex

    Abstract

    Among the most surprising aspects of deep learning models are their highly overparametrized and non-convex nature. Both of these aspects are a common trait of all the deep learning models and have led to unexpected results for classical learning theory and non-convex optimization. Current deep neural networks (DNN)  are composed of millions (or even billions) of connection weights and the  learning  process seeks to minimize a non-convex loss function that measures the number of classification errors made by the DNN.  The empirical evidence shows that these highly expressive neural network models can fit the training data via simple variants of algorithms originally designed for convex optimization. Moreover,  even if the learning processes are run with little control over their statistical complexity (e.g. regularisation, number of parameters, …), these models achieve unparalleled levels of prediction accuracy, contrary to what would be expected from the uniform convergence framework of classical statistical inference.

    In this talk, we will discuss the geometrical structure of the space of solutions (zero error configurations) in overparametrized  non-convex neural networks when trained to classify patterns taken from some natural distribution. Building on statistical physics techniques for the study of disordered systems,   we analyze the geometric structure of the different minima and critical points of the error loss function as the number of parameters increases and we relate this to learning performance. Of particular interest is the role of rare flat minima which are both accessible to algorithms and have good generalisation properties, on the contrary to dominating minima which are almost impossible to sample.  We will show that the appearance of rare flat minima defines a phase boundary at which algorithms start to find solutions efficiently.

     

     

     

    Short Bio

    Riccardo Zecchina is professor in theoretical physics at Bocconi University in Milan, with a chair in Machine Learning.

    He did his PhD in Theoretical Physics at the University of Turin with Tullio Regge. Next he was appointed  research scientist and head of the Statistical Physics Group at the International Centre for Theoretical Physics in Trieste and 

    full professor  in Theoretical Physics at  the Polytechnic University of Turin. In 2017 he moved to  Bocconi University in Milan, with a  chair in Machine Learning. He has been multiple times long term visiting scientist  at Microsoft Research (in Redmond and Cambridge MA) and  at the Laboratory of Theoretical Physics and Statistical Models (LPTMS) of the University of Paris-Sud.

    His current research interests  lie at the interface between statistical physics, computer science and machine learning. His current research activity is  primarily focused on the study of  the out-of-equilibrium theory of learning algorithms, in artificial and biological neural networks.

    He is an advanced grantee of the European Research Council.  In 2016, he was awarded (with M. Mezard and G. Parisi) the Lars Onsager Prize  in Theoretical Statistical Physics by the American Physical Society, "For groundbreaking work applying spin glass ideas to ensembles of computational problems, yielding both new classes of efficient algorithms and new perspectives on phase transitions in their structure and complexity." 

     

    Add event on Google Calendar

    Add event on Outlook

     

    Email
    This email address is being protected from spambots. You need JavaScript enabled to view it.

     

    All Dates


    • Friday, 21st January 2022 15:00

    Powered by iCagenda

    Privacy Policy

    © 2024 CQB PhD. All Rights Reserved. Designed By WEB DIETI using JoomShaper