# Probability

## Random matrix theory of high-dimensional optimization - Lecture 13

ptimization theory seeks to show the performance of algorithms to find the (or a) minimizer x∈ℝd of an objective function. The dimension of the parameter space d has long been known to be a source of difficulty in designing good algorithms and in analyzing the objective function landscape. With the rise of machine learning in recent years, this has been proven that this is a manageable problem, but why? One explanation is that this high dimensionality is simultaneously mollified by three essential types of randomness: the data are random, the optimization algorithms are stochastic gradient methods, and the model parameters are randomly initialized (and much of this randomness remains). The resulting loss surfaces defy low-dimensional intuitions, especially in nonconvex settings.

Random matrix theory and spin glass theory provides a toolkit for theanalysis of these landscapes when the dimension $d$ becomes large. In this course, we will show

how random matrices can be used to describe high-dimensional inference

nonconvex landscape properties

high-dimensional limits of stochastic gradient methods.

## Random walks and branching random walks: old and new perspectives - Lecture 13

This course will focus on two well-studied models of modern probability: simple symmetric and branching random walks in ℤd. The focus will be on the study of their trace in the regime that this is a small subset of the ambient space.

We will start by reviewing some useful classical (and not) facts about simple random walks. We will introduce the notion of capacity and give many alternative forms for it. Then we will relate it to the covering problem of a domain by a simple random walk. We will review Lawler’s work on non-intersection probabilities and focus on the critical dimension $d=4$. With these tools at hand we will study the tails of the intersection of two infinite random walk ranges in dimensions d≥5.

A branching random walk (or tree indexed random walk) in ℤd is a non-Markovian process whose time index is a random tree. The random tree is either a critical Galton Watson tree or a critical Galton Watson tree conditioned to survive. Each edge of the tree is assigned an independent simple random walk in ℤd increment and the location of every vertex is given by summing all the increments along the geodesic from the root to that vertex. When $d\geq 5$, the branching random walk is transient and we will mainly focus on this regime. We will introduce the notion of branching capacity and show how it appears naturally as a suitably rescaled limit of hitting probabilities of sets. We will then use it to study covering problems analogously to the random walk case.

## Random walks and branching random walks: old and new perspectives - Lecture 12

This course will focus on two well-studied models of modern probability: simple symmetric and branching random walks in ℤd. The focus will be on the study of their trace in the regime that this is a small subset of the ambient space.

We will start by reviewing some useful classical (and not) facts about simple random walks. We will introduce the notion of capacity and give many alternative forms for it. Then we will relate it to the covering problem of a domain by a simple random walk. We will review Lawler’s work on non-intersection probabilities and focus on the critical dimension $d=4$. With these tools at hand we will study the tails of the intersection of two infinite random walk ranges in dimensions d≥5.

A branching random walk (or tree indexed random walk) in ℤd is a non-Markovian process whose time index is a random tree. The random tree is either a critical Galton Watson tree or a critical Galton Watson tree conditioned to survive. Each edge of the tree is assigned an independent simple random walk in ℤd increment and the location of every vertex is given by summing all the increments along the geodesic from the root to that vertex. When $d\geq 5$, the branching random walk is transient and we will mainly focus on this regime. We will introduce the notion of branching capacity and show how it appears naturally as a suitably rescaled limit of hitting probabilities of sets. We will then use it to study covering problems analogously to the random walk case.

## Random matrix theory of high-dimensional optimization - Lecture 12

Optimization theory seeks to show the performance of algorithms to find the (or a) minimizer x∈ℝd of an objective function. The dimension of the parameter space d has long been known to be a source of difficulty in designing good algorithms and in analyzing the objective function landscape. With the rise of machine learning in recent years, this has been proven that this is a manageable problem, but why? One explanation is that this high dimensionality is simultaneously mollified by three essential types of randomness: the data are random, the optimization algorithms are stochastic gradient methods, and the model parameters are randomly initialized (and much of this randomness remains). The resulting loss surfaces defy low-dimensional intuitions, especially in nonconvex settings.

Random matrix theory and spin glass theory provides a toolkit for theanalysis of these landscapes when the dimension $d$ becomes large. In this course, we will show

how random matrices can be used to describe high-dimensional inference

nonconvex landscape properties

high-dimensional limits of stochastic gradient methods.

## Permutations in random geometry - Lecture 3

I will introduce a new universal family of random permutons, called the skew Brownian permutons, describing the scaling limit of various natural models of random constrained permutations. After that, the main goal will be to discuss some connections between random permutations and random geometry. In particular, we will focus on the problem of the longest increasing subsequence in permutations sampled from the skew Brownian permuton and its connection with the study of certain directed metrics on planar maps, which conjecturally should converge in the limit to a notion of "directed Liouville quantum gravity metric.

## Random matrix theory of high-dimensional optimization - Lecture 11

Optimization theory seeks to show the performance of algorithms to find the (or a) minimizer x∈ℝd of an objective function. The dimension of the parameter space d has long been known to be a source of difficulty in designing good algorithms and in analyzing the objective function landscape. With the rise of machine learning in recent years, this has been proven that this is a manageable problem, but why? One explanation is that this high dimensionality is simultaneously mollified by three essential types of randomness: the data are random, the optimization algorithms are stochastic gradient methods, and the model parameters are randomly initialized (and much of this randomness remains). The resulting loss surfaces defy low-dimensional intuitions, especially in nonconvex settings.

Random matrix theory and spin glass theory provides a toolkit for theanalysis of these landscapes when the dimension $d$ becomes large. In this course, we will show

how random matrices can be used to describe high-dimensional inference

nonconvex landscape properties

high-dimensional limits of stochastic gradient methods.

## Random walks and branching random walks: old and new perspectives - Lecture 11

This course will focus on two well-studied models of modern probability: simple symmetric and branching random walks in ℤd. The focus will be on the study of their trace in the regime that this is a small subset of the ambient space.

We will start by reviewing some useful classical (and not) facts about simple random walks. We will introduce the notion of capacity and give many alternative forms for it. Then we will relate it to the covering problem of a domain by a simple random walk. We will review Lawler’s work on non-intersection probabilities and focus on the critical dimension $d=4$. With these tools at hand we will study the tails of the intersection of two infinite random walk ranges in dimensions d≥5.

A branching random walk (or tree indexed random walk) in ℤd is a non-Markovian process whose time index is a random tree. The random tree is either a critical Galton Watson tree or a critical Galton Watson tree conditioned to survive. Each edge of the tree is assigned an independent simple random walk in ℤd increment and the location of every vertex is given by summing all the increments along the geodesic from the root to that vertex. When $d\geq 5$, the branching random walk is transient and we will mainly focus on this regime. We will introduce the notion of branching capacity and show how it appears naturally as a suitably rescaled limit of hitting probabilities of sets. We will then use it to study covering problems analogously to the random walk case.

## Permutations in random geometry - Lecture 2

andom permutons, called the skew Brownian permutons, describing the scaling limit of various natural models of random constrained permutations. After that, the main goal will be to discuss some connections between random permutations and random geometry. In particular, we will focus on the problem of the longest increasing subsequence in permutations sampled from the skew Brownian permuton and its connection with the study of certain directed metrics on planar maps, which conjecturally should converge in the limit to a notion of "directed Liouville quantum gravity metric.

## Random walks and branching random walks: old and new perspective - Lecture 10

We will start by reviewing some useful classical (and not) facts about simple random walks. We will introduce the notion of capacity and give many alternative forms for it. Then we will relate it to the covering problem of a domain by a simple random walk. We will review Lawler’s work on non-intersection probabilities and focus on the critical dimension $d=4$. With these tools at hand we will study the tails of the intersection of two infinite random walk ranges in dimensions d≥5.

## Random matrix theory of high-dimensional optimization - Lecture 10

Optimization theory seeks to show the performance of algorithms to find the (or a) minimizer x∈ℝd of an objective function. The dimension of the parameter space d has long been known to be a source of difficulty in designing good algorithms and in analyzing the objective function landscape. With the rise of machine learning in recent years, this has been proven that this is a manageable problem, but why? One explanation is that this high dimensionality is simultaneously mollified by three essential types of randomness: the data are random, the optimization algorithms are stochastic gradient methods, and the model parameters are randomly initialized (and much of this randomness remains). The resulting loss surfaces defy low-dimensional intuitions, especially in nonconvex settings.

Random matrix theory and spin glass theory provides a toolkit for theanalysis of these landscapes when the dimension $d$ becomes large. In this course, we will show

* how random matrices can be used to describe high-dimensional inference

* nonconvex landscape properties

* high-dimensional limits of stochastic gradient methods.