I will discuss recent work with Chantal David, Alexander Dunn, and Joshua Stucky, in which we prove that a positive proportion of Hecke L-functions associated to the cubic residue symbol modulo square-free Eisenstein integers do not vanish at the central point. Our principal new contribution is the asymptotic evaluation of the mollified second moment. No such asymptotic formula was previously known for a cubic family (even over function fields).
Our new approach makes crucial use of Patterson's evaluation of the Fourier coefficients of the cubic metaplectic theta function, Heath-Brown's cubic large sieve, and a Lindelöf-on-average upper bound for the second moment of cubic Dirichlet series that we establish. The significance of our result is that the family considered does not satisfy a perfectly orthogonal large sieve bound. This is quite unlike other families of Dirichlet L-functions for which unconditional results are known (namely the family of quadratic characters and the family of all Dirichlet characters modulo q). Consequently, our proof has fundamentally different features from the corresponding works of Soundararajan and of Iwaniec and Sarnak.
Gradient flows have emerged as a powerful framework for analyzing machine learning and statistical inference algorithms. Motivated by several applications in statistical inference, generative models, generalization, and robustness of learning algorithms, I will provide a few new results regarding the kernel approximation of gradient flows, such as a hidden link between the gradient flows of kernel maximum-mean discrepancy and relative entropies. These findings not only advance our theoretical understanding but also provide practical tools for enhancing machine learning algorithms. I will showcase inference and sampling algorithms using a new kernel approximation of the Wasserstein-Fisher-Rao (a.k.a. Hellinger-Kantorovich) gradient flows, which have better convergence characterization and improved performance in computation.
Current LLM alignment techniques use pairwise human preferences at a sample level, and as such, they do not imply an alignment on the distributional level. We propose in this paper Alignment via Optimal Transport (AOT), a novel method for distributional preference alignment of LLMs. AOT aligns LLMs on unpaired preference data by making the reward distribution of the positive samples stochastically dominant in the first order on the distribution of negative samples. We introduce a convex relaxation of this first-order stochastic dominance and cast it as an optimal transport problem with a smooth and convex cost. Thanks to the one-dimensional nature of the resulting optimal transport problem and the convexity of the cost, it has a closed-form solution via sorting on empirical measures. We fine-tune LLMs with this AOT objective, which enables alignment by penalizing the violation of the stochastic dominance of the reward distribution of the positive samples on the reward distribution of the negative samples. We analyze the sample complexity of AOT by considering the dual of the OT problem and show that it converges at the parametric rate. Empirically, we show on a diverse set of alignment datasets and LLMs that AOT leads to state-of-the-art models in the 7B family of models when evaluated with Open LLM Benchmarks and AlpacaEval. We will cover how these ideas extend to multivariate stochastic dominance, that is crucial for covering the multi-reward setting in the context of LLMs.
What happens to Wasserstein gradient flows if one uses entropic optimal transport instead of classical optimal transport? I will explain why it may be relevant to use Sinkhorn divergences, built on entropic optimal transport, as they allow the regularization parameter to remain fixed. This leads to the study of the Riemannian geometry induced by the Sinkhorn divergences: it retains some features of optimal transport geometry while being “smoother.” The gradient flows of potential energies in this geometry exhibit some intriguing features, which I will detail. This is joint work with Mathis Hardion, Jonas Luckhardt, Gilles Mordant, Bernhard Schmitzer and Luca Tamanini.
In practice, L-functions appear as generating functions encapsulating information about various objects, such as Galois representations, elliptic curves, arithmetic functions, modular forms, Maass forms, etc. Studying L-functions is therefore of utmost importance in number theory at large. Two of their attached data carry critical information: their zeros, which govern the distributional behavior of underlying objects; and their central values, which are related to invariants such as the class number of a field extension. We discuss a connection between low-lying zeros and central values of L-functions, in particular showing that results about the distribution of low-lying zeros (towards the density conjecture of Katz-Sarnak) implies results about the distribution of the central values (towards the normal distribution conjecture of Keating-Snaith). Even though we discuss this principle in general, we instanciate it in the case of modular forms in the level aspect to give a statement and explain the arguments of the proof.
Let ord𝑝(𝑎)be the order of 𝑎in (ℤ/𝑝ℤ)∗. In 1927, Artin conjectured that the set of primes 𝑝for which an integer 𝑎≠−1,◻is a primitive root (i.e. ord𝑝(𝑎)=𝑝−1) has a positive asymptotic density among all primes. In 1967 Hooley proved this conjecture assuming the Generalized Riemann Hypothesis (GRH). In this talk, we will study the behaviour of ord𝑝(𝑎)as 𝑝varies over primes. In particular, we will show, under GRH, that the set of primes 𝑝for which ord𝑝(𝑎)is “𝑘prime factors away” from 𝑝−1− 1 has a positive asymptotic density among all primes, except for particular values of 𝑎and 𝑘. We will interpret being “𝑘prime factors away” in three different ways:
𝑘=𝜔(𝑝−1ord𝑝(𝑎)),𝑘=Ω(𝑝−1ord𝑝(𝑎)),𝑘=𝜔(𝑝−1)−𝜔(ord𝑝(𝑎)).
We will present conditional results analogous to Hooley’s in all three cases and for all integer 𝑘. From this, we will derive conditionally the expectation for these quantities.
Furthermore, we will provide partial unconditional answers to some of these questions.
This is joint work with Leo Goldmakher and Greg Martin.
Over the years, there have been several open problems involving polynomials that I would love to tell others about. This opportunity to speak at my “home ground” seems the perfect time to do so. More specifically, I will discuss the following:
- A conjecture of Ruzsa for integers and a related problem in a joint work with Bell for polynomials over finite fields.
- A conjectural lower bound for the degree of irreducible factors of certain polynomials from a joint work with DeMarco, Ghioca, Krieger, Tucker, and Ye.
- The irreducibility of certain Gleason polynomials.
Cell movement requires long-range coordination of the cytoskeletal machinery that organizes cell morphogenesis. We have found that reciprocal interactions between biochemical signals and physical forces enable this long-range signal integration. Through a combination of optogenetic inputs, mechanical measurements, and mathematical modeling, we resolve a recent controversy regarding the role of membrane tension propagation in this process and reveal the requirements for long-range transmission of tension in cells. Most cells don't move in isolation-- they collectively migrate by sharing information similar to the flocking of birds, the schooling of fish, and the swarming of ants. We reveal a novel active signal relay system that rapidly and robustly ensures the proper level of immune cell recruitment to sites of injury and infection.
The quadratically regularized optimal transport problem (QOT) has emerged in the literature as a sparse alternative to entropic regularization (EOT). Unlike EOT, whose solutions always have full support—even for small regularization parameters—QOT solutions, or QOT plans, tend to approximate the support of the unregularized transport problem. This raises natural questions: Do the supports decrease monotonically? At what rate does this support reduction occur? How quickly does the QOT cost converge to the optimal transport cost? In this talk, we will review recent theoretical results addressing these questions.