The fundamental theorem of Tur\'{a}n from Extremal Graph Theory determines the exact bound on the number of edges $t_r(n)$ in an $n$-vertex graph that does not contain a clique of size $r+1$. We establish an interesting link between Extremal Graph Theory and Algorithms by providing a simple compression algorithm that in linear time reduces the problem of finding a clique of size $\ell$ in an $n$-vertex graph $G$ with $m \ge t_r(n)-k$ edges, where $\ell\leq r+1$, to the problem of finding a maximum clique in a graph on at most $5k$ vertices. This also gives us an algorithm deciding in time $2.49^{k}\cdot(n + m)$ whether $G$ has a clique of size $\ell$. As a byproduct of the new compression algorithm, we give an algorithm that in time $2^{\mathcal{O}(td^2)} \cdot n^2$ decides whether a graph contains an independent set of size at least $n/(d+1) + t$. Here $d$ is the average vertex degree of the graph $G$. The multivariate complexity analysis based on ETH indicates that the asymptotical dependence on several parameters in the running times of our algorithms is tight.
Given a continuous definable function $f: S \to \mathbb{R}$ on a definable set $S$, we study sublevel sets of the form $S^f_t = \{x \in S: f(x) \leq t\}$ for all $t \in \mathbb{R}$. Using o-minimal structures, we prove that the Euler characteristic of $S^f_t$ is right continuous with respect to $t$. Furthermore, when $S$ is compact, we show that $S^f_{t+\delta}$ deformation retracts to $S^f_t$ for all sufficiently small $\delta > 0$. Applying these results, we also characterize the relationship between the concepts of Euler characteristic transform and smooth Euler characteristic transform in topological data analysis.
The Variational Autoencoder (VAE) is known to suffer from the phenomenon of \textit{posterior collapse}, where the latent representations generated by the model become independent of the inputs. This leads to degenerated representations of the input, which is attributed to the limitations of the VAE's objective function. In this work, we propose a novel solution to this issue, the Contrastive Regularization for Variational Autoencoders (CR-VAE). The core of our approach is to augment the original VAE with a contrastive objective that maximizes the mutual information between the representations of similar visual inputs. This strategy ensures that the information flow between the input and its latent representation is maximized, effectively avoiding posterior collapse. We evaluate our method on a series of visual datasets and demonstrate, that CR-VAE outperforms state-of-the-art approaches in preventing posterior collapse.
We propose and analyze exact and inexact regularized Newton-type methods for finding a global saddle point of \emph{convex-concave} unconstrained min-max optimization problems. Compared to first-order methods, our understanding of second-order methods for min-max optimization is relatively limited, as obtaining global rates of convergence with second-order information is much more involved. In this paper, we examine how second-order information can be used to speed up extra-gradient methods, even under inexactness. Specifically, we show that the proposed algorithms generate iterates that remain within a bounded set and the averaged iterates converge to an $\epsilon$-saddle point within $O(\epsilon^{-2/3})$ iterations in terms of a restricted gap function. Our algorithms match the theoretically established lower bound in this context and our analysis provides a simple and intuitive convergence analysis for second-order methods without any boundedness requirements. Finally, we present a series of numerical experiments on synthetic and real data that demonstrate the efficiency of the proposed algorithms.
Consider bivariate observations $(X_1,Y_1), \ldots, (X_n,Y_n) \in \mathbb{R}\times \mathbb{R}$ with unknown conditional distributions $Q_x$ of $Y$, given that $X = x$. The goal is to estimate these distributions under the sole assumption that $Q_x$ is isotonic in $x$ with respect to likelihood ratio order. If the observations are identically distributed, a related goal is to estimate the joint distribution $\mathcal{L}(X,Y)$ under the sole assumption that it is totally positive of order two in a certain sense. An algorithm is developed which estimates the unknown family of distributions $(Q_x)_x$ via empirical likelihood. The benefit of the stronger regularization imposed by likelihood ratio order over the usual stochastic order is evaluated in terms of estimation and predictive performances on simulated as well as real data.
We study the differential privacy (DP) of a core ML problem, linear ordinary least squares (OLS), a.k.a. $\ell_2$-regression. Our key result is that the approximate LS algorithm (ALS) (Sarlos, 2006), a randomized solution to the OLS problem primarily used to improve performance on large datasets, also preserves privacy. ALS achieves a better privacy/utility tradeoff, without modifications or further noising, when compared to alternative private OLS algorithms which modify and/or noise OLS. We give the first {\em tight} DP-analysis for the ALS algorithm and the standard Gaussian mechanism (Dwork et al., 2014) applied to OLS. Our methodology directly improves the privacy analysis of (Blocki et al., 2012) and (Sheffet, 2019)) and introduces new tools which may be of independent interest: (1) the exact spectrum of $(\epsilon, \delta)$-DP parameters (``DP spectrum") for mechanisms whose output is a $d$-dimensional Gaussian, and (2) an improved DP spectrum for random projection (compared to (Blocki et al., 2012) and (Sheffet, 2019)). All methods for private OLS (including ours) assume, often implicitly, restrictions on the input database, such as bounds on leverage and residuals. We prove that such restrictions are necessary. Hence, computing the privacy of mechanisms such as ALS must estimate these database parameters, which can be infeasible in big datasets. For more complex ML models, DP bounds may not even be tractable. There is a need for blackbox DP-estimators (Lu et al., 2022) which empirically estimate a data-dependent privacy. We demonstrate the effectiveness of such a DP-estimator by empirically recovering a DP-spectrum that matches our theory for OLS. This validates the DP-estimator in a nontrivial ML application, opening the door to its use in more complex nonlinear ML settings where theory is unavailable.
Many physical systems are governed by ordinary or partial differential equations (see, for example, Chapter ''Differential equations'', ''System of Differential Equations''). Typically the solution of such systems are functions of time or of a single space variable (in the case of ODE's), or they depend on multidimensional space coordinates or on space and time (in the case of PDE's). In some cases, the solutions may depend on several time or space scales. An example governed by ODE's is the damped harmonic oscillator, in the two extreme cases of very small or very large damping, the cardiovascular system, where the thickness of the arteries and veins varies from centimeters to microns, shallow water equations, which are valid when water depth is small compared to typical wavelength of surface waves, and sorption kinetics, in which the range of interaction of a surfactant with an air bubble is much smaller than the size of the bubble itself. In all such cases a detailed simulation of the models which resolves all space or time scales is often inefficient or intractable, and usually even unnecessary to provide a reasonable description of the behavior of the system. In the Chapter ''Multiscale modeling with differential equations'' we present examples of systems described by ODE's and PDE's which are intrinsically multiscale, and illustrate how suitable modeling provide an effective way to capture the essential behavior of the solutions of such systems without resolving the small scales.
We consider the problem of dynamically maintaining the convex hull of a set $S$ of points in the plane under the following special sequence of insertions and deletions (called {\em window-sliding updates}): insert a point to the right of all points of $S$ and delete the leftmost point of $S$. We propose an $O(|S|)$-space data structure that can handle each update in $O(1)$ amortized time, such that standard binary-search-based queries on the convex hull of $S$ can be answered in $O(\log h)$ time, where $h$ is the number of vertices of the convex hull of $S$, and the convex hull itself can be output in $O(h)$ time.
Let $C$ be a linear code of length $n$ and dimension $k$ over the finite field $\mathbb{F}_{q^m}$. The trace code $\mathrm{Tr}(C)$ is a linear code of the same length $n$ over the subfield $\mathbb{F}_q$. The obvious upper bound for the dimension of the trace code over $\mathbb{F}_q$ is $mk$. If equality holds, then we say that $C$ has maximum trace dimension. The problem of finding the true dimension of trace codes and their duals is relevant for the size of the public key of various code-based cryptographic protocols. Let $C_{\mathbf{a}}$ denote the code obtained from $C$ and a multiplier vector $\mathbf{a}\in (\mathbb{F}_{q^m})^n$. In this paper, we give a lower bound for the probability that a random multiplier vector produces a code $C_{\mathbf{a}}$ of maximum trace dimension. We give an interpretation of the bound for the class of algebraic geometry codes in terms of the degree of the defining divisor. The bound explains the experimental fact that random alternant codes have minimal dimension. Our bound holds whenever $n\geq m(k+h)$, where $h\geq 0$ is the Singleton defect of $C$. For the extremal case $n=m(h+k)$, numerical experiments reveal a closed connection between the probability of having maximum trace dimension and the probability that a random matrix has full rank.
We investigate the decidability of the ${0,\infty}$ fragment of Timed Propositional Temporal Logic (TPTL). We show that the satisfiability checking of TPTL$^{0,\infty}$ is PSPACE-complete. Moreover, even its 1-variable fragment (1-TPTL$^{0,\infty}$) is strictly more expressive than Metric Interval Temporal Logic (MITL) for which satisfiability checking is EXPSPACE complete. Hence, we have a strictly more expressive logic with computationally easier satisfiability checking. To the best of our knowledge, TPTL$^{0,\infty}$ is the first multi-variable fragment of TPTL for which satisfiability checking is decidable without imposing any bounds/restrictions on the timed words (e.g. bounded variability, bounded time, etc.). The membership in PSPACE is obtained by a reduction to the emptiness checking problem for a new "non-punctual" subclass of Alternating Timed Automata with multiple clocks called Unilateral Very Weak Alternating Timed Automata (VWATA$^{0,\infty}$) which we prove to be in PSPACE. We show this by constructing a simulation equivalent non-deterministic timed automata whose number of clocks is polynomial in the size of the given VWATA$^{0,\infty}$.
In multi-turn dialog, utterances do not always take the full form of sentences \cite{Carbonell1983DiscoursePA}, which naturally makes understanding the dialog context more difficult. However, it is essential to fully grasp the dialog context to generate a reasonable response. Hence, in this paper, we propose to improve the response generation performance by examining the model's ability to answer a reading comprehension question, where the question is focused on the omitted information in the dialog. Enlightened by the multi-task learning scheme, we propose a joint framework that unifies these two tasks, sharing the same encoder to extract the common and task-invariant features with different decoders to learn task-specific features. To better fusing information from the question and the dialog history in the encoding part, we propose to augment the Transformer architecture with a memory updater, which is designed to selectively store and update the history dialog information so as to support downstream tasks. For the experiment, we employ human annotators to write and examine a large-scale dialog reading comprehension dataset. Extensive experiments are conducted on this dataset, and the results show that the proposed model brings substantial improvements over several strong baselines on both tasks. In this way, we demonstrate that reasoning can indeed help better response generation and vice versa. We release our large-scale dataset for further research.