Given a graph $G$, a set $S$ of vertices in $G$ is a general position set if no triple of vertices from $S$ lie on a common shortest path in $G$. The general position achievement/avoidance game is played on a graph $G$ by players A and B who alternately select vertices of $G$. A selection of a vertex by a player is a legal move if it has not been selected before and the set of selected vertices so far forms a general position set of $G$. The player who picks the last vertex is the winner in the general position achievement game and is the loser in the avoidance game. In this paper, we prove that the general position achievement/avoidance games are PSPACE-complete even on graphs with diameter at most 4. For this, we prove that the \textit{mis\`ere} play of the classical Node Kayles game is also PSPACE-complete. As positive results, we obtain linear time algorithms to decide the winning player of the general position avoidance game in rook's graphs, grids, cylinders, and lexicographic products with complete second factors.
We derive upper bounds for random design linear regression with dependent ($\beta$-mixing) data absent any realizability assumptions. In contrast to the strictly realizable martingale noise regime, no sharp instance-optimal non-asymptotics are available in the literature. Up to constant factors, our analysis correctly recovers the variance term predicted by the Central Limit Theorem -- the noise level of the problem -- and thus exhibits graceful degradation as we introduce misspecification. Past a burn-in, our result is sharp in the moderate deviations regime, and in particular does not inflate the leading order term by mixing time factors.
We present simulation-free score and flow matching ([SF]$^2$M), a simulation-free objective for inferring stochastic dynamics given unpaired samples drawn from arbitrary source and target distributions. Our method generalizes both the score-matching loss used in the training of diffusion models and the recently proposed flow matching loss used in the training of continuous normalizing flows. [SF]$^2$M interprets continuous-time stochastic generative modeling as a Schr\"odinger bridge problem. It relies on static entropy-regularized optimal transport, or a minibatch approximation, to efficiently learn the SB without simulating the learned stochastic process. We find that [SF]$^2$M is more efficient and gives more accurate solutions to the SB problem than simulation-based methods from prior work. Finally, we apply [SF]$^2$M to the problem of learning cell dynamics from snapshot data. Notably, [SF]$^2$M is the first method to accurately model cell dynamics in high dimensions and can recover known gene regulatory networks from simulated data.
This paper analyzes a $\theta$-method and 3-point time filter. This approach adds one additional line of code to the existing source code of $\theta$-method. We prove the method's $0$-stability, accuracy, and $A$-stability for both constant time step and variable time step. Some numerical tests are performed to validate the theoretical results.
In the Trivially Perfect Editing problem one is given an undirected graph $G = (V,E)$ and an integer $k$ and seeks to add or delete at most $k$ edges in $G$ to obtain a trivially perfect graph. In a recent work, Dumas, Perez and Todinca [Algorithmica 2023] proved that this problem admits a kernel with $O(k^3)$ vertices. This result heavily relies on the fact that the size of trivially perfect modules can be bounded by $O(k^2)$ as shown by Drange and Pilipczuk [Algorithmica 2018]. To obtain their cubic vertex-kernel, Dumas, Perez and Todinca [Algorithmica 2023] then showed that a more intricate structure, so-called \emph{comb}, can be reduced to $O(k^2)$ vertices. In this work we show that the bound can be improved to $O(k)$ for both aforementioned structures and thus obtain a kernel with $O(k^2)$ vertices. Our approach relies on the straightforward yet powerful observation that any large enough structure contains unaffected vertices whose neighborhood remains unchanged by an editing of size $k$, implying strong structural properties.
Given a graph $G=(V,E)$ and an integer $k$, the Cluster Editing problem asks whether we can transform $G$ into a union of vertex-disjoint cliques by at most $k$ modifications (edge deletions or insertions). In this paper, we study the following variant of Cluster Editing. We are given a graph $G=(V,E)$, a packing $\cal H$ of modification-disjoint induced $P_3$s (no pair of $P_3$s in $\cal H$ share an edge or non-edge) and an integer $\ell$. The task is to decide whether $G$ can be transformed into a union of vertex-disjoint cliques by at most $\ell+|\cal H|$ modifications (edge deletions or insertions). We show that this problem is NP-hard even when $\ell=0$ (in which case the problem asks to turn $G$ into a disjoint union of cliques by performing exactly one edge deletion or insertion per element of $\cal H$) and when each vertex is in at most 23 $P_3$s of the packing. This answers negatively a question of van Bevern, Froese, and Komusiewicz (CSR 2016, ToCS 2018), repeated by C. Komusiewicz at Shonan meeting no. 144 in March 2019. We then initiate the study to find the largest integer $c$ such that the problem remains tractable when restricting to packings such that each vertex is in at most $c$ packed $P_3$s. Here packed $P_3$s are those belonging to the packing $\cal H$. Van Bevern et al. showed that the case $c = 1$ is fixed-parameter tractable with respect to $\ell$ and we show that the case $c = 2$ is solvable in $|V|^{2\ell + O(1)}$ time.
Simulation-based inference has been popular for amortized Bayesian computation. It is typical to have more than one posterior approximation, from different inference algorithms, different architectures, or simply the randomness of initialization and stochastic gradients. With a provable asymptotic guarantee, we present a general stacking framework to make use of all available posterior approximations. Our stacking method is able to combine densities, simulation draws, confidence intervals, and moments, and address the overall precision, calibration, coverage, and bias at the same time. We illustrate our method on several benchmark simulations and a challenging cosmological inference task.
A saddlepoint of an $n \times n$ matrix $A$ is an entry of $A$ that is a maximum in its row and a minimum in its column. Knuth (1968) gave several different algorithms for finding a saddlepoint. The worst-case running time of these algorithms is $\Theta(n^2)$, and Llewellyn, Tovey, and Trick (1988) showed that this cannot be improved, as in the worst case all entries of A may need to be queried. A strict saddlepoint of $A$ is an entry that is the strict maximum in its row and the strict minimum in its column. The strict saddlepoint (if it exists) is unique, and Bienstock, Chung, Fredman, Sch\"affer, Shor, and Suri (1991) showed that it can be found in time $O(n \log{n})$, where a dominant runtime contribution is sorting the diagonal of the matrix. This upper bound has not been improved since 1991. In this paper we show that the strict saddlepoint can be found in $O(n \log^{*}{n})$ time, where $\log^{*}$ denotes the very slowly growing iterated logarithm function, coming close to the lower bound of $\Omega(n)$. In fact, we can also compute, within the same runtime, the value of a non-strict saddlepoint, assuming one exists. Our algorithm is based on a simple recursive approach, a feasibility test inspired by searching in sorted matrices, and a relaxed notion of saddlepoint.
A common approach to evaluating the significance of a collection of $p$-values combines them with a pooling function, in particular when the original data are not available. These pooled $p$-values convert a sample of $p$-values into a single number which behaves like a univariate $p$-value. To clarify discussion of these functions, a telescoping series of alternative hypotheses are introduced that communicate the strength and prevalence of non-null evidence in the $p$-values before general pooling formulae are discussed. A pattern noticed in the UMP pooled $p$-value for a particular alternative motivates the definition and discussion of central and marginal rejection levels at $\alpha$. It is proven that central rejection is always greater than or equal to marginal rejection, motivating a quotient to measure the balance between the two for pooled $p$-values. A combining function based on the $\chi^2_{\kappa}$ quantile transformation is proposed to control this quotient and shown to be robust to mis-specified parameters relative to the UMP. Different powers for different parameter settings motivate a map of plausible alternatives based on where this pooled $p$-value is minimized.
For solving two-dimensional incompressible flow in the vorticity form by the fourth-order compact finite difference scheme and explicit strong stability preserving (SSP) temporal discretizations, we show that the simple bound-preserving limiter in [Li H., Xie S., Zhang X., SIAM J. Numer. Anal., 56 (2018)]. can enforce the strict bounds of the vorticity, if the velocity field satisfies a discrete divergence free constraint. For reducing oscillations, a modified TVB limiter adapted from [Cockburn B., Shu CW., SIAM J. Numer. Anal., 31 (1994)] is constructed without affecting the bound-preserving property. This bound-preserving finite difference method can be used for any passive convection equation with a divergence free velocity field.
Let ${\mathcal P}$ be a family of probability measures on a measurable space $(S,{\mathcal A}).$ Given a Banach space $E,$ a functional $f:E\mapsto {\mathbb R}$ and a mapping $\theta: {\mathcal P}\mapsto E,$ our goal is to estimate $f(\theta(P))$ based on i.i.d. observations $X_1,\dots, X_n\sim P, P\in {\mathcal P}.$ In particular, if ${\mathcal P}=\{P_{\theta}: \theta\in \Theta\}$ is an identifiable statistical model with parameter set $\Theta\subset E,$ one can consider the mapping $\theta(P)=\theta$ for $P\in {\mathcal P}, P=P_{\theta},$ resulting in a problem of estimation of $f(\theta)$ based on i.i.d. observations $X_1,\dots, X_n\sim P_{\theta}, \theta\in \Theta.$ Given a smooth functional $f$ and estimators $\hat \theta_n(X_1,\dots, X_n), n\geq 1$ of $\theta(P),$ we use these estimators, the sample split and the Taylor expansion of $f(\theta(P))$ of a proper order to construct estimators $T_f(X_1,\dots, X_n)$ of $f(\theta(P)).$ For these estimators and for a functional $f$ of smoothness $s\geq 1,$ we prove upper bounds on the $L_p$-errors of estimator $T_f(X_1,\dots, X_n)$ under certain moment assumptions on the base estimators $\hat \theta_n.$ We study the performance of estimators $T_f(X_1,\dots, X_n)$ in several concrete problems, showing their minimax optimality and asymptotic efficiency. In particular, this includes functional estimation in high-dimensional models with many low dimensional components, functional estimation in high-dimensional exponential families and estimation of functionals of covariance operators in infinite-dimensional subgaussian models.