In this work we investigate the Weihrauch degree of the problem $\mathsf{DS}$ of finding an infinite descending sequence through a given ill-founded linear order, which is shared by the problem $\mathsf{BS}$ of finding a bad sequence through a given non-well quasi-order. We show that $\mathsf{DS}$, despite being hard to solve (it has computable inputs with no hyperarithmetic solution), is rather weak in terms of uniform computational strength. To make the latter precise, we introduce the notion of the deterministic part of a Weihrauch degree. We then generalize $\mathsf{DS}$ and $\mathsf{BS}$ by considering $\boldsymbol{\Gamma}$-presented orders, where $\boldsymbol{\Gamma}$ is a Borel pointclass or $\boldsymbol{\Delta}^1_1$, $\boldsymbol{\Sigma}^1_1$, $\boldsymbol{\Pi}^1_1$. We study the obtained $\mathsf{DS}$-hierarchy and $\mathsf{BS}$-hierarchy of problems in comparison with the (effective) Baire hierarchy and show that they do not collapse at any finite level.
Let $a$ and $b$ be two non-zero elements of a finite field $\mathbb{F}_q$, where $q>2$. It has been shown that if $a$ and $b$ have the same multiplicative order in $\mathbb{F}_q$, then the families of $a$-constacyclic and $b$-constacyclic codes over $\mathbb{F}_q$ are monomially equivalent. In this paper, we investigate the monomial equivalence of $a$-constacyclic and $b$-constacyclic codes when $a$ and $b$ have distinct multiplicative orders. We present novel conditions for establishing monomial equivalence in such constacyclic codes, surpassing previous methods of determining monomially equivalent constacyclic and cyclic codes. As an application, we use these results to search for new linear codes more systematically. In particular, we present more than $70$ new record-breaking linear codes over various finite fields, as well as new binary quantum codes.
Recently, order-preserving pattern (OPP) mining, a new sequential pattern mining method, has been proposed to mine frequent relative orders in a time series. Although frequent relative orders can be used as features to classify a time series, the mined patterns do not reflect the differences between two classes of time series well. To effectively discover the differences between time series, this paper addresses the top-k contrast OPP (COPP) mining and proposes a COPP-Miner algorithm to discover the top-k contrast patterns as features for time series classification, avoiding the problem of improper parameter setting. COPP-Miner is composed of three parts: extreme point extraction to reduce the length of the original time series, forward mining, and reverse mining to discover COPPs. Forward mining contains three steps: group pattern fusion strategy to generate candidate patterns, the support rate calculation method to efficiently calculate the support of a pattern, and two pruning strategies to further prune candidate patterns. Reverse mining uses one pruning strategy to prune candidate patterns and consists of applying the same process as forward mining. Experimental results validate the efficiency of the proposed algorithm and show that top-k COPPs can be used as features to obtain a better classification performance.
We show that the known list-decoding algorithms for univariate multiplicity and folded Reed-Solomon (FRS) codes can be made to run in nearly-linear time. This yields, to our knowledge, the first known family of codes that can be decoded in nearly linear time, even as they approach the list decoding capacity. Univariate multiplicity codes and FRS codes are natural variants of Reed-Solomon codes that were discovered and studied for their applications to list-decoding. It is known that for every $\epsilon >0$, and rate $R \in (0,1)$, there exist explicit families of these codes that have rate $R$ and can be list-decoded from a $(1-R-\epsilon)$ fraction of errors with constant list size in polynomial time (Guruswami & Wang (IEEE Trans. Inform. Theory, 2013) and Kopparty, Ron-Zewi, Saraf & Wootters (SIAM J. Comput. 2023)). In this work, we present randomized algorithms that perform the above tasks in nearly linear time. Our algorithms have two main components. The first builds upon the lattice-based approach of Alekhnovich (IEEE Trans. Inf. Theory 2005), who designed a nearly linear time list-decoding algorithm for Reed-Solomon codes approaching the Johnson radius. As part of the second component, we design nearly-linear time algorithms for two natural algebraic problems. The first algorithm solves linear differential equations of the form $Q\left(x, f(x), \frac{df}{dx}, \dots,\frac{d^m f}{dx^m}\right) \equiv 0$ where $Q$ has the form $Q(x,y_0,\dots,y_m) = \tilde{Q}(x) + \sum_{i = 0}^m Q_i(x)\cdot y_i$. The second solves functional equations of the form $Q\left(x, f(x), f(\gamma x), \dots,f(\gamma^m x)\right) \equiv 0$ where $\gamma$ is a high-order field element. These algorithms can be viewed as generalizations of classical algorithms of Sieveking (Computing 1972) and Kung (Numer. Math. 1974) for computing the modular inverse of a power series, and might be of independent interest.
We consider \emph{random linear programs} (rlps) as a subclass of \emph{random optimization problems} (rops) and study their typical behavior. Our particular focus is on appropriate linear objectives which connect the rlps to the mean widths of random polyhedrons/polytopes. Utilizing the powerful machinery of \emph{random duality theory} (RDT) \cite{StojnicRegRndDlt10}, we obtain, in a large dimensional context, the exact characterizations of the program's objectives. In particular, for any $\alpha=\lim_{n\rightarrow\infty}\frac{m}{n}\in(0,\infty)$, any unit vector $\mathbf{c}\in{\mathbb R}^n$, any fixed $\mathbf{a}\in{\mathbb R}^n$, and $A\in {\mathbb R}^{m\times n}$ with iid standard normal entries, we have \begin{eqnarray*} \lim_{n\rightarrow\infty}{\mathbb P}_{A} \left ( (1-\epsilon) \xi_{opt}(\alpha;\mathbf{a}) \leq \min_{A\mathbf{x}\leq \mathbf{a}}\mathbf{c}^T\mathbf{x} \leq (1+\epsilon) \xi_{opt}(\alpha;\mathbf{a}) \right ) \longrightarrow 1, \end{eqnarray*} where \begin{equation*} \xi_{opt}(\alpha;\mathbf{a}) \triangleq \min_{x>0} \sqrt{x^2- x^2 \lim_{n\rightarrow\infty} \frac{\sum_{i=1}^{m} \left ( \frac{1}{2} \left (\left ( \frac{\mathbf{a}_i}{x}\right )^2 + 1\right ) \mbox{erfc}\left( \frac{\mathbf{a}_i}{x\sqrt{2}}\right ) - \frac{\mathbf{a}_i}{x} \frac{e^{-\frac{\mathbf{a}_i^2}{2x^2}}}{\sqrt{2\pi}} \right ) }{n} }. \end{equation*} For example, for $\mathbf{a}=\mathbf{1}$, one uncovers \begin{equation*} \xi_{opt}(\alpha) = \min_{x>0} \sqrt{x^2- x^2 \alpha \left ( \frac{1}{2} \left ( \frac{1}{x^2} + 1\right ) \mbox{erfc} \left ( \frac{1}{x\sqrt{2}}\right ) - \frac{1}{x} \frac{e^{-\frac{1}{2x^2}}}{\sqrt{2\pi}} \right ) }. \end{equation*} Moreover, $2 \xi_{opt}(\alpha)$ is precisely the concentrating point of the mean width of the polyhedron $\{\mathbf{x}|A\mathbf{x} \leq \mathbf{1}\}$.
We consider the weak convergence of the Euler-Maruyama approximation for Schr\"odinger-F\"ollmer diffusions, which are solutions of Schr\"odinger bridge problems and can be used for sampling from given distributions. We show that the distribution of the terminal random variable of the time-discretized process weakly converges to the target one under mild regularity conditions.
We propose an efficient algorithm for matching two correlated Erd\H{o}s--R\'enyi graphs with $n$ vertices whose edges are correlated through a latent vertex correspondence. When the edge density $q= n^{- \alpha+o(1)}$ for a constant $\alpha \in [0,1)$, we show that our algorithm has polynomial running time and succeeds to recover the latent matching as long as the edge correlation is non-vanishing. This is closely related to our previous work on a polynomial-time algorithm that matches two Gaussian Wigner matrices with non-vanishing correlation, and provides the first polynomial-time random graph matching algorithm (regardless of the regime of $q$) when the edge correlation is below the square root of the Otter's constant (which is $\approx 0.338$).
Let $D$ be a digraph. Its acyclic number $\vec{\alpha}(D)$ is the maximum order of an acyclic induced subdigraph and its dichromatic number $\vec{\chi}(D)$ is the least integer $k$ such that $V(D)$ can be partitioned into $k$ subsets inducing acyclic subdigraphs. We study ${\vec a}(n)$ and $\vec t(n)$ which are the minimum of $\vec\alpha(D)$ and the maximum of $\vec{\chi}(D)$, respectively, over all oriented triangle-free graphs of order $n$. For every $\epsilon>0$ and $n$ large enough, we show $(1/\sqrt{2} - \epsilon) \sqrt{n\log n} \leq \vec{a}(n) \leq \frac{107}{8} \sqrt n \log n$ and $\frac{8}{107} \sqrt n/\log n \leq \vec{t}(n) \leq (\sqrt 2 + \epsilon) \sqrt{n/\log n}$. We also construct an oriented triangle-free graph on 25 vertices with dichromatic number~3, and show that every oriented triangle-free graph of order at most 17 has dichromatic number at most 2.
Iterated conditional expectation (ICE) g-computation is an estimation approach for addressing time-varying confounding for both longitudinal and time-to-event data. Unlike other g-computation implementations, ICE avoids the need to specify models for each time-varying covariate. For variance estimation, previous work has suggested the bootstrap. However, bootstrapping can be computationally intense and sensitive to the number of resamples used. Here, we present ICE g-computation as a set of stacked estimating equations. Therefore, the variance for the ICE g-computation estimator can be consistently estimated using the empirical sandwich variance estimator. Performance of the variance estimator was evaluated empirically with a simulation study. The proposed approach is also demonstrated with an illustrative example on the effect of cigarette smoking on the prevalence of hypertension. In the simulation study, the empirical sandwich variance estimator appropriately estimated the variance. When comparing runtimes between the sandwich variance estimator and the bootstrap for the applied example, the sandwich estimator was substantially faster, even when bootstraps were run in parallel. The empirical sandwich variance estimator is a viable option for variance estimation with ICE g-computation.
Let $M$ be an $n\times n$ matrix of homogeneous linear forms over a field $\Bbbk$. If the ideal $\mathcal{I}_{n-2}(M)$ generated by minors of size $n-1$ is Cohen-Macaulay, then the Gulliksen-Neg{\aa}rd complex is a free resolution of $\mathcal{I}_{n-2}(M)$. It has recently been shown that by taking into account the syzygy modules for $\mathcal{I}_{n-2}(M)$ which can be obtained from this complex, one can derive a refined signature-based Gr\"obner basis algorithm DetGB which avoids reductions to zero when computing a grevlex Gr\"obner basis for $\mathcal{I}_{n-2}(M)$. In this paper, we establish sharp complexity bounds on DetGB. To accomplish this, we prove several results on the sizes of reduced grevlex Gr\"obner bases of reverse lexicographic ideals, thanks to which we obtain two main complexity results which rely on conjectures similar to that of Fr\"oberg. The first one states that, in the zero-dimensional case, the size of the reduced grevlex Gr\"obner basis of $\mathcal{I}_{n-2}(M)$ is bounded from below by $n^{6}$ asymptotically. The second, also in the zero-dimensional case, states that the complexity of DetGB is bounded from above by $n^{2\omega+3}$ asymptotically, where $2\le\omega\le 3$ is any complexity exponent for matrix multiplication over $\Bbbk$.
We prove non-asymptotic error bounds for particle gradient descent (PGD)~(Kuntz et al., 2023), a recently introduced algorithm for maximum likelihood estimation of large latent variable models obtained by discretizing a gradient flow of the free energy. We begin by showing that, for models satisfying a condition generalizing both the log-Sobolev and the Polyak--{\L}ojasiewicz inequalities (LSI and P{\L}I, respectively), the flow converges exponentially fast to the set of minimizers of the free energy. We achieve this by extending a result well-known in the optimal transport literature (that the LSI implies the Talagrand inequality) and its counterpart in the optimization literature (that the P{\L}I implies the so-called quadratic growth condition), and applying it to our new setting. We also generalize the Bakry--\'Emery Theorem and show that the LSI/P{\L}I generalization holds for models with strongly concave log-likelihoods. For such models, we further control PGD's discretization error, obtaining non-asymptotic error bounds. While we are motivated by the study of PGD, we believe that the inequalities and results we extend may be of independent interest.