{mayi_des}
The paper studies nonstationary high-dimensional vector autoregressions of order $k$, VAR($k$). Additional deterministic terms such as trend or seasonality are allowed. The number of time periods, $T$, and the number of coordinates, $N$, are assumed to be large and of the same order. Under this regime the first-order asymptotics of the Johansen likelihood ratio (LR), Pillai-Bartlett, and Hotelling-Lawley tests for cointegration are derived: the test statistics converge to nonrandom integrals. For more refined analysis, the paper proposes and analyzes a modification of the Johansen test. The new test for the absence of cointegration converges to the partial sum of the Airy$_1$ point process. Supporting Monte Carlo simulations indicate that the same behavior persists universally in many situations beyond those considered in our theorems. The paper presents empirical implementations of the approach for the analysis of S$\&$P$100$ stocks and of cryptocurrencies. The latter example has a strong presence of multiple cointegrating relationships, while the results for the former are consistent with the null of no cointegration.
Recent years have seen a surge of interest in the algorithmic estimation of stochastic entropy production (EP) from trajectory data via machine learning. A crucial element of such algorithms is the identification of a loss function whose minimization guarantees the accurate EP estimation. In this study, we show that there exists a host of loss functions, namely those implementing a variational representation of the $\alpha$-divergence, which can be used for the EP estimation. By fixing $\alpha$ to a value between $-1$ and $0$, the $\alpha$-NEEP (Neural Estimator for Entropy Production) exhibits a much more robust performance against strong nonequilibrium driving or slow dynamics, which adversely affects the existing method based on the Kullback-Leibler divergence ($\alpha = 0$). In particular, the choice of $\alpha = -0.5$ tends to yield the optimal results. To corroborate our findings, we present an exactly solvable simplification of the EP estimation problem, whose loss function landscape and stochastic properties give deeper intuition into the robustness of the $\alpha$-NEEP.
We study the edge-coloring problem in simple $n$-vertex $m$-edge graphs with maximum degree $\Delta$. This is one of the most classical and fundamental graph-algorithmic problems. Vizing's celebrated theorem provides $(\Delta+1)$-edge-coloring in $O(m\cdot n)$ deterministic time. This running time was improved to $O\left(m\cdot\min\left\{\Delta\cdot\log n, \sqrt{n}\right\}\right)$. It is also well-known that $3\left\lceil\frac{\Delta}{2}\right\rceil$-edge-coloring can be computed in $O(m\cdot\log\Delta)$ time deterministically. Duan et al. devised a randomized $(1+\varepsilon)\Delta$-edge-coloring algorithm with running time $O\left(m\cdot\frac{\log^6 n}{\varepsilon^2}\right)$. It was however open if there exists a deterministic near-linear time algorithm for this basic problem. We devise a simple deterministic $(1+\varepsilon)\Delta$-edge-coloring algorithm with running time $O\left(m\cdot\frac{\log n}{\varepsilon}\right)$. We also devise a randomized $(1+\varepsilon)\Delta$-edge-coloring algorithm with running time $O(m\cdot(\varepsilon^{-18}+\log(\varepsilon\cdot\Delta)))$. For $\varepsilon\geq\frac{1}{\log^{1/18}\Delta}$, this running time is $O(m\cdot\log\Delta)$.
In data-driven control and machine learning, a common requirement involves breaking down large matrices into smaller, low-rank factors that possess specific levels of sparsity. This paper introduces an innovative solution to the orthogonal nonnegative matrix factorization (ONMF) problem. The objective is to approximate input data by using two low-rank nonnegative matrices, adhering to both orthogonality and $\ell_0$-norm sparsity constraints. the proposed maximum-entropy-principle based framework ensures orthogonality and sparsity of features or the mixing matrix, while maintaining nonnegativity in both. Additionally, the methodology offers a quantitative determination of the ``true'' number of underlying features, a crucial hyperparameter for ONMF. Experimental evaluation on synthetic and a standard datasets highlights the method's superiority in terms of sparsity, orthogonality, and computational speed compared to existing approaches. Notably, the proposed method achieves comparable or improved reconstruction errors in line with the literature.
We propose the algorithm that solves the symmetric cone programs (SCPs) by iteratively calling the projection and rescaling methods the algorithms for solving exceptional cases of SCP. Although our algorithm can solve SCPs by itself, we propose it intending to use it as a post-processing step for interior point methods since it can solve the problems more efficiently by using an approximate optimal (interior feasible) solution. We also conduct numerical experiments to see the numerical performance of the proposed algorithm when used as a post-processing step of the solvers implementing interior point methods, using several instances where the symmetric cone is given by a direct product of positive semidefinite cones. Numerical results show that our algorithm can obtain approximate optimal solutions more accurately than the solvers. When at least one of the primal and dual problems did not have an interior feasible solution, the performance of our algorithm was slightly reduced in terms of optimality. However, our algorithm stably returned more accurate solutions than the solvers when the primal and dual problems had interior feasible solutions.
This paper presents exact formulas for the probability distribution function (PDF) and moment generating function (MGF) of the sum-product of statistically independent but not necessarily identically distributed (i.n.i.d.) Nakagami-$m$ random variables (RVs) in terms of Meijer's G-function. Additionally, exact series representations are also derived for the sum of double-Nakagami RVs, providing useful insights on the trade-off between accuracy and computational cost. Simple asymptotic analytical expressions are provided to gain further insight into the derived formula, and the achievable diversity order is obtained. The suggested statistical properties are proved to be a highly useful tool for modeling parallel cascaded Nakagami-$m$ fading channels. The application of these new results is illustrated by deriving exact expressions and simple tight upper bounds for the outage probability (OP) and average symbol error rate (ASER) of several binary and multilevel modulation signals in intelligent reflecting surfaces (IRSs)-assisted communication systems operating over Nakagami-$m$ fading channels. It is demonstrated that the new asymptotic expression is highly accurate and can be extended to encompass a wider range of scenarios. To validate the theoretical frameworks and formulations, Monte-Carlo simulation results are presented. Additionally, supplementary simulations are provided to compare the derived results with two common types of approximations available in the literature, namely the central limit theorem (CLT) and gamma distribution.
This paper analyzes the stability of the class of Time-Accurate and Highly-Stable Explicit Runge-Kutta (TASE-RK) methods, introduced in 2021 by Bassenne et al. (J. Comput. Phys.) for the numerical solution of stiff Initial Value Problems (IVPs). Such numerical methods are easy to implement and require the solution of a limited number of linear systems per step, whose coefficient matrices involve the exact Jacobian $J$ of the problem. To significantly reduce the computational cost of TASE-RK methods without altering their consistency properties, it is possible to replace $J$ with a matrix $A$ (not necessarily tied to $J$) in their formulation, for instance fixed for a certain number of consecutive steps or even constant. However, the stability properties of TASE-RK methods strongly depend on this choice, and so far have been studied assuming $A=J$. In this manuscript, we theoretically investigate the conditional and unconditional stability of TASE-RK methods by considering arbitrary $A$. To this end, we first split the Jacobian as $J=A+B$. Then, through the use of stability diagrams and their connections with the field of values, we analyze both the case in which $A$ and $B$ are simultaneously diagonalizable and not. Numerical experiments, conducted on Partial Differential Equations (PDEs) arising from applications, show the correctness and utility of the theoretical results derived in the paper, as well as the good stability and efficiency of TASE-RK methods when $A$ is suitably chosen.
Typically, training LLMs with long context sizes is computationally expensive, requiring extensive training hours and GPU resources. Existing long-context extension methods usually need additional training procedures to support corresponding long-context windows, where the long-context training data (e.g., 32k) is needed, and high GPU training costs are assumed. To address the aforementioned issues, we propose an Efficient and Extreme length extension method for Large Language Models, called E 2 -LLM, with only one training procedure and dramatically reduced computation cost, which also removes the need to collect long-context data. Concretely, first, the training data of our E 2 -LLM only requires a short length (e.g., 4k), which reduces the tuning cost greatly. Second, the training procedure on the short training context window is performed only once time, and we can support different evaluation context windows at inference. Third, in E 2 - LLM, based on RoPE position embeddings, we introduce two different augmentation methods on the scale and position index parameters for different samples in training. It aims to make the model more robust to the different relative differences when directly interpolating the arbitrary context length at inference. Comprehensive experimental results on multiple benchmark datasets demonstrate the effectiveness of our E 2 -LLM on challenging long-context tasks.
In the $0$-Extension problem, we are given an edge-weighted graph $G=(V,E,c)$, a set $T\subseteq V$ of its vertices called terminals, and a semi-metric $D$ over $T$, and the goal is to find an assignment $f$ of each non-terminal vertex to a terminal, minimizing the sum, over all edges $(u,v)\in E$, the product of the edge weight $c(u,v)$ and the distance $D(f(u),f(v))$ between the terminals that $u,v$ are mapped to. Current best approximation algorithms on $0$-Extension are based on rounding a linear programming relaxation called the \emph{semi-metric LP relaxation}. The integrality gap of this LP, with best upper bound $O(\log |T|/\log\log |T|)$ and best lower bound $\Omega((\log |T|)^{2/3})$, has been shown to be closely related to the best quality of cut and flow vertex sparsifiers. We study a variant of the $0$-Extension problem where Steiner vertices are allowed. Specifically, we focus on the integrality gap of the same semi-metric LP relaxation to this new problem. Following from previous work, this new integrality gap turns out to be closely related to the quality achievable by cut/flow vertex sparsifiers with Steiner nodes, a major open problem in graph compression. Our main result is that the new integrality gap stays superconstant $\Omega(\log\log |T|)$ even if we allow a super-linear $O(|T|\log^{1-\varepsilon}|T|)$ number of Steiner nodes.
SARRIGUREN, a new complete algorithm for SAT based on counting clauses (which is valid also for Unique-SAT and #SAT) is described, analyzed and tested. Although existing complete algorithms for SAT perform slower with clauses with many literals, that is an advantage for SARRIGUREN, because the more literals are in the clauses the bigger is the probability of overlapping among clauses, a property that makes the clause counting process more efficient. Actually, it provides a $O(m^2 \times n/k)$ time complexity for random $k$-SAT instances of $n$ variables and $m$ relatively dense clauses, where that density level is relative to the number of variables $n$, that is, clauses are relatively dense when $k\geq7\sqrt{n}$. Although theoretically there could be worst-cases with exponential complexity, the probability of those cases to happen in random $k$-SAT with relatively dense clauses is practically zero. The algorithm has been empirically tested and that polynomial time complexity maintains also for $k$-SAT instances with less dense clauses ($k\geq5\sqrt{n}$). That density could, for example, be of only 0.049 working with $n=20000$ variables and $k=989$ literals. In addition, they are presented two more complementary algorithms that provide the solutions to $k$-SAT instances and valuable information about number of solutions for each literal. Although this algorithm does not solve the NP=P problem (it is not a polynomial algorithm for 3-SAT), it broads the knowledge about that subject, because $k$-SAT with $k>3$ and dense clauses is not harder than 3-SAT. Moreover, the Python implementation of the algorithms, and all the input datasets and obtained results in the experiments are made available.
We present approximation algorithms for the Fault-tolerant $k$-Supplier with Outliers ($\mathsf{F}k\mathsf{SO}$) problem. This is a common generalization of two known problems -- $k$-Supplier with Outliers, and Fault-tolerant $k$-Supplier -- each of which generalize the well-known $k$-Supplier problem. In the $k$-Supplier problem the goal is to serve $n$ clients $C$, by opening $k$ facilities from a set of possible facilities $F$; the objective function is the farthest that any client must travel to access an open facility. In $\mathsf{F}k\mathsf{SO}$, each client $v$ has a fault-tolerance $\ell_v$, and now desires $\ell_v$ facilities to serve it; so each client $v$'s contribution to the objective function is now its distance to the $\ell_v^{\text{th}}$ closest open facility. Furthermore, we are allowed to choose $m$ clients that we will serve, and only those clients contribute to the objective function, while the remaining $n-m$ are considered outliers. Our main result is a $\min\{4t-1,2^t+1\}$-approximation for the $\mathsf{F}k\mathsf{SO}$ problem, where $t$ is the number of distinct values of $\ell_v$ that appear in the instance. At $t=1$, i.e. in the case where the $\ell_v$'s are uniformly some $\ell$, this yields a $3$-approximation, improving upon the $11$-approximation given for the uniform case by Inamdar and Varadarajan [2020], who also introduced the problem. Our result for the uniform case matches tight $3$-approximations that exist for $k$-Supplier, $k$-Supplier with Outliers, and Fault-tolerant $k$-Supplier. Our key technical contribution is an application of the round-or-cut schema to $\mathsf{F}k\mathsf{SO}$. Guided by an LP relaxation, we reduce to a simpler optimization problem, which we can solve to obtain distance bounds for the "round" step, and valid inequalities for the "cut" step.