In this paper, we investigate the existence of parameterized algorithms running in subexponential time for two fundamental cycle-hitting problems: Feedback Vertex Set (FVS) and Triangle Hitting (TH). We focus on the class of pseudo-disk graphs, which forms a common generalization of several graph classes where such results exist, like disk graphs and square graphs. In these graphs, we show that TH can be solved in time $2^{O(k^{3/4}\log k)}n^{O(1)}$, and given a geometric representation FVS can be solved in time $2^{O(k^{6/7}\log k)}n^{O(1)}$.
We study the problem of testing whether the missing values of a potentially high-dimensional dataset are Missing Completely at Random (MCAR). We relax the problem of testing MCAR to the problem of testing the compatibility of a collection of covariance matrices, motivated by the fact that this procedure is feasible when the dimension grows with the sample size. Our first contributions are to define a natural measure of the incompatibility of a collection of correlation matrices, which can be characterised as the optimal value of a Semi-definite Programming (SDP) problem, and to establish a key duality result allowing its practical computation and interpretation. By analysing the concentration properties of the natural plug-in estimator for this measure, we propose a novel hypothesis test, which is calibrated via a bootstrap procedure and demonstrates power against any distribution with incompatible covariance matrices. By considering key examples of missingness structures, we demonstrate that our procedures are minimax rate optimal in certain cases. We further validate our methodology with numerical simulations that provide evidence of validity and power, even when data are heavy tailed. Furthermore, tests of compatibility can be used to test the feasibility of positive semi-definite matrix completion problems with noisy observations, and thus our results may be of independent interest.
In this paper we consider a nonconvex unconstrained optimization problem minimizing a twice differentiable objective function with H\"older continuous Hessian. Specifically, we first propose a Newton-conjugate gradient (Newton-CG) method for finding an approximate first- and second-order stationary point of this problem, assuming the associated the H\"older parameters are explicitly known. Then we develop a parameter-free Newton-CG method without requiring any prior knowledge of these parameters. To the best of our knowledge, this method is the first parameter-free second-order method achieving the best-known iteration and operation complexity for finding an approximate first- and second-order stationary point of this problem. Finally, we present preliminary numerical results to demonstrate the superior practical performance of our parameter-free Newton-CG method over a well-known regularized Newton method.
In this paper, we consider a class of non-convex and non-smooth sparse optimization problems, which encompass most existing nonconvex sparsity-inducing terms. We show the second-order optimality conditions only depend on the nonzeros of the stationary points. We propose two damped iterative reweighted algorithms including the iteratively reweighted $\ell_1$ algorithm (DIRL$_1$) and the iteratively reweighted $\ell_2$ (DIRL$_2$) algorithm, to solve these problems. For DIRL$_1$, we show the reweighted $\ell_1$ subproblem has support identification property so that DIRL$_1$ locally reverts to a gradient descent algorithm around a stationary point. For DIRL$_2$, we show the solution map of the reweighted $\ell_2$ subproblem is differentiable and Lipschitz continuous everywhere. Therefore, the map of DIRL$_1$ and DIRL$_2$ and their inverse are Lipschitz continuous, and the strict saddle points are their unstable fixed points. By applying the stable manifold theorem, these algorithms are shown to converge only to local minimizers with randomly initialization when the strictly saddle point property is assumed.
Considered a pair of random lifetimes whose dependence is described by a Time Transformed Exponential model, we provide analytical expressions for the distribution of their sum. These expressions are obtained by using a representation of the joint distribution in terms of multivariate distortions, which is an alternative approach to the classical copula representation. Since this approach allows to obtain conditional distributions and their inverses in simple form, then it is also shown how it can be used to predict the value of the sum from the value of one of the variables (or vice versa) by using quantile regression techniques.
In this work, we present a semi-discrete scheme to approximate solutions to the scalar LWR traffic model with spatially discontinuous flux, described by the equation $u_t + (k(x)u(1-u))_x = 0$. This approach is based on the Lagrangian-Eulerian method proposed by E. Abreu, J. Francois, W. Lambert, and J. Perez [J. Comp. Appl. Math. 406 (2022) 114011] for scalar conservation laws. We derive a non-uniform bound on the growth rate of the total variation for approximate solutions. Since the total variation can explode only at $x=0$, we can provide a convergence proof for our scheme in $BV_{loc}(\mathbb{R}\setminus \lbrace 0 \rbrace)$ by using Helly's compactness theorem.
In this paper, we provide a mathematical framework for improving generalization in a class of learning problems which is related to point estimations for modeling of high-dimensional nonlinear functions. In particular, we consider a variational problem for a weakly-controlled gradient system, whose control input enters into the system dynamics as a coefficient to a nonlinear term which is scaled by a small parameter. Here, the optimization problem consists of a cost functional, which is associated with how to gauge the quality of the estimated model parameters at a certain fixed final time w.r.t. the model validating dataset, while the weakly-controlled gradient system, whose the time-evolution is guided by the model training dataset and its perturbed version with small random noise. Using the perturbation theory, we provide results that will allow us to solve a sequence of optimization problems, i.e., a set of decomposed optimization problems, so as to aggregate the corresponding approximate optimal solutions that are reasonably sufficient for improving generalization in such a class of learning problems. Moreover, we also provide an estimate for the rate of convergence for such approximate optimal solutions. Finally, we present some numerical results for a typical case of nonlinear regression problem.
There is a wide range of mathematical models that describe populations of large numbers of neurons. In this article, we focus on nonlinear noisy leaky integrate-and-fire (NNLIF) models that describe neuronal activity at the level of the membrane potential. We introduce a sequence of states, which we call pseudoequilibria, and give evidence of their defining role in the behavior of the NNLIF system when a significant synaptic delay is considered. The advantage is that these states are determined solely by the system's parameters and are derived from a sequence of firing rates that result from solving a recurrence equation. We propose a strategy to show convergence to an equilibrium for a weakly connected system with large transmission delay, based on following the sequence of pseudoequilibria. Unlike direct entropy dissipation methods, this technique allows us to see how a large delay favors convergence. We present a detailed numerical study to support our results. This study helps us understand, among other phenomena, the appearance of periodic solutions in strongly inhibitory networks.
Weighting with the inverse probability of censoring is an approach to deal with censoring in regression analyses where the outcome may be missing due to right-censoring. In this paper, three separate approaches involving this idea in a setting where the Kaplan--Meier estimator is used for estimating the censoring probability are compared. In more detail, the three approaches involve weighted regression, regression with a weighted outcome, and regression of a jack-knife pseudo-observation based on a weighted estimator. Expressions of the asymptotic variances are given in each case and the expressions are compared to each other and to the uncensored case. In terms of low asymptotic variance, a clear winner cannot be found. Which approach will have the lowest asymptotic variance depends on the censoring distribution. Expressions of the limit of the standard sandwich variance estimator in the three cases are also provided, revealing an overestimation under the implied assumptions.
This perspective article aims at providing an outline of the state of the art and future developments towards the integration of cutting-edge predictive language models with BCI. A synthetic overview of early and more recent linguistic models, from natural language processing (NLP) models to recent LLM, that to a varying extent improved predictive writing systems, is first provided. Second, a summary of previous BCI implementations integrating language models is presented. The few preliminary studies investigating the possible combination of LLM with BCI spellers to efficiently support fast communication and control are then described. Finally, current challenges and limitations towards the full integration of LLM with BCI systems are discussed. Recent investigations suggest that the combination of LLM with BCI might drastically improve human-computer interaction in patients with motor or language disorders as well as in healthy individuals. In particular, the pretrained autoregressive transformer models, such as GPT, that capitalize from parallelization, learning through pre-training and fine-tuning, promise a substantial improvement of BCI for communication with respect to previous systems incorporating simpler language models. Indeed, among various models, the GPT-2 was shown to represent an excellent candidate for its integration into BCI although testing was only perfomed on simulated conversations and not on real BCI scenarios. Prospectively, the full integration of LLM with advanced BCI systems might lead to a big leap forward towards fast, efficient and user-adaptive neurotechnology.
Among all the deterministic CholeskyQR-type algorithms, Shifted CholeskyQR3 is specifically designed to address the QR factorization of ill-conditioned matrices. This algorithm introduces a shift parameter $s$ to prevent failure during the initial Cholesky factorization step, making the choice of this parameter critical for the algorithm's effectiveness. Our goal is to identify a smaller $s$ compared to the traditional selection based on $\norm{X}_{2}$. In this research, we propose a new matrix norm called the $g$-norm, which is based on the column properties of $X$. This norm allows us to obtain a reduced shift parameter $s$ for the Shifted CholeskyQR3 algorithm, thereby improving the sufficient condition of $\kappa_{2}(X)$ for this method. We provide rigorous proofs of orthogonality and residuals for the improved algorithm using our proposed $s$. Numerical experiments confirm the enhanced numerical stability of orthogonality and residuals with the reduced $s$. We find that Shifted CholeskyQR3 can effectively handle ill-conditioned $X$ with a larger $\kappa_{2}(X)$ when using our reduced $s$ compared to the original $s$. Furthermore, we compare CPU times with other algorithms to assess performance improvements.