This paper gives a comprehensive treatment of the convergence rates of penalized spline estimators for simultaneously estimating several leading principal component functions, when the functional data is sparsely observed. The penalized spline estimators are defined as the solution of a penalized empirical risk minimization problem, where the loss function belongs to a general class of loss functions motivated by the matrix Bregman divergence, and the penalty term is the integrated squared derivative. The theory reveals that the asymptotic behavior of penalized spline estimators depends on the interesting interplay between several factors, i.e., the smoothness of the unknown functions, the spline degree, the spline knot number, the penalty order, and the penalty parameter. The theory also classifies the asymptotic behavior into seven scenarios and characterizes whether and how the minimax optimal rates of convergence are achievable in each scenario.
The latency location routing problem integrates the facility location problem and the multi-depot cumulative capacitated vehicle routing problem. This problem involves making simultaneous decisions about depot locations and vehicle routes to serve customers while aiming to minimize the sum of waiting (arriving) times for all customers. To address this computationally challenging problem, we propose a reinforcement learning guided hybrid evolutionary algorithm following the framework of the memetic algorithm. The proposed algorithm relies on a diversity-enhanced multi-parent edge assembly crossover to build promising offspring and a reinforcement learning guided variable neighborhood descent to determine the exploration order of multiple neighborhoods. Additionally, strategic oscillation is used to achieve a balanced exploration of both feasible and infeasible solutions. The competitiveness of the algorithm against state-of-the-art methods is demonstrated by experimental results on the three sets of 76 popular instances, including 51 improved best solutions (new upper bounds) for the 59 instances with unknown optima and equal best results for the remaining instances. We also conduct additional experiments to shed light on the key components of the algorithm.
In this paper we consider functional data with heterogeneity in time and in population. We propose a mixture model with segmentation of time to represent this heterogeneity while keeping the functional structure. Maximum likelihood estimator is considered, proved to be identifiable and consistent. In practice, an EM algorithm is used, combined with dynamic programming for the maximization step, to approximate the maximum likelihood estimator. The method is illustrated on a simulated dataset, and used on a real dataset of electricity consumption.
In this paper we investigate the existence of subexponential parameterized algorithms of three fundamental cycle-hitting problems in geometric graph classes. The considered problems, \textsc{Triangle Hitting} (TH), \textsc{Feedback Vertex Set} (FVS), and \textsc{Odd Cycle Transversal} (OCT) ask for the existence in a graph $G$ of a set $X$ of at most $k$ vertices such that $G-X$ is, respectively, triangle-free, acyclic, or bipartite. Such subexponential parameterized algorithms are known to exist in planar and even $H$-minor free graphs from bidimensionality theory [Demaine et al., JACM 2005], and there is a recent line of work lifting these results to geometric graph classes consisting of intersection of "fat" objects ([Grigoriev et al., FOCS 2022] and [Lokshtanov et al., SODA 2022]). In this paper we focus on "thin" objects by considering intersection graphs of segments in the plane with $d$ possible slopes ($d$-DIR graphs) and contact graphs of segments in the plane. Assuming the ETH, we rule out the existence of algorithms: - solving TH in time $2^{o(n)}$ in 2-DIR graphs; and - solving TH, FVS, and OCT in time $2^{o(\sqrt{n})}$ in $K_{2,2}$-free contact 2-DIR graphs. These results indicate that additional restrictions are necessary in order to obtain subexponential parameterized algorithms for %these problems. In this direction we provide: - a $2^{O(k^{3/4}\cdot \log k)}n^{O(1)}$-time algorithm for FVS in contact segment graphs; - a $2^{O(\sqrt d\cdot t^2 \log t\cdot k^{2/3}\log k)} n^{O(1)}$-time algorithm for TH in $K_{t,t}$-free $d$-DIR graphs; and - a $2^{O(k^{7/9}\log^{3/2}k)} n^{O(1)}$-time algorithm for TH in contact segment graphs.
Mixed methods for linear elasticity with strongly symmetric stresses of lowest order are studied in this paper. On each simplex, the stress space has piecewise linear components with respect to its Alfeld split (which connects the vertices to barycenter), generalizing the Johnson-Mercier two-dimensional element to higher dimensions. Further reductions in the stress space in the three-dimensional case (to 24 degrees of freedom per tetrahedron) are possible when the displacement space is reduced to local rigid displacements. Proofs of optimal error estimates of numerical solutions and improved error estimates via postprocessing and the duality argument are presented.
We present a finite element approach for diffusion problems with thermal fluctuations based on a fluctuating hydrodynamics model. The governing transport equations are stochastic partial differential equations with a fluctuating forcing term. We propose a discrete formulation of the stochastic forcing term that has the correct covariance matrix up to a standard discretization error. Furthermore, to obtain a numerical solution with spatial correlations that converge to those of the continuum equation, we derive a linear mapping to transform the finite element solution into an equivalent discrete solution that is free from the artificial correlations introduced by the spatial discretization. The method is validated by applying it to two diffusion problems: a second-order diffusion equation and a fourth-order diffusion equation. The theoretical (continuum) solution to the first case presents spatially decorrelated fluctuations, while the second case presents fluctuations correlated over a finite length. In both cases, the numerical solution presents a structure factor that approximates well the continuum one.
Incomplete factorizations have long been popular general-purpose algebraic preconditioners for solving large sparse linear systems of equations. Guaranteeing the factorization is breakdown free while computing a high quality preconditioner is challenging. A resurgence of interest in using low precision arithmetic makes the search for robustness more urgent and tougher. In this paper, we focus on symmetric positive definite problems and explore a number of approaches: a look-ahead strategy to anticipate break down as early as possible, the use of global shifts, and a modification of an idea developed in the field of numerical optimization for the complete Cholesky factorization of dense matrices. Our numerical simulations target highly ill-conditioned sparse linear systems with the goal of computing the factors in half precision arithmetic and then achieving double precision accuracy using mixed precision refinement.
This paper studies the convergence of a spatial semidiscretization of a three-dimensional stochastic Allen-Cahn equation with multiplicative noise. For non-smooth initial values, the regularity of the mild solution is investigated, and an error estimate is derived with the spatial $ L^2 $-norm. For smooth initial values, two error estimates with the general spatial $ L^q $-norms are established.
This paper studies the fundamental limits of availability and throughput for independent and heterogeneous demands of a limited resource. Availability is the probability that the demands are below the capacity of the resource. Throughput is the expected fraction of the resource that is utilized by the demands. We offer a concentration inequality generator that gives lower bounds on feasible availability and throughput pairs with a given capacity and independent but not necessarily identical distributions of up-to-unit demands. We show that availability and throughput cannot both be poor. These bounds are analogous to tail inequalities on sums of independent random variables, but hold throughout the support of the demand distribution. This analysis gives analytically tractable bounds supporting the unit-demand characterization of Chawla, Devanur, and Lykouris (2023) and generalizes to up-to-unit demands. Our bounds also provide an approach towards improved multi-unit prophet inequalities (Hajiaghayi, Kleinberg, and Sandholm, 2007). They have applications to transaction fee mechanism design (for blockchains) where high availability limits the probability of profitable user-miner coalitions (Chung and Shi, 2023).
We consider the fundamental task of optimising a real-valued function defined in a potentially high-dimensional Euclidean space, such as the loss function in many machine-learning tasks or the logarithm of the probability distribution in statistical inference. We use Riemannian geometry notions to redefine the optimisation problem of a function on the Euclidean space to a Riemannian manifold with a warped metric, and then find the function's optimum along this manifold. The warped metric chosen for the search domain induces a computational friendly metric-tensor for which optimal search directions associated with geodesic curves on the manifold becomes easier to compute. Performing optimization along geodesics is known to be generally infeasible, yet we show that in this specific manifold we can analytically derive Taylor approximations up to third-order. In general these approximations to the geodesic curve will not lie on the manifold, however we construct suitable retraction maps to pull them back onto the manifold. Therefore, we can efficiently optimize along the approximate geodesic curves. We cover the related theory, describe a practical optimization algorithm and empirically evaluate it on a collection of challenging optimisation benchmarks. Our proposed algorithm, using 3rd-order approximation of geodesics, tends to outperform standard Euclidean gradient-based counterparts in term of number of iterations until convergence.
Mendelian randomization uses genetic variants as instrumental variables to make causal inferences about the effects of modifiable risk factors on diseases from observational data. One of the major challenges in Mendelian randomization is that many genetic variants are only modestly or even weakly associated with the risk factor of interest, a setting known as many weak instruments. Many existing methods, such as the popular inverse-variance weighted (IVW) method, could be biased when the instrument strength is weak. To address this issue, the debiased IVW (dIVW) estimator, which is shown to be robust to many weak instruments, was recently proposed. However, this estimator still has non-ignorable bias when the effective sample size is small. In this paper, we propose a modified debiased IVW (mdIVW) estimator by multiplying a modification factor to the original dIVW estimator. After this simple correction, we show that the bias of the mdIVW estimator converges to zero at a faster rate than that of the dIVW estimator under some regularity conditions. Moreover, the mdIVW estimator has smaller variance than the dIVW estimator.We further extend the proposed method to account for the presence of instrumental variable selection and balanced horizontal pleiotropy. We demonstrate the improvement of the mdIVW estimator over the dIVW estimator through extensive simulation studies and real data analysis.