We consider the problem of finding the smallest or largest entry of a tensor of order N that is specified via its rank decomposition. Stated in a different way, we are given N sets of R-dimensional vectors and we wish to select one vector from each set such that the sum of the Hadamard product of the selected vectors is minimized or maximized. We show that this fundamental tensor problem is NP-hard for any tensor rank higher than one, and polynomial-time solvable in the rank-one case. We also propose a continuous relaxation and prove that it is tight for any rank. For low-enough ranks, the proposed continuous reformulation is amenable to low-complexity gradient-based optimization, and we propose a suite of gradient-based optimization algorithms drawing from projected gradient descent, Frank-Wolfe, or explicit parametrization of the relaxed constraints. We also show that our core results remain valid no matter what kind of polyadic tensor model is used to represent the tensor of interest, including Tucker, HOSVD/MLSVD, tensor train, or tensor ring. Next, we consider the class of problems that can be posed as special instances of the problem of interest. We show that this class includes the partition problem (and thus all NP-complete problems via polynomial-time transformation), integer least squares, integer linear programming, integer quadratic programming, sign retrieval (a special kind of mixed integer programming / restricted version of phase retrieval), and maximum likelihood decoding of parity check codes. We demonstrate promising experimental results on a number of hard problems, including state-of-art performance in decoding low density parity check codes and general parity check codes.
In this paper we present a new high order semi-implicit DG scheme on two-dimensional staggered triangular meshes applied to different nonlinear systems of hyperbolic conservation laws such as advection-diffusion models, incompressible Navier-Stokes equations and natural convection problems. While the temperature and pressure field are defined on a triangular main grid, the velocity field is defined on a quadrilateral edge-based staggered mesh. A semi-implicit time discretization is proposed, which separates slow and fast time scales by treating them explicitly and implicitly, respectively. The nonlinear convection terms are evolved explicitly using a semi-Lagrangian approach, whereas we consider an implicit discretization for the diffusion terms and the pressure contribution. High-order of accuracy in time is achieved using a new flexible and general framework of IMplicit-EXplicit (IMEX) Runge-Kutta schemes specifically designed to operate with semi-Lagrangian methods. To improve the efficiency in the computation of the DG divergence operator and the mass matrix, we propose to approximate the numerical solution with a less regular polynomial space on the edge-based mesh, which is defined on two sub-triangles that split the staggered quadrilateral elements. Due to the implicit treatment of the fast scale terms, the resulting numerical scheme is unconditionally stable for the considered governing equations. Contrarily to a genuinely space-time discontinuous-Galerkin scheme, the IMEX discretization permits to preserve the symmetry and the positive semi-definiteness of the arising linear system for the pressure that can be solved at the aid of an efficient matrix-free implementation of the conjugate gradient method. We present several convergence results, including nonlinear transport and density currents, up to third order of accuracy in both space and time.
We propose a hybrid iterative method based on MIONet for PDEs, which combines the traditional numerical iterative solver and the recent powerful machine learning method of neural operator, and further systematically analyze its theoretical properties, including the convergence condition, the spectral behavior, as well as the convergence rate, in terms of the errors of the discretization and the model inference. We show the theoretical results for the frequently-used smoothers, i.e. Richardson (damped Jacobi) and Gauss-Seidel. We give an upper bound of the convergence rate of the hybrid method w.r.t. the model correction period, which indicates a minimum point to make the hybrid iteration converge fastest. Several numerical examples including the hybrid Richardson (Gauss-Seidel) iteration for the 1-d (2-d) Poisson equation are presented to verify our theoretical results, and also reflect an excellent acceleration effect. As a meshless acceleration method, it is provided with enormous potentials for practice applications.
Many products in engineering are highly reliable with large mean lifetimes to failure. Performing lifetests under normal operations conditions would thus require long experimentation times and high experimentation costs. Alternatively, accelerated lifetests shorten the experimentation time by running the tests at higher than normal stress conditions, thus inducing more failures. Additionally, a log-linear regression model can be used to relate the lifetime distribution of the product to the level of stress it experiences. After estimating the parameters of this relationship, results can be extrapolated to normal operating conditions. On the other hand, censored data is common in reliability analysis. Interval-censored data arise when continuous inspection is difficult or infeasible due to technical or budgetary constraints. In this paper, we develop robust restricted estimators based on the density power divergence for step-stress accelerated life-tests under Weibull distributions with interval-censored data. We present theoretical asymptotic properties of the estimators and develop robust Rao-type test statistics based on the proposed robust estimators for testing composite null hypothesis on the model parameters.
Accelerated life tests (ALTs) play a crucial role in reliability analyses, providing lifetime estimates of highly reliable products. Among ALTs, step-stress design increases the stress level at predefined times, while maintaining a constant stress level between successive changes. This approach accelerates the occurrence of failures, reducing experimental duration and cost. While many studies assume a specific form for the lifetime distribution, in certain applications instead a general form satisfying certain properties should be preferred. Proportional hazard model assumes that applied stresses act multiplicatively on the hazard rate, so the hazards function may be divided into two factors, with one representing the effect of the stress, and the other representing the baseline hazard. In this work we examine two particular forms of baseline hazards, namely, linear and quadratic. Moreover, certain experiments may face practical constraints making continuous monitoring of devices infeasible. Instead, devices under test are inspected at predetermined intervals, leading to interval-censoring data. On the other hand, recent works have shown an appealing trade-off between the efficiency and robustness of divergence-based estimators. This paper introduces the step-stress ALT model under proportional hazards and presents a robust family of minimum density power divergence estimators (MDPDEs) for estimating device reliability and related lifetime characteristics such as mean lifetime and distributional quantiles. The asymptotic distributions of these estimates are derived, providing approximate confidence intervals. Empirical evaluations through Monte Carlo simulations demonstrate their performance in terms of robustness and efficiency. Finally, an illustrative example is provided to demonstrate the usefulness of the model and associated methods developed.
Based on Holler (1982) and Armijos-Toro et al. (2021) we propose two power indices to measure the influence of the players in the class of weighted majority games in partition function form. We compare these new power indices with their original versions on the class of games in characteristic function form. Finally, we use both pairs of power indices for games in partition function form to study the distribution of power in the National Assembly of Ecuador that emerged after the elections of February 7, 2021.
A new shock-tracking technique that avoids re-meshing the computational grid around the moving shock-front was recently proposed by the authors (Ciallella et al., 2020). The method combines the unstructured shock-fitting (Paciorri and Bonfiglioli,2009) approach, developed in the last decade by some of the authors, with ideas coming from embedded boundary methods. In particular, second-order extrapolations based on Taylor series expansions are employed to transfer the solution and retain high order of accuracy. This paper describes the basic idea behind the new method and further algorithmic improvements which make the extrapolated Discontinuity Tracking Technique (eDIT) capable of dealing with complex shock-topologies featuring shock-shock and shock-wall interactions occurring in steady problems. This method paves the way to a new class of shock-tracking techniques truly independent on the mesh structure and flow solver. Various test-cases are included to prove the potential of the method, demonstrate the key features of the methodology, and thoroughly evaluate several technical aspects related to the extrapolation from/onto the shock, and their impact on accuracy, and conservation.
We provide a new theoretical framework for the variable-step deferred correction (DC) methods based on the well-known BDF2 formula. By using the discrete orthogonal convolution kernels, some high-order BDF2-DC methods are proven to be stable on arbitrary time grids according to the recent definition of stability (SINUM, 60: 2253-2272). It significantly relaxes the existing step-ratio restrictions for the BDF2-DC methods (BIT, 62: 1789-1822). The associated sharp error estimates are established by taking the numerical effects of the starting approximations into account, and they suggest that the BDF2-DC methods have no aftereffect, that is, the lower-order starting scheme for the BDF2 scheme will not cause a loss in the accuracy of the high-order BDF2-DC methods. Extensive tests on the graded and random time meshes are presented to support the new theory.
Current high-throughput technologies provide a large amount of variables to describe a phenomenon. Only a few variables are generally sufficient to answer the question. Identify them in a high-dimensional Gaussian linear regression model is the one of the most-used statistical methods. In this article, we describe step-by-step the variable selection procedures built upon regularization paths. Regularization paths are obtained by combining a regularization function and an algorithm. Then, they are combined either with a model selection procedure using penalty functions or with a sampling strategy to obtain the final selected variables. We perform a comparison study by considering three simulation settings with various dependency structures on variables. %from the most classical to a most realistic one. In all the settings, we evaluate (i) the ability to discriminate between the active variables and the non-active variables along the regularization path (pROC-AUC), (ii) the prediction performance of the selected variable subset (MSE) and (iii) the relevance of the selected variables (recall, specificity, FDR). From the results, we provide recommendations on strategies to be favored depending on the characteristics of the problem at hand. We obtain that the regularization function Elastic-net provides most of the time better results than the $\ell_1$ one and the lars algorithm has to be privileged as the GD one. ESCV provides the best prediction performances. Bolasso and the knockoffs method are judicious choices to limit the selection of non-active variables while ensuring selection of enough active variables. Conversely, the data-driven penalties considered in this review are not to be favored. As for Tigress and LinSelect, they are conservative methods.
Understanding the mechanisms through which neural networks extract statistics from input-label pairs is one of the most important unsolved problems in supervised learning. Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA). However, the reason these quantities become correlated during training is poorly understood. In this work, we explain the emergence of this correlation. We identify that the NFA is equivalent to alignment between the left singular structure of the weight matrices and a significant component of the empirical neural tangent kernels associated with those weights. We establish that the NFA introduced in prior works is driven by a centered NFA that isolates this alignment. We show that the speed of NFA development can be predicted analytically at early training times in terms of simple statistics of the inputs and labels. Finally, we introduce a simple intervention to increase NFA correlation at any given layer, which dramatically improves the quality of features learned.
Training a deep architecture using a ranking loss has become standard for the person re-identification task. Increasingly, these deep architectures include additional components that leverage part detections, attribute predictions, pose estimators and other auxiliary information, in order to more effectively localize and align discriminative image regions. In this paper we adopt a different approach and carefully design each component of a simple deep architecture and, critically, the strategy for training it effectively for person re-identification. We extensively evaluate each design choice, leading to a list of good practices for person re-identification. By following these practices, our approach outperforms the state of the art, including more complex methods with auxiliary components, by large margins on four benchmark datasets. We also provide a qualitative analysis of our trained representation which indicates that, while compact, it is able to capture information from localized and discriminative regions, in a manner akin to an implicit attention mechanism.