We present a rigorous theoretical analysis of the convergence rate of the deep mixed residual method (MIM) when applied to a linear elliptic equation with various types of boundary conditions. The MIM method has been proposed as a more effective numerical approximation method compared to the deep Galerkin method (DGM) and deep Ritz method (DRM) in various cases. Our analysis shows that MIM outperforms DRM and deep Galerkin method for weak solution (DGMW) in the Dirichlet case due to its ability to enforce the boundary condition. However, for the Neumann and Robin cases, MIM demonstrates similar performance to the other methods. Our results provides valuable insights into the strengths of MIM and its comparative performance in solving linear elliptic equations with different boundary conditions.
This paper considers the problem of learning a single ReLU neuron with squared loss (a.k.a., ReLU regression) in the overparameterized regime, where the input dimension can exceed the number of samples. We analyze a Perceptron-type algorithm called GLM-tron (Kakade et al., 2011) and provide its dimension-free risk upper bounds for high-dimensional ReLU regression in both well-specified and misspecified settings. Our risk bounds recover several existing results as special cases. Moreover, in the well-specified setting, we provide an instance-wise matching risk lower bound for GLM-tron. Our upper and lower risk bounds provide a sharp characterization of the high-dimensional ReLU regression problems that can be learned via GLM-tron. On the other hand, we provide some negative results for stochastic gradient descent (SGD) for ReLU regression with symmetric Bernoulli data: if the model is well-specified, the excess risk of SGD is provably no better than that of GLM-tron ignoring constant factors, for each problem instance; and in the noiseless case, GLM-tron can achieve a small risk while SGD unavoidably suffers from a constant risk in expectation. These results together suggest that GLM-tron might be preferable to SGD for high-dimensional ReLU regression.
Modern automotive functions are controlled by a large number of small computers called electronic control units (ECUs). These functions span from safety-critical autonomous driving to comfort and infotainment. ECUs communicate with one another over multiple internal networks using different technologies. Some, such as Controller Area Network (CAN), are very simple and provide minimal or no security services. Machine learning techniques can be used to detect anomalous activities in such networks. However, it is necessary that these machine learning techniques are not prone to adversarial attacks. In this paper, we investigate adversarial sample vulnerabilities in four different machine learning-based intrusion detection systems for automotive networks. We show that adversarial samples negatively impact three of the four studied solutions. Furthermore, we analyze transferability of adversarial samples between different systems. We also investigate detection performance and the attack success rate after using adversarial samples in the training. After analyzing these results, we discuss whether current solutions are mature enough for a use in modern vehicles.
Recently, Meta-Auto-Decoder (MAD) was proposed as a novel reduced order model (ROM) for solving parametric partial differential equations (PDEs), and the best possible performance of this method can be quantified by the decoder width. This paper aims to provide a theoretical analysis related to the decoder width. The solution sets of several parametric PDEs are examined, and the upper bounds of the corresponding decoder widths are estimated. In addition to the elliptic and the parabolic equations on a fixed domain, we investigate the advection equations that present challenges for classical linear ROMs, as well as the elliptic equations with the computational domain shape as a variable PDE parameter. The resulting fast decay rates of the decoder widths indicate the promising potential of MAD in addressing these problems.
Tensor train decomposition is widely used in machine learning and quantum physics due to its concise representation of high-dimensional tensors, overcoming the curse of dimensionality. Cross approximation-originally developed for representing a matrix from a set of selected rows and columns-is an efficient method for constructing a tensor train decomposition of a tensor from few of its entries. While tensor train cross approximation has achieved remarkable performance in practical applications, its theoretical analysis, in particular regarding the error of the approximation, is so far lacking. To our knowledge, existing results only provide element-wise approximation accuracy guarantees, which lead to a very loose bound when extended to the entire tensor. In this paper, we bridge this gap by providing accuracy guarantees in terms of the entire tensor for both exact and noisy measurements. Our results illustrate how the choice of selected subtensors affects the quality of the cross approximation and that the approximation error caused by model error and/or measurement error may not grow exponentially with the order of the tensor. These results are verified by numerical experiments, and may have important implications for the usefulness of cross approximations for high-order tensors, such as those encountered in the description of quantum many-body states.
This paper proposes novel computational multiscale methods for linear second-order elliptic partial differential equations in nondivergence-form with heterogeneous coefficients satisfying a Cordes condition. The construction follows the methodology of localized orthogonal decomposition (LOD) and provides operator-adapted coarse spaces by solving localized cell problems on a fine scale in the spirit of numerical homogenization. The degrees of freedom of the coarse spaces are related to nonconforming and mixed finite element methods for homogeneous problems. The rigorous error analysis of one exemplary approach shows that the favorable properties of the LOD methodology known from divergence-form PDEs, i.e., its applicability and accuracy beyond scale separation and periodicity, remain valid for problems in nondivergence-form.
Deep neural networks have received significant attention due to their simplicity and flexibility in the fields of engineering and scientific calculation. In this work, we probe into solving a class of elliptic PDEs with multiple scales by means of Fourier-based mixed physics-informed neural networks (called FMPINN), and its solver is configured as a multi-scale DNN model. Unlike the classical PINN method, a dual (flux) variable about the rough coefficient of PDEs is introduced to avoid the ill-condition of neural tangent kernel matrix that resulted from the oscillating coefficient of multi-scale PDEs. Therefore, apart from the physical conservation laws, the discrepancy between the auxiliary variables and the gradients of multi-scale coefficients is incorporated into the cost function, then leveraging the optimization method to yield the satisfactory solution of PDEs by minimizing the defined loss. Additionally, a novel trigonometric activation function is introduced for FMPINN, which is suited for representing the derivatives of complex target functions. Handling the input data by Fourier feature mapping will effectively improve the capacity of deep neural networks to solve high-frequency problems. Finally, by introducing several numerical examples of multi-scale problems in various dimensional Euclidean spaces, we validate the efficiency and robustness of the proposed FMPINN algorithm in both low-frequency and high-frequency oscillation cases.
The propagation of internal gravity waves in stratified media, such as those found in ocean basins and lakes, leads to the development of geometrical patterns called "attractors". These structures accumulate much of the wave energy and make the fluid flow highly singular. In more analytical terms, the cause of this phenomenon has been attributed to the presence of a continuous spectrum in some nonlocal zeroth-order pseudo-differential operators. In this work, we analyze the generation of these attractors from a numerical analysis perspective. First, we propose a high-order pseudo-spectral method to solve the evolution problem (whose long-term behaviour is known to be not square-integrable). Then, we use similar tools to discretize the corresponding eigenvalue problem. Since the eigenvalues are embedded in a continuous spectrum, we compute them using viscous approximations. Finally, we explore the effect that the embedded eigenmodes have on the long-term evolution of the system.
The focus of precision medicine is on decision support, often in the form of dynamic treatment regimes (DTRs), which are sequences of decision rules. At each decision point, the decision rules determine the next treatment according to the patient's baseline characteristics, the information on treatments and responses accrued by that point, and the patient's current health status, including symptom severity and other measures. However, DTR estimation with ordinal outcomes is rarely studied, and rarer still in the context of interference - where one patient's treatment may affect another's outcome. In this paper, we introduce the proposed weighted proportional odds model (WPOM): a regression-based, doubly-robust approach to single-stage DTR estimation for ordinal outcomes. This method also accounts for the possibility of interference between individuals sharing a household through the use of covariate balancing weights derived from joint propensity scores. Examining different types of balancing weights, we verify the double robustness of WPOM with our adjusted weights via simulation studies. We further extend WPOM to multi-stage DTR estimation with household interference. Lastly, we demonstrate our proposed methodology in the analysis of longitudinal survey data from the Population Assessment of Tobacco and Health study, which motivates this work.
Deterministic finite automata (DFA) are a classic tool for high throughput matching of regular expressions, both in theory and practice. Due to their high space consumption, extensive research has been devoted to compressed representations of DFAs that still support efficient pattern matching queries. Kumar~et~al.~[SIGCOMM 2006] introduced the \emph{delayed deterministic finite automaton} (\ddfa{}) which exploits the large redundancy between inter-state transitions in the automaton. They showed it to obtain up to two orders of magnitude compression of real-world DFAs, and their work formed the basis of numerous subsequent results. Their algorithm, as well as later algorithms based on their idea, have an inherent quadratic-time bottleneck, as they consider every pair of states to compute the optimal compression. In this work we present a simple, general framework based on locality-sensitive hashing for speeding up these algorithms to achieve sub-quadratic construction times for \ddfa{}s. We apply the framework to speed up several algorithms to near-linear time, and experimentally evaluate their performance on real-world regular expression sets extracted from modern intrusion detection systems. We find an order of magnitude improvement in compression times, with either little or no loss of compression, or even significantly better compression in some cases.
Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.