Out-of-distribution (OOD) detection discerns OOD data where the predictor cannot make valid predictions as in-distribution (ID) data, thereby increasing the reliability of open-world classification. However, it is typically hard to collect real out-of-distribution (OOD) data for training a predictor capable of discerning ID and OOD patterns. This obstacle gives rise to data generation-based learning methods, synthesizing OOD data via data generators for predictor training without requiring any real OOD data. Related methods typically pre-train a generator on ID data and adopt various selection procedures to find those data likely to be the OOD cases. However, generated data may still coincide with ID semantics, i.e., mistaken OOD generation remains, confusing the predictor between ID and OOD data. To this end, we suggest that generated data (with mistaken OOD generation) can be used to devise an auxiliary OOD detection task to facilitate real OOD detection. Specifically, we can ensure that learning from such an auxiliary task is beneficial if the ID and the OOD parts have disjoint supports, with the help of a well-designed training procedure for the predictor. Accordingly, we propose a powerful data generation-based learning method named Auxiliary Task-based OOD Learning (ATOL) that can relieve the mistaken OOD generation. We conduct extensive experiments under various OOD detection setups, demonstrating the effectiveness of our method against its advanced counterparts.
An intelligent driving system should be capable of dynamically formulating appropriate driving strategies based on the current environment and vehicle status, while ensuring the security and reliability of the system. However, existing methods based on reinforcement learning and imitation learning suffer from low safety, poor generalization, and inefficient sampling. Additionally, they cannot accurately predict future driving trajectories, and the accurate prediction of future driving trajectories is a precondition for making optimal decisions. To solve these problems, in this paper, we introduce a Safe and Generalized end-to-end Autonomous Driving System (SGADS) for complex and various scenarios. Our SGADS incorporates variational inference with normalizing flows, enabling the intelligent vehicle to accurately predict future driving trajectories. Moreover, we propose the formulation of robust safety constraints. Furthermore, we combine reinforcement learning with demonstrations to augment search process of the agent. The experimental results demonstrate that our SGADS can significantly improve safety performance, exhibit strong generalization, and enhance the training efficiency of intelligent vehicles in complex urban scenarios compared to existing methods.
We propose a novel approach to nonlinear functional regression, called the Mapping-to-Parameter function model, which addresses complex and nonlinear functional regression problems in parameter space by employing any supervised learning technique. Central to this model is the mapping of function data from an infinite-dimensional function space to a finite-dimensional parameter space. This is accomplished by concurrently approximating multiple functions with a common set of B-spline basis functions by any chosen order, with their knot distribution determined by the Iterative Local Placement Algorithm, a newly proposed free knot placement algorithm. In contrast to the conventional equidistant knot placement strategy that uniformly distributes knot locations based on a predefined number of knots, our proposed algorithms determine knot location according to the local complexity of the input or output functions. The performance of our knot placement algorithms is shown to be robust in both single-function approximation and multiple-function approximation contexts. Furthermore, the effectiveness and advantage of the proposed prediction model in handling both function-on-scalar regression and function-on-function regression problems are demonstrated through several real data applications, in comparison with four groups of state-of-the-art methods.
A new sparse semiparametric model is proposed, which incorporates the influence of two functional random variables in a scalar response in a flexible and interpretable manner. One of the functional covariates is included through a single-index structure, while the other is included linearly through the high-dimensional vector formed by its discretised observations. For this model, two new algorithms are presented for selecting relevant variables in the linear part and estimating the model. Both procedures utilise the functional origin of linear covariates. Finite sample experiments demonstrated the scope of application of both algorithms: the first method is a fast algorithm that provides a solution (without loss in predictive ability) for the significant computational time required by standard variable selection methods for estimating this model, and the second algorithm completes the set of relevant linear covariates provided by the first, thus improving its predictive efficiency. Some asymptotic results theoretically support both procedures. A real data application demonstrated the applicability of the presented methodology from a predictive perspective in terms of the interpretability of outputs and low computational cost.
A fast and flexible $k$NN procedure is developed for dealing with a semiparametric functional regression model involving both partial-linear and single-index components. Rates of uniform consistency are presented. Simulated experiments highlight the advantages of the $k$NN procedure. A real data analysis is also shown.
The family of log-concave density functions contains various kinds of common probability distributions. Due to the shape restriction, it is possible to find the nonparametric estimate of the density, for example, the nonparametric maximum likelihood estimate (NPMLE). However, the associated uncertainty quantification of the NPMLE is less well developed. The current techniques for uncertainty quantification are Bayesian, using a Dirichlet process prior combined with the use of Markov chain Monte Carlo (MCMC) to sample from the posterior. In this paper, we start with the NPMLE and use a version of the martingale posterior distribution to establish uncertainty about the NPMLE. The algorithm can be implemented in parallel and hence is fast. We prove the convergence of the algorithm by constructing suitable submartingales. We also illustrate results with different models and settings and some real data, and compare our method with that within the literature.
We propose two graph neural network layers for graphs with features in a Riemannian manifold. First, based on a manifold-valued graph diffusion equation, we construct a diffusion layer that can be applied to an arbitrary number of nodes and graph connectivity patterns. Second, we model a tangent multilayer perceptron by transferring ideas from the vector neuron framework to our general setting. Both layers are equivariant with respect to node permutations and isometries of the feature manifold. These properties have been shown to lead to a beneficial inductive bias in many deep learning tasks. Numerical examples on synthetic data as well as on triangle meshes of the right hippocampus to classify Alzheimer's disease demonstrate the very good performance of our layers.
The problem of straggler mitigation in distributed matrix multiplication (DMM) is considered for a large number of worker nodes and a fixed small finite field. Polynomial codes and matdot codes are generalized by making use of algebraic function fields (i.e., algebraic functions over an algebraic curve) over a finite field. The construction of optimal solutions is translated to a combinatorial problem on the Weierstrass semigroups of the corresponding algebraic curves. Optimal or almost optimal solutions are provided. These have the same computational complexity per worker as classical polynomial and matdot codes, and their recovery thresholds are almost optimal in the asymptotic regime (growing number of workers and a fixed finite field).
In high-order and high-dimensional finite elements, ill-conditioned nodal distributions are often computationally cost-prohibitive. As a result, uniform distributions quickly fall apart. For tensor-product like elements, Gauss-Legendre-Lobatto (GLL) nodal distributions are often used as a substitute. Besides these, other efficient nodal distributions are difficult to create due to a desired symmetry within elements and conformity with neighboring elements. In this paper, we provide a general framework to construct symmetric, well-conditioned, cross-element compatible nodal distributions which can be used for high-order and high-dimensional finite elements. Starting from the inherent symmetries in any potential element, the framework is used to build up nodal groups in a structured and efficient manner utilizing the natural coordinates of each element, while ensuring nodes stay within the elements. By constructing constrained symmetry groups, the vertices, edges, and faces, of all elements are required to conform to their respective lower-dimensional distributions. Optimizing over these groups yields the desired optimized nodal distributions. We demonstrate the strength of this framework by creating and comparing optimized nodal distributions with GLL distributions (in elements such as the line, quadrilateral, and hexahedron), and its robustness by generating optimized nodal distributions for otherwise difficult elements (such as the triangle, tetrahedron, and triangular prism).
We present a tensor train (TT) based algorithm designed for sampling from a target distribution and employ TT approximation to capture the high-dimensional probability density evolution of overdamped Langevin dynamics. This involves utilizing the regularized Wasserstein proximal operator, which exhibits a simple kernel integration formulation, i.e., the softmax formula of the traditional proximal operator. The integration, performed in $\mathbb{R}^d$, poses a challenge in practical scenarios, making the algorithm practically implementable only with the aid of TT approximation. In the specific context of Gaussian distributions, we rigorously establish the unbiasedness and linear convergence of our sampling algorithm towards the target distribution. To assess the effectiveness of our proposed methods, we apply them to various scenarios, including Gaussian families, Gaussian mixtures, bimodal distributions, and Bayesian inverse problems in numerical examples. The sampling algorithm exhibits superior accuracy and faster convergence when compared to classical Langevin dynamics-type sampling algorithms.
Most existing works in visual question answering (VQA) are dedicated to improving the accuracy of predicted answers, while disregarding the explanations. We argue that the explanation for an answer is of the same or even more importance compared with the answer itself, since it makes the question and answering process more understandable and traceable. To this end, we propose a new task of VQA-E (VQA with Explanation), where the computational models are required to generate an explanation with the predicted answer. We first construct a new dataset, and then frame the VQA-E problem in a multi-task learning architecture. Our VQA-E dataset is automatically derived from the VQA v2 dataset by intelligently exploiting the available captions. We have conducted a user study to validate the quality of explanations synthesized by our method. We quantitatively show that the additional supervision from explanations can not only produce insightful textual sentences to justify the answers, but also improve the performance of answer prediction. Our model outperforms the state-of-the-art methods by a clear margin on the VQA v2 dataset.