We study the problem of change point (CP) detection with high dimensional time series, within the framework of frequency domain. The overarching goal is to locate all change points and for each change point, delineate which series are activated by the change, over which set of frequencies. The working assumption is that only a few series are activated per change and frequency. We solve the problem by computing a CUSUM tensor based on spectra estimated from blocks of the observed time series. A frequency-specific projection approach is applied to the CUSUM tensor for dimension reduction. The projection direction is estimated by a proposed sparse tensor decomposition algorithm. Finally, the projected CUSUM vectors across frequencies are aggregated by a sparsified wild binary segmentation for change point detection. We provide theoretical guarantees on the number of estimated change points and the convergence rate of their locations. We derive error bounds for the estimated projection direction for identifying the frequency-specific series that are activated in a change. We provide data-driven rules for the choice of parameters. We illustrate the efficacy of the proposed method by simulation and a stock returns application.
We study algorithms for online change-point detection (OCPD), where samples that are potentially heavy-tailed, are presented one at a time and a change in the underlying mean must be detected as early as possible. We present an algorithm based on clipped Stochastic Gradient Descent (SGD), that works even if we only assume that the second moment of the data generating process is bounded. We derive guarantees on worst-case, finite-sample false-positive rate (FPR) over the family of all distributions with bounded second moment. Thus, our method is the first OCPD algorithm that guarantees finite-sample FPR, even if the data is high dimensional and the underlying distributions are heavy-tailed. The technical contribution of our paper is to show that clipped-SGD can estimate the mean of a random vector and simultaneously provide confidence bounds at all confidence values. We combine this robust estimate with a union bound argument and construct a sequential change-point algorithm with finite-sample FPR guarantees. We show empirically that our algorithm works well in a variety of situations, whether the underlying data are heavy-tailed, light-tailed, high dimensional or discrete. No other algorithm achieves bounded FPR theoretically or empirically, over all settings we study simultaneously.
We introduce a novel one-stage end-to-end multi-person 2D pose estimation algorithm, known as Joint Coordinate Regression and Association (JCRA), that produces human pose joints and associations without requiring any post-processing. The proposed algorithm is fast, accurate, effective, and simple. The one-stage end-to-end network architecture significantly improves the inference speed of JCRA. Meanwhile, we devised a symmetric network structure for both the encoder and decoder, which ensures high accuracy in identifying keypoints. It follows an architecture that directly outputs part positions via a transformer network, resulting in a significant improvement in performance. Extensive experiments on the MS COCO and CrowdPose benchmarks demonstrate that JCRA outperforms state-of-the-art approaches in both accuracy and efficiency. Moreover, JCRA demonstrates 69.2 mAP and is 78\% faster at inference acceleration than previous state-of-the-art bottom-up algorithms. The code for this algorithm will be publicly available.
Most ordinary differential equation (ODE) models used to describe biological or physical systems must be solved approximately using numerical methods. Perniciously, even those solvers which seem sufficiently accurate for the forward problem, i.e., for obtaining an accurate simulation, may not be sufficiently accurate for the inverse problem, i.e., for inferring the model parameters from data. We show that for both fixed step and adaptive step ODE solvers, solving the forward problem with insufficient accuracy can distort likelihood surfaces, which may become jagged, causing inference algorithms to get stuck in local "phantom" optima. We demonstrate that biases in inference arising from numerical approximation of ODEs are potentially most severe in systems involving low noise and rapid nonlinear dynamics. We reanalyze an ODE changepoint model previously fit to the COVID-19 outbreak in Germany and show the effect of the step size on simulation and inference results. We then fit a more complicated rainfall-runoff model to hydrological data and illustrate the importance of tuning solver tolerances to avoid distorted likelihood surfaces. Our results indicate that when performing inference for ODE model parameters, adaptive step size solver tolerances must be set cautiously and likelihood surfaces should be inspected for characteristic signs of numerical issues.
In this paper, we study the partial pole assignment problem in symmetric quadratic pencil with time delay. A novel multi-step method is proposed to solve this problem, resulting in the undesired eigenvalues being moved to desired values, and the remaining eigenvalues unchanged. By establishing a new matrix equality relation and using a multi-step method, the problem is transformed into solving linear systems with low order. Specifically, assuming that there are $p$ undesired eigenvalues requiring reassigned, the size of the linear system we finally solved is $p^2$. Notably, the method demonstrates high efficiency for large systems with only a few poles requiring reassigned. Numerical examples are provided to illustrate the effectiveness of the proposed method
We study sampling problems associated with potentials that lack smoothness. The potentials can be either convex or non-convex. Departing from the standard smooth setting, the potentials are only assumed to be weakly smooth or non-smooth, or the summation of multiple such functions. We develop a sampling algorithm that resembles proximal algorithms in optimization for this challenging sampling task. Our algorithm is based on a special case of Gibbs sampling known as the alternating sampling framework (ASF). The key contribution of this work is a practical realization of the ASF based on rejection sampling for both non-convex and convex potentials that are not necessarily smooth. In almost all the cases of sampling considered in this work, our proximal sampling algorithm achieves better complexity than all existing methods.
The purpose of this work is to study an optimal control problem for a semilinear elliptic partial differential equation with a linear combination of Dirac measures as a forcing term; the control variable corresponds to the amplitude of such singular sources. We analyze the existence of optimal solutions and derive first and, necessary and sufficient, second order optimality conditions. We develop a solution technique that discretizes the state and adjoint equations with continuous piecewise linear finite elements; the control variable is already discrete. We analyze the convergence properties of discretizations and obtain, in two dimensions, an a priori error estimate for the underlying approximation of an optimal control variable.
Neuro-symbolic artificial intelligence is an emerging area that combines traditional symbolic techniques with neural networks. In this paper, we consider its application to sequential decision making under uncertainty. We introduce neuro-symbolic partially observable Markov decision processes (NS-POMDPs), which model an agent that perceives a continuous-state environment using a neural network and makes decisions symbolically, and study the problem of optimising discounted cumulative rewards. This requires functions over continuous-state beliefs, for which we propose a novel piecewise linear and convex representation (P-PWLC) in terms of polyhedra covering the continuous-state space and value vectors, and extend Bellman backups to this representation. We prove the convexity and continuity of value functions and present two value iteration algorithms that ensure finite representability by exploiting the underlying structure of the continuous-state model and the neural perception mechanism. The first is a classical (exact) value iteration algorithm extending $\alpha$-functions of Porta et al (2006) to the P-PWLC representation for continuous-state spaces. The second is a point-based (approximate) method called NS-HSVI, which uses the P-PWLC representation and belief-value induced functions to approximate value functions from below and above for two types of beliefs, particle-based and region-based. Using a prototype implementation, we show the practical applicability of our approach on two case studies that employ (trained) ReLU neural networks as perception functions, dynamic car parking and an aircraft collision avoidance system, by synthesising (approximately) optimal strategies. An experimental comparison with the finite-state POMDP solver SARSOP demonstrates that NS-HSVI is more robust to particle disturbances.
In this paper, practically computable low-order approximations of potentially high-dimensional differential equations driven by geometric rough paths are proposed and investigated. In particular, equations are studied that cover the linear setting, but we allow for a certain type of dissipative nonlinearity in the drift as well. In a first step, a linear subspace is found that contains the solution space of the underlying rough differential equation (RDE). This subspace is associated to covariances of linear Ito-stochastic differential equations which is shown exploiting a Gronwall lemma for matrix differential equations. Orthogonal projections onto the identified subspace lead to a first exact reduced order system. Secondly, a linear map of the RDE solution (quantity of interest) is analyzed in terms of redundant information meaning that state variables are found that do not contribute to the quantity of interest. Once more, a link to Ito-stochastic differential equations is used. Removing such unnecessary information from the RDE provides a further dimension reduction without causing an error. Finally, we discretize a linear parabolic rough partial differential equation in space. The resulting large-order RDE is subsequently tackled with the exact reduction techniques studied in this paper. We illustrate the enormous complexity reduction potential in the corresponding numerical experiments.
In this paper, we deal with nonparametric regression for circular data, meaning that observations are represented by points lying on the unit circle. We propose a kernel estimation procedure with data-driven selection of the bandwidth parameter. For this purpose, we use a warping strategy combined with a Goldenshluger-Lepski type estimator. To study optimality of our methodology, we consider the minimax setting and prove, by establishing upper and lower bounds, that our procedure is nearly optimal on anisotropic Holder classes of functions for pointwise estimation. The obtained rates also reveal the specific nature of regression for circular responses. Finally, a numerical study is conducted, illustrating the good performances of our approach.
Image segmentation is considered to be one of the critical tasks in hyperspectral remote sensing image processing. Recently, convolutional neural network (CNN) has established itself as a powerful model in segmentation and classification by demonstrating excellent performances. The use of a graphical model such as a conditional random field (CRF) contributes further in capturing contextual information and thus improving the segmentation performance. In this paper, we propose a method to segment hyperspectral images by considering both spectral and spatial information via a combined framework consisting of CNN and CRF. We use multiple spectral cubes to learn deep features using CNN, and then formulate deep CRF with CNN-based unary and pairwise potential functions to effectively extract the semantic correlations between patches consisting of three-dimensional data cubes. Effective piecewise training is applied in order to avoid the computationally expensive iterative CRF inference. Furthermore, we introduce a deep deconvolution network that improves the segmentation masks. We also introduce a new dataset and experimented our proposed method on it along with several widely adopted benchmark datasets to evaluate the effectiveness of our method. By comparing our results with those from several state-of-the-art models, we show the promising potential of our method.