The central problem we address in this work is estimation of the parameter support set S, the set of indices corresponding to nonzero parameters, in the context of a sparse parametric likelihood model for count-valued multivariate time series. We develop a computationally-intensive algorithm that performs the estimation by aggregating support sets obtained by applying the LASSO to data subsamples. Our approach is to identify several well-fitting candidate models and estimate S by the most frequently-used parameters, thus \textit{aggregating} candidate models rather than selecting a single candidate deemed optimal in some sense. While our method is broadly applicable to any selection problem, we focus on the generalized vector autoregressive model class, and in particular the Poisson case, due to (i) the difficulty of the support estimation problem due to complex dependence in the data, (ii) recent work applying the LASSO in this context, and (iii) interesting applications in network recovery from discrete multivariate time series. We establish benchmark methods based on the LASSO and present empirical results demonstrating the superior performance of our method. Additionally, we present an application estimating ecological interaction networks from paleoclimatology data.
In this paper, we derive explicit second-order necessary and sufficient optimality conditions of a local minimizer to an optimal control problem for a quasilinear second-order partial differential equation with a piecewise smooth but not differentiable nonlinearity in the leading term. The key argument rests on the analysis of level sets of the state. Specifically, we show that if a function vanishes on the boundary and its the gradient is different from zero on a level set, then this set decomposes into finitely many closed simple curves. Moreover, the level sets depend continuously on the functions defining these sets. We also prove the continuity of the integrals on the level sets. In particular, Green's first identity is shown to be applicable on an open set determined by two functions with nonvanishing gradients. In the second part to this paper, the explicit sufficient second-order conditions will be used to derive error estimates for a finite-element discretization of the control problem.
Confounder selection, namely choosing a set of covariates to control for confounding between a treatment and an outcome, is arguably the most important step in the design of observational studies. Previous methods, such as Pearl's celebrated back-door criterion, typically require pre-specifying a causal graph, which can often be difficult in practice. We propose an interactive procedure for confounder selection that does not require pre-specifying the graph or the set of observed variables. This procedure iteratively expands the causal graph by finding what we call "primary adjustment sets" for a pair of possibly confounded variables. This can be viewed as inverting a sequence of latent projections of the underlying causal graph. Structural information in the form of primary adjustment sets is elicited from the user, bit by bit, until either a set of covariates are found to control for confounding or it can be determined that no such set exists. We show that if the user correctly specifies the primary adjustment sets in every step, our procedure is both sound and complete.
This article presents a new tool for the automatic detection of meteors. Fast Meteor Detection Toolbox (FMDT) is able to detect meteor sightings by analyzing videos acquired by cameras onboard weather balloons or within airplane with stabilization. The challenge consists in designing a processing chain composed of simple algorithms, that are robust to the high fluctuation of the videos and that satisfy the constraints on power consumption (10 W) and real-time processing (25 frames per second).
Next Point-of-Interest (POI) recommendation is a critical task in location-based services that aim to provide personalized suggestions for the user's next destination. Previous works on POI recommendation have laid focused on modeling the user's spatial preference. However, existing works that leverage spatial information are only based on the aggregation of users' previous visited positions, which discourages the model from recommending POIs in novel areas. This trait of position-based methods will harm the model's performance in many situations. Additionally, incorporating sequential information into the user's spatial preference remains a challenge. In this paper, we propose Diff-POI: a Diffusion-based model that samples the user's spatial preference for the next POI recommendation. Inspired by the wide application of diffusion algorithm in sampling from distributions, Diff-POI encodes the user's visiting sequence and spatial character with two tailor-designed graph encoding modules, followed by a diffusion-based sampling strategy to explore the user's spatial visiting trends. We leverage the diffusion process and its reversed form to sample from the posterior distribution and optimized the corresponding score function. We design a joint training and inference framework to optimize and evaluate the proposed Diff-POI. Extensive experiments on four real-world POI recommendation datasets demonstrate the superiority of our Diff-POI over state-of-the-art baseline methods. Further ablation and parameter studies on Diff-POI reveal the functionality and effectiveness of the proposed diffusion-based sampling strategy for addressing the limitations of existing methods.
Cram\'er's moderate deviations give a quantitative estimate for the relative error of the normal approximation and provide theoretical justifications for many estimator used in statistics. In this paper, we establish self-normalized Cram\'{e}r type moderate deviations for martingales under some mile conditions. The result extends an earlier work of Fan, Grama, Liu and Shao [Bernoulli, 2019]. Moreover, applications of our result to Student's statistic, stationary martingale difference sequences and branching processes in a random environment are also discussed. In particular, we establish Cram\'{e}r type moderate deviations for Student's $t$-statistic for branching processes in a random environment.
In this research work, we propose a high-order time adapted scheme for pricing a coupled system of fixed-free boundary constant elasticity of variance (CEV) model on both equidistant and locally refined space-grid. The performance of our method is substantially enhanced to improve irregularities in the model which are both inherent and induced. Furthermore, the system of coupled PDEs is strongly nonlinear and involves several time-dependent coefficients that include the first-order derivative of the early exercise boundary. These coefficients are approximated from a fourth-order analytical approximation which is derived using a regularized square-root function. The semi-discrete equation for the option value and delta sensitivity is obtained from a non-uniform fourth-order compact finite difference scheme. Fifth-order 5(4) Dormand-Prince time integration method is used to solve the coupled system of discrete equations. Enhancing the performance of our proposed method with local mesh refinement and adaptive strategies enables us to obtain highly accurate solution with very coarse space grids, hence reducing computational runtime substantially. We further verify the performance of our methodology as compared with some of the well-known and better-performing existing methods.
In this work, we discuss some properties of the eigenvalues of some classes of signed complete graphs. We also obtain the form of characteristic polynomial for these graphs.
This work proposes an adjacent-category autoregressive model for time series of ordinal variables. We apply this model to dendrochronological records to study the effect of climate on the intensity of spruce budworm defoliation during outbreaks in two sites in eastern Canada. The model's parameters are estimated using the maximum likelihood approach. We show that this estimator is consistent and asymptotically Gaussian distributed. We also propose a Portemanteau test for goodness-of-fit. Our study shows that the seasonal ranges of maximum daily temperatures in the spring and summer have a significant quadratic effect on defoliation. The study reveals that for both regions, a greater range of summer daily maximum temperatures is associated with lower levels of defoliation up to a threshold estimated at 22.7C (CI of 0-39.7C at 95%) in T\'emiscamingue and 21.8C (CI of 0-54.2C at 95%) for Matawinie. For Matawinie, a greater range in spring daily maximum temperatures increased defoliation, up to a threshold of 32.5C (CI of 0-80.0C). We also present a statistical test to compare the autoregressive parameter values between different fits of the model, which allows us to detect changes in the defoliation dynamics between the study sites in terms of their respective autoregression structures.
Approximating differential operators defined on two-dimensional surfaces is an important problem that arises in many areas of science and engineering. Over the past ten years, localized meshfree methods based on generalized moving least squares (GMLS) and radial basis function finite differences (RBF-FD) have been shown to be effective for this task as they can give high orders of accuracy at low computational cost, and they can be applied to surfaces defined only by point clouds. However, there have yet to be any studies that perform a direct comparison of these methods for approximating surface differential operators (SDOs). The first purpose of this work is to fill that gap. For this comparison, we focus on an RBF-FD method based on polyharmonic spline kernels and polynomials (PHS+Poly) since they are most closely related to the GMLS method. Additionally, we use a relatively new technique for approximating SDOs with RBF-FD called the tangent plane method since it is simpler than previous techniques and natural to use with PHS+Poly RBF-FD. The second purpose of this work is to relate the tangent plane formulation of SDOs to the local coordinate formulation used in GMLS and to show that they are equivalent when the tangent space to the surface is known exactly. The final purpose is to use ideas from the GMLS SDO formulation to derive a new RBF-FD method for approximating the tangent space for a point cloud surface when it is unknown. For the numerical comparisons of the methods, we examine their convergence rates for approximating the surface gradient, divergence, and Laplacian as the point clouds are refined for various parameter choices. We also compare their efficiency in terms of accuracy per computational cost, both when including and excluding setup costs.
In settings where interference between units is possible, we define the prevelance of peer effects to be the number of units who are affected by the treatment of others. This quantity does not fully identify a peer effect, but may be used to show whether peer effects are widely prevalent. Given a randomized experiment with binary-valued outcomes, methods are presented for conservative point estimation and one-sided interval estimation. To show asymptotic coverage of our intervals in settings not previously covered, we provide a central limit theorem that combines local dependence and sampling without replacement. Consistency and minimax properties of the point estimator are shown as well. The approach is demonstrated on an experiment in which students were treated for a highly transmissible parasitic infection, for which we find that a significant fraction of students were affected by the treatment of schools other than their own.