Penalizing complexity (PC) priors is a principled framework for designing priors that reduce model complexity. PC priors penalize the Kullback-Leibler Divergence (KLD) between the distributions induced by a ``simple'' model and that of a more complex model. However, in many common cases, it is impossible to construct a prior in this way because the KLD is infinite. Various approximations are used to mitigate this problem, but the resulting priors then fail to follow the designed principles. We propose a new class of priors, the Wasserstein complexity penalization (WCP) priors, by replacing KLD with the Wasserstein distance in the PC prior framework. These priors avoid the infinite model distance issues and can be derived by following the principles exactly, making them more interpretable. Furthermore, principles and recipes to construct joint WCP priors for multiple parameters analytically and numerically are proposed and we show that they can be easily obtained, either numerically or analytically, for a general class of models. The methods are illustrated through several examples for which PC priors have previously been applied.
Mixtures of regression are a powerful class of models for regression learning with respect to a highly uncertain and heterogeneous response variable of interest. In addition to being a rich predictive model for the response given some covariates, the parameters in this model class provide useful information about the heterogeneity in the data population, which is represented by the conditional distributions for the response given the covariates associated with a number of distinct but latent subpopulations. In this paper, we investigate conditions of strong identifiability, rates of convergence for conditional density and parameter estimation, and the Bayesian posterior contraction behavior arising in finite mixture of regression models, under exact-fitted and over-fitted settings and when the number of components is unknown. This theory is applicable to common choices of link functions and families of conditional distributions employed by practitioners. We provide simulation studies and data illustrations, which shed some light on the parameter learning behavior found in several popular regression mixture models reported in the literature.
The scheduling problem is a key class of optimization problems and has various kinds of applications both in practical and theoretical scenarios. In the scheduling problem, probabilistic analysis is a basic tool for investigating performance of scheduling algorithms, and therefore has been carried out by plenty amount of prior works. However, probabilistic analysis has several potential problems. For example, current research interest in the scheduling problem is limited to i.i.d. scenarios, due to its simplicity for analysis. This paper provides a new framework for probabilistic analysis in the scheduling problem and aims to deal with such problems. As a consequence, we obtain several theorems including a theoretical limit of the scheduling problem which can be applied to \emph{general, non-i.i.d. probability distributions}. Several information theoretic techniques, such as \emph{information-spectrum method}, turned out to be useful to prove our results. Since the scheduling problem has relations to many other research fields, our framework hopefully yields other interesting applications in the future.
We derive bounds on the moduli of the eigenvalues of special type of matrix rational functions using the following techniques/methods: (1) the Bauer-Fike theorem on an associated block matrix of the given matrix rational function, (2) by associating a real rational function, along with Rouch$\text{\'e}$ theorem for the matrix rational function and (3) by a numerical radius inequality for a block matrix for the matrix rational function. These bounds are compared when the coefficients are unitary matrices. Numerical examples are given to illustrate the results obtained.
It is known that standard stochastic Galerkin methods encounter challenges when solving partial differential equations with high-dimensional random inputs, which are typically caused by the large number of stochastic basis functions required. It becomes crucial to properly choose effective basis functions, such that the dimension of the stochastic approximation space can be reduced. In this work, we focus on the stochastic Galerkin approximation associated with generalized polynomial chaos (gPC), and explore the gPC expansion based on the analysis of variance (ANOVA) decomposition. A concise form of the gPC expansion is presented for each component function of the ANOVA expansion, and an adaptive ANOVA procedure is proposed to construct the overall stochastic Galerkin system. Numerical results demonstrate the efficiency of our proposed adaptive ANOVA stochastic Galerkin method for both diffusion and Helmholtz problems.
We study the problem of parameter estimation for large exchangeable interacting particle systems when a sample of discrete observations from a single particle is known. We propose a novel method based on martingale estimating functions constructed by employing the eigenvalues and eigenfunctions of the generator of the mean field limit, where the law of the process is replaced by the (unique) invariant measure of the mean field dynamics. We then prove that our estimator is asymptotically unbiased and asymptotically normal when the number of observations and the number of particles tend to infinity, and we provide a rate of convergence towards the exact value of the parameters. Finally, we present several numerical experiments which show the accuracy of our estimator and corroborate our theoretical findings, even in the case the mean field dynamics exhibit more than one steady states.
A new sparse semiparametric model is proposed, which incorporates the influence of two functional random variables in a scalar response in a flexible and interpretable manner. One of the functional covariates is included through a single-index structure, while the other is included linearly through the high-dimensional vector formed by its discretised observations. For this model, two new algorithms are presented for selecting relevant variables in the linear part and estimating the model. Both procedures utilise the functional origin of linear covariates. Finite sample experiments demonstrated the scope of application of both algorithms: the first method is a fast algorithm that provides a solution (without loss in predictive ability) for the significant computational time required by standard variable selection methods for estimating this model, and the second algorithm completes the set of relevant linear covariates provided by the first, thus improving its predictive efficiency. Some asymptotic results theoretically support both procedures. A real data application demonstrated the applicability of the presented methodology from a predictive perspective in terms of the interpretability of outputs and low computational cost.
In arXiv:2305.03945 [math.NA], a first-order optimization algorithm has been introduced to solve time-implicit schemes of reaction-diffusion equations. In this research, we conduct theoretical studies on this first-order algorithm equipped with a quadratic regularization term. We provide sufficient conditions under which the proposed algorithm and its time-continuous limit converge exponentially fast to a desired time-implicit numerical solution. We show both theoretically and numerically that the convergence rate is independent of the grid size, which makes our method suitable for large-scale problems. The efficiency of our algorithm has been verified via a series of numerical examples conducted on various types of reaction-diffusion equations. The choice of optimal hyperparameters as well as comparisons with some classical root-finding algorithms are also discussed in the numerical section.
We present a polymorphic linear lambda-calculus as a proof language for second-order intuitionistic linear logic. The calculus includes addition and scalar multiplication, enabling the proof of a linearity result at the syntactic level.
Asymptotic analysis for related inference problems often involves similar steps and proofs. These intermediate results could be shared across problems if each of them is made self-contained and easily identified. However, asymptotic analysis using Taylor expansions is limited for result borrowing because it is a step-to-step procedural approach. This article introduces EEsy, a modular system for estimating finite and infinitely dimensional parameters in related inference problems. It is based on the infinite-dimensional Z-estimation theorem, Donsker and Glivenko-Cantelli preservation theorems, and weight calibration techniques. This article identifies the systematic nature of these tools and consolidates them into one system containing several modules, which can be built, shared, and extended in a modular manner. This change to the structure of method development allows related methods to be developed in parallel and complex problems to be solved collaboratively, expediting the development of new analytical methods. This article considers four related inference problems -- estimating parameters with random sampling, two-phase sampling, auxiliary information incorporation, and model misspecification. We illustrate this modular approach by systematically developing 9 parameter estimators and 18 variance estimators for the four related inference problems regarding semi-parametric additive hazards models. Simulation studies show the obtained asymptotic results for these 27 estimators are valid. In the end, I describe how this system can simplify the use of empirical process theory, a powerful but challenging tool to be adopted by the broad community of methods developers. I discuss challenges and the extension of this system to other inference problems.
We present a novel combination of dynamic embedded topic models and change-point detection to explore diachronic change of lexical semantic modality in classical and early Christian Latin. We demonstrate several methods for finding and characterizing patterns in the output, and relating them to traditional scholarship in Comparative Literature and Classics. This simple approach to unsupervised models of semantic change can be applied to any suitable corpus, and we conclude with future directions and refinements aiming to allow noisier, less-curated materials to meet that threshold.