We present a method for estimating the maximal symmetry of a regression function. Knowledge of such a symmetry can be used to significantly improve modelling by removing the modes of variation resulting from the symmetries. Symmetry estimation is carried out using hypothesis testing for invariance strategically over the subgroup lattice of a search group G acting on the feature space. We show that the estimation of the unique maximal invariant subgroup of G can be achieved by testing on only a finite portion of the subgroup lattice when G_max is a compact subgroup of G, even for infinite search groups and lattices (such as for the 3D rotation group SO(3)). We then show that the estimation is consistent when G is finite. We demonstrate the performance of this estimator in low dimensional simulations, on a synthetic image classification on MNIST data, and apply the methods to an application using satellite measurements of the earth's magnetic field.
Conventional harvesting problems for natural resources often assume physiological homogeneity of the body length/weight among individuals. However, such assumptions generally are not valid in real-world problems, where heterogeneity plays an essential role in the planning of biological resource harvesting. Furthermore, it is difficult to observe heterogeneity directly from the available data. This paper presents a novel optimal control framework for the cost-efficient harvesting of biological resources for application in fisheries management. The heterogeneity is incorporated into the resource dynamics, which is the population dynamics in this case, through a probability density that can be distorted from the reality. Subsequently, the distortion, which is the model uncertainty, is penalized through a divergence, leading to a non-standard dynamic differential game wherein the Hamilton-Jacobi-Bellman-Isaacs (HJBI) equation has a unique nonlinear partial differential term. Here, the existence and uniqueness results of the HJBI equation are presented along with an explicit monotone finite difference method. Finally, the proposed optimal control is applied to a harvesting problem with recreationally, economically, and ecologically important fish species using collected field data.
A bilateral (i.e., upper and lower) bound on the mean-square error under a general model mismatch is developed. The bound, which is derived from the variational representation of the chi-square divergence, is applicable in the Bayesian and nonBayesian frameworks to biased and unbiased estimators. Unlike other classical MSE bounds that depend only on the model, our bound is also estimator-dependent. Thus, it is applicable as a tool for characterizing the MSE of a specific estimator. The proposed bounding technique has a variety of applications, one of which is a tool for proving the consistency of estimators for a class of models. Furthermore, it provides insight as to why certain estimators work well under general model mismatch conditions.
This paper offers a new approach for study the frequentist properties of the penalized MLE for general nonlinear regression models. The idea of the approach is to relax the nonlinear structural equation by introducing an auxiliary parameter for the regression response and replacing the structural equation with a penalty. This leads to a general semiparametric problem which is studied using the SLS approach from \cite{Sp2022}. We state sharp bounds on concentration and on the accuracy of the penalized MLE, Fisher and Wilks expansions, evaluate the risk of estimation over smoothness classes, and a number of further results. All the bounds are given in terms of effective dimension and do not involve the ambient dimension of the parameter space.
Hazard rate functions of natural and manufactured systems often show a bathtub shaped failure rate. A high early rate of failures is followed by an extended period of useful working life where failures are rare, and finally the failure rate increases as the system reaches the end of its life. Parametric modelling of such hazard rate functions can lead to unnecessarily restrictive assumptions on the function shape, however the most common non-parametric estimator (the Kaplan-Meier estimator) does not allow specification of the requirement that it be bathtub shaped. In this paper we extend the Lo and Weng (1989) approach and specify four nonparametric bathtub hazard rate functions drawn from Gamma Process Priors. We implement and demonstrate simulation for these four models.
Several kernel based testing procedures are proposed to solve the problem of model selection in the presence of parameter estimation in a family of candidate models. Extending the two sample test of Gretton et al. (2006), we first provide a way of testing whether some data is drawn from a given parametric model (model specification). Second, we provide a test statistic to decide whether two parametric models are equally valid to describe some data (model comparison), in the spirit of Vuong (1989). All our tests are asymptotically standard normal under the null, even when the true underlying distribution belongs to the competing parametric families.Some simulations illustrate the performance of our tests in terms of power and level.
We consider the well-studied Robust $(k, z)$-Clustering problem, which generalizes the classic $k$-Median, $k$-Means, and $k$-Center problems. Given a constant $z\ge 1$, the input to Robust $(k, z)$-Clustering is a set $P$ of $n$ weighted points in a metric space $(M,\delta)$ and a positive integer $k$. Further, each point belongs to one (or more) of the $m$ many different groups $S_1,S_2,\ldots,S_m$. Our goal is to find a set $X$ of $k$ centers such that $\max_{i \in [m]} \sum_{p \in S_i} w(p) \delta(p,X)^z$ is minimized. This problem arises in the domains of robust optimization [Anthony, Goyal, Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness. For polynomial time computation, an approximation factor of $O(\log m/\log\log m)$ is known [Makarychev, Vakilian, COLT $2021$], which is tight under a plausible complexity assumption even in the line metrics. For FPT time, there is a $(3^z+\epsilon)$-approximation algorithm, which is tight under GAP-ETH [Goyal, Jaiswal, Inf. Proc. Letters, 2023]. Motivated by the tight lower bounds for general discrete metrics, we focus on \emph{geometric} spaces such as the (discrete) high-dimensional Euclidean setting and metrics of low doubling dimension, which play an important role in data analysis applications. First, for a universal constant $\eta_0 >0.0006$, we devise a $3^z(1-\eta_{0})$-factor FPT approximation algorithm for discrete high-dimensional Euclidean spaces thereby bypassing the lower bound for general metrics. We complement this result by showing that even the special case of $k$-Center in dimension $\Theta(\log n)$ is $(\sqrt{3/2}- o(1))$-hard to approximate for FPT algorithms. Finally, we complete the FPT approximation landscape by designing an FPT $(1+\epsilon)$-approximation scheme (EPAS) for the metric of sub-logarithmic doubling dimension.
In this paper we examine the use of low-rank approximations for the handling of radiation boundary conditions in a transient heat equation given a cavity radiation setting. The finite element discretization that arises from cavity radiation is well known to be dense, which poses difficulties for efficiency and scalability of solvers. Here we consider a special treatment of the cavity radiation discretization using a block low-rank approximation combined with hierarchical matrices. We provide an overview of the methodology and discusses techniques that can be used to improve efficiency within the framework of hierarchical matrices, including the usage of the approximate cross approximation (ACA) method. We provide a number of numerical results that demonstrate the accuracy and efficiency of the approach in practical problems, and demonstrate significant speedup and memory reduction compared to the more conventional "dense matrix" approach.
Quantum density matrix represents all the information of the entire quantum system, and novel models of meaning employing density matrices naturally model linguistic phenomena such as hyponymy and linguistic ambiguity, among others in quantum question answering tasks. Naturally, we argue that applying the quantum density matrix into classical Question Answering (QA) tasks can show more effective performance. Specifically, we (i) design a new mechanism based on Long Short-Term Memory (LSTM) to accommodate the case when the inputs are matrixes; (ii) apply the new mechanism to QA problems with Convolutional Neural Network (CNN) and gain the LSTM-based QA model with the quantum density matrix. Experiments of our new model on TREC-QA and WIKI-QA data sets show encouraging results. Similarly, we argue that the quantum density matrix can also enhance the image feature information and the relationship between the features for the classical image classification. Thus, we (i) combine density matrices and CNN to design a new mechanism; (ii) apply the new mechanism to some representative classical image classification tasks. A series of experiments show that the application of quantum density matrix in image classification has the generalization and high efficiency on different datasets. The application of quantum density matrix both in classical question answering tasks and classical image classification tasks show more effective performance.
Let $G$ be a graph on $n$ vertices of maximum degree $\Delta$. We show that, for any $\delta > 0$, the down-up walk on independent sets of size $k \leq (1-\delta)\alpha_c(\Delta)n$ mixes in time $O_{\Delta,\delta}(k\log{n})$, thereby resolving a conjecture of Davies and Perkins in an optimal form. Here, $\alpha_{c}(\Delta)n$ is the NP-hardness threshold for the problem of counting independent sets of a given size in a graph on $n$ vertices of maximum degree $\Delta$. Our mixing time has optimal dependence on $k,n$ for the entire range of $k$; previously, even polynomial mixing was not known. In fact, for $k = \Omega_{\Delta}(n)$ in this range, we establish a log-Sobolev inequality with optimal constant $\Omega_{\Delta,\delta}(1/n)$. At the heart of our proof are three new ingredients, which may be of independent interest. The first is a method for lifting $\ell_\infty$-independence from a suitable distribution on the discrete cube -- in this case, the hard-core model -- to the slice by proving stability of an Edgeworth expansion using a multivariate zero-free region for the base distribution. The second is a generalization of the Lee-Yau induction to prove log-Sobolev inequalities for distributions on the slice with considerably less symmetry than the uniform distribution. The third is a sharp decomposition-type result which provides a lossless comparison between the Dirichlet form of the original Markov chain and that of the so-called projected chain in the presence of a contractive coupling.
In digital images, the performance of optical aberration is a multivariate degradation, where the spectral of the scene, the lens imperfections, and the field of view together contribute to the results. Besides eliminating it at the hardware level, the post-processing system, which utilizes various prior information, is significant for correction. However, due to the content differences among priors, the pipeline that aligns these factors shows limited efficiency and unoptimized restoration. Here, we propose a prior quantization model to correct the optical aberrations in image processing systems. To integrate these messages, we encode various priors into a latent space and quantify them by the learnable codebooks. After quantization, the prior codes are fused with the image restoration branch to realize targeted optical aberration correction. Comprehensive experiments demonstrate the flexibility of the proposed method and validate its potential to accomplish targeted restoration for a specific camera. Furthermore, our model promises to analyze the correlation between the various priors and the optical aberration of devices, which is helpful for joint soft-hardware design.