As it is known, universal codes, which estimate the entropy rate consistently, exist for any stationary ergodic source over a finite alphabet but not over a countably infinite one. We cast the problem of universal coding into the problem of universal densities with respect to a given reference measure on a countably generated measurable space, examples being the counting measure or the Lebesgue measure. We show that universal densities, which estimate the differential entropy rate consistently, exist if the reference measure is finite, which disproves that the assumption of a finite alphabet is necessary in general. To exhibit a universal density, we combine the prediction by partial matching (PPM) code with the non-parametric differential (NPD) entropy rate estimator, putting a prior both over all Markov orders and all quantization levels. The proof of universality applies Barron's asymptotic equipartition for densities and continuity of $f$-divergences for filtrations. As an application, we demonstrate that any universal density induces a strongly consistent Ces\`aro mean estimator of the conditional density given an infinite past, which solves the problem of universal prediction with the $0-1$ loss for a countable alphabet, by the way. We also show that there exists a strongly consistent entropy rate estimator with respect to the Lebesgue measure in the class of stationary ergodic Gaussian processes.
If the probability distribution model aims to approximate the hidden mother distribution, it is imperative to establish a useful criterion for the resemblance between the mother and the model distributions. This study proposes a criterion that measures the Hellinger distance between discretized (quantized) samples from both distributions. Unlike information criteria such as AIC, this criterion does not require the probability density function of the model distribution, which cannot be explicitly obtained for a complicated model such as a deep learning machine. Second, it can draw a positive conclusion (i.e., both distributions are sufficiently close) under a given threshold, whereas a statistical hypothesis test, such as the Kolmogorov-Smirnov test, cannot genuinely lead to a positive conclusion when the hypothesis is accepted. In this study, we establish a reasonable threshold for the criterion deduced from the Bayes error rate and also present the asymptotic bias of the estimator of the criterion. From these results, a reasonable and easy-to-use criterion is established that can be directly calculated from the two sets of samples from both distributions.
We develop and implement a Bayesian approach for the estimation of the shape of a two dimensional annular domain enclosing a Stokes flow from sparse and noisy observations of the enclosed fluid. Our setup includes the case of direct observations of the flow field as well as the measurement of concentrations of a solute passively advected by and diffusing within the flow. Adopting a statistical approach provides estimates of uncertainty in the shape due both to the non-invertibility of the forward map and to error in the measurements. When the shape represents a design problem of attempting to match desired target outcomes, this "uncertainty" can be interpreted as identifying remaining degrees of freedom available to the designer. We demonstrate the viability of our framework on three concrete test problems. These problems illustrate the promise of our framework for applications while providing a collection of test cases for recently developed Markov Chain Monte Carlo (MCMC) algorithms designed to resolve infinite dimensional statistical quantities.
Several applications of molecular communications (MC) feature an alarm-prompt behavior for which the prevalent Shannon capacity may not be the appropriate performance metric. The identification capacity as an alternative measure for such systems has been motivated and established in the literature. In this paper, we study deterministic identification (DI) for the discrete-time \emph{Poisson} channel (DTPC) with inter-symbol interference (ISI) where the transmitter is restricted to an average and a peak molecule release rate constraint. Such a channel serves as a model for diffusive MC systems featuring long channel impulse responses and employing molecule counting receivers. We derive lower and upper bounds on the DI capacity of the DTPC with ISI when the number of ISI channel taps $K$ may grow with the codeword length $n$ (e.g., due to increasing symbol rate). As a key finding, we establish that for deterministic encoding, the codebook size scales as $2^{(n\log n)R}$ assuming that the number of ISI channel taps scales as $K = 2^{\kappa \log n}$, where $R$ is the coding rate and $\kappa$ is the ISI rate. Moreover, we show that optimizing $\kappa$ leads to an effective identification rate [bits/s] that scales linearly with $n$, which is in contrast to the typical transmission rate [bits/s] that is independent of $n$.
We consider an optimal control problem constrained by a parabolic partial differential equation (PDE) with Robin boundary conditions. We use a well-posed space-time variational formulation in Lebesgue--Bochner spaces with minimal regularity. The abstract formulation of the optimal control problem yields the Lagrange function and Karush--Kuhn--Tucker (KKT) conditions in a natural manner. This results in space-time variational formulations of the adjoint and gradient equation in Lebesgue--Bochner spaces with minimal regularity. Necessary and sufficient optimality conditions are formulated and the optimality system is shown to be well-posed. Next, we introduce a conforming uniformly stable simultaneous space-time (tensorproduct) discretization of the optimality system in these Lebesgue--Boch\-ner spaces. Using finite elements of appropriate orders in space and time for trial and test spaces, this setting is known to be equivalent to a Crank--Nicolson time-stepping scheme for parabolic problems. Differences to existing methods are detailed. We show numerical comparisons with time-stepping methods. The space-time method shows good stability properties and requires fewer degrees of freedom in time to reach the same accuracy.
Recent works of Roughgarden (EC'21) and Chung and Shi (SODA'23) initiate the study of a new decentralized mechanism design problem called transaction fee mechanism design (TFM). Unlike the classical mechanism design literature, in the decentralized environment, even the auctioneer (i.e., the miner) can be a strategic player, and it can even collude with a subset of the users facilitated by binding side contracts. Chung and Shi showed two main impossibility results that rule out the existence of a {\it dream} TFM. First, any TFM that provides incentive compatibility for individual users and miner-user coalitions must always have zero miner revenue, no matter whether the block size is finite or infinite. Second, assuming finite block size, no non-trivial TFM can simultaenously provide incentive compatibility for any individual user, and for any miner-user coalition. In this work, we explore what new models and meaningful relaxations can allow us to circumvent the impossibility results of Chung and Shi. Besides today's model that does not employ cryptography, we introduce a new MPC-assisted model where the TFM is implemented by a joint multi-party computation (MPC) protocol among the miners. We prove several feasibility and infeasibility results for achieving {\it strict} and {\it approximate} incentive compatibility, respectively, in the plain model as well as the MPC-assisted model. We show that while cryptography is not a panacea, it indeed allows us to overcome some impossibility results pertaining to the plain model, leading to non-trivial mechanisms with useful guarantees that are otherwise impossible in the plain model. Our work is also the first to characterize the mathematical landscape of transaction fee mechanism design under approximate incentive compatibility, as well as in a cryptography-assisted model.
We revisit the problem of finding small $\epsilon$-separation keys introduced by Motwani and Xu (2008). In this problem, the input is $m$-dimensional tuples $x_1,x_2,\ldots,x_n $. The goal is to find a small subset of coordinates that separates at least $(1-\epsilon){n \choose 2}$ pairs of tuples. They provided a fast algorithm that runs on $\Theta(m/\epsilon)$ tuples sampled uniformly at random. We show that the sample size can be improved to $\Theta(m/\sqrt{\epsilon})$. Our algorithm also enjoys a faster running time. To obtain this result, we provide upper and lower bounds on the sample size to solve the following decision problem. Given a subset of coordinates $A$, reject if $A$ separates fewer than $(1-\epsilon){n \choose 2}$ pairs, and accept if $A$ separates all pairs. The algorithm must be correct with probability at least $1-\delta$ for all $A$. We show that for algorithms based on sampling: - $\Theta(m/\sqrt{\epsilon})$ samples are sufficient and necessary so that $\delta \leq e^{-m}$ and - $\Omega(\sqrt{\frac{\log m}{\epsilon}})$ samples are necessary so that $\delta$ is a constant. Our analysis is based on a constrained version of the balls-into-bins problem. We believe our analysis may be of independent interest. We also study a related problem that asks for the following sketching algorithm: with given parameters $\alpha,k$ and $\epsilon$, the algorithm takes a subset of coordinates $A$ of size at most $k$ and returns an estimate of the number of unseparated pairs in $A$ up to a $(1\pm\epsilon)$ factor if it is at least $\alpha {n \choose 2}$. We show that even for constant $\alpha$ and success probability, such a sketching algorithm must use $\Omega(mk \log \epsilon^{-1})$ bits of space; on the other hand, uniform sampling yields a sketch of size $\Theta(\frac{mk \log m}{\alpha \epsilon^2})$ for this purpose.
We consider a $p$-dimensional, centered normal population such that all variables have a positive variance $\sigma^2$ and any correlation coefficient between different variables is a given nonnegative constant $\rho<1$. Suppose that both the sample size $n$ and population dimension $p$ tend to infinity with $p/n \to c>0$. We prove that the limiting spectral distribution of a sample correlation matrix is Mar\v{c}enko-Pastur distribution of index $c$ and scale parameter $1-\rho$. By the limiting spectral distributions, we rigorously show the limiting behavior of widespread stopping rules Guttman-Kaiser criterion and cumulative-percentage-of-variation rule in PCA and EFA. As a result, we establish the following dichotomous behavior of Guttman-Kaiser criterion when both $n$ and $p$ are large, but $p/n$ is small: (1) the criterion retains a small number of variables for $\rho>0$, as suggested by Kaiser, Humphreys, and Tucker [Kaiser, H. F. (1992). On Cliff's formula, the Kaiser-Guttman rule and the number of factors. Percept. Mot. Ski. 74]; and (2) the criterion retains $p/2$ variables for $\rho=0$, as in a simulation study [Yeomans, K. A. and Golder, P. A. (1982). The Guttman-Kaiser criterion as a predictor of the number of common factors. J. Royal Stat. Soc. Series D. 31(3)].
It is well-known that cohomology has a richer structure than homology. However, so far, in practice, the use of cohomology in persistence setting has been limited to speeding up of barcode computations. Two recently introduced invariants, namely, persistent cup-length and persistent Steenrod modules, to some extent, fill this gap. When added to the standard persistence barcode, they lead to invariants that are more discriminative than the standard persistence barcode. In this work, we introduce (the persistent variants of) the order-$k$ cup product modules, which are images of maps from the $k$-fold tensor products of the cohomology vector space of a complex to the cohomology vector space of the complex itself. We devise an $O(d n^4)$ algorithm for computing the order-$k$ cup product persistent modules for all $k \in \{2, \dots, d\}$, where $d$ denotes the dimension of the filtered complex, and $n$ denotes its size. Furthermore, we show that these modules are stable for Cech and Rips filtrations. Finally, we note that the persistent cup length can be obtained as a byproduct of our computations leading to a significantly faster algorithm for computing it.
Most published work on differential privacy (DP) focuses exclusively on meeting privacy constraints, by adding to the query noise with a pre-specified parametric distribution model, typically with one or two degrees of freedom. The accuracy of the response and its utility to the intended use are frequently overlooked. Considering that several database queries are categorical in nature (e.g., a label, a ranking, etc.), or can be quantized, the parameters that define the randomized mechanism's distribution are finite. Thus, it is reasonable to search through numerical optimization for the probability masses that meet the privacy constraints while minimizing the query distortion. Considering the modulo summation of random noise as the DP mechanism, the goal of this paper is to introduce a tractable framework to design the optimum noise probability mass function (PMF) for database queries with a discrete and finite set, optimizing with an expected distortion metric for a given $(\epsilon,\delta)$. We first show that the optimum PMF can be obtained by solving a mixed integer linear program (MILP). Then, we derive closed-form solutions for the optimum PMF that minimize the probability of error for two special cases. We show numerically that the proposed optimal mechanisms significantly outperform the state-of-the-art.
This paper introduces JAX-FEM, an open-source differentiable finite element method (FEM) library. Constructed on top of Google JAX, a rising machine learning library focusing on high-performance numerical computing, JAX-FEM is implemented with pure Python while scalable to efficiently solve problems with moderate to large sizes. For example, in a 3D tensile loading problem with 7.7 million degrees of freedom, JAX-FEM with GPU achieves around 10$\times$ acceleration compared to a commercial FEM code depending on platform. Beyond efficiently solving forward problems, JAX-FEM employs the automatic differentiation technique so that inverse problems are solved in a fully automatic manner without the need to manually derive sensitivities. Examples of 3D topology optimization of nonlinear materials are shown to achieve optimal compliance. Finally, JAX-FEM is an integrated platform for machine learning-aided computational mechanics. We show an example of data-driven multi-scale computations of a composite material where JAX-FEM provides an all-in-one solution from microscopic data generation and model training to macroscopic FE computations. The source code of the library and these examples are shared with the community to facilitate computational mechanics research.