We establish several properties of (weighted) generalized $\psi$-estimators introduced by Barczy and P\'ales in 2022: mean-type, monotonicity and sensitivity properties, bisymmetry-type inequality and some asymptotic and continuity properties as well. We also illustrate these properties by providing several examples including statistical ones as well.
For factor analysis, many estimators, starting with the maximum likelihood estimator, are developed, and the statistical properties of most estimators are well discussed. In the early 2000s, a new estimator based on matrix factorization, called Matrix Decomposition Factor Analysis (MDFA), was developed. Although the estimator is obtained by minimizing the principal component analysis-like loss function, this estimator empirically behaves like other consistent estimators of factor analysis, not principal component analysis. Since the MDFA estimator cannot be formulated as a classical M-estimator, the statistical properties of the MDFA estimator have not yet been discussed. To explain this unexpected behavior theoretically, we establish the consistency of the MDFA estimator as the factor analysis. That is, we show that the MDFA estimator has the same limit as other consistent estimators of factor analysis.
In decision-making, maxitive functions are used for worst-case and best-case evaluations. Maxitivity gives rise to a rich structure that is well-studied in the context of the pointwise order. In this article, we investigate maxitivity with respect to general preorders and provide a representation theorem for such functionals. The results are illustrated for different stochastic orders in the literature, including the usual stochastic order, the increasing convex/concave order, and the dispersive order.
Data sets tend to live in low-dimensional non-linear subspaces. Ideal data analysis tools for such data sets should therefore account for such non-linear geometry. The symmetric Riemannian geometry setting can be suitable for a variety of reasons. First, it comes with a rich mathematical structure to account for a wide range of non-linear geometries that has been shown to be able to capture the data geometry through empirical evidence from classical non-linear embedding. Second, many standard data analysis tools initially developed for data in Euclidean space can also be generalised efficiently to data on a symmetric Riemannian manifold. A conceptual challenge comes from the lack of guidelines for constructing a symmetric Riemannian structure on the data space itself and the lack of guidelines for modifying successful algorithms on symmetric Riemannian manifolds for data analysis to this setting. This work considers these challenges in the setting of pullback Riemannian geometry through a diffeomorphism. The first part of the paper characterises diffeomorphisms that result in proper, stable and efficient data analysis. The second part then uses these best practices to guide construction of such diffeomorphisms through deep learning. As a proof of concept, different types of pullback geometries -- among which the proposed construction -- are tested on several data analysis tasks and on several toy data sets. The numerical experiments confirm the predictions from theory, i.e., that the diffeomorphisms generating the pullback geometry need to map the data manifold into a geodesic subspace of the pulled back Riemannian manifold while preserving local isometry around the data manifold for proper, stable and efficient data analysis, and that pulling back positive curvature can be problematic in terms of stability.
The $\beta$-model has been extensively utilized to model degree heterogeneity in networks, wherein each node is assigned a unique parameter. In this article, we consider the hypothesis testing problem that two nodes $i$ and $j$ of a $\beta$-model have the same node parameter. We prove that the null distribution of the proposed statistic converges in distribution to the standard normal distribution. Further, we investigate the homogeneous test for $\beta$-model by combining individual $p$-values to aggregate small effects of multiple tests. Both simulation studies and real-world data examples indicate that the proposed method works well.
We show that certain diagrams of $\infty$-logoses are reconstructed in internal languages of their oplax limits via lex, accessible modalities, which enables us to use plain homotopy type theory to reason about not only a single $\infty$-logos but also a diagram of $\infty$-logoses. This also provides a higher dimensional version of Sterling's synthetic Tait computability -- a type theory for higher dimensional logical relations. To prove the main result, we establish a precise correspondence between the lex, accessible localizations of an $\infty$-logos and the lex, accessible modalities in the internal language of the $\infty$-logos. To do this, we also partly develop the Kripke-Joyal semantics of homotopy type theory in $\infty$-logoses.
We consider Gibbs distributions, which are families of probability distributions over a discrete space $\Omega$ with probability mass function of the form $\mu^\Omega_\beta(\omega) \propto e^{\beta H(\omega)}$ for $\beta$ in an interval $[\beta_{\min}, \beta_{\max}]$ and $H( \omega ) \in \{0 \} \cup [1, n]$. The partition function is the normalization factor $Z(\beta)=\sum_{\omega \in\Omega}e^{\beta H(\omega)}$. Two important parameters of these distributions are the log partition ratio $q = \log \tfrac{Z(\beta_{\max})}{Z(\beta_{\min})}$ and the counts $c_x = |H^{-1}(x)|$. These are correlated with system parameters in a number of physical applications and sampling algorithms. Our first main result is to estimate the counts $c_x$ using roughly $\tilde O( \frac{q}{\varepsilon^2})$ samples for general Gibbs distributions and $\tilde O( \frac{n^2}{\varepsilon^2} )$ samples for integer-valued distributions (ignoring some second-order terms and parameters), and we show this is optimal up to logarithmic factors. We illustrate with improved algorithms for counting connected subgraphs, independent sets, and perfect matchings. As a key subroutine, we also develop algorithms to compute the partition function $Z$ using $\tilde O(\frac{q}{\varepsilon^2})$ samples for general Gibbs distributions and using $\tilde O(\frac{n^2}{\varepsilon^2})$ samples for integer-valued distributions.
This article is concerned with the multilevel Monte Carlo (MLMC) methods for approximating expectations of some functions of the solution to the Heston 3/2-model from mathematical finance, which takes values in $(0, \infty)$ and possesses superlinearly growing drift and diffusion coefficients. To discretize the SDE model, a new Milstein-type scheme is proposed to produce independent sample paths. The proposed scheme can be explicitly solved and is positivity-preserving unconditionally, i.e., for any time step-size $h>0$. This positivity-preserving property for large discretization time steps is particularly desirable in the MLMC setting. Furthermore, a mean-square convergence rate of order one is proved in the non-globally Lipschitz regime, which is not trivial, as the diffusion coefficient grows super-linearly. The obtained order-one convergence in turn promises the desired relevant variance of the multilevel estimator and justifies the optimal complexity $\mathcal{O}(\epsilon^{-2})$ for the MLMC approach, where $\epsilon > 0$ is the required target accuracy. Numerical experiments are finally reported to confirm the theoretical findings.
The central path problem is a variation on the single facility location problem. The aim is to find, in a given connected graph $G$, a path $P$ minimizing its eccentricity, which is the maximal distance from $P$ to any vertex of the graph $G$. The path eccentricity of $G$ is the minimal eccentricity achievable over all paths in $G$. In this article we consider the path eccentricity of the class of the $k$-AT-free graphs. They are graphs in which any set of three vertices contains a pair for which every path between them uses at least one vertex of the closed neighborhood at distance $k$ of the third. We prove that they have path eccentricity bounded by $k$. Moreover, we answer a question of G\'omez and Guti\'errez asking if there is a relation between path eccentricity and the consecutive ones property. The latter is the property for a binary matrix to admit a permutation of the rows placing the 1's consecutively on the columns. It was already known that graphs whose adjacency matrices have the consecutive ones property have path eccentricity at most 1, and that the same remains true when the augmented adjacency matrices (with ones on the diagonal) has the consecutive ones property. We generalize these results as follow. We study graphs whose adjacency matrices can be made to satisfy the consecutive ones property after changing some values on the diagonal, and show that those graphs have path eccentricity at most 2, by showing that they are 2-AT-free.
An $\epsilon$-test for any non-trivial property (one for which there are both satisfying inputs and inputs of large distance from the property) should use a number of queries that is at least inversely proportional in $\epsilon$. However, to the best of our knowledge there is no reference proof for this intuition. Such a proof is provided here. It is written so as to not require any prior knowledge of the related literature, and in particular does not use Yao's method.
The paper presents a spectral representation for general type two-sided discrete time signals from $\ell_\infty$, i.e for all bounded discrete time signals, including signals that do not vanish at $\pm\infty$. This representation allows to extend on the general type signals from $\ell_\infty$ the notions of transfer functions, spectrum gaps, and filters, and to obtain some frequency conditions of predictability and data recoverability.