Johnson-Lindenstrauss lemma states random projections can be used as a topology preserving embedding technique for fixed vectors. In this paper, we try to understand how random projections affect probabilistic properties of random vectors. In particular we prove the distribution of inner product of two independent random vectors $X, Z \in {R}^n$ is preserved by random projection $S:{R}^n \to {R}^m$. More precisely, \[ \sup_t \left| \text{P}(\frac{1}{C_{m,n}} X^TS^TSZ <t) - \text{P}(\frac{1}{\sqrt{n}} X^TZ<t) \right| \le O\left(\frac{1}{\sqrt{n}}+ \frac{1}{\sqrt{m}} \right) \] As a by-product, we obtain product central limit theorem (product-CLT) for $\sum_{k=1}^{n} X_k Y_k$, where $\{X_k\}$ is a martingale difference sequence, and $\{Y_k\}$ has dependency within the sequence. We also obtain the rate of convergence in the spirit of Berry-Esseen theorem.
In this paper we prove upper and lower bounds on the minimal spherical dispersion. In particular, we see that the inverse $N(\varepsilon,d)$ of the minimal spherical dispersion is, for fixed $\varepsilon>0$, up to logarithmic terms linear in the dimension $d$. We also derive upper and lower bounds on the expected dispersion for points chosen independently and uniformly at random from the Euclidean unit sphere.
A Multiplicative-Exponential Linear Logic (MELL) proof-structure can be expanded into a set of resource proof-structures: its Taylor expansion. We introduce a new criterion characterizing (and deciding in the finite case) those sets of resource proof-structures that are part of the Taylor expansion of some MELL proof-structure, through a rewriting system acting both on resource and MELL proof-structures. We also prove semi-decidability of the type inhabitation problem for cut-free MELL proof-structures.
As is well known, the stability of the 3-step backward differentiation formula (BDF3) on variable grids for a parabolic problem is analyzed in [Calvo and Grigorieff, \newblock BIT. \textbf{42} (2002) 689--701] under the condition $r_k:=\tau_k/\tau_{k-1}<1.199$, where $r_k$ is the adjacent time-step ratio. In this work, we establish the spectral norm inequality, which can be used to give a upper bound for the inverse matrix. Then the BDF3 scheme is unconditionally stable under a new condition $r_k\leq 1.405$. Meanwhile, we show that the upper bound of the ratio $r_k$ is less than $\sqrt{3}$ for BDF3 scheme. In addition, based on the idea of [Wang and Ruuth, J. Comput. Math. \textbf{26} (2008) 838--855; Chen, Yu, and Zhang, arXiv:2108.02910], we design a weighted and shifted BDF3 (WSBDF3) scheme for solving the parabolic problem. We prove that the WSBDF3 scheme is unconditionally stable under the condition $r_k\leq 1.771$, which is a significant improvement for the maximum time-step ratio. The error estimates are obtained by the stability inequality. Finally, numerical experiments are given to illustrate the theoretical results.
In this paper, we proposed a multi-objective approach for the EEG Inverse Problem. This formulation does not need unknown parameters that involve empirical procedures. Due to the combinatorial characteristics of the problem, this alternative included evolutionary strategies to resolve it. The result is a Multi-objective Evolutionary Algorithm based on Anatomical Restrictions (MOEAAR) to estimate distributed solutions. The comparative tests were between this approach and 3 classic methods of regularization: LASSO, Ridge-L and ENET-L. In the experimental phase, regression models were selected to obtain sparse and distributed solutions. The analysis involved simulated data with different signal-to-noise ratio (SNR). The indicators for quality control were Localization Error, Spatial Resolution and Visibility. The MOEAAR evidenced better stability than the classic methods in the reconstruction and localization of the maximum activation. The norm L0 was used to estimate sparse solutions with the evolutionary approach and its results were relevant.
This paper is concerned with the problem of comparing the population means of two groups of independent observations. An approximate randomization test procedure based on the test statistic of Chen & Qin (2010) is proposed. The asymptotic behavior of the test statistic as well as the randomized statistic is studied under weak conditions. In our theoretical framework, observations are not assumed to be identically distributed even within groups. No condition on the eigenstructure of the covariance matrices is imposed. And the sample sizes of two groups are allowed to be unbalanced. Under general conditions, all possible asymptotic distributions of the test statistic are obtained. We derive the asymptotic level and local power of the approximate randomization 20 test procedure. Our theoretical results show that the proposed test procedure can adapt to all possible asymptotic distributions of the test statistic and always has correct test level asymptotically. Also, the proposed test procedure has good power behavior. Our numerical experiments show that the proposed test procedure has favorable performance compared with several alternative test procedures.
A random dot product graph (RDPG) is a generative model for networks in which vertices correspond to positions in a latent Euclidean space and edge probabilities are determined by the dot products of the latent positions. We consider RDPGs for which the latent positions are randomly sampled from an unknown $1$-dimensional submanifold of the latent space. In principle, restricted inference, i.e., procedures that exploit the structure of the submanifold, should be more effective than unrestricted inference; however, it is not clear how to conduct restricted inference when the submanifold is unknown. We submit that techniques for manifold learning can be used to learn the unknown submanifold well enough to realize benefit from restricted inference. To illustrate, we test $1$- and $2$-sample hypotheses about the Fr\'{e}chet means of small communities of vertices, using the complete set of vertices to infer latent structure. We propose test statistics that deploy the Isomap procedure for manifold learning, using shortest path distances on neighborhood graphs constructed from estimated latent positions to estimate arc lengths on the unknown $1$-dimensional submanifold. Unlike conventional applications of Isomap, the estimated latent positions do not lie on the submanifold of interest. We extend existing convergence results for Isomap to this setting and use them to demonstrate that, as the number of auxiliary vertices increases, the power of our test converges to the power of the corresponding test when the submanifold is known. Finally, we apply our methods to an inference problem that arises in studying the connectome of the Drosophila larval mushroom body. The univariate learnt manifold test rejects ($p<0.05$), while the multivariate ambient space test does not ($p\gg0.05$), illustrating the value of identifying and exploiting low-dimensional structure for subsequent inference.
The combinatorial diameter $\operatorname{diam}(P)$ of a polytope $P$ is the maximum shortest path distance between any pair of vertices. In this paper, we provide upper and lower bounds on the combinatorial diameter of a random "spherical" polytope, which is tight to within one factor of dimension when the number of inequalities is large compared to the dimension. More precisely, for an $n$-dimensional polytope $P$ defined by the intersection of $m$ i.i.d.\ half-spaces whose normals are chosen uniformly from the sphere, we show that $\operatorname{diam}(P)$ is $\Omega(n m^{\frac{1}{n-1}})$ and $O(n^2 m^{\frac{1}{n-1}} + n^5 4^n)$ with high probability when $m \geq 2^{\Omega(n)}$. For the upper bound, we first prove that the number of vertices in any fixed two dimensional projection sharply concentrates around its expectation when $m$ is large, where we rely on the $\Theta(n^2 m^{\frac{1}{n-1}})$ bound on the expectation due to Borgwardt [Math. Oper. Res., 1999]. To obtain the diameter upper bound, we stitch these ``shadows paths'' together over a suitable net using worst-case diameter bounds to connect vertices to the nearest shadow. For the lower bound, we first reduce to lower bounding the diameter of the dual polytope $P^\circ$, corresponding to a random convex hull, by showing the relation $\operatorname{diam}(P) \geq (n-1)(\operatorname{diam}(P^\circ)-2)$. We then prove that the shortest path between any ``nearly'' antipodal pair vertices of $P^\circ$ has length $\Omega(m^{\frac{1}{n-1}})$.
We consider the interaction between a free flowing fluid and a porous medium flow, where the free flowing fluid is described using the time dependent Stokes equations, and the porous medium flow is described using Darcy's law in the primal formulation. To solve this problem numerically, we use the diffuse interface approach, where the weak form of the coupled problem is written on an extended domain which contains both Stokes and Darcy regions. This is achieved using a phase-field function which equals one in the Stokes region and zero in the Darcy region, and smoothly transitions between these two values on a diffuse region of width $\epsilon$ around the Stokes-Darcy interface. We prove the convergence of the diffuse interface formulation to the standard, sharp interface formulation, and derive the rates of the convergence. This is performed by analyzing the modeling error of the diffuse interface approach at the continuous level, and by deriving the a priori error estimates for the diffuse interface method at the discrete level. The convergence rates are also derived computationally in a numerical example.
A fundamental problem in numerical analysis and approximation theory is approximating smooth functions by polynomials. A much harder version under recent consideration is to enforce bounds constraints on the approximating polynomial. In this paper, we consider the problem of approximating functions by polynomials whose Bernstein coefficients with respect to a given degree satisfy such bounds, which implies such bounds on the approximant. We frame the problem as an inequality-constrained optimization problem and give an algorithm for finding the Bernstein coefficients of the exact solution. Additionally, our method can be modified slightly to include equality constraints such as mass preservation. It also extends naturally to multivariate polynomials over a simplex.
We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.