亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In statistics, independent, identically distributed random samples do not carry a natural ordering, and their statistics are typically invariant with respect to permutations of their order. Thus, an $n$-sample in a space $M$ can be considered as an element of the quotient space of $M^n$ modulo the permutation group. The present paper takes this definition of sample space and the related concept of orbit types as a starting point for developing a geometric perspective on statistics. We aim at deriving a general mathematical setting for studying the behavior of empirical and population means in spaces ranging from smooth Riemannian manifolds to general stratified spaces. We fully describe the orbifold and path-metric structure of the sample space when $M$ is a manifold or path-metric space, respectively. These results are non-trivial even when $M$ is Euclidean. We show that the infinite sample space exists in a Gromov-Hausdorff type sense and coincides with the Wasserstein space of probability distributions on $M$. We exhibit Fr\'echet means and $k$-means as metric projections onto 1-skeleta or $k$-skeleta in Wasserstein space, and we define a new and more general notion of polymeans. This geometric characterization via metric projections applies equally to sample and population means, and we use it to establish asymptotic properties of polymeans such as consistency and asymptotic normality.

相關內容

The identification of interesting substructures within jets is an important tool for searching for new physics and probing the Standard Model at colliders. Many of these substructure tools have previously been shown to take the form of optimal transport problems, in particular the Energy Mover's Distance (EMD). In this work, we show that the EMD is in fact the natural structure for comparing collider events, which accounts for its recent success in understanding event and jet substructure. We then present a Shape Hunting Algorithm using Parameterized Energy Reconstruction (SHAPER), which is a general framework for defining and computing shape-based observables. SHAPER generalizes N-jettiness from point clusters to any extended, parametrizable shape. This is accomplished by efficiently minimizing the EMD between events and parameterized manifolds of energy flows representing idealized shapes, implemented using the dual-potential Sinkhorn approximation of the Wasserstein metric. We show how the geometric language of observables as manifolds can be used to define novel observables with built-in infrared-and-collinear safety. We demonstrate the efficacy of the SHAPER framework by performing empirical jet substructure studies using several examples of new shape-based observables.

We study the problems of sequential nonparametric two-sample and independence testing. Sequential tests process data online and allow using observed data to decide whether to stop and reject the null hypothesis or to collect more data, while maintaining type I error control. We build upon the principle of (nonparametric) testing by betting, where a gambler places bets on future observations and their wealth measures evidence against the null hypothesis. While recently developed kernel-based betting strategies often work well on simple distributions, selecting a suitable kernel for high-dimensional or structured data, such as images, is often nontrivial. To address this drawback, we design prediction-based betting strategies that rely on the following fact: if a sequentially updated predictor starts to consistently determine (a) which distribution an instance is drawn from, or (b) whether an instance is drawn from the joint distribution or the product of the marginal distributions (the latter produced by external randomization), it provides evidence against the two-sample or independence nulls respectively. We empirically demonstrate the superiority of our tests over kernel-based approaches under structured settings. Our tests can be applied beyond the case of independent and identically distributed data, remaining valid and powerful even when the data distribution drifts over time.

We consider the framework of penalized estimation where the penalty term is given by a real-valued polyhedral gauge, which encompasses methods such as LASSO (and many variants thereof such as the generalized LASSO), SLOPE, OSCAR, PACS and others. Each of these estimators can uncover a different structure or ``pattern'' of the unknown parameter vector. We define a general notion of patterns based on subdifferentials and formalize an approach to measure their complexity. For pattern recovery, we provide a minimal condition for a particular pattern to be detected by the procedure with positive probability, the so-called accessibility condition. Using our approach, we also introduce the stronger noiseless recovery condition. For the LASSO, it is well known that the irrepresentability condition is necessary for pattern recovery with probability larger than $1/2$ and we show that the noiseless recovery plays exactly the same role, thereby extending and unifying the irrepresentability condition of the LASSO to a broad class of penalized estimators. We show that the noiseless recovery condition can be relaxed when turning to thresholded penalized estimators, extending the idea of the thresholded LASSO: we prove that the accessibility condition is already sufficient (and necessary) for sure pattern recovery by thresholded penalized estimation provided that the signal of the pattern is large enough. Throughout the article, we demonstrate how our findings can be interpreted through a geometrical lens.

Kernel interpolation is a versatile tool for the approximation of functions from data, and it can be proven to have some optimality properties when used with kernels related to certain Sobolev spaces. In the context of interpolation, the selection of optimal function sampling locations is a central problem, both from a practical perspective, and as an interesting theoretical question. Greedy interpolation algorithms provide a viable solution for this task, being efficient to run and provably accurate in their approximation. In this paper we close a gap that is present in the convergence theory for these algorithms by employing a recent result on general greedy algorithms. This modification leads to new convergence rates which match the optimal ones when restricted to the $P$-greedy target-data-independent selection rule, and can additionally be proven to be optimal when they fully exploit adaptivity ($f$-greedy). Other than closing this gap, the new results have some significance in the broader setting of the optimality of general approximation algorithms in Reproducing Kernel Hilbert Spaces, as they allow us to compare adaptive interpolation with non-adaptive best nonlinear approximation.

We study the problem of fairly allocating $m$ indivisible items among $n$ agents. Envy-free allocations, in which each agent prefers her bundle to the bundle of every other agent, need not exist in the worst case. However, when agents have additive preferences and the value $v_{i,j}$ of agent $i$ for item $j$ is drawn independently from a distribution $D_i$, envy-free allocations exist with high probability when $m \in \Omega( n \log n / \log \log n )$. In this paper, we study the existence of envy-free allocations under stochastic valuations far beyond the additive setting. We introduce a new stochastic model in which each agent's valuation is sampled by first fixing a worst-case function, and then drawing a uniformly random renaming of the items, independently for each agent. This strictly generalizes known settings; for example, $v_{i,j} \sim D_i$ may be seen as picking a random (instead of a worst-case) additive function before renaming. We prove that random renaming is sufficient to ensure that envy-free allocations exist with high probability in very general settings. When valuations are non-negative and ``order-consistent,'' a valuation class that generalizes additive, budget-additive, unit-demand, and single-minded agents, SD-envy-free allocations (a stronger notion of fairness than envy-freeness) exist for $m \in \omega(n^2)$ when $n$ divides $m$, and SD-EFX allocations exist for all $m \in \omega(n^2)$. The dependence on $n$ is tight, that is, for $m \in O(n^2)$ envy-free allocations don't exist with constant probability. For the case of arbitrary valuations (allowing non-monotone, negative, or mixed-manna valuations) and $n=2$ agents, we prove envy-free allocations exist with probability $1 - \Theta(1/m)$ (and this is tight).

It often happens that free algebras for a given theory satisfy useful reasoning principles that are not preserved under homomorphisms of algebras, and hence need not hold in an arbitrary algebra. For instance, if $M$ is the free monoid on a set $A$, then the scalar multiplication function $A\times M \to M$ is injective. Therefore, when reasoning in the formal theory of monoids under $A$, it is possible to use this injectivity law to make sound deductions even about monoids under $A$ for which scalar multiplication is not injective -- a principle known in algebra as the permanence of identity. Properties of this kind are of fundamental practical importance to the logicians and computer scientists who design and implement computerized proof assistants like Lean and Coq, as they enable the formal reductions of equational problems that make type checking tractable. As type theories have become increasingly more sophisticated, it has become more and more difficult to establish the useful properties of their free models that enable effective implementation. These obstructions have facilitated a fruitful return to foundational work in type theory, which has taken on a more geometrical flavor than ever before. Here we expose a modern way to prove a highly non-trivial injectivity law for free models of Martin-L\"of type theory, paying special attention to the ways that contemporary methods in type theory have been influenced by three important ideas of the Grothendieck school: the relative point of view, the language of universes, and the recollement of generalized spaces.

We investigate a fundamental vertex-deletion problem called (Induced) Subgraph Hitting: given a graph $G$ and a set $\mathcal{F}$ of forbidden graphs, the goal is to compute a minimum-sized set $S$ of vertices of $G$ such that $G-S$ does not contain any graph in $\mathcal{F}$ as an (induced) subgraph. This is a generic problem that encompasses many well-known problems that were extensively studied on their own, particularly (but not only) from the perspectives of both approximation and parameterization. In this paper, we study the approximability of the problem on a large variety of graph classes. Our first result is a linear-time $(1+\varepsilon)$-approximation reduction from (Induced) Subgraph Hitting on any graph class $\mathcal{G}$ of bounded expansion to the same problem on bounded degree graphs within $\mathcal{G}$. This directly yields linear-size $(1+\varepsilon)$-approximation lossy kernels for the problems on any bounded-expansion graph classes. Our second result is a linear-time approximation scheme for (Induced) Subgraph Hitting on any graph class $\mathcal{G}$ of polynomial expansion, based on the local-search framework of Har-Peled and Quanrud [SICOMP 2017]. This approximation scheme can be applied to a more general family of problems that aim to hit all subgraphs satisfying a certain property $\pi$ that is efficiently testable and has bounded diameter. Both of our results have applications to Subgraph Hitting (not induced) on wide classes of geometric intersection graphs, resulting in linear-size lossy kernels and (near-)linear time approximation schemes for the problem.

Knowledge graph embedding (KGE) is a increasingly popular technique that aims to represent entities and relations of knowledge graphs into low-dimensional semantic spaces for a wide spectrum of applications such as link prediction, knowledge reasoning and knowledge completion. In this paper, we provide a systematic review of existing KGE techniques based on representation spaces. Particularly, we build a fine-grained classification to categorise the models based on three mathematical perspectives of the representation spaces: (1) Algebraic perspective, (2) Geometric perspective, and (3) Analytical perspective. We introduce the rigorous definitions of fundamental mathematical spaces before diving into KGE models and their mathematical properties. We further discuss different KGE methods over the three categories, as well as summarise how spatial advantages work over different embedding needs. By collating the experimental results from downstream tasks, we also explore the advantages of mathematical space in different scenarios and the reasons behind them. We further state some promising research directions from a representation space perspective, with which we hope to inspire researchers to design their KGE models as well as their related applications with more consideration of their mathematical space properties.

The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods. Indeed, many high-dimensional learning tasks previously thought to be beyond reach -- such as computer vision, playing Go, or protein folding -- are in fact feasible with appropriate computational scale. Remarkably, the essence of deep learning is built from two simple algorithmic principles: first, the notion of representation or feature learning, whereby adapted, often hierarchical, features capture the appropriate notion of regularity for each task, and second, learning by local gradient-descent type methods, typically implemented as backpropagation. While learning generic functions in high dimensions is a cursed estimation problem, most tasks of interest are not generic, and come with essential pre-defined regularities arising from the underlying low-dimensionality and structure of the physical world. This text is concerned with exposing these regularities through unified geometric principles that can be applied throughout a wide spectrum of applications. Such a 'geometric unification' endeavour, in the spirit of Felix Klein's Erlangen Program, serves a dual purpose: on one hand, it provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers. On the other hand, it gives a constructive procedure to incorporate prior physical knowledge into neural architectures and provide principled way to build future architectures yet to be invented.

Recent years have witnessed the enormous success of low-dimensional vector space representations of knowledge graphs to predict missing facts or find erroneous ones. Currently, however, it is not yet well-understood how ontological knowledge, e.g. given as a set of (existential) rules, can be embedded in a principled way. To address this shortcoming, in this paper we introduce a framework based on convex regions, which can faithfully incorporate ontological knowledge into the vector space embedding. Our technical contribution is two-fold. First, we show that some of the most popular existing embedding approaches are not capable of modelling even very simple types of rules. Second, we show that our framework can represent ontologies that are expressed using so-called quasi-chained existential rules in an exact way, such that any set of facts which is induced using that vector space embedding is logically consistent and deductively closed with respect to the input ontology.

北京阿比特科技有限公司