唐人街探案三免费观看_狼友视频首页_国精人妻品无码一区二区三区99_国产日韩精品综合欧美一区二区_国产日韩欧美在线播放_国产一区在线观看无码中文_日本免费一区二区三区A区

Any discrete distribution with support on $\{0,\ldots, d\}$ can be constructed as the distribution of sums of Bernoulli variables. We prove that the class of $d$-dimensional Bernoulli variables $\boldsymbol{X}=(X_1,\ldots, X_d)$ whose sums $\sum_{i=1}^dX_i$ have the same distribution $p$ is a convex polytope $\mathcal{P}(p)$ and we analytically find its extremal points. Our main result is to prove that the Hausdorff measure of the polytopes $\mathcal{P}(p), p\in \mathcal{D}_d,$ is a continuous function $l(p)$ over $\mathcal{D}_d$ and it is the density of a finite measure $\mu_s$ on $\mathcal{D}_d$ that is Hausdorff absolutely continuous. We also prove that the measure $\mu_s$ normalized over the simplex $\mathcal{D}$ belongs to the class of Dirichlet distributions. We observe that the symmetric binomial distribution is the mean of the Dirichlet distribution on $\mathcal{D}$ and that when $d$ increases it converges to the mode.

相關內容

離散化

關注 0

泛函 · Analysis · 估計/估計量 · 數據分析 · 優化器 ·

2024 年 12 月 9 日

Optimal estimation in private distributed functional data analysis

Gengyu Xue,Zhenhua Lin,Yi Yu

We systematically investigate the preservation of differential privacy in functional data analysis, beginning with functional mean estimation and extending to varying coefficient model estimation. Our work introduces a distributed learning framework involving multiple servers, each responsible for collecting several sparsely observed functions. This hierarchical setup introduces a mixed notion of privacy. Within each function, user-level differential privacy is applied to $m$ discrete observations. At the server level, central differential privacy is deployed to account for the centralised nature of data collection. Across servers, only private information is exchanged, adhering to federated differential privacy constraints. To address this complex hierarchy, we employ minimax theory to reveal several fundamental phenomena: from sparse to dense functional data analysis, from user-level to central and federated differential privacy costs, and the intricate interplay between different regimes of functional data analysis and privacy preservation. To the best of our knowledge, this is the first study to rigorously examine functional data estimation under multiple privacy constraints. Our theoretical findings are complemented by efficient private algorithms and extensive numerical evidence, providing a comprehensive exploration of this challenging problem.

Continuity · 向量化 · Learning · 查準率/準確率 · 近似 ·

2024 年 12 月 9 日

How many continuous measurements are needed to learn a vector?

David Krieg,Erich Novak,Mario Ullrich

from arxiv, 12 pages

One can recover vectors from $\mathbb{R}^m$ with arbitrary precision, using only $\lceil \log_2(m+1)\rceil +1$ continuous measurements that are chosen adaptively. This surprising result is explained and discussed, and we present applications to infinite-dimensional approximation problems.

回火 · Processing（編程語言） · 類別 · 樣例 · 評論員 ·

2024 年 12 月 9 日

Stable and tempered stable distributions and processes: an overview toward trajectory simulation

Taher Jalal

from arxiv, 25 pages, 6 figures

Stable distributions are a celebrated class of probability laws used in various fields. The $\alpha$-stable process, and its exponentially tempered counterpart, the Classical Tempered Stable (CTS) process, are also prominent examples of L\'evy processes. Simulating these processes is critical for many applications, yet it remains computationally challenging, due to their infinite jump activity. This survey provides an overview of the key properties of these objects offering a roadmap for practitioners. The first part is a review of the stability property, sampling algorithms are provided along with numerical illustrations. Then CTS processes are presented, with the Baeumer-Meerschaert algorithm for increment simulation, and a computational analysis is provided with numerical illustrations across different time scales.

平滑 · 估計/估計量 · 模型評估 · 離散化 · 約束 ·

2024 年 12 月 8 日

The need for accuracy and smoothness in numerical simulations

Carl Christian Kjelgaard Mikkelsen,Lorién López-Villellas

from arxiv, 14 pages, 12 figures

We consider the problem of estimating the error when solving a system of differential algebraic equations. Richardson extrapolation is a classical technique that can be used to judge when computational errors are irrelevant and estimate the discretization error. We have simulated molecular dynamics with constraints using the GROMACS library and found that the output is not always amenable to Richardson extrapolation. We derive and illustrate Richardson extrapolation using a variety of numerical experiments. We identify two necessary conditions that are not always satisfied by the GROMACS library.

代價 · MoDELS · 計算成本 · 查準率/準確率 · 基 ·

2024 年 12 月 7 日

Cost-sensitive computational adequacy of higher-order recursion in synthetic domain theory

Yue Niu,Jonathan Sterling,Robert Harper

from arxiv, Final version for MFPS '24

We study a cost-aware programming language for higher-order recursion dubbed $\textbf{PCF}_\mathsf{cost}$ in the setting of synthetic domain theory (SDT). Our main contribution relates the denotational cost semantics of $\textbf{PCF}_\mathsf{cost}$ to its computational cost semantics, a new kind of dynamic semantics for program execution that serves as a mathematically natural alternative to operational semantics in SDT. In particular we prove an internal, cost-sensitive version of Plotkin's computational adequacy theorem, giving a precise correspondence between the denotational and computational semantics for complete programs at base type. The constructions and proofs of this paper take place in the internal dependent type theory of an SDT topos extended by a phase distinction in the sense of Sterling and Harper. By controlling the interpretation of cost structure via the phase distinction in the denotational semantics, we show that $\textbf{PCF}_\mathsf{cost}$ programs also evince a noninterference property of cost and behavior. We verify the axioms of the type theory by means of a model construction based on relative sheaf models of SDT.

簇 · 異常點 · 數據集 · 相似度 · DBSCAN ·

2024 年 12 月 7 日

Detecting outliers by clustering algorithms

Qi Li,Shuliang Wang

Clustering and outlier detection are two important tasks in data mining. Outliers frequently interfere with clustering algorithms to determine the similarity between objects, resulting in unreliable clustering results. Currently, only a few clustering algorithms (e.g., DBSCAN) have the ability to detect outliers to eliminate interference. For other clustering algorithms, it is tedious to introduce another outlier detection task to eliminate outliers before each clustering process. Obviously, how to equip more clustering algorithms with outlier detection ability is very meaningful. Although a common strategy allows clustering algorithms to detect outliers based on the distance between objects and clusters, it is contradictory to improving the performance of clustering algorithms on the datasets with outliers. In this paper, we propose a novel outlier detection approach, called ODAR, for clustering. ODAR maps outliers and normal objects into two separated clusters by feature transformation. As a result, any clustering algorithm can detect outliers by identifying clusters. Experiments show that ODAR is robust to diverse datasets. Compared with baseline methods, the clustering algorithms achieve the best on 7 out of 10 datasets with the help of ODAR, with at least 5% improvement in accuracy.

確切的 · 情景 · 樣本 · 優化器 · 目標函數 ·

2024 年 12 月 7 日

Exact distribution-free tests of spherical symmetry applicable to high dimensional data

Bilol Banerjee,Anil K. Ghosh

from arxiv, 37 pages

We develop some graph-based tests for spherical symmetry of a multivariate distribution using a method based on data augmentation. These tests are constructed using a new notion of signs and ranks that are computed along a path obtained by optimizing an objective function based on pairwise dissimilarities among the observations in the augmented data set. The resulting tests based on these signs and ranks have the exact distribution-free property, and irrespective of the dimension of the data, the null distributions of the test statistics remain the same. These tests can be conveniently used for high-dimensional data, even when the dimension is much larger than the sample size. Under appropriate regularity conditions, we prove the consistency of these tests in high dimensional asymptotic regime, where the dimension grows to infinity while the sample size may or may not grow with the dimension. We also propose a generalization of our methods to take care of the situations, where the center of symmetry is not specified by the null hypothesis. Several simulated data sets and a real data set are analyzed to demonstrate the utility of the proposed tests.

協方差矩陣 · 統計量 · 均值 · 優化器 · 最優化 ·

2024 年 12 月 7 日

Towards a unified theory for testing statistical hypothesis: Multinormal mean with nuisance covariance matrix

Ming-Tien Tsai

from arxiv, arXiv admin note: substantial text overlap with arXiv:1710.06573

Under a multinormal distribution with an arbitrary unknown covariance matrix, the main purpose of this paper is to propose a framework to achieve the goal of reconciliation of Bayesian, frequentist, and Fisher's reporting $p$-values, Neyman-Pearson's optimal theory and Wald's decision theory for the problems of testing mean against restricted alternatives (closed convex cones). To proceed, the tests constructed via the likelihood ratio (LR) and the union-intersection (UI) principles are studied. For the problems of testing against restricted alternatives, first, we show that the LRT and the UIT are not the proper Bayes tests, however, they are shown to be the integrated LRT and the integrated UIT, respectively. For the problem of testing against the positive orthant space alternative, both the null distributions of the LRT and the UIT depend on the unknown nuisance covariance matrix. Hence we have difficulty adopting Fisher's approach to reporting $p$-values. On the other hand, according to the definition of the level of significance, both the LRT and the UIT are shown to be power-dominated by the corresponding LRT and UIT for testing against the half-space alternative, respectively. Hence, both the LRT and the UIT are $\alpha$-inadmissible, these results are against the common statistical sense. Neither Fisher's approach of reporting $p$-values alone nor Neyman-Pearson's optimal theory for power function alone is a satisfactory criterion for evaluating the performance of tests. Wald's decision theory via $d$-admissibility may shed light on resolving these challenging issues of imposing the balance between type 1 error and power.

Branch · 線性的 · 泛化理論 · 情景 · 無限 ·

2024 年 12 月 6 日

Hahn series and Mahler equations: Algorithmic aspects

C. Faverjon,Julien Roques

Many articles have recently been devoted to Mahler equations, partly because of their links with other branches of mathematics such as automata theory. Hahn series (a generalization of the Puiseux series allowing arbitrary exponents of the indeterminate as long as the set that supports them is well-ordered) play a central role in the theory of Mahler equations. In this paper, we address the following fundamental question: is there an algorithm to calculate the Hahn series solutions of a given linear Mahler equation? What makes this question interesting is the fact that the Hahn series appearing in this context can have complicated supports with infinitely many accumulation points. Our (positive) answer to the above question involves among other things the construction of a computable well-ordered receptacle for the supports of the potential Hahn series solutions.

GROUP · TC · binary ·

2024 年 11 月 30 日

The complexity of knapsack problems in wreath products

Michael Figelius,Moses Ganardi,Markus Lohrey,Georg Zetzsche

We prove new complexity results for computational problems in certain wreath products of groups and (as an application) for free solvable group. For a finitely generated group we study the so-called power word problem (does a given expression $u_1^{k_1} \ldots u_d^{k_d}$, where $u_1, \ldots, u_d$ are words over the group generators and $k_1, \ldots, k_d$ are binary encoded integers, evaluate to the group identity?) and knapsack problem (does a given equation $u_1^{x_1} \ldots u_d^{x_d} = v$, where $u_1, \ldots, u_d,v$ are words over the group generators and $x_1,\ldots,x_d$ are variables, has a solution in the natural numbers). We prove that the power word problem for wreath products of the form $G \wr \mathbb{Z}$ with $G$ nilpotent and iterated wreath products of free abelian groups belongs to $\mathsf{TC}^0$. As an application of the latter, the power word problem for free solvable groups is in $\mathsf{TC}^0$. On the other hand we show that for wreath products $G \wr \mathbb{Z}$, where $G$ is a so called uniformly strongly efficiently non-solvable group (which form a large subclass of non-solvable groups), the power word problem is $\mathsf{coNP}$-hard. For the knapsack problem we show $\mathsf{NP}$-completeness for iterated wreath products of free abelian groups and hence free solvable groups. Moreover, the knapsack problem for every wreath product $G \wr \mathbb{Z}$, where $G$ is uniformly efficiently non-solvable, is $\Sigma^2_p$-hard.