嘘嘘中国免费观看网站,蜜芽亚洲精品国产品国语在线试看,国产在线看不卡一区二区,国产高清精品在一区二区

Many analyses of multivariate data focus on evaluating the dependence between two sets of variables, rather than the dependence among individual variables within each set. Canonical correlation analysis (CCA) is a classical data analysis technique that estimates parameters describing the dependence between such sets. However, inference procedures based on traditional CCA rely on the assumption that all variables are jointly normally distributed. We present a semiparametric approach to CCA in which the multivariate margins of each variable set may be arbitrary, but the dependence between variable sets is described by a parametric model that provides low-dimensional summaries of dependence. While maximum likelihood estimation in the proposed model is intractable, we propose two estimation strategies: one using a pseudolikelihood for the model and one using a Markov chain Monte Carlo (MCMC) algorithm that provides Bayesian estimates and confidence regions for the between-set dependence parameters. The MCMC algorithm is derived from a multirank likelihood function, which uses only part of the information in the observed data in exchange for being free of assumptions about the multivariate margins. We apply the proposed Bayesian inference procedure to Brazilian climate data and monthly stock returns from the materials and communications market sectors.

相關內容

估計/估計量

關注 3

自助法/自舉法 · 置信度 · 統計量 · 中位數 · 推斷 ·

2024 年 6 月 3 日

Resampling methods for private statistical inference

Karan Chadha,John Duchi,Rohith Kuditipudi

from arxiv, 45 pages

We consider the task of constructing confidence intervals with differential privacy. We propose two private variants of the non-parametric bootstrap, which privately compute the median of the results of multiple "little" bootstraps run on partitions of the data and give asymptotic bounds on the coverage error of the resulting confidence intervals. For a fixed differential privacy parameter $\epsilon$, our methods enjoy the same error rates as that of the non-private bootstrap to within logarithmic factors in the sample size $n$. We empirically validate the performance of our methods for mean estimation, median estimation, and logistic regression with both real and synthetic data. Our methods achieve similar coverage accuracy to existing methods (and non-private baselines) while providing notably shorter ($\gtrsim 10$ times) confidence intervals than previous approaches.

近似 · 優化器 · 圖 · 代價函數 · 成對型 ·

2024 年 6 月 1 日

Approximating optimization problems in graphs with locational uncertainty

Marin Bougeret,Jérémy Omer,Michael Poss

from arxiv, arXiv admin note: substantial text overlap with arXiv:2109.00389

Many combinatorial optimization problems can be formulated as the search for a subgraph that satisfies certain properties and minimizes the total weight. We assume here that the vertices correspond to points in a metric space and can take any position in given uncertainty sets. Then, the cost function to be minimized is the sum of the distances for the worst positions of the vertices in their uncertainty sets. We propose two types of polynomial-time approximation algorithms. The first one relies on solving a deterministic counterpart of the problem where the uncertain distances are replaced with maximum pairwise distances. We study in details the resulting approximation ratio, which depends on the structure of the feasible subgraphs and whether the metric space is Ptolemaic or not. The second algorithm is a fully-polynomial time approximation scheme for the special case of $s-t$ paths.

優化器 · MoDELS · Analysis · 樣本 · Extensibility ·

2024 年 5 月 31 日

Hyper-differential sensitivity analysis with respect to model discrepancy: Posterior Optimal Solution Sampling

Joseph Hart,Bart van Bloemen Waanders

Optimization constrained by high-fidelity computational models has potential for transformative impact. However, such optimization is frequently unattainable in practice due to the complexity and computational intensity of the model. An alternative is to optimize a low-fidelity model and use limited evaluations of the high-fidelity model to assess the quality of the solution. This article develops a framework to use limited high-fidelity simulations to update the optimization solution computed using the low-fidelity model. Building off a previous article [22], which introduced hyper-differential sensitivity analysis with respect to model discrepancy, this article provides novel extensions of the algorithm to enable uncertainty quantification of the optimal solution update via a Bayesian framework. Specifically, we formulate a Bayesian inverse problem to estimate the model discrepancy and propagate the posterior model discrepancy distribution through the post-optimality sensitivity operator for the low-fidelity optimization problem. We provide a rigorous treatment of the Bayesian formulation, a computationally efficient algorithm to compute posterior samples, a guide to specify and interpret the algorithm hyper-parameters, and a demonstration of the approach on three examples which highlight various types of discrepancy between low and high-fidelity models.

詞向量表示 · 分解的 · Analysis · 奇異值分解 · 變換 ·

2024 年 5 月 31 日

A comparison of correspondence analysis with PMI-based word embedding methods

Qianqian Qi,David J. Hessen,Peter G. M. van der Heijden

Popular word embedding methods such as GloVe and Word2Vec are related to the factorization of the pointwise mutual information (PMI) matrix. In this paper, we link correspondence analysis (CA) to the factorization of the PMI matrix. CA is a dimensionality reduction method that uses singular value decomposition (SVD), and we show that CA is mathematically close to the weighted factorization of the PMI matrix. In addition, we present variants of CA that turn out to be successful in the factorization of the word-context matrix, i.e. CA applied to a matrix where the entries undergo a square-root transformation (ROOT-CA) and a root-root transformation (ROOTROOT-CA). An empirical comparison among CA- and PMI-based methods shows that overall results of ROOT-CA and ROOTROOT-CA are slightly better than those of the PMI-based methods.

平滑 · 卷積 · 正則化項 · FAST · 近似 ·

2024 年 5 月 30 日

Generalized Convolution Quadrature for non smooth sectorial problems

Jing Guo,Maria Lopez-Fernandez

from arxiv, 30 pages, 22 figures

We consider the application of the generalized Convolution Quadrature (gCQ) to approximate the solution of an important class of sectorial problems. The gCQ is a generalization of Lubich's Convolution Quadrature (CQ) that allows for variable steps. The available stability and convergence theory for the gCQ requires non realistic regularity assumptions on the data, which do not hold in many applications of interest, such as the approximation of subdiffusion equations. It is well known that for non smooth enough data the original CQ, with uniform steps, presents an order reduction close to the singularity. We generalize the analysis of the gCQ to data satisfying realistic regularity assumptions and provide sufficient conditions for stability and convergence on arbitrary sequences of time points. We consider the particular case of graded meshes and show how to choose them optimally, according to the behaviour of the data. An important advantage of the gCQ method is that it allows for a fast and memory reduced implementation. We describe how the fast and oblivious gCQ can be implemented and illustrate our theoretical results with several numerical experiments.

Analysis · Performer · 數據集 · 相互獨立的 · 線性的 ·

2024 年 5 月 30 日

Elementary methods provide more replicable results in microbial differential abundance analysis

Juho Pelto,Kari Auranen,Janne Kujala,Leo Lahti

Differential abundance analysis is a key component of microbiome studies. While dozens of methods for it exist, currently, there is no consensus on the preferred methods. Correctness of results in differential abundance analysis is an ambiguous concept that cannot be evaluated without employing simulated data, but we argue that consistency of results across datasets should be considered as an essential quality of a well-performing method. We compared the performance of 14 differential abundance analysis methods employing datasets from 54 taxonomic profiling studies based on 16S rRNA gene or shotgun sequencing. For each method, we examined how the results replicated between random partitions of each dataset and between datasets from independent studies. While certain methods showed good consistency, some widely used methods were observed to produce a substantial number of conflicting findings. Overall, the highest consistency without unnecessary reduction in sensitivity was attained by analyzing relative abundances with a non-parametric method (Wilcoxon test or ordinal regression model) or linear regression (MaAsLin2). Comparable performance was also attained by analyzing presence/absence of taxa with logistic regression.

數值微分 · 模型評估 · 泛函 · 優化器 · 數值分析 ·

2024 年 5 月 30 日

About truncated Chebyshev spectral method for solving numerical differentiation and summation

Y. V. Semenova,S. G. Solodky

from arxiv, arXiv admin note: text overlap with arXiv:2309.05425, arXiv:2309.09710

The problems of optimal recovering univariate functions and their derivatives are studied. To solve these problems, two variants of the truncation method are constructed, which are order-optimal both in the sense of accuracy and in terms of the amount of involved Galerkin information. For numerical summation, it has been established how the parameters characterizing the problem being solved affect its stability.

Alphabet · 情景 · CASE · 離散數學 ·

2024 年 5 月 30 日

On shortest products for nonnegative matrix mortality

Andrew Ryzhikov

Given a finite set of matrices with integer entries, the matrix mortality problem asks if there exists a product of these matrices equal to the zero matrix. We consider a special case of this problem where all entries of the matrices are nonnegative. This case is equivalent to the NFA mortality problem, which, given an NFA, asks for a word $w$ such that the image of every state under $w$ is the empty set. The size of the alphabet of the NFA is then equal to the number of matrices in the set. We study the length of shortest such words depending on the size of the alphabet. We show that this length for an NFA with $n$ states can be at least $2^n - 1$, $2^{(n - 4)/2}$ and $2^{(n - 2)/3}$ if the size of the alphabet is, respectively, equal to $n$, three and two.

CASES · 操作 · 可約的 · 矩 · motivation ·

2024 年 5 月 26 日

A logic for temporal conditionals and a solution to the Sea Battle Puzzle

Fengkui Ju,Gianluca Grilletti,Valentin Goranko

from arxiv, Proceedings of the 12th International Conference on Advances in Modal Logic. College Publications, 2018

Temporal reasoning with conditionals is more complex than both classical temporal reasoning and reasoning with timeless conditionals, and can lead to some rather counter-intuitive conclusions. For instance, Aristotle's famous "Sea Battle Tomorrow" puzzle leads to a fatalistic conclusion: whether there will be a sea battle tomorrow or not, but that is necessarily the case now. We propose a branching-time logic LTC to formalise reasoning about temporal conditionals and provide that logic with adequate formal semantics. The logic LTC extends the Nexttime fragment of CTL*, with operators for model updates, restricting the domain to only future moments where antecedent is still possible to satisfy. We provide formal semantics for these operators that implements the restrictor interpretation of antecedents of temporalized conditionals, by suitably restricting the domain of discourse. As a motivating example, we demonstrate that a naturally formalised in our logic version of the `Sea Battle' argument renders it unsound, thereby providing a solution to the problem with fatalist conclusion that it entails, because its underlying reasoning per cases argument no longer applies when these cases are treated not as material implications but as temporal conditionals. On the technical side, we analyze the semantics of LTC and provide a series of reductions of LTC-formulae, first recursively eliminating the dynamic update operators and then the path quantifiers in such formulae. Using these reductions we obtain a sound and complete axiomatization for LTC, and reduce its decision problem to that of the modal logic KD.

泛化理論 · 黑盒 · 學成 · INFORMS · 監督學習算法 ·

2021 年 10 月 4 日

Information-theoretic generalization bounds for black-box learning algorithms

Hrayr Harutyunyan,Maxim Raginsky,Greg Ver Steeg,Aram Galstyan

from arxiv, NeurIPS 2021

We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.