五月丁香四月婷婷激情综合,午夜日本大胆裸艺术,十八禁止无遮挡免费视频在线观看,日本一区二区三区免费高清视频

The conditional moment problem is a powerful formulation for describing structural causal parameters in terms of observables, a prominent example being instrumental variable regression. A standard approach reduces the problem to a finite set of marginal moment conditions and applies the optimally weighted generalized method of moments (OWGMM), but this requires we know a finite set of identifying moments, can still be inefficient even if identifying, or can be theoretically efficient but practically unwieldy if we use a growing sieve of moment conditions. Motivated by a variational minimax reformulation of OWGMM, we define a very general class of estimators for the conditional moment problem, which we term the variational method of moments (VMM) and which naturally enables controlling infinitely-many moments. We provide a detailed theoretical analysis of multiple VMM estimators, including ones based on kernel methods and neural nets, and provide conditions under which these are consistent, asymptotically normal, and semiparametrically efficient in the full conditional moment model. We additionally provide algorithms for valid statistical inference based on the same kind of variational reformulations, both for kernel- and neural-net-based varieties. Finally, we demonstrate the strong performance of our proposed estimation and inference algorithms in a detailed series of synthetic experiments.

相關內容

矩

關注 0

INFORMS · 香農熵 · 隨機變量 · 香農 · 互信息 ·

2023 年 5 月 12 日

A Logarithmic Decomposition for Information

Keenan J. A. Down,Pedro A. M. Mediano

from arxiv, 9 pages, 4 figures. Submitted to the 2023 IEEE International Symposium on Information Theory

The Shannon entropy of a random variable $X$ has much behaviour analogous to a signed measure. Previous work has concretized this connection by defining a signed measure $\mu$ on an abstract information space $\tilde{X}$, which is taken to represent the information that $X$ contains. This construction is sufficient to derive many measure-theoretical counterparts to information quantities such as the mutual information $I(X; Y) = \mu(\tilde{X} \cap \tilde{Y})$, the joint entropy $H(X,Y) = \mu(\tilde{X} \cup \tilde{Y})$, and the conditional entropy $H(X|Y) = \mu(\tilde{X}\, \setminus \, \tilde{Y})$. We demonstrate that there exists a much finer decomposition with intuitive properties which we call the logarithmic decomposition (LD). We show that this signed measure space has the useful property that its logarithmic atoms are easily characterised with negative or positive entropy, while also being coherent with Yeung's $I$-measure. We present the usability of our approach by re-examining the G\'acs-K\"orner common information from this new geometric perspective and characterising it in terms of our logarithmic atoms. We then highlight that our geometric refinement can account for an entire class of information quantities, which we call logarithmically decomposable quantities.

正則化項 · Learning · 變分分布 · MoDELS · 置信度 ·

2023 年 5 月 12 日

Calibration-Aware Bayesian Learning

Jiayi Huang,Sangwoo Park,Osvaldo Simeone

from arxiv, submitted for conference publication

Deep learning models, including modern systems like large language models, are well known to offer unreliable estimates of the uncertainty of their decisions. In order to improve the quality of the confidence levels, also known as calibration, of a model, common approaches entail the addition of either data-dependent or data-independent regularization terms to the training loss. Data-dependent regularizers have been recently introduced in the context of conventional frequentist learning to penalize deviations between confidence and accuracy. In contrast, data-independent regularizers are at the core of Bayesian learning, enforcing adherence of the variational distribution in the model parameter space to a prior density. The former approach is unable to quantify epistemic uncertainty, while the latter is severely affected by model misspecification. In light of the limitations of both methods, this paper proposes an integrated framework, referred to as calibration-aware Bayesian neural networks (CA-BNNs), that applies both regularizers while optimizing over a variational distribution as in Bayesian learning. Numerical results validate the advantages of the proposed approach in terms of expected calibration error (ECE) and reliability diagrams.

MoDELS · 樣例 · Neural Networks · CASES · 講稿 ·

2023 年 5 月 12 日

Aleatoric uncertainty for Errors-in-Variables models in deep regression

J?rg Martin,Clemens Elster

from arxiv, 9 pages

A Bayesian treatment of deep learning allows for the computation of uncertainties associated with the predictions of deep neural networks. We show how the concept of Errors-in-Variables can be used in Bayesian deep regression to also account for the uncertainty associated with the input of the employed neural network. The presented approach thereby exploits a relevant, but generally overlooked, source of uncertainty and yields a decomposition of the predictive uncertainty into an aleatoric and epistemic part that is more complete and, in many cases, more consistent from a statistical perspective. We discuss the approach along various simulated and real examples and observe that using an Errors-in-Variables model leads to an increase in the uncertainty while preserving the prediction performance of models without Errors-in-Variables. For examples with known regression function we observe that this ground truth is substantially better covered by the Errors-in-Variables model, indicating that the presented approach leads to a more reliable uncertainty estimation.

近似 · MoDELS · 線性的 · 近似誤差 · CASE ·

2023 年 5 月 12 日

Sequential model correction for nonlinear inverse problems

Arttu Arjas,Mikko J. Sillanp??,Andreas Hauptmann

from arxiv, 25 pages, 9 figures

Inverse problems are in many cases solved with optimization techniques. When the underlying model is linear, first-order gradient methods are usually sufficient. With nonlinear models, due to nonconvexity, one must often resort to second-order methods that are computationally more expensive. In this work we aim to approximate a nonlinear model with a linear one and correct the resulting approximation error. We develop a sequential method that iteratively solves a linear inverse problem and updates the approximation error by evaluating it at the new solution. This treatment convexifies the problem and allows us to benefit from established convex optimization methods. We separately consider cases where the approximation is fixed over iterations and where the approximation is adaptive. In the fixed case we show theoretically under what assumptions the sequence converges. In the adaptive case, particularly considering the special case of approximation by first-order Taylor expansion, we show that with certain assumptions the sequence converges to a critical point of the original nonconvex functional. Furthermore, we show that with quadratic objective functions the sequence corresponds to the Gauss-Newton method. Finally, we showcase numerical results superior to the conventional model correction method. We also show, that a fixed approximation can provide competitive results with considerable computational speed-up.

Integration · INTERACT · CASES · 分離的 · 代碼 ·

2023 年 5 月 11 日

Code-Verification Techniques for the Method-of-Moments Implementation of the Combined-Field Integral Equation

Brian A. Freno,Neil R. Matula

from arxiv, arXiv admin note: substantial text overlap with arXiv:2209.09378

Code verification plays an important role in establishing the credibility of computational simulations by assessing the correctness of the implementation of the underlying numerical methods. In computational electromagnetics, the numerical solution to integral equations incurs multiple interacting sources of numerical error, as well as other challenges, which render traditional code-verification approaches ineffective. In this paper, we provide approaches to separately measure the numerical errors arising from these different error sources for the method-of-moments implementation of the combined-field integral equation. We demonstrate the effectiveness of these approaches for cases with and without coding errors.

相互獨立的 · 近似 · 統計量 · 規范化的 · 向量化 ·

2023 年 5 月 10 日

Bounds for distributional approximation in the multivariate delta method by Stein's method

Robert E. Gaunt,Heather Sutcliffe

from arxiv, 38 pages

We obtain bounds to quantify the distributional approximation in the delta method for vector statistics (the sample mean of $n$ independent random vectors) for normal and non-normal limits, measured using smooth test functions. For normal limits, we obtain bounds of the optimal order $n^{-1/2}$ rate of convergence, but for a wide class of non-normal limits, which includes quadratic forms amongst others, we achieve bounds with a faster order $n^{-1}$ convergence rate. We apply our general bounds to derive explicit bounds to quantify distributional approximations of an estimator for Bernoulli variance, several statistics of sample moments, order $n^{-1}$ bounds for the chi-square approximation of a family of rank-based statistics, and we also provide an efficient independent derivation of an order $n^{-1}$ bound for the chi-square approximation of Pearson's statistic. In establishing our general results, we generalise recent results on Stein's method for functions of multivariate normal random vectors to vector-valued functions and sums of independent random vectors whose components may be dependent. These bounds are widely applicable and are of independent interest.

MoDELS · 方差 · Extensibility · Performer · 確切的 ·

2023 年 5 月 10 日

Bayesian variance change point detection with credible sets

Lorenzo Cappello,Oscar Hernan Madrid Padilla

This paper introduces a novel Bayesian approach to detect changes in the variance of a Gaussian sequence model, focusing on quantifying the uncertainty in the change point locations and providing a scalable algorithm for inference. Such a measure of uncertainty is necessary when change point methods are deployed in sensitive applications, for example, when one is interested in determining whether an organ is viable for transplant. The key of our proposal is framing the problem as a product of multiple single changes in the scale parameter. We fit the model through an iterative procedure similar to what is done for additive models. The novelty is that each iteration returns a probability distribution on time instances, which captures the uncertainty in the change point location. Leveraging a recent result in the literature, we can show that our proposal is a variational approximation of the exact model posterior distribution. We study the algorithm's convergence and the change point localization rate. Extensive experiments in simulation studies illustrate the performance of our method and the possibility of generalizing it to more complex data-generating mechanisms. We apply the new model to an experiment involving a novel technique to assess the viability of a liver and oceanographic data.

SGD · 規范化的 · 幾乎必然 · 駐點 · 平穩的 ·

2023 年 5 月 10 日

Convergence of a Normal Map-based Prox-SGD Method under the KL Inequality

Andre Milzarek,Junwen Qiu

from arxiv, 34 pages, 14 figures

In this paper, we present a novel stochastic normal map-based algorithm ($\mathsf{norM}\text{-}\mathsf{SGD}$) for nonconvex composite-type optimization problems and discuss its convergence properties. Using a time window-based strategy, we first analyze the global convergence behavior of $\mathsf{norM}\text{-}\mathsf{SGD}$ and it is shown that every accumulation point of the generated sequence of iterates $\{\boldsymbol{x}^k\}_k$ corresponds to a stationary point almost surely and in an expectation sense. The obtained results hold under standard assumptions and extend the more limited convergence guarantees of the basic proximal stochastic gradient method. In addition, based on the well-known Kurdyka-{\L}ojasiewicz (KL) analysis framework, we provide novel point-wise convergence results for the iterates $\{\boldsymbol{x}^k\}_k$ and derive convergence rates that depend on the underlying KL exponent $\boldsymbol{\theta}$ and the step size dynamics $\{\alpha_k\}_k$. Specifically, for the popular step size scheme $\alpha_k=\mathcal{O}(1/k^\gamma)$, $\gamma \in (\frac23,1]$, (almost sure) rates of the form $\|\boldsymbol{x}^k-\boldsymbol{x}^*\| = \mathcal{O}(1/k^p)$, $p \in (0,\frac12)$, can be established. The obtained rates are faster than related and existing convergence rates for $\mathsf{SGD}$ and improve on the non-asymptotic complexity bounds for $\mathsf{norM}\text{-}\mathsf{SGD}$.

極小點 · 寬度 · 圖 · Weight · 可辨認的 ·

2023 年 5 月 9 日

Width Helps and Hinders Splitting Flows

Manuel Cáceres,Massimo Cairo,Andreas Grigorjew,Shahbaz Khan,Brendan Mumey,Romeo Rizzi,Alexandru I. Tomescu,Lucia Williams

from arxiv, A preliminary version was submitted to ESA 2022

Minimum flow decomposition (MFD) is the NP-hard problem of finding a smallest decomposition of a network flow/circulation $X$ on a directed graph $G$ into weighted source-to-sink paths whose superposition equals $X$. We show that, for acyclic graphs, considering the \emph{width} of the graph (the minimum number of paths needed to cover all of its edges) yields advances in our understanding of its approximability. For the version of the problem that uses only non-negative weights, we identify and characterise a new class of \emph{width-stable} graphs, for which a popular heuristic is a \gwsimple-approximation ($|X|$ being the total flow of $X$), and strengthen its worst-case approximation ratio from $\Omega(\sqrt{m})$ to $\Omega(m / \log m)$ for sparse graphs, where $m$ is the number of edges in the graph. We also study a new problem on graphs with cycles, Minimum Cost Circulation Decomposition (MCCD), and show that it generalises MFD through a simple reduction. For the version allowing also negative weights, we give a $(\lceil \log \Vert X \Vert \rceil +1)$-approximation ($\Vert X \Vert$ being the maximum absolute value of $X$ on any edge) using a power-of-two approach, combined with parity fixing arguments and a decomposition of unitary circulations ($\Vert X \Vert \leq 1$), using a generalised notion of width for this problem. Finally, we disprove a conjecture about the linear independence of minimum (non-negative) flow decompositions posed by Kloster et al. [ALENEX 2018], but show that its useful implication (polynomial-time assignments of weights to a given set of paths to decompose a flow) holds for the negative version.

圖 · 估計/估計量 · 線性的 · 離散數學 · 算法與數據結構 ·

2023 年 5 月 9 日

On the Number of $t$-Lee-Error-Correcting Codes

Nadja Willenborg,Anna-Lena Horlemann,Violetta Weger

We consider $t$-Lee-error-correcting codes of length $n$ over the residue ring $\mathbb{Z}_m := \mathbb{Z}/m\mathbb{Z}$ and determine upper and lower bounds on the number of $t$-Lee-error-correcting codes. We use two different methods, namely estimating isolated nodes on bipartite graphs and the graph container method. The former gives density results for codes of fixed size and the latter for any size. This confirms some recent density results for linear Lee metric codes and provides new density results for nonlinear codes. To apply a variant of the graph container algorithm we also investigate some geometrical properties of the balls in the Lee metric.