久草精品视频在线观看_国产精品久久久久一级毛片_久久久无码人妻一区二区三区少妇_日日噜噜噜夜夜狠狠伊人_日韩国产制服综合无码_SAO货腿张开JI巴CAO死我_91成人精品国产刺激国语对白

In this paper, we propose a general framework for designing sensing matrix $\boldsymbol{A} \in \mathbb{R}^{d\times p}$, for estimation of sparse covariance matrix from compressed measurements of the form $\boldsymbol{y} = \boldsymbol{A}\boldsymbol{x} + \boldsymbol{n}$, where $\boldsymbol{y}, \boldsymbol{n} \in \mathbb{R}^d$, and $\boldsymbol{x} \in \mathbb{R}^p$. By viewing covariance recovery as inference over factor graphs via message passing algorithm, ideas from coding theory, such as \textit{Density Evolution} (DE), are leveraged to construct a framework for the design of the sensing matrix. The proposed framework can handle both (1) regular sensing, i.e., equal importance is given to all entries of the covariance, and (2) preferential sensing, i.e., higher importance is given to a part of the covariance matrix. Through experiments, we show that the sensing matrix designed via density evolution can match the state-of-the-art for covariance recovery in the regular sensing paradigm and attain improved performance in the preferential sensing regime. Additionally, we study the feasibility of causal graph structure recovery using the estimated covariance matrix obtained from the compressed measurements.

相關內容

協方差矩陣

關注 3

在概率論和統計學中，協方差矩陣（也稱為自協方差矩陣，色散矩陣，方差矩陣或方差-協方差矩陣）是平方矩陣，給出了給定隨機向量的每對元素之間的協方差。在矩陣對角線中存在方差，即每個元素與其自身的協方差。

估計/估計量 · 統計量 · Learning · 泛函 · 核化 ·

2023 年 1 月 12 日

Statistical Learning with Sublinear Regret of Propagator Models

Eyal Neuman,Yufei Zhang

from arxiv, 49 pages

We consider a class of learning problems in which an agent liquidates a risky asset while creating both transient price impact driven by an unknown convolution propagator and linear temporary price impact with an unknown parameter. We characterize the trader's performance as maximization of a revenue-risk functional, where the trader also exploits available information on a price predicting signal. We present a trading algorithm that alternates between exploration and exploitation phases and achieves sublinear regrets with high probability. For the exploration phase we propose a novel approach for non-parametric estimation of the price impact kernel by observing only the visible price process and derive sharp bounds on the convergence rate, which are characterised by the singularity of the propagator. These kernel estimation methods extend existing methods from the area of Tikhonov regularisation for inverse problems and are of independent interest. The bound on the regret in the exploitation phase is obtained by deriving stability results for the optimizer and value function of the associated class of infinite-dimensional stochastic control problems. As a complementary result we propose a regression-based algorithm to estimate the conditional expectation of non-Markovian signals and derive its convergence rate.

估計/估計量 · 極大似然 · 最大似然估計 · 似然 · MoDELS ·

2023 年 1 月 12 日

Maximum likelihood estimation and prediction error for a Mat{é}rn model on the circle

Sébastien Petit

This work considers Gaussian process interpolation with a periodized version of the Mat{\'e}rn covariance function (Stein, 1999, Section 6.7) with Fourier coefficients $\phi$($\alpha$^2 + j^2)^(--$\nu$--1/2). Convergence rates are studied for the joint maximum likelihood estimation of $\nu$ and $\phi$ when the data is sampled according to the model. The mean integrated squared error is also analyzed with fixed and estimated parameters, showing that maximum likelihood estimation yields asymptotically the same error as if the ground truth was known. Finally, the case where the observed function is a ''deterministic'' element of a continuous Sobolev space is also considered, suggesting that bounding assumptions on some parameters can lead to different estimates.

分解的 · 劃分 · Learning · 模型評估 · 可約的 ·

2023 年 1 月 12 日

Efficient Ridge Solution for the Incremental Broad Learning System on Added Nodes by Inverse Cholesky Factorization of a Partitioned Matrix

Hufei Zhu,Chenghao Wei

To accelerate the existing Broad Learning System (BLS) for new added nodes in [7], we extend the inverse Cholesky factorization in [10] to deduce an efficient inverse Cholesky factorization for a Hermitian matrix partitioned into 2 * 2 blocks, which is utilized to develop the proposed BLS algorithm 1. The proposed BLS algorithm 1 compute the ridge solution (i.e, the output weights) from the inverse Cholesky factor of the Hermitian matrix in the ridge inverse, and update the inverse Cholesky factor efficiently. From the proposed BLS algorithm 1, we deduce the proposed ridge inverse, which can be obtained from the generalized inverse in [7] by just change one matrix in the equation to compute the newly added sub-matrix. We also modify the proposed algorithm 1 into the proposed algorithm 2, which is equivalent to the existing BLS algorithm [7] in terms of numerical computations. The proposed algorithms 1 and 2 can reduce the computational complexity, since usually the Hermitian matrix in the ridge inverse is smaller than the ridge inverse. With respect to the existing BLS algorithm, the proposed algorithms 1 and 2 usually require about 13 and 2 3 of complexities, respectively, while in numerical experiments they achieve the speedups (in each additional training time) of 2.40 - 2.91 and 1.36 - 1.60, respectively. Numerical experiments also show that the proposed algorithm 1 and the standard ridge solution always bear the same testing accuracy, and usually so do the proposed algorithm 2 and the existing BLS algorithm. The existing BLS assumes the ridge parameter lamda->0, since it is based on the generalized inverse with the ridge regression approximation. When the assumption of lamda-> 0 is not satisfied, the standard ridge solution obviously achieves a better testing accuracy than the existing BLS algorithm in numerical experiments.

估計/估計量 · 可約的 · 極大似然 · 正定 · 最大似然估計 ·

2023 年 1 月 12 日

Localized covariance estimation: A Bayesian perspective

Robert J. Webber,Matthias Morzfeld

from arxiv, 20 pages, 4 figures

A major problem in numerical weather prediction (NWP) is the estimation of high-dimensional covariance matrices from a small number of samples. Maximum likelihood estimators cannot provide reliable estimates when the overall dimension is much larger than the number of samples. Fortunately, NWP practitioners have found ingenious ways to boost the accuracy of their covariance estimators by leveraging the assumption that the correlations decay with spatial distance. In this work, Bayesian statistics is used to provide a new justification and analysis of the practical NWP covariance estimators. The Bayesian framework involves manipulating distributions over symmetric positive definite matrices, and it leads to two main findings: (i) the commonly used "hybrid estimator" for the covariance matrix has a naturally Bayesian interpretation; (ii) the very commonly used "Schur product estimator" is not Bayesian, but it can be studied and understood within the Bayesian framework. As practical implications, the Bayesian framework shows how to reduce the amount of tuning required for covariance estimation, and it suggests that efficient covariance estimation should be rooted in understanding and penalizing conditional correlations, rather than correlations.

估計/估計量 · 協變量偏移 · 樣本 · 泛化理論 · 有偏 ·

2023 年 1 月 12 日

A Framework for Generalization and Transportation of Causal Estimates Under Covariate Shift

Apoorva Lal,Wenjing Zheng,Simon Ejdemyr

from arxiv, 7 pages, 1 figure

Randomized experiments are an excellent tool for estimating internally valid causal effects with the sample at hand, but their external validity is frequently debated. While classical results on the estimation of Population Average Treatment Effects (PATE) implicitly assume random selection into experiments, this is typically far from true in many medical, social-scientific, and industry experiments. When the experimental sample is different from the target sample along observable or unobservable dimensions, experimental estimates may be of limited use for policy decisions. We begin by decomposing the extrapolation bias from estimating the Target Average Treatment Effect (TATE) using the Sample Average Treatment Effect (SATE) into covariate shift, overlap, and effect modification components, which researchers can reason about in order to diagnose the severity of extrapolation bias. Next, We cast covariate shift as a sample selection problem and propose estimators that re-weight the doubly-robust scores from experimental subjects to estimate treatment effects in the overall sample (=: generalization) or in an alternate target sample (=: transportation). We implement these estimators in the open-source R package causalTransportR and illustrate its performance in a simulation study and discuss diagnostics to evaluate its performance.

推斷 · Networking · 向量空間 · 閾值 · Networks ·

2023 年 1 月 12 日

Variational Inference: Posterior Threshold Improves Network Clustering Accuracy in Sparse Regimes

Xuezhen Li,Can M. Le

Variational inference has been widely used in machine learning literature to fit various Bayesian models. In network analysis, this method has been successfully applied to solve the community detection problems. Although these results are promising, their theoretical support is only for relatively dense networks, an assumption that may not hold for real networks. In addition, it has been shown recently that the variational loss surface has many saddle points, which may severely affect its performance, especially when applied to sparse networks. This paper proposes a simple way to improve the variational inference method by hard thresholding the posterior of the community assignment after each iteration. Using a random initialization that correlates with the true community assignment, we show that the proposed method converges and can accurately recover the true community labels, even when the average node degree of the network is bounded. Extensive numerical study further confirms the advantage of the proposed method over the classical variational inference and another state-of-the-art algorithm.

Learning · Pattern Recognition · 可理解性 · 深度學習 · 模型構建 ·

2022 年 9 月 14 日

A Review and Roadmap of Deep Learning Causal Discovery in Different Variable Paradigms

Hang Chen,Keqing Du,Xinyu Yang,Chenguang Li

from arxiv, 26 pages,10 figures. arXiv admin note: text overlap with arXiv:2012.07138, arXiv:1605.08179, arXiv:2203.14237 by other authors

Understanding causality helps to structure interventions to achieve specific goals and enables predictions under interventions. With the growing importance of learning causal relationships, causal discovery tasks have transitioned from using traditional methods to infer potential causal structures from observational data to the field of pattern recognition involved in deep learning. The rapid accumulation of massive data promotes the emergence of causal search methods with brilliant scalability. Existing summaries of causal discovery methods mainly focus on traditional methods based on constraints, scores and FCMs, there is a lack of perfect sorting and elaboration for deep learning-based methods, also lacking some considers and exploration of causal discovery methods from the perspective of variable paradigms. Therefore, we divide the possible causal discovery tasks into three types according to the variable paradigm and give the definitions of the three tasks respectively, define and instantiate the relevant datasets for each task and the final causal model constructed at the same time, then reviews the main existing causal discovery methods for different tasks. Finally, we propose some roadmaps from different perspectives for the current research gaps in the field of causal discovery and point out future research directions.

優化器 · 圖 · 圖形處理器 · Neural Networks · 核化 ·

2021 年 1 月 28 日

Interpreting and Unifying Graph Neural Networks with An Optimization Framework

Meiqi Zhu,Xiao Wang,Chuan Shi,Houye Ji,Peng Cui

from arxiv, WWW2021, 12 pages

Graph Neural Networks (GNNs) have received considerable attention on graph-structured data learning for a wide variety of tasks. The well-designed propagation mechanism which has been demonstrated effective is the most fundamental part of GNNs. Although most of GNNs basically follow a message passing manner, litter effort has been made to discover and analyze their essential relations. In this paper, we establish a surprising connection between different propagation mechanisms with a unified optimization problem, showing that despite the proliferation of various GNNs, in fact, their proposed propagation mechanisms are the optimal solution optimizing a feature fitting function over a wide class of graph kernels with a graph regularization term. Our proposed unified optimization framework, summarizing the commonalities between several of the most representative GNNs, not only provides a macroscopic view on surveying the relations between different GNNs, but also further opens up new opportunities for flexibly designing new GNNs. With the proposed framework, we discover that existing works usually utilize naive graph convolutional kernels for feature fitting function, and we further develop two novel objective functions considering adjustable graph kernels showing low-pass or high-pass filtering capabilities respectively. Moreover, we provide the convergence proofs and expressive power comparisons for the proposed models. Extensive experiments on benchmark datasets clearly show that the proposed GNNs not only outperform the state-of-the-art methods but also have good ability to alleviate over-smoothing, and further verify the feasibility for designing GNNs with our unified optimization framework.

樣本 · 類別 · 損失 · Performer · SimPLe ·

2019 年 1 月 16 日

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui,Menglin Jia,Tsung-Yi Lin,Yang Song,Serge Belongie

from arxiv, Code is available at: //github.com/richardaecn/class-balanced-loss

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.