欧美狂野视频一区国产精品,国产精品性爱视频亚洲国产黄片,果冻精品VA天堂亚洲国产,中文字幕精品一区久久久久

We propose a co-variance corrected random batch method for interacting particle systems. By establishing a certain entropic central limit theorem, we provide entropic convergence guarantees for the law of the entire trajectories of all particles of the proposed method to the law of the trajectories of the discrete time interacting particle system whenever the batch size $B \gg (\alpha n)^{\frac{1}{3}}$ (where $n$ is the number of particles and $\alpha$ is the time discretization parameter). This in turn implies that the outputs of these methods are nearly \emph{statistically indistinguishable} when $B$ is even moderately large. Previous works mainly considered convergence in Wasserstein distance with required stringent assumptions on the potentials or the bounds had an exponential dependence on the time horizon. This work makes minimal assumptions on the interaction potentials and in particular establishes that even when the particle trajectories diverge to infinity, they do so in the same way for both the methods. Such guarantees are very useful in light of the recent advances in interacting particle based algorithms for sampling.

相關內容

INTERACT

關注 5

IFIP TC13 Conference on Human-Computer Interaction是人機交互領域的研究者和實踐者展示其工作的重要平臺。多年來，這些會議吸引了來自幾個國家和文化的研究人員。官網鏈接： · massive MIMO · 通道 · 估計/估計量 · 知識 (knowledge) ·

2022 年 7 月 21 日

FDD Massive MIMO Channel Training Optimal Rate Distortion Bounds and the Efficiency of one-shot Schemes

Mahdi Barzegar Khalilsarai,Yi Song,Tianyu Yang,Giuseppe Caire

We study the problem of providing channel state information (CSI) at the transmitter in multi-user massive MIMO systems operating in frequency division duplexing (FDD). The wideband MIMO channel is a vector-valued random process correlated in time, space (antennas), and frequency (subcarriers). The base station (BS) broadcasts periodically beta_tr pilot symbols from its M antenna ports to K single-antenna users (UEs). Correspondingly, the K UEs send feedback messages about their channel state using beta_fb symbols in the uplink (UL). Using results from remote rate-distortion theory, we show that, as snr reaches infty, the optimal feedback strategy achieves a channel state estimation mean squared error (MSE) that behaves as Theta(1) if beta_tr < r and as Theta(snr^(-alpha)) when beta_tr >=r, where alpha = min(beta_fb/r, 1), where r is the rank of the channel covariance matrix. The MSE-optimal rate-distortion strategy implies encoding of long sequences of channel states, which would yield completely stale CSI and therefore poor multiuser precoding performance. Hence, we consider three practical one-shot CSI strategies with minimum one-slot delay and analyze their large-SNR channel estimation MSE behavior. These are: (1) digital feedback via entropy-coded scalar quantization (ECSQ), (2) analog feedback (AF), and (3) local channel estimation at the UEs and digital feedback. These schemes have different requirements in terms of knowledge of the channel statistics at the UE and at the BS. In particular, the latter strategy requires no statistical knowledge and is closely inspired by a CSI feedback scheme currently proposed in 3GPP standardization.

哈希學習 · DirectShow · Performer · 秩 · 泛函 ·

2022 年 7 月 21 日

Tight Bounds for Monotone Minimal Perfect Hashing

Sepehr Assadi,Martin Farach-Colton,William Kuszmaul

The monotone minimal perfect hash function (MMPHF) problem is the following indexing problem. Given a set $S= \{s_1,\ldots,s_n\}$ of $n$ distinct keys from a universe $U$ of size $u$, create a data structure $DS$ that answers the following query: \[ RankOp(q) = \text{rank of } q \text{ in } S \text{ for all } q\in S ~\text{ and arbitrary answer otherwise.} \] Solutions to the MMPHF problem are in widespread use in both theory and practice. The best upper bound known for the problem encodes $DS$ in $O(n\log\log\log u)$ bits and performs queries in $O(\log u)$ time. It has been an open problem to either improve the space upper bound or to show that this somewhat odd looking bound is tight. In this paper, we show the latter: specifically that any data structure (deterministic or randomized) for monotone minimal perfect hashing of any collection of $n$ elements from a universe of size $u$ requires $\Omega(n \cdot \log\log\log{u})$ expected bits to answer every query correctly. We achieve our lower bound by defining a graph $\mathbf{G}$ where the nodes are the possible ${u \choose n}$ inputs and where two nodes are adjacent if they cannot share the same $DS$. The size of $DS$ is then lower bounded by the log of the chromatic number of $\mathbf{G}$. Finally, we show that the fractional chromatic number (and hence the chromatic number) of $\mathbf{G}$ is lower bounded by $2^{\Omega(n \log\log\log u)}$.

Boosting（一種模型訓練加速方式） · 正交 · 設計矩陣 · 貪心逐層預訓練 · Projection ·

2022 年 7 月 21 日

High-Dimensional $L_2$Boosting: Rate of Convergence

Ye Luo,Martin Spindler,Jannis Kück

from arxiv, 19 pages, 4 tables; AMS 2000 subject classifications: Primary 62J05, 62J07, 41A25; secondary 49M15, 68Q32

Boosting is one of the most significant developments in machine learning. This paper studies the rate of convergence of $L_2$Boosting, which is tailored for regression, in a high-dimensional setting. Moreover, we introduce so-called \textquotedblleft post-Boosting\textquotedblright. This is a post-selection estimator which applies ordinary least squares to the variables selected in the first stage by $L_2$Boosting. Another variant is \textquotedblleft Orthogonal Boosting\textquotedblright\ where after each step an orthogonal projection is conducted. We show that both post-$L_2$Boosting and the orthogonal boosting achieve the same rate of convergence as LASSO in a sparse, high-dimensional setting. We show that the rate of convergence of the classical $L_2$Boosting depends on the design matrix described by a sparse eigenvalue constant. To show the latter results, we derive new approximation results for the pure greedy algorithm, based on analyzing the revisiting behavior of $L_2$Boosting. We also introduce feasible rules for early stopping, which can be easily implemented and used in applied work. Our results also allow a direct comparison between LASSO and boosting which has been missing from the literature. Finally, we present simulation studies and applications to illustrate the relevance of our theoretical results and to provide insights into the practical aspects of boosting. In these simulation studies, post-$L_2$Boosting clearly outperforms LASSO.

估計/估計量 · Neural Networks · Networking · Processing（編程語言） · 平滑 ·

2022 年 7 月 21 日

Estimation of Non-Crossing Quantile Regression Process with Deep ReQU Neural Networks

Guohao Shen,Yuling Jiao,Yuanyuan Lin,Joel L. Horowitz,Jian Huang

from arxiv, 44 pages, 10 figures, 6 tables

We propose a penalized nonparametric approach to estimating the quantile regression process (QRP) in a nonseparable model using rectifier quadratic unit (ReQU) activated deep neural networks and introduce a novel penalty function to enforce non-crossing of quantile regression curves. We establish the non-asymptotic excess risk bounds for the estimated QRP and derive the mean integrated squared error for the estimated QRP under mild smoothness and regularity conditions. To establish these non-asymptotic risk and estimation error bounds, we also develop a new error bound for approximating $C^s$ smooth functions with $s >0$ and their derivatives using ReQU activated neural networks. This is a new approximation result for ReQU networks and is of independent interest and may be useful in other problems. Our numerical experiments demonstrate that the proposed method is competitive with or outperforms two existing methods, including methods using reproducing kernels and random forests, for nonparametric quantile regression.

高斯混合（模型） · Minimax · 高斯混合模型 · 估計/估計量 · 后驗推斷 ·

2022 年 7 月 21 日

Bayesian Sparse Gaussian Mixture Model in High Dimensions

Dapeng Yao,Fangzheng Xie,Yanxun Xu

We establish the minimax risk for parameter estimation in sparse high-dimensional Gaussian mixture models and show that a constrained maximum likelihood estimator (MLE) achieves the minimax optimality. However, the optimization-based constrained MLE is computationally intractable due to non-convexity of the problem. Therefore, we propose a Bayesian approach to estimate high-dimensional Gaussian mixtures whose cluster centers exhibit sparsity using a continuous spike-and-slab prior, and prove that the posterior contraction rate of the proposed Bayesian method is minimax optimal. The mis-clustering rate is obtained as a by-product using tools from matrix perturbation theory. Computationally, posterior inference of the proposed Bayesian method can be implemented via an efficient Gibbs sampler with data augmentation, circumventing the challenging frequentist nonconvex optimization-based algorithms. The proposed Bayesian sparse Gaussian mixture model does not require pre-specifying the number of clusters, which is allowed to grow with the sample size and can be adaptively estimated via posterior inference. The validity and usefulness of the proposed method is demonstrated through simulation studies and the analysis of a real-world single-cell RNA sequencing dataset.

估計/估計量 · 方差 · Networking · 推斷 · 查準率/準確率 ·

2022 年 7 月 20 日

Experimental Design under Network Interference

Davide Viviano

This paper studies the design of two-wave experiments in the presence of spillover effects when the researcher aims to conduct precise inference on treatment effects. We consider units connected through a single network, local dependence among individuals, and a general class of estimands encompassing average treatment and average spillover effects. We introduce a statistical framework for designing two-wave experiments with networks, where the researcher optimizes over participants and treatment assignments to minimize the variance of the estimators of interest, using a first-wave (pilot) experiment to estimate the variance. We derive guarantees for inference on treatment effects and regret guarantees on the variance obtained from the proposed design mechanism. Our results illustrate the existence of a trade-off in the choice of the pilot study and formally characterize the pilot's size relative to the main experiment. Simulations using simulated and real-world networks illustrate the advantages of the method.

SGD · Learning · Performer · motivation · Neural Networks ·

2022 年 7 月 20 日

Adaptive Step-Size Methods for Compressed SGD

Adarsh M. Subramaniam,Akshayaa Magesh,Venugopal V. Veeravalli

from arxiv, 40 pages

Compressed Stochastic Gradient Descent (SGD) algorithms have been recently proposed to address the communication bottleneck in distributed and decentralized optimization problems, such as those that arise in federated machine learning. Existing compressed SGD algorithms assume the use of non-adaptive step-sizes(constant or diminishing) to provide theoretical convergence guarantees. Typically, the step-sizes are fine-tuned in practice to the dataset and the learning algorithm to provide good empirical performance. Such fine-tuning might be impractical in many learning scenarios, and it is therefore of interest to study compressed SGD using adaptive step-sizes. Motivated by prior work on adaptive step-size methods for SGD to train neural networks efficiently in the uncompressed setting, we develop an adaptive step-size method for compressed SGD. In particular, we introduce a scaling technique for the descent step in compressed SGD, which we use to establish order-optimal convergence rates for convex-smooth and strong convex-smooth objectives under an interpolation condition and for non-convex objectives under a strong growth condition. We also show through simulation examples that without this scaling, the algorithm can fail to converge. We present experimental results on deep neural networks for real-world datasets, and compare the performance of our proposed algorithm with previously proposed compressed SGD methods in literature, and demonstrate improved performance on ResNet-18, ResNet-34 and DenseNet architectures for CIFAR-100 and CIFAR-10 datasets at various levels of compression.

重要性采樣 · 蒙特卡羅 · 高斯分布 · 樣本 · 錯誤率 ·

2022 年 7 月 20 日

On the error rate of importance sampling with randomized quasi-Monte Carlo

Zhijian He,Zhan Zheng,Xiaoqun Wang

Importance sampling (IS) is valuable in reducing the variance of Monte Carlo sampling for many areas, including finance, rare event simulation, and Bayesian inference. It is natural and obvious to combine quasi-Monte Carlo (QMC) methods with IS to achieve a faster rate of convergence. However, a naive replacement of Monte Carlo with QMC may not work well. This paper investigates the convergence rates of randomized QMC-based IS for estimating integrals with respect to a Gaussian measure, in which the IS measure is a Gaussian or $t$ distribution. We prove that if the target function satisfies the so-called boundary growth condition and the covariance matrix of the IS density has eigenvalues no smaller than 1, then randomized QMC with the Gaussian proposal has a root mean squared error of $O(N^{-1+\epsilon})$ for arbitrarily small $\epsilon>0$. Similar results of $t$ distribution as the proposal are also established. These sufficient conditions help to assess the effectiveness of IS in QMC. For some particular applications, we find that the Laplace IS, a very general approach to approximate the target function by a quadratic Taylor approximation around its mode, has eigenvalues smaller than 1, making the resulting integrand less favorable for QMC. From this point of view, when using Gaussian distributions as the IS proposal, a change of measure via Laplace IS may transform a favorable integrand into unfavorable one for QMC although the variance of Monte Carlo sampling is reduced. We also give some examples to verify our propositions and warn against naive replacement of MC with QMC under IS proposals. Numerical results suggest that using Laplace IS with $t$ distributions is more robust than that with Gaussian distributions.

離散化 · 講稿 · 標量 · 有向 · 粵港澳大灣區數字經濟研究院 ·

2022 年 7 月 20 日

Entropy-Preserving and Entropy-Stable Relaxation IMEX and Multirate Time-Stepping Methods

Shinhoo Kang,Emil M. Constantinescu

from arxiv, 37 pages, 16 figures, 4 tables

We propose entropy-preserving and entropy-stable partitioned Runge--Kutta (RK) methods. In particular, we extend the explicit relaxation Runge--Kutta methods to IMEX--RK methods and a class of explicit second-order multirate methods for stiff problems arising from scale-separable or grid-induced stiffness in a system. The proposed approaches not only mitigate system stiffness but also fully support entropy-preserving and entropy-stability properties at a discrete level. The key idea of the relaxation approach is to adjust the step completion with a relaxation parameter so that the time-adjusted solution satisfies the entropy condition at a discrete level. The relaxation parameter is computed by solving a scalar nonlinear equation at each timestep in general; however, as for a quadratic entropy function, we theoretically derive the explicit form of the relaxation parameter and numerically confirm that the relaxation parameter works the Burgers equation. Several numerical results for ordinary differential equations and the Burgers equation are presented to demonstrate the entropy-conserving/stable behavior of these methods. We also compare the relaxation approach and the incremental direction technique for the Burgers equation with and without a limiter in the presence of shocks.

Performer · 推斷 · 估計/估計量 · Extensibility · 經驗分布 ·

2022 年 7 月 20 日

Validating Causal Inference Methods

Harsh Parikh,Carlos Varjao,Louise Xu,Eric Tchetgen Tchetgen

from arxiv, 5 figures, 13 pages

The fundamental challenge of drawing causal inference is that counterfactual outcomes are not fully observed for any unit. Furthermore, in observational studies, treatment assignment is likely to be confounded. Many statistical methods have emerged for causal inference under unconfoundedness conditions given pre-treatment covariates, including propensity score-based methods, prognostic score-based methods, and doubly robust methods. Unfortunately for applied researchers, there is no `one-size-fits-all' causal method that can perform optimally universally. In practice, causal methods are primarily evaluated quantitatively on handcrafted simulated data. Such data-generative procedures can be of limited value because they are typically stylized models of reality. They are simplified for tractability and lack the complexities of real-world data. For applied researchers, it is critical to understand how well a method performs for the data at hand. Our work introduces a deep generative model-based framework, Credence, to validate causal inference methods. The framework's novelty stems from its ability to generate synthetic data anchored at the empirical distribution for the observed sample, and therefore virtually indistinguishable from the latter. The approach allows the user to specify ground truth for the form and magnitude of causal effects and confounding bias as functions of covariates. Thus simulated data sets are used to evaluate the potential performance of various causal estimation methods when applied to data similar to the observed sample. We demonstrate Credence's ability to accurately assess the relative performance of causal estimation techniques in an extensive simulation study and two real-world data applications from Lalonde and Project STAR studies.