一级欧美一级日韩大片,精品女同性恋一区二区三区,丁香五月开心婷婷在线,成人欧美一区二区三区视频不卡,性色网站国产高青

Multilevel linear models allow flexible statistical modelling of complex data with different levels of stratification. Identifying the most appropriate model from the large set of possible candidates is a challenging problem. In the Bayesian setting, the standard approach is a comparison of models using the model evidence or the Bayes factor. Explicit expressions for these quantities are available for simple linear models, but in most cases, direct computation is impossible. In practice, Markov Chain Monte Carlo approaches are widely used, such as sequential Monte Carlo, but it is not always clear how well such techniques perform. We present a method for estimation of the log model evidence, by an intermediate marginalisation over non-variance parameters. This reduces the dimensionality of the Monte Carlo sampling algorithm, which in turn yields more consistent estimates. The aim of this paper is to show how this framework fits together and works in practice, particularly on data with hierarchical structure. We illustrate this method on a popular multilevel dataset containing levels of radon in homes in the US state of Minnesota.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · 估計/估計量 · 穩健性 · MoDELS · 方差 ·

2022 年 9 月 19 日

Robust leave-one-out cross-validation for high-dimensional Bayesian models

Luca Silva,Giacomo Zanella

from arxiv, 37 pages, 8 figures

Leave-one-out cross-validation (LOO-CV) is a popular method for estimating out-of-sample predictive accuracy. However, computing LOO-CV criteria can be computationally expensive due to the need to fit the model multiple times. In the Bayesian context, importance sampling provides a possible solution but classical approaches can easily produce estimators whose variance is infinite, making them potentially unreliable. Here we propose and analyze a novel mixture estimator to compute Bayesian LOO-CV criteria. Our method retains the simplicity and computational convenience of classical approaches, while guaranteeing finite variance of the resulting estimators. Both theoretical and numerical results are provided to illustrate the improved robustness and efficiency. The computational benefits are particularly significant in high-dimensional problems, allowing to perform Bayesian LOO-CV for a broader range of models. The proposed methodology is easily implementable in standard probabilistic programming software and has a computational cost roughly equivalent to fitting the original model once.

規范化的 · 估計/估計量 · Analysis · 集成 · 線性的 ·

2022 年 9 月 19 日

Multilevel Estimation of Normalization Constants Using the Ensemble Kalman-Bucy Filter

Hamza Ruzayqat,Neil K. Chada,Ajay Jasra

from arxiv, 33 pages, 21 figures

In this article we consider the application of multilevel Monte Carlo, for the estimation of normalizing constants. In particular we will make use of the filtering algorithm, the ensemble Kalman-Bucy filter (EnKBF), which is an N-particle representation of the Kalma-Bucy filter (KBF). The EnKBF is of interest as it coincides with the optimal filter in the continuous-linear setting, i.e. the KBF. This motivates our particular setup in the linear setting. The resulting methodology we will use is the multilevel ensemble Kalman-Bucy filter (MLEnKBF). We provide an analysis based on deriving Lq-bounds for the normalizing constants using both the single-level, and the multilevel algorithms. Our results will be highlighted through numerical results, where we firstly demonstrate the error-to-cost rates of the MLEnKBF comparing it to the EnKBF on a linear Gaussian model. Our analysis will be specific to one variant of the MLEnKBF, whereas the numerics will be tested on different variants. We also exploit this methodology for parameter estimation, where we test this on the models arising in atmospheric sciences, such as the stochastic Lorenz 63 and 96 model.

優化器 · Storage · 在線 · 可約的 · motivation ·

2022 年 9 月 17 日

Optimal Online Peak Minimization Using Energy Storage

Yanfang Mo,Qiulin Lin,Minghua Chen,Si-Zhao Joe Qin

The significant presence of demand charges in electric bills motivates large-load customers to utilize energy storage to reduce the peak procurement from the grid. We herein study the problem of energy storage allocation for peak minimization, under the online setting where irrevocable decisions are sequentially made without knowing future demands. The problem is uniquely challenging due to (i) the coupling of online decisions across time imposed by the inventory constraints and (ii) the noncumulative nature of the peak procurement. We apply the CR-Pursuit framework and address the challenges unique to our minimization problem to design an online algorithm achieving the optimal competitive ratio (CR) among all online algorithms. We show that the optimal CR can be computed in polynomial time by solving a linear number of linear-fractional problems. More importantly, we generalize our approach to develop an \emph{anytime-optimal} online algorithm that achieves the best possible CR at any epoch, given the inputs and online decisions so far. The algorithm retains the optimal worst-case performance and attains adaptive average-case performance. Trace-driven simulations show that our algorithm can decrease the peak demand by an extra 19% compared to baseline alternatives under typical settings.

Agent · 估計/估計量 · 廣義線性模型 · 線性的 · 線性模型 ·

2022 年 9 月 16 日

Truthful Generalized Linear Models

Yuan Qiu,Jinyan Liu,Di Wang

from arxiv, To appear in The 18th Conference on Web and Internet Economics (WINE 2022)

In this paper we study estimating Generalized Linear Models (GLMs) in the case where the agents (individuals) are strategic or self-interested and they concern about their privacy when reporting data. Compared with the classical setting, here we aim to design mechanisms that can both incentivize most agents to truthfully report their data and preserve the privacy of individuals' reports, while their outputs should also close to the underlying parameter. In the first part of the paper, we consider the case where the covariates are sub-Gaussian and the responses are heavy-tailed where they only have the finite fourth moments. First, motivated by the stationary condition of the maximizer of the likelihood function, we derive a novel private and closed form estimator. Based on the estimator, we propose a mechanism which has the following properties via some appropriate design of the computation and payment scheme for several canonical models such as linear regression, logistic regression and Poisson regression: (1) the mechanism is $o(1)$-jointly differentially private (with probability at least $1-o(1)$); (2) it is an $o(\frac{1}{n})$-approximate Bayes Nash equilibrium for a $(1-o(1))$-fraction of agents to truthfully report their data, where $n$ is the number of agents; (3) the output could achieve an error of $o(1)$ to the underlying parameter; (4) it is individually rational for a $(1-o(1))$ fraction of agents in the mechanism ; (5) the payment budget required from the analyst to run the mechanism is $o(1)$. In the second part, we consider the linear regression model under more general setting where both covariates and responses are heavy-tailed and only have finite fourth moments. By using an $\ell_4$-norm shrinkage operator, we propose a private estimator and payment scheme which have similar properties as in the sub-Gaussian case.

極大似然 · 似然 · Processing（編程語言） · MoDELS · Learning ·

2022 年 9 月 16 日

Maximum Likelihood Training of Implicit Nonlinear Diffusion Models

Dongjun Kim,Byeonghu Na,Se Jung Kwon,Dongsoo Lee,Wanmo Kang,Il-Chul Moon

Whereas diverse variations of diffusion models exist, expanding the linear diffusion into a nonlinear diffusion process is investigated only by a few works. The nonlinearity effect has been hardly understood, but intuitively, there would be more promising diffusion patterns to optimally train the generative distribution towards the data distribution. This paper introduces such a data-adaptive and nonlinear diffusion process for score-based diffusion models. The proposed Implicit Nonlinear Diffusion Model (INDM) learns the nonlinear diffusion process by combining a normalizing flow and a diffusion process. Specifically, INDM implicitly constructs a nonlinear diffusion on the \textit{data space} by leveraging a linear diffusion on the \textit{latent space} through a flow network. This flow network is the key to forming a nonlinear diffusion as the nonlinearity fully depends on the flow network. This flexible nonlinearity is what improves the learning curve of INDM to nearly Maximum Likelihood Estimation (MLE) training, against the non-MLE training of DDPM++, which turns out to be a special case of INDM with the identity flow. Also, training the nonlinear diffusion yields the sampling robustness by the discretization step sizes. In experiments, INDM achieves the state-of-the-art FID on CelebA.

邊緣化 · Learning · 線性的 · 模型評估 · 類標記 ·

2022 年 9 月 15 日

Private Synthetic Data for Multitask Learning and Marginal Queries

Giuseppe Vietri,Cedric Archambeau,Sergul Aydore,William Brown,Michael Kearns,Aaron Roth,Ankit Siva,Shuai Tang,Zhiwei Steven Wu

from arxiv, The short version of this paper appears in the proceedings of NeurIPS-22

We provide a differentially private algorithm for producing synthetic data simultaneously useful for multiple tasks: marginal queries and multitask machine learning (ML). A key innovation in our algorithm is the ability to directly handle numerical features, in contrast to a number of related prior approaches which require numerical features to be first converted into {high cardinality} categorical features via {a binning strategy}. Higher binning granularity is required for better accuracy, but this negatively impacts scalability. Eliminating the need for binning allows us to produce synthetic data preserving large numbers of statistical queries such as marginals on numerical features, and class conditional linear threshold queries. Preserving the latter means that the fraction of points of each class label above a particular half-space is roughly the same in both the real and synthetic data. This is the property that is needed to train a linear classifier in a multitask setting. Our algorithm also allows us to produce high quality synthetic data for mixed marginal queries, that combine both categorical and numerical features. Our method consistently runs 2-5x faster than the best comparable techniques, and provides significant accuracy improvements in both marginal queries and linear prediction tasks for mixed-type datasets.

規范化的 · 最小最大值規范化 · INFORMS · Performer · Principle ·

2022 年 9 月 15 日

Effect of Post-processing on Contextualized Word Representations

Hassan Sajjad,Firoj Alam,Fahim Dalvi,Nadir Durrani

from arxiv, COLING 2022

Post-processing of static embedding has beenshown to improve their performance on both lexical and sequence-level tasks. However, post-processing for contextualized embeddings is an under-studied problem. In this work, we question the usefulness of post-processing for contextualized embeddings obtained from different layers of pre-trained language models. More specifically, we standardize individual neuron activations using z-score, min-max normalization, and by removing top principle components using the all-but-the-top method. Additionally, we apply unit length normalization to word representations. On a diverse set of pre-trained models, we show that post-processing unwraps vital information present in the representations for both lexical tasks (such as word similarity and analogy)and sequence classification tasks. Our findings raise interesting points in relation to theresearch studies that use contextualized representations, and suggest z-score normalization as an essential step to consider when using them in an application.

Learning · Analysis · 表示學習 · 泛化理論 · Seven ·

2022 年 9 月 15 日

Generalized Representations Learning for Time Series Classification

Wang Lu,Jindong Wang,Xinwei Sun,Yiqiang Chen,Xing Xie

from arxiv, Technical report; 18 pages

Time series classification is an important problem in real world. Due to its non-stationary property that the distribution changes over time, it remains challenging to build models for generalization to unseen distributions. In this paper, we propose to view the time series classification problem from the distribution perspective. We argue that the temporal complexity attributes to the unknown latent distributions within. To this end, we propose DIVERSIFY to learn generalized representations for time series classification. DIVERSIFY takes an iterative process: it first obtains the worst-case distribution scenario via adversarial training, then matches the distributions of the obtained sub-domains. We also present some theoretical insights. We conduct experiments on gesture recognition, speech commands recognition, wearable stress and affect detection, and sensor-based human activity recognition with a total of seven datasets in different settings. Results demonstrate that DIVERSIFY significantly outperforms other baselines and effectively characterizes the latent distributions by qualitative and quantitative analysis.

Facebook AI Research · 優化器 · Extensibility · 模型評估 · 經驗風險最小化 ·

2022 年 9 月 15 日

A Stochastic Optimization Framework for Fair Risk Minimization

Andrew Lowy,Sina Baharlouei,Rakesh Pavan,Meisam Razaviyayn,Ahmad Beirami

Despite the success of large-scale empirical risk minimization (ERM) at achieving high accuracy across a variety of machine learning tasks, fair ERM is hindered by the incompatibility of fairness constraints with stochastic optimization. We consider the problem of fair classification with discrete sensitive attributes and potentially large models and data sets, requiring stochastic solvers. Existing in-processing fairness algorithms are either impractical in the large-scale setting because they require large batches of data at each iteration or they are not guaranteed to converge. In this paper, we develop the first stochastic in-processing fairness algorithm with guaranteed convergence. For demographic parity, equalized odds, and equal opportunity notions of fairness, we provide slight variations of our algorithm--called FERMI--and prove that each of these variations converges in stochastic optimization with any batch size. Empirically, we show that FERMI is amenable to stochastic solvers with multiple (non-binary) sensitive attributes and non-binary targets, performing well even with minibatch size as small as one. Extensive experiments show that FERMI achieves the most favorable tradeoffs between fairness violation and test accuracy across all tested setups compared with state-of-the-art baselines for demographic parity, equalized odds, equal opportunity. These benefits are especially significant with small batch sizes and for non-binary classification with large number of sensitive attributes, making FERMI a practical fairness algorithm for large-scale problems.

樣本 · 類別 · 損失 · Performer · SimPLe ·

2019 年 1 月 16 日

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui,Menglin Jia,Tsung-Yi Lin,Yang Song,Serge Belongie

from arxiv, Code is available at: //github.com/richardaecn/class-balanced-loss

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.