精品自在线观看影片天天看,亚洲精品国产字幕久久AV,久久九九RE热这里有精品

Geometric Semantic Geometric Programming (GSGP) is one of the most prominent Genetic Programming (GP) variants, thanks to its solid theoretical background, the excellent performance achieved, and the execution time significantly smaller than standard syntax-based GP. In recent years, a new mutation operator, Geometric Semantic Mutation with Local Search (GSM-LS), has been proposed to include a local search step in the mutation process based on the idea that performing a linear regression during the mutation can allow for a faster convergence to good-quality solutions. While GSM-LS helps the convergence of the evolutionary search, it is prone to overfitting. Thus, it was suggested to use GSM-LS only for a limited number of generations and, subsequently, to switch back to standard geometric semantic mutation. A more recently defined variant of GSGP (called GSGP-reg) also includes a local search step but shares similar strengths and weaknesses with GSM-LS. Here we explore multiple possibilities to limit the overfitting of GSM-LS and GSGP-reg, ranging from adaptive methods to estimate the risk of overfitting at each mutation to a simple regularized regression. The results show that the method used to limit overfitting is not that important: providing that a technique to control overfitting is used, it is possible to consistently outperform standard GSGP on both training and unseen data. The obtained results allow practitioners to better understand the role of local search in GSGP and demonstrate that simple regularization strategies are effective in controlling overfitting.

相關內容

過擬合

關注 8

過擬合，在AI領域多指機器學習得到模型太過復雜，導致在訓練集上表現很好，然而在測試集上卻不盡人意。過擬合（over-fitting）也稱為過學習，它的直觀表現是算法在訓練集上表現好，但在測試集上表現不好，泛化性能差。過擬合是在模型參數擬合過程中由于訓練數據包含抽樣誤差，在訓練時復雜的模型將抽樣誤差也進行了擬合導致的。

Analysis · 訓練數據 · 模型評估 · MoDELS · 輸出 ·

2023 年 7 月 14 日

Global sensitivity analysis in the limited data setting with application to char combustion

Dongjin Lee,Elle Lavichant,Boris Kramer

from arxiv, 26 pages, 11 figures

In uncertainty quantification, variance-based global sensitivity analysis quantitatively determines the effect of each input random variable on the output by partitioning the total output variance into contributions from each input. However, computing conditional expectations can be prohibitively costly when working with expensive-to-evaluate models. Surrogate models can accelerate this, yet their accuracy depends on the quality and quantity of training data, which is expensive to generate (experimentally or computationally) for complex engineering systems. Thus, methods that work with limited data are desirable. We propose a diffeomorphic modulation under observable response preserving homotopy (D-MORPH) regression to train a polynomial dimensional decomposition surrogate of the output that minimizes the number of training data. The new method first computes a sparse Lasso solution and uses it to define the cost function. A subsequent D-MORPH regression minimizes the difference between the D-MORPH and Lasso solution. The resulting D-MORPH surrogate is more robust to input variations and more accurate with limited training data. We illustrate the accuracy and computational efficiency of the new surrogate for global sensitivity analysis using mathematical functions and an expensive-to-simulate model of char combustion. The new method is highly efficient, requiring only 15% of the training data compared to conventional regression.

估計/估計量 · MoDELS · 泛函 · 置換不變性 · 不變 ·

2023 年 7 月 13 日

Choice Models and Permutation Invariance

Amandeep Singh,Ye Liu,Hema Yoganarasimhan

Choice Modeling is at the core of many economics, operations, and marketing problems. In this paper, we propose a fundamental characterization of choice functions that encompasses a wide variety of extant choice models. We demonstrate how nonparametric estimators like neural nets can easily approximate such functionals and overcome the curse of dimensionality that is inherent in the non-parametric estimation of choice functions. We demonstrate through extensive simulations that our proposed functionals can flexibly capture underlying consumer behavior in a completely data-driven fashion and outperform traditional parametric models. As demand settings often exhibit endogenous features, we extend our framework to incorporate estimation under endogenous features. Further, we also describe a formal inference procedure to construct valid confidence intervals on objects of interest like price elasticity. Finally, to assess the practical applicability of our estimator, we utilize a real-world dataset from S. Berry, Levinsohn, and Pakes (1995). Our empirical analysis confirms that the estimator generates realistic and comparable own- and cross-price elasticities that are consistent with the observations reported in the existing literature.

移動平均 · MoDELS · 蒙特卡羅 · Analysis · 講稿 ·

2023 年 7 月 13 日

Bayesian Analysis of Beta Autoregressive Moving Average Models

Aline Foerster Grande,Guilherme Pumi,Gabriela Bettella Cybis

This work presents a Bayesian approach for the estimation of Beta Autoregressive Moving Average ($\beta$ARMA) models. We discuss standard choice for the prior distributions and employ a Hamiltonian Monte Carlo algorithm to sample from the posterior. We propose a method to approach the problem of unit roots in the model's systematic component. We then present a series of Monte Carlo simulations to evaluate the performance of this Bayesian approach. In addition to parameter estimation, we evaluate the proposed approach to verify the presence of unit roots in the model's systematic component and study prior sensitivity. An empirical application is presented to exemplify the usefulness of the method. In the application, we compare the fitted Bayesian and frequentist approaches in terms of their out-of-sample forecasting capabilities.

高斯混合（模型） · Learning · 高斯混合模型 · MoDELS · Networking ·

2023 年 7 月 13 日

Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent

Ruichong Zhang

from arxiv, 21 pages, 4 figures

The learning of Gaussian Mixture Models (also referred to simply as GMMs) plays an important role in machine learning. Known for their expressiveness and interpretability, Gaussian mixture models have a wide range of applications, from statistics, computer vision to distributional reinforcement learning. However, as of today, few known algorithms can fit or learn these models, some of which include Expectation-Maximization algorithms and Sliced Wasserstein Distance. Even fewer algorithms are compatible with gradient descent, the common learning process for neural networks. In this paper, we derive a closed formula of two GMMs in the univariate, one-dimensional case, then propose a distance function called Sliced Cram\'er 2-distance for learning general multivariate GMMs. Our approach has several advantages over many previous methods. First, it has a closed-form expression for the univariate case and is easy to compute and implement using common machine learning libraries (e.g., PyTorch and TensorFlow). Second, it is compatible with gradient descent, which enables us to integrate GMMs with neural networks seamlessly. Third, it can fit a GMM not only to a set of data points, but also to another GMM directly, without sampling from the target model. And fourth, it has some theoretical guarantees like global gradient boundedness and unbiased sampling gradient. These features are especially useful for distributional reinforcement learning and Deep Q Networks, where the goal is to learn a distribution over future rewards. We will also construct a Gaussian Mixture Distributional Deep Q Network as a toy example to demonstrate its effectiveness. Compared with previous models, this model is parameter efficient in terms of representing a distribution and possesses better interpretability.

INTERACT · Agent · MoDELS · 情景 · 約束 ·

2023 年 7 月 13 日

A Local-Time Semantics for Negotiations

Madhavan Mukund,Adwitee Roy,B Srivathsan

from arxiv, A shorter version appears in FORMATS 2023

Negotiations, introduced by Esparza et al., are a model for concurrent systems where computations involving a set of agents are described in terms of their interactions. In many situations, it is natural to impose timing constraints between interactions -- for instance, to limit the time available to enter the PIN after inserting a card into an ATM. To model this, we introduce a real-time aspect to negotiations. In our model of local-timed negotiations, agents have local reference times that evolve independently. Inspired by the model of networks of timed automata, each agent is equipped with a set of local clocks. Similar to timed automata, the outcomes of a negotiation contain guards and resets over the local clocks. As a new feature, we allow some interactions to force the reference clocks of the participating agents to synchronize. This synchronization constraint allows us to model interesting scenarios. Surprisingly, it also gives unlimited computing power. We show that reachability is undecidable for local-timed negotiations with a mixture of synchronized and unsynchronized interactions. We study restrictions on the use of synchronized interactions that make the problem decidable.

Performer · MoDELS · 變換 · 信息通信技術 · 縮放 ·

2023 年 7 月 12 日

Joint Hierarchical Priors and Adaptive Spatial Resolution for Efficient Neural Image Compression

Ahmed Ghorbel,Wassim Hamidouche,Luce Morin

Recently, the performance of neural image compression (NIC) has steadily improved thanks to the last line of study, reaching or outperforming state-of-the-art conventional codecs. Despite significant progress, current NIC methods still rely on ConvNet-based entropy coding, limited in modeling long-range dependencies due to their local connectivity and the increasing number of architectural biases and priors, resulting in complex underperforming models with high decoding latency. Motivated by the efficiency investigation of the Tranformer-based transform coding framework, namely SwinT-ChARM, we propose to enhance the latter, as first, with a more straightforward yet effective Tranformer-based channel-wise auto-regressive prior model, resulting in an absolute image compression transformer (ICT). Through the proposed ICT, we can capture both global and local contexts from the latent representations and better parameterize the distribution of the quantized latents. Further, we leverage a learnable scaling module with a sandwich ConvNeXt-based pre-/post-processor to accurately extract more compact latent codes while reconstructing higher-quality images. Extensive experimental results on benchmark datasets showed that the proposed framework significantly improves the trade-off between coding efficiency and decoder complexity over the versatile video coding (VVC) reference encoder (VTM-18.0) and the neural codec SwinT-ChARM. Moreover, we provide model scaling studies to verify the computational efficiency of our approach and conduct several objective and subjective analyses to bring to the fore the performance gap between the adaptive image compression transformer (AICT) and the neural codec SwinT-ChARM.

全局極小值 · 優化器 · 極小值 · 非凸 · 近似 ·

2021 年 3 月 24 日

Why Do Local Methods Solve Nonconvex Problems?

Tengyu Ma

from arxiv, This is the Chapter 21 of the book "Beyond the Worst-Case Analysis of Algorithms"

Non-convex optimization is ubiquitous in modern machine learning. Researchers devise non-convex objective functions and optimize them using off-the-shelf optimizers such as stochastic gradient descent and its variants, which leverage the local geometry and update iteratively. Even though solving non-convex functions is NP-hard in the worst case, the optimization quality in practice is often not an issue -- optimizers are largely believed to find approximate global minima. Researchers hypothesize a unified explanation for this intriguing phenomenon: most of the local minima of the practically-used objectives are approximately global minima. We rigorously formalize it for concrete instances of machine learning problems.

圖形處理器 · Neural Networks · 圖 · 規范化的 · Principle ·

2021 年 2 月 16 日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Tianle Cai,Shengjie Luo,Keyulu Xu,Di He,Tie-Yan Liu,Liwei Wang

Normalization is known to help the optimization of deep neural networks. Curiously, different architectures require specialized normalization methods. In this paper, we study what normalization is effective for Graph Neural Networks (GNNs). First, we adapt and evaluate the existing methods from other domains to GNNs. Faster convergence is achieved with InstanceNorm compared to BatchNorm and LayerNorm. We provide an explanation by showing that InstanceNorm serves as a preconditioner for GNNs, but such preconditioning effect is weaker with BatchNorm due to the heavy batch noise in graph datasets. Second, we show that the shift operation in InstanceNorm results in an expressiveness degradation of GNNs for highly regular graphs. We address this issue by proposing GraphNorm with a learnable shift. Empirically, GNNs with GraphNorm converge faster compared to GNNs using other normalization. GraphNorm also improves the generalization of GNNs, achieving better performance on graph classification benchmarks.

contrastive · 學成 · 對比學習 · Extensibility · SSL ·

2020 年 6 月 18 日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Krishna Chaitanya,Ertunc Erdil,Neerav Karani,Ender Konukoglu

from arxiv, 16 pages, 2 figures, 7 tables. This article is a pre-print and is currently under review at a conference

A key requirement for the success of supervised deep learning is a large labeled dataset - a condition that is difficult to meet in medical image analysis. Self-supervised learning (SSL) can help in this regard by providing a strategy to pre-train a neural network with unlabeled data, followed by fine-tuning for a downstream task with limited annotations. Contrastive learning, a particular variant of SSL, is a powerful technique for learning image-level representations. In this work, we propose strategies for extending the contrastive learning framework for segmentation of volumetric medical images in the semi-supervised setting with limited annotations, by leveraging domain-specific and problem-specific cues. Specifically, we propose (1) novel contrasting strategies that leverage structural similarity across volumetric medical images (domain-specific cue) and (2) a local version of the contrastive loss to learn distinctive representations of local regions that are useful for per-pixel segmentation (problem-specific cue). We carry out an extensive evaluation on three Magnetic Resonance Imaging (MRI) datasets. In the limited annotation setting, the proposed method yields substantial improvements compared to other self-supervision and semi-supervised learning techniques. When combined with a simple data augmentation technique, the proposed method reaches within 8% of benchmark performance using only two labeled MRI volumes for training, corresponding to only 4% (for ACDC) of the training data used to train the benchmark.

推斷 · 估計/估計量 · 統計量 · Machine Learning · 學成 ·

2020 年 2 月 5 日

A Survey on Causal Inference

Liuyi Yao,Zhixuan Chu,Sheng Li,Yaliang Li,Jing Gao,Aidong Zhang

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well known causal inference framework. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.