唯美清纯另类亚洲一区二区,影888午夜理论不卡,2020色愉拍亚洲偷自拍,手机看片欧美变态,久久精品国产亚洲欧美不卡

Generating calibrated and sharp neural network predictive distributions for regression problems is essential for optimal decision-making in many real-world applications. To address the miscalibration issue of neural networks, various methods have been proposed to improve calibration, including post-hoc methods that adjust predictions after training and regularization methods that act during training. While post-hoc methods have shown better improvement in calibration compared to regularization methods, the post-hoc step is completely independent of model training. We introduce a novel end-to-end model training procedure called Quantile Recalibration Training, integrating post-hoc calibration directly into the training process without additional parameters. We also present a unified algorithm that includes our method and other post-hoc and regularization methods, as particular cases. We demonstrate the performance of our method in a large-scale experiment involving 57 tabular regression datasets, showcasing improved predictive accuracy while maintaining calibration. We also conduct an ablation study to evaluate the significance of different components within our proposed method, as well as an in-depth analysis of the impact of the base model and different hyperparameters on predictive accuracy.

相關內容

正則化項

關注 0

Processing（編程語言） · 語言處理 · MoDELS · 自然語言處理 · NLP ·

2024 年 5 月 1 日

A Legal Framework for Natural Language Processing Model Training in Portugal

Rúben Almeida,Evelin Amorim

from arxiv, LEGAL2024 Legal and Ethical Issues in Human Language Technologies, LREC 2024

Recent advances in deep learning have promoted the advent of many computational systems capable of performing intelligent actions that, until then, were restricted to the human intellect. In the particular case of human languages, these advances allowed the introduction of applications like ChatGPT that are capable of generating coherent text without being explicitly programmed to do so. Instead, these models use large volumes of textual data to learn meaningful representations of human languages. Associated with these advances, concerns about copyright and data privacy infringements caused by these applications have emerged. Despite these concerns, the pace at which new natural language processing applications continued to be developed largely outperformed the introduction of new regulations. Today, communication barriers between legal experts and computer scientists motivate many unintentional legal infringements during the development of such applications. In this paper, a multidisciplinary team intends to bridge this communication gap and promote more compliant Portuguese NLP research by presenting a series of everyday NLP use cases, while highlighting the Portuguese legislation that may arise during its development.

Integration · Neural Networks · Networking · 可約的 · 動力系統 ·

2024 年 4 月 30 日

Neural Operator Learning for Long-Time Integration in Dynamical Systems with Recurrent Neural Networks

Katarzyna Micha?owska,Somdatta Goswami,George Em Karniadakis,Signe Riemer-S?rensen

from arxiv, 8 pages, 5 figures

Deep neural networks are an attractive alternative for simulating complex dynamical systems, as in comparison to traditional scientific computing methods, they offer reduced computational costs during inference and can be trained directly from observational data. Existing methods, however, cannot extrapolate accurately and are prone to error accumulation in long-time integration. Herein, we address this issue by combining neural operators with recurrent neural networks, learning the operator mapping, while offering a recurrent structure to capture temporal dependencies. The integrated framework is shown to stabilize the solution and reduce error accumulation for both interpolation and extrapolation of the Korteweg-de Vries equation.

模型選擇 · MoDELS · 估計/估計量 · Analysis · 超參數 ·

2024 年 4 月 29 日

Empirical Analysis of Model Selection for Heterogeneous Causal Effect Estimation

Divyat Mahajan,Ioannis Mitliagkas,Brady Neal,Vasilis Syrgkanis

from arxiv, Proceedings of the 12th International Conference on Learning Representations (ICLR), 2024. (Spotlight)

We study the problem of model selection in causal inference, specifically for conditional average treatment effect (CATE) estimation. Unlike machine learning, there is no perfect analogue of cross-validation for model selection as we do not observe the counterfactual potential outcomes. Towards this, a variety of surrogate metrics have been proposed for CATE model selection that use only observed data. However, we do not have a good understanding regarding their effectiveness due to limited comparisons in prior studies. We conduct an extensive empirical analysis to benchmark the surrogate model selection metrics introduced in the literature, as well as the novel ones introduced in this work. We ensure a fair comparison by tuning the hyperparameters associated with these metrics via AutoML, and provide more detailed trends by incorporating realistic datasets via generative modeling. Our analysis suggests novel model selection strategies based on careful hyperparameter selection of CATE estimators and causal ensembling.

規范化的 · 估計/估計量 · 3D · state-of-the-art · 回合 ·

2024 年 4 月 29 日

3D Gaussian Splatting with Deferred Reflection

Keyang Ye,Qiming Hou,Kun Zhou

The advent of neural and Gaussian-based radiance field methods have achieved great success in the field of novel view synthesis. However, specular reflection remains non-trivial, as the high frequency radiance field is notoriously difficult to fit stably and accurately. We present a deferred shading method to effectively render specular reflection with Gaussian splatting. The key challenge comes from the environment map reflection model, which requires accurate surface normal while simultaneously bottlenecks normal estimation with discontinuous gradients. We leverage the per-pixel reflection gradients generated by deferred shading to bridge the optimization process of neighboring Gaussians, allowing nearly correct normal estimations to gradually propagate and eventually spread over all reflective objects. Our method significantly outperforms state-of-the-art techniques and concurrent work in synthesizing high-quality specular reflection effects, demonstrating a consistent improvement of peak signal-to-noise ratio (PSNR) for both synthetic and real-world scenes, while running at a frame rate almost identical to vanilla Gaussian splatting.

INFORMS · 優化器 · 可約的 · Agent · 多峰值 ·

2024 年 4 月 29 日

Trajectory Optimization for Adaptive Informative Path Planning with Multimodal Sensing

Joshua Ott,Edward Balaban,Mykel Kochenderfer

from arxiv, IEEE International Conference on Control, Decision and Information Technologies

We consider the problem of an autonomous agent equipped with multiple sensors, each with different sensing precision and energy costs. The agent's goal is to explore the environment and gather information subject to its resource constraints in unknown, partially observable environments. The challenge lies in reasoning about the effects of sensing and movement while respecting the agent's resource and dynamic constraints. We formulate the problem as a trajectory optimization problem and solve it using a projection-based trajectory optimization approach where the objective is to reduce the variance of the Gaussian process world belief. Our approach outperforms previous approaches in long horizon trajectories by achieving an overall variance reduction of up to 85% and reducing the root-mean square error in the environment belief by 50%. This approach was developed in support of rover path planning for the NASA VIPER Mission.

稀疏 · 稀疏權重 · Weight · 剪枝 · Neural Networks ·

2024 年 4 月 26 日

Sparse Weight Averaging with Multiple Particles for Iterative Magnitude Pruning

Moonseok Choi,Hyungi Lee,Giung Nam,Juho Lee

from arxiv, ICLR 2024

Given the ever-increasing size of modern neural networks, the significance of sparse architectures has surged due to their accelerated inference speeds and minimal memory demands. When it comes to global pruning techniques, Iterative Magnitude Pruning (IMP) still stands as a state-of-the-art algorithm despite its simple nature, particularly in extremely sparse regimes. In light of the recent finding that the two successive matching IMP solutions are linearly connected without a loss barrier, we propose Sparse Weight Averaging with Multiple Particles (SWAMP), a straightforward modification of IMP that achieves performance comparable to an ensemble of two IMP solutions. For every iteration, we concurrently train multiple sparse models, referred to as particles, using different batch orders yet the same matching ticket, and then weight average such models to produce a single mask. We demonstrate that our method consistently outperforms existing baselines across different sparsities through extensive experiments on various data and neural network structures.

INFORMS · 推薦系統 · 操作 · TEAM · AI ·

2023 年 10 月 17 日

Improving Operator Situation Awareness when Working with AI Recommender Systems

Divya K. Srivastava,J. Mason Lilly,Karen M. Feigh

from arxiv, 26 pages, 12 figures, submitted to Springer's "Cognition, Technology, and Work" journal

AI recommender systems are sought for decision support by providing suggestions to operators responsible for making final decisions. However, these systems are typically considered black boxes, and are often presented without any context or insight into the underlying algorithm. As a result, recommender systems can lead to miscalibrated user reliance and decreased situation awareness. Recent work has focused on improving the transparency of recommender systems in various ways such as improving the recommender's analysis and visualization of the figures of merit, providing explanations for the recommender's decision, as well as improving user training or calibrating user trust. In this paper, we introduce an alternative transparency technique of structuring the order in which contextual information and the recommender's decision are shown to the human operator. This technique is designed to improve the operator's situation awareness and therefore the shared situation awareness between the operator and the recommender system. This paper presents the results of a two-phase between-subjects study in which participants and a recommender system jointly make a high-stakes decision. We varied the amount of contextual information the participant had, the assessment technique of the figures of merit, and the reliability of the recommender system. We found that providing contextual information upfront improves the team's shared situation awareness by improving the human decision maker's initial and final judgment, as well as their ability to discern the recommender's error boundary. Additionally, this technique accurately calibrated the human operator's trust in the recommender. This work proposes and validates a way to provide model-agnostic transparency into AI systems that can support the human decision maker and lead to improved team performance.

INFORMS · 可辨認的 · Networking · Neural Networks · 黑盒 ·

2021 年 10 月 4 日

Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information

Yang Zhang,Ashkan Khakzar,Yawei Li,Azade Farshad,Seong Tae Kim,Nassir Navab

from arxiv, Accepted in NeurIPS 2021 (Neural Information Processing Systems)

One principal approach for illuminating a black-box neural network is feature attribution, i.e. identifying the importance of input features for the network's prediction. The predictive information of features is recently proposed as a proxy for the measure of their importance. So far, the predictive information is only identified for latent features by placing an information bottleneck within the network. We propose a method to identify features with predictive information in the input domain. The method results in fine-grained identification of input features' information and is agnostic to network architecture. The core idea of our method is leveraging a bottleneck on the input that only lets input features associated with predictive latent features pass through. We compare our method with several feature attribution methods using mainstream feature attribution evaluation experiments. The code is publicly available.

離散化 · 圖 · 圖形處理器 · Neural Networks · Networking ·

2019 年 3 月 28 日

Learning Discrete Structures for Graph Neural Networks

Luca Franceschi,Mathias Niepert,Massimiliano Pontil,Xiao He

from arxiv, 18 pages

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.

樣例 · 黑盒 · Networking · MoDELS · 原點 ·

2018 年 1 月 15 日

Generating Adversarial Examples with Adversarial Networks

Chaowei Xiao,Bo Li,Jun-Yan Zhu,Warren He,Mingyan Liu,Dawn Song

Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples resulting from adding small-magnitude perturbations to inputs. Such adversarial examples can mislead DNNs to produce adversary-selected results. Different attack strategies have been proposed to generate adversarial examples, but how to produce them with high perceptual quality and more efficiently requires more research efforts. In this paper, we propose AdvGAN to generate adversarial examples with generative adversarial networks (GANs), which can learn and approximate the distribution of original instances. For AdvGAN, once the generator is trained, it can generate adversarial perturbations efficiently for any instance, so as to potentially accelerate adversarial training as defenses. We apply AdvGAN in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, there is no need to access the original target model after the generator is trained, in contrast to traditional white-box attacks. In black-box attacks, we dynamically train a distilled model for the black-box model and optimize the generator accordingly. Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks. Our attack has placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.