欧美狂野视频一区国产精品,亚洲丁香婷婷久久综合激情综合

We consider the problem of learning support vector machines robust to uncertainty. It has been established in the literature that typical loss functions, including the hinge loss, are sensible to data perturbations and outliers, thus performing poorly in the setting considered. In contrast, using the 0-1 loss or a suitable non-convex approximation results in robust estimators, at the expense of large computational costs. In this paper we use mixed-integer optimization techniques to derive a new loss function that better approximates the 0-1 loss compared with existing alternatives, while preserving the convexity of the learning problem. In our computational results, we show that the proposed estimator is competitive with the standard SVMs with the hinge loss in outlier-free regimes and better in the presence of outliers.

相關內容

穩(wen)健(jian)性

關注 3

通道 · 判別器 · Principle · binary · 相互獨立的 ·

2024 年 3 月 17 日

Usefulness of adaptive strategies in asymptotic quantum channel discrimination

Farzin Salek,Masahito Hayashi,Andreas Winter

from arxiv, v3 is the accepted version

Adaptiveness is a key principle in information processing including statistics and machine learning. We investigate the usefulness of adaptive methods in the framework of asymptotic binary hypothesis testing, when each hypothesis represents asymptotically many independent instances of a quantum channel, and the tests are based on using the unknown channel and observing outputs. Unlike the familiar setting of quantum states as hypotheses, there is a fundamental distinction between adaptive and non-adaptive strategies with respect to the channel uses, and we introduce a number of further variants of the discrimination tasks by imposing different restrictions on the test strategies. The following results are obtained: (1) We prove that for classical-quantum channels, adaptive and non-adaptive strategies lead to the same error exponents both in the symmetric (Chernoff) and asymmetric (Hoeffding, Stein) settings. (2) The first separation between adaptive and non-adaptive symmetric hypothesis testing exponents for quantum channels, which we derive from a general lower bound on the error probability for non-adaptive strategies; the concrete example we analyze is a pair of entanglement-breaking channels. (3)We prove, in some sense generalizing the previous statement, that for general channels adaptive strategies restricted to classical feed-forward and product state channel inputs are not superior in the asymptotic limit to non-adaptive product state strategies. (4) As an application of our findings, we address the discrimination power of an arbitrary quantum channel and show that adaptive strategies with classical feedback and no quantum memory at the input do not increase the discrimination power of the channel beyond non-adaptive tensor product input strategies.

INFORMS · 信息先驗 · MoDELS · 邊緣化 · 樣例 ·

2024 年 3 月 17 日

Translating predictive distributions into informative priors

Andrew A. Manderson,Robert J. B. Goudie

from arxiv, Revised to shorten the main text considerably

When complex Bayesian models exhibit implausible behaviour, one solution is to assemble available information into an informative prior. Challenges arise as prior information is often only available for the observable quantity, or some model-derived marginal quantity, rather than directly pertaining to the natural parameters in our model. We propose a method for translating available prior information, in the form of an elicited distribution for the observable or model-derived marginal quantity, into an informative joint prior. Our approach proceeds given a parametric class of prior distributions with as yet undetermined hyperparameters, and minimises the difference between the supplied elicited distribution and corresponding prior predictive distribution. We employ a global, multi-stage Bayesian optimisation procedure to locate optimal values for the hyperparameters. Three examples illustrate our approach: a cure-fraction survival model, where censoring implies that the observable quantity is a priori a mixed discrete/continuous quantity; a setting in which prior information pertains to $R^{2}$ -- a model-derived quantity; and a nonlinear regression model.

估計/估計量 · Processing（編程語言） · HTTPS · TEAM · 似然 ·

2024 年 3 月 16 日

Cubature scheme for spatio-temporal Poisson point processes estimation

Nicoletta D'Angelo,Giada Adelfio

from arxiv, arXiv admin note: text overlap with arXiv:2302.13684, arXiv:2209.07153

This work presents the cubature scheme for the fitting of spatio-temporal Poisson point processes. The methodology is implemented in the R Core Team (2024) package stopp (D'Angelo and Adelfio, 2023), published on the Comprehensive R Archive Network (CRAN) and available from //CRAN.R-project.org/package=stopp. Since the number of dummy points should be sufficient for an accurate estimate of the likelihood, numerical experiments are currently under development to give guidelines on this aspect.

Analysis · 估計/估計量 · Performer · 蒙特卡羅方法 · 蒙特卡羅 ·

2024 年 3 月 15 日

Towards a power analysis for PLS-based methods

Angela Andreella,Livio Fino,Bruno Scarpa,Matteo Stocchero

In recent years, power analysis has become widely used in applied sciences, with the increasing importance of the replicability issue. When distribution-free methods, such as Partial Least Squares (PLS)-based approaches, are considered, formulating power analysis turns out to be challenging. In this study, we introduce the methodological framework of a new procedure for performing power analysis when PLS-based methods are used. Data are simulated by the Monte Carlo method, assuming the null hypothesis of no effect is false and exploiting the latent structure estimated by PLS in the pilot data. In this way, the complex correlation data structure is explicitly considered in power analysis and sample size estimation. The paper offers insights into selecting statistical tests for the power analysis procedure, comparing accuracy-based tests and those based on continuous parameters estimated by PLS. Simulated and real datasets are investigated to show how the method works in practice.

離散化 · Weight · 圖 · 泛函 · Continuity ·

2024 年 3 月 15 日

Discrete functional inequalities on lattice graphs

Shubham Gupta

from arxiv, PhD thesis, Imperial College London, 140 pages

In this thesis, we study problems at the interface of analysis and discrete mathematics. We discuss analogues of well known Hardy-type inequalities and Rearrangement inequalities on the lattice graphs $\mathbb{Z}^d$, with a particular focus on behaviour of sharp constants and optimizers.In the first half of the thesis, we analyse Hardy inequalities on $\mathbb{Z}^d$, first for $d=1$ and then for $d \geq 3$. We prove a sharp weighted Hardy inequality on integers with power weights of the form $n^\alpha$. This is done via two different methods, namely super-solution and Fourier method. We also use Fourier method to prove a weighted Hardy type inequality for higher order operators. After discussing the one dimensional case, we study the Hardy inequality in higher dimensions ($d \geq 3$). In particular, we compute the asymptotic behaviour of the sharp constant in the discrete Hardy inequality, as $d \rightarrow \infty$. This is done by converting the inequality into a continuous Hardy-type inequality on a torus for functions having zero average. These continuous inequalities are new and interesting in themselves. In the second half, we focus our attention on analogues of Rearrangement inequalities on lattice graphs. We begin by analysing the situation in dimension one. We define various notions of rearrangements and prove the corresponding Polya-Szeg\H{o} inequality. These inequalities are also applied to prove some weighted Hardy inequalities on integers. Finally, we study Rearrangement inequalities (Polya-Szeg\H{o}) on general graphs, with a particular focus on lattice graphs $\mathbb{Z}^d$, for $d \geq 2$. We develop a framework to study these inequalities, using which we derive concrete results in dimension two. In particular, these results develop connections between Polya-Szeg\H{o} inequality and various isoperimetric inequalities on graphs.

INTERACT · 回合 · Learning · Processing（編程語言） · 情景 ·

2024 年 3 月 13 日

Interactive environments for training children's curiosity through the practice of metacognitive skills: a pilot study

Rania Abdelghani,Edith Law,Chloé Desvaux,Pierre-Yves Oudeyer,Hélène Sauzéon

Curiosity-driven learning has shown significant positive effects on students' learning experiences and outcomes. But despite this importance, reports show that children lack this skill, especially in formal educational settings. To address this challenge, we propose an 8-session workshop that aims to enhance children's curiosity through training a set of specific metacognitive skills we hypothesize are involved in its process. Our workshop contains animated videos presenting declarative knowledge about curiosity and the said metacognitive skills as well as practice sessions to apply these skills during a reading-comprehension task, using a web platform designed for this study (e.g. expressing uncertainty, formulating questions, etc). We conduct a pilot study with 15 primary school students, aged between 8 and 10. Our first results show a positive impact on children's metacognitive efficiency and their ability to express their curiosity through question-asking behaviors.

MoDELS · Analysis · 情景 · 代價 · Performer ·

2024 年 3 月 12 日

Characterising harmful data sources when constructing multi-fidelity surrogate models

Nicolau Andrés-Thió,Mario Andrés Mu?oz,Kate Smith-Miles

Surrogate modelling techniques have seen growing attention in recent years when applied to both modelling and optimisation of industrial design problems. These techniques are highly relevant when assessing the performance of a particular design carries a high cost, as the overall cost can be mitigated via the construction of a model to be queried in lieu of the available high-cost source. The construction of these models can sometimes employ other sources of information which are both cheaper and less accurate. The existence of these sources however poses the question of which sources should be used when constructing a model. Recent studies have attempted to characterise harmful data sources to guide practitioners in choosing when to ignore a certain source. These studies have done so in a synthetic setting, characterising sources using a large amount of data that is not available in practice. Some of these studies have also been shown to potentially suffer from bias in the benchmarks used in the analysis. In this study, we present a characterisation of harmful low-fidelity sources using only the limited data available to train a surrogate model. We employ recently developed benchmark filtering techniques to conduct a bias-free assessment, providing objectively varied benchmark suites of different sizes for future research. Analysing one of these benchmark suites with the technique known as Instance Space Analysis, we provide an intuitive visualisation of when a low-fidelity source should be used and use this analysis to provide guidelines that can be used in an applied industrial setting.

Principle · Integration · xgboost · 評論員 · TOOLS ·

2024 年 3 月 12 日

BayesFLo: Bayesian fault localization of complex software systems

Yi Ji,Simon Mak,Ryan Lekivetz,Joseph Morgan

Software testing is essential for the reliable development of complex software systems. A key step in software testing is fault localization, which uses test data to pinpoint failure-inducing combinations for further diagnosis. Existing fault localization methods, however, are largely deterministic, and thus do not provide a principled approach for assessing probabilistic risk of potential root causes, or for integrating domain and/or structural knowledge from test engineers. To address this, we propose a novel Bayesian fault localization framework called BayesFLo, which leverages a flexible Bayesian model on potential root cause combinations. A key feature of BayesFLo is its integration of the principles of combination hierarchy and heredity, which capture the structured nature of failure-inducing combinations. A critical challenge, however, is the sheer number of potential root cause scenarios to consider, which renders the computation of posterior root cause probabilities infeasible even for small software systems. We thus develop new algorithms for efficient computation of such probabilities, leveraging recent tools from integer programming and graph representations. We then demonstrate the effectiveness of BayesFLo over state-of-the-art fault localization methods, in a suite of numerical experiments and in two motivating case studies on the JMP XGBoost interface.

優化器 · INTERACT · Networking · 知識 (knowledge) · Performer ·

2022 年 5 月 11 日

Dynamic neighbourhood optimisation for task allocation using multi-agent

Niall Creech,Natalia Criado Pacheco,Simon Miles

from arxiv, 28 pages

In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.

過擬合 · SimPLe · Principle · 模型評估 · 統計量 ·

2021 年 3 月 16 日

Deep learning: a statistical viewpoint

Peter L. Bartlett,Andrea Montanari,Alexander Rakhlin

The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.