日本人体黄色三级视频-欧美成人大V领

Conformal prediction is a popular, modern technique for providing valid predictive inference for arbitrary machine learning models. Its validity relies on the assumptions of exchangeability of the data, and symmetry of the given model fitting algorithm as a function of the data. However, exchangeability is often violated when predictive models are deployed in practice. For example, if the data distribution drifts over time, then the data points are no longer exchangeable; moreover, in such settings, we might want to use a nonsymmetric algorithm that treats recent observations as more relevant. This paper generalizes conformal prediction to deal with both aspects: we employ weighted quantiles to introduce robustness against distribution drift, and design a new randomization technique to allow for algorithms that do not treat data points symmetrically. Our new methods are provably robust, with substantially less loss of coverage when exchangeability is violated due to distribution drift or other challenging features of real data, while also achieving the same coverage guarantees as existing conformal prediction methods if the data points are in fact exchangeable. We demonstrate the practical utility of these new tools with simulations and real-data experiments on electricity and election forecasting.

相關內容

Description of random level sets by polynomial chaos expansions

Markus Bambach,Stephan Gerster,Michael Herty,Aleksey Sikstel

We present a novel approach to determine the evolution of level sets under uncertainties in the velocity fields. This leads to a stochastic description of the level sets. To compute the quantiles of random level sets, we use the stochastic Galerkin method for a hyperbolic reformulation of the level-set equations. A novel intrusive Galerkin formulation is presented and proven hyperbolic. It induces a corresponding finite-volume scheme that is specifically taylored for uncertain velocities.

賭博機/老虎機 · Facebook AI Research · 情景 · ARM · 上置信界限 ·

2023 年 5 月 9 日

$\texttt{BanditQ}:$ Fair Multi-Armed Bandits with Guaranteed Rewards per Arm

Abhishek Sinha

Classic no-regret online prediction algorithms, including variants of the Upper Confidence Bound ($\texttt{UCB}$) algorithm, $\texttt{Hedge}$, and $\texttt{EXP3}$, are inherently unfair by design. The unfairness stems from their very objective of playing the most rewarding arm as many times as possible while ignoring the less rewarding ones among $N$ arms. In this paper, we consider a fair prediction problem in the stochastic setting with hard lower bounds on the rate of accrual of rewards for a set of arms. We study the problem in both full and bandit feedback settings. Using queueing-theoretic techniques in conjunction with adversarial learning, we propose a new online prediction policy called $\texttt{BanditQ}$ that achieves the target reward rates while achieving a regret and target rate violation penalty of $O(T^{\frac{3}{4}}).$ In the full-information setting, the regret bound can be further improved to $O(\sqrt{T})$ when considering the average regret over the entire horizon of length $T$. The proposed policy is efficient and admits a black-box reduction from the fair prediction problem to the standard MAB problem with a carefully defined sequence of rewards. The design and analysis of the $\texttt{BanditQ}$ policy involve a novel use of the potential function method in conjunction with scale-free second-order regret bounds and a new self-bounding inequality for the reward gradients, which are of independent interest.

可交換的 · Markov · 統計量 · 峰值 · MoDELS ·

2023 年 5 月 9 日

Testing exchangeability in the batch mode with e-values and Markov alternatives

Vladimir Vovk

from arxiv, 10 pages, 2 figures, 3 tables

The topic of this paper is testing exchangeability using e-values in the batch mode, with the Markov model as alternative. The null hypothesis of exchangeability is formalized as a Kolmogorov-type compression model, and the Bayes mixture of the Markov model w.r. to the uniform prior is taken as simple alternative hypothesis. Using e-values instead of p-values leads to a computationally efficient testing procedure. In the appendixes I explain connections with the algorithmic theory of randomness and with the traditional theory of testing statistical hypotheses. In the standard statistical terminology, this paper proposes a new permutation test. This test can also be interpreted as a poor man's version of Kolmogorov's deficiency of randomness.

Markov · 馬爾可夫鏈 · 可約的 · 閾值 · 類別 ·

2023 年 5 月 8 日

Skolem and Positivity Completeness of Ergodic Markov Chains

Mihir Vahanwala

We consider the following decision problems: given a finite, rational Markov chain, source and target states, and a rational threshold, does there exist an n such that the probability of reaching the target from the source in n steps is equal to the threshold (resp. crosses the threshold)? These problems are known to be equivalent to the Skolem (resp. Positivity) problems for Linear Recurrence Sequences (LRS). These are number-theoretic problems whose decidability has been open for decades. We present a short, self-contained, and elementary reduction from LRS to Markov Chains that improves the state of the art as follows: (a) We reduce to ergodic Markov Chains, a class that is widely used in Model Checking. (b) We reduce LRS to Markov Chains of significantly lower order than before. We thus get sharper hardness results for a more ubiquitous class of Markov Chains. Immediate applications include problems in modeling biological systems, and regular automata-based counting problems.

Processing（編程語言） · 有向模型 · MoDELS · 有向 · 閉式 ·

2023 年 5 月 8 日

Gaussian process deconvolution

Felipe Tobar,Arnaud Robert,Jorge F. Silva

from arxiv, Accepted at Proceedings of the Royal Society A

Let us consider the deconvolution problem, that is, to recover a latent source $x(\cdot)$ from the observations $\y = [y_1,\ldots,y_N]$ of a convolution process $y = x\star h + \eta$, where $\eta$ is an additive noise, the observations in $\y$ might have missing parts with respect to $y$, and the filter $h$ could be unknown. We propose a novel strategy to address this task when $x$ is a continuous-time signal: we adopt a Gaussian process (GP) prior on the source $x$, which allows for closed-form Bayesian nonparametric deconvolution. We first analyse the direct model to establish the conditions under which the model is well defined. Then, we turn to the inverse problem, where we study i) some necessary conditions under which Bayesian deconvolution is feasible, and ii) to which extent the filter $h$ can be learnt from data or approximated for the blind deconvolution case. The proposed approach, termed Gaussian process deconvolution (GPDC) is compared to other deconvolution methods conceptually, via illustrative examples, and using real-world datasets.

INFORMS · 估計/估計量 · 互信息 · Performer · Learning ·

2023 年 5 月 8 日

Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video

Runyang Feng,Yixing Gao,Xueqing Ma,Tze Ho Elden Tse,Hyung Jin Chang

from arxiv, This paper is accepted to CVPR 2023

Temporal modeling is crucial for multi-frame human pose estimation. Most existing methods directly employ optical flow or deformable convolution to predict full-spectrum motion fields, which might incur numerous irrelevant cues, such as a nearby person or background. Without further efforts to excavate meaningful motion priors, their results are suboptimal, especially in complicated spatiotemporal interactions. On the other hand, the temporal difference has the ability to encode representative motion information which can potentially be valuable for pose estimation but has not been fully exploited. In this paper, we present a novel multi-frame human pose estimation framework, which employs temporal differences across frames to model dynamic contexts and engages mutual information objectively to facilitate useful motion information disentanglement. To be specific, we design a multi-stage Temporal Difference Encoder that performs incremental cascaded learning conditioned on multi-stage feature difference sequences to derive informative motion representation. We further propose a Representation Disentanglement module from the mutual information perspective, which can grasp discriminative task-relevant motion signals by explicitly defining useful and noisy constituents of the raw motion features and minimizing their mutual information. These place us to rank No.1 in the Crowd Pose Estimation in Complex Events Challenge on benchmark dataset HiEve, and achieve state-of-the-art performance on three benchmarks PoseTrack2017, PoseTrack2018, and PoseTrack21.

INFORMS · Information Systems · 泛函 · MoDELS · 正則化項 ·

2023 年 5 月 7 日

Rate-Distortion Theory for Mixed States

Zahra Baghali Khanian,Kohdai Kuroiwa,Debbie Leung

from arxiv, Substantial revisions with additional results

In this paper we consider the compression of asymptotically many i.i.d. copies of ensembles of mixed quantum states where the encoder has access to a side information system. The figure of merit is per-copy or local error criterion. Rate-distortion theory studies the trade-off between the compression rate and the per-copy error. The optimal trade-off can be characterized by the rate-distortion function, which is the best rate given a certain distortion. In this paper, we derive the rate-distortion function of mixed-state compression. The rate-distortion functions in the entanglement-assisted and unassisted scenarios are in terms of a single-letter mutual information quantity and the regularized entanglement of purification, respectively. For the general setting where the consumption of both communication and entanglement are considered, we present the full qubit-entanglement rate region. Our compression scheme covers both blind and visible compression models (and other models in between) depending on the structure of the side information system.

賭博機/老虎機 · Tensor · 估計/估計量 · 約束 · Subspace ·

2023 年 5 月 6 日

On High-dimensional and Low-rank Tensor Bandits

Chengshuai Shi,Cong Shen,Nicholas D. Sidiropoulos

from arxiv, Accepted to the 2023 IEEE International Symposium on Information Theory (ISIT 2023)

Most existing studies on linear bandits focus on the one-dimensional characterization of the overall system. While being representative, this formulation may fail to model applications with high-dimensional but favorable structures, such as the low-rank tensor representation for recommender systems. To address this limitation, this work studies a general tensor bandits model, where actions and system parameters are represented by tensors as opposed to vectors, and we particularly focus on the case that the unknown system tensor is low-rank. A novel bandit algorithm, coined TOFU (Tensor Optimism in the Face of Uncertainty), is developed. TOFU first leverages flexible tensor regression techniques to estimate low-dimensional subspaces associated with the system tensor. These estimates are then utilized to convert the original problem to a new one with norm constraints on its system parameters. Lastly, a norm-constrained bandit subroutine is adopted by TOFU, which utilizes these constraints to avoid exploring the entire high-dimensional parameter space. Theoretical analyses show that TOFU improves the best-known regret upper bound by a multiplicative factor that grows exponentially in the system order. A novel performance lower bound is also established, which further corroborates the efficiency of TOFU.

Conformer · 優化器 · 超參數 · MoDELS · 泛函 ·

2023 年 5 月 5 日

Optimizing Hyperparameters with Conformal Quantile Regression

David Salinas,Jacek Golebiowski,Aaron Klein,Matthias Seeger,Cedric Archambeau

Many state-of-the-art hyperparameter optimization (HPO) algorithms rely on model-based optimizers that learn surrogate models of the target function to guide the search. Gaussian processes are the de facto surrogate model due to their ability to capture uncertainty but they make strong assumptions about the observation noise, which might not be warranted in practice. In this work, we propose to leverage conformalized quantile regression which makes minimal assumptions about the observation noise and, as a result, models the target function in a more realistic and robust fashion which translates to quicker HPO convergence on empirical benchmarks. To apply our method in a multi-fidelity setting, we propose a simple, yet effective, technique that aggregates observed results across different resource levels and outperforms conventional methods across many empirical tasks.

推斷 · MoDELS · Learning · 估計/估計量 · 圖 ·

2022 年 10 月 15 日

Active Bayesian Causal Inference

Christian Toth,Lars Lorch,Christian Knoll,Andreas Krause,Franz Pernkopf,Robert Peharz,Julius von Kügelgen

from arxiv, NeurIPS 2022 camera-ready version. RP & JvK are shared last authors. 10 pages + Bibliography + Appendix (34 pages total)

Causal discovery and causal reasoning are classically treated as separate and consecutive tasks: one first infers the causal graph, and then uses it to estimate causal effects of interventions. However, such a two-stage approach is uneconomical, especially in terms of actively collected interventional data, since the causal query of interest may not require a fully-specified causal model. From a Bayesian perspective, it is also unnatural, since a causal query (e.g., the causal graph or some causal effect) can be viewed as a latent quantity subject to posterior inference -- other unobserved quantities that are not of direct interest (e.g., the full causal model) ought to be marginalized out in this process and contribute to our epistemic uncertainty. In this work, we propose Active Bayesian Causal Inference (ABCI), a fully-Bayesian active learning framework for integrated causal discovery and reasoning, which jointly infers a posterior over causal models and queries of interest. In our approach to ABCI, we focus on the class of causally-sufficient, nonlinear additive noise models, which we model using Gaussian processes. We sequentially design experiments that are maximally informative about our target causal query, collect the corresponding interventional data, and update our beliefs to choose the next experiment. Through simulations, we demonstrate that our approach is more data-efficient than several baselines that only focus on learning the full causal graph. This allows us to accurately learn downstream causal queries from fewer samples while providing well-calibrated uncertainty estimates for the quantities of interest.