亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The problem of robust binary hypothesis testing is studied. Under both hypotheses, the data-generating distributions are assumed to belong to uncertainty sets constructed through moments; in particular, the sets contain distributions whose moments are centered around the empirical moments obtained from training samples. The goal is to design a test that performs well under all distributions in the uncertainty sets, i.e., minimize the worst-case error probability over the uncertainty sets. In the finite-alphabet case, the optimal test is obtained. In the infinite-alphabet case, a tractable approximation to the worst-case error is derived that converges to the optimal value using finite samples from the alphabet. A test is further constructed to generalize to the entire alphabet. An exponentially consistent test for testing batch samples is also proposed. Numerical results are provided to demonstrate the performance of the proposed robust tests.

相關內容

In this paper, we consider a recently-proposed model of teaching and learning under uncertainty, in which a teacher receives independent observations of a single bit corrupted by binary symmetric noise, and sequentially transmits to a student through another binary symmetric channel based on the bits observed so far. After a given number $n$ of transmissions, the student outputs an estimate of the unknown bit, and we are interested in the exponential decay rate of the error probability as $n$ increases. We propose a novel block-structured teaching strategy in which the teacher encodes the number of 1s received in each block, and show that the resulting error exponent is the binary relative entropy $D\big(\frac{1}{2}\|\max(p,q)\big)$, where $p$ and $q$ are the noise parameters. This matches a trivial converse result based on the data processing inequality, and settles two conjectures of [Jog and Loh, 2021] and [Huleihel, Polyanskiy, and Shayevitz, 2019]. In addition, we show that the computation time required by the teacher and student is linear in $n$. We also study a more general setting in which the binary symmetric channels are replaced by general binary-input discrete memoryless channels. We provide an achievability bound and a converse bound, and show that the two coincide in certain cases, including (i) when the two channels are identical, and (ii) when the student-teacher channel is a binary symmetric channel. More generally, we give sufficient conditions under which our learning rate is the best possible for block-structured protocols.

Black-box machine learning models are now routinely used in high-risk settings, like medical diagnostics, which demand uncertainty quantification to avoid consequential model failures. Conformal prediction is a user-friendly paradigm for creating statistically rigorous uncertainty sets/intervals for the predictions of such models. Critically, the sets are valid in a distribution-free sense: they possess explicit, non-asymptotic guarantees even without distributional assumptions or model assumptions. One can use conformal prediction with any pre-trained model, such as a neural network, to produce sets that are guaranteed to contain the ground truth with a user-specified probability, such as 90%. It is easy-to-understand, easy-to-use, and general, applying naturally to problems arising in the fields of computer vision, natural language processing, deep reinforcement learning, and so on. This hands-on introduction is aimed to provide the reader a working understanding of conformal prediction and related distribution-free uncertainty quantification techniques with one self-contained document. We lead the reader through practical theory for and examples of conformal prediction and describe its extensions to complex machine learning tasks involving structured outputs, distribution shift, time-series, outliers, models that abstain, and more. Throughout, there are many explanatory illustrations, examples, and code samples in Python. With each code sample comes a Jupyter notebook implementing the method on a real-data example; the notebooks can be accessed and easily run using our codebase.

We develop and implement a Bayesian approach for the estimation of the shape of a two dimensional annular domain enclosing a Stokes flow from sparse and noisy observations of the enclosed fluid. Our setup includes the case of direct observations of the flow field as well as the measurement of concentrations of a solute passively advected by and diffusing within the flow. Adopting a statistical approach provides estimates of uncertainty in the shape due both to the non-invertibility of the forward map and to error in the measurements. When the shape represents a design problem of attempting to match desired target outcomes, this "uncertainty" can be interpreted as identifying remaining degrees of freedom available to the designer. We demonstrate the viability of our framework on three concrete test problems. These problems illustrate the promise of our framework for applications while providing a collection of test cases for recently developed Markov Chain Monte Carlo (MCMC) algorithms designed to resolve infinite dimensional statistical quantities.

We present a noisy channel generative model of two sequences, for example text and speech, which enables uncovering the association between the two modalities when limited paired data is available. To address the intractability of the exact model under a realistic data setup, we propose a variational inference approximation. To train this variational model with categorical data, we propose a KL encoder loss approach which has connections to the wake-sleep algorithm. Identifying the joint or conditional distributions by only observing unpaired samples from the marginals is only possible under certain conditions in the data distribution and we discuss under what type of conditional independence assumptions that might be achieved, which guides the architecture designs. Experimental results show that even tiny amount of paired data (5 minutes) is sufficient to learn to relate the two modalities (graphemes and phonemes here) when a massive amount of unpaired data is available, paving the path to adopting this principled approach for all seq2seq models in low data resource regimes.

When data is streaming from multiple sources, conventional training methods update model weights often assuming the same level of reliability for each source; that is: a model does not consider data quality of each source during training. In many applications, sources can have varied levels of noise or corruption that has negative effects on the learning of a robust deep learning model. A key issue is that the quality of data or labels for individual sources is often not available during training and could vary over time. Our solution to this problem is to consider the mistakes made while training on data originating from sources and utilise this to create a perceived data quality for each source. This paper demonstrates a straight-forward and novel technique that can be applied to any gradient descent optimiser: Update model weights as a function of the perceived reliability of data sources within a wider data set. The algorithm controls the plasticity of a given model to weight updates based on the history of losses from individual data sources. We show that applying this technique can significantly improve model performance when trained on a mixture of reliable and unreliable data sources, and maintain performance when models are trained on data sources that are all considered reliable. All code to reproduce this work's experiments and implement the algorithm in the reader's own models is made available.

Monitoring the correctness of distributed cyber-physical systems is essential. Detecting possible safety violations can be hard when some samples are uncertain or missing. We monitor here black-box cyber-physical system, with logs being uncertain both in the state and timestamp dimensions: that is, not only the logged value is known with some uncertainty, but the time at which the log was made is uncertain too. In addition, we make use of an over-approximated yet expressive model, given by a non-linear extension of dynamical systems. Given an offline log, our approach is able to monitor the log against safety specifications with a limited number of false alarms. As a second contribution, we show that our approach can be used online to minimize the number of sample triggers, with the aim at energetic efficiency. We apply our approach to three benchmarks, an anesthesia model, an adaptive cruise controller and an aircraft orbiting system.

We present a federated learning framework that is designed to robustly deliver good predictive performance across individual clients with heterogeneous data. The proposed approach hinges upon a superquantile-based learning objective that captures the tail statistics of the error distribution over heterogeneous clients. We present a stochastic training algorithm that interleaves differentially private client filtering with federated averaging steps. We prove finite time convergence guarantees for the algorithm: $O(1/\sqrt{T})$ in the nonconvex case in $T$ communication rounds and $O(\exp(-T/\kappa^{3/2}) + \kappa/T)$ in the strongly convex case with local condition number $\kappa$. Experimental results on benchmark datasets for federated learning demonstrate that our approach is competitive with classical ones in terms of average error and outperforms them in terms of tail statistics of the error.

Semi-supervised object detection (SSOD) aims to boost detection performance by leveraging extra unlabeled data. The teacher-student framework has been shown to be promising for SSOD, in which a teacher network generates pseudo-labels for unlabeled data to assist the training of a student network. Since the pseudo-labels are noisy, filtering the pseudo-labels is crucial to exploit the potential of such framework. Unlike existing suboptimal methods, we propose a two-step pseudo-label filtering for the classification and regression heads in a teacher-student framework. For the classification head, OCL (Object-wise Contrastive Learning) regularizes the object representation learning that utilizes unlabeled data to improve pseudo-label filtering by enhancing the discriminativeness of the classification score. This is designed to pull together objects in the same class and push away objects from different classes. For the regression head, we further propose RUPL (Regression-Uncertainty-guided Pseudo-Labeling) to learn the aleatoric uncertainty of object localization for label filtering. By jointly filtering the pseudo-labels for the classification and regression heads, the student network receives better guidance from the teacher network for object detection task. Experimental results on Pascal VOC and MS-COCO datasets demonstrate the superiority of our proposed method with competitive performance compared to existing methods.

Robust controllers that stabilize dynamical systems even under disturbances and noise are often formulated as solutions of nonsmooth, nonconvex optimization problems. While methods such as gradient sampling can handle the nonconvexity and nonsmoothness, the costs of evaluating the objective function may be substantial, making robust control challenging for dynamical systems with high-dimensional state spaces. In this work, we introduce multi-fidelity variants of gradient sampling that leverage low-cost, low-fidelity models with low-dimensional state spaces for speeding up the optimization process while nonetheless providing convergence guarantees for a high-fidelity model of the system of interest, which is primarily accessed in the last phase of the optimization process. Our first multi-fidelity method initiates gradient sampling on higher fidelity models with starting points obtained from cheaper, lower fidelity models. Our second multi-fidelity method relies on ensembles of gradients that are computed from low- and high-fidelity models. Numerical experiments with controlling the cooling of a steel rail profile and laminar flow in a cylinder wake demonstrate that our new multi-fidelity gradient sampling methods achieve up to two orders of magnitude speedup compared to the single-fidelity gradient sampling method that relies on the high-fidelity model alone.

This paper introduces video domain generalization where most video classification networks degenerate due to the lack of exposure to the target domains of divergent distributions. We observe that the global temporal features are less generalizable, due to the temporal domain shift that videos from other unseen domains may have an unexpected absence or misalignment of the temporal relations. This finding has motivated us to solve video domain generalization by effectively learning the local-relation features of different timescales that are more generalizable, and exploiting them along with the global-relation features to maintain the discriminability. This paper presents the VideoDG framework with two technical contributions. The first is a new deep architecture named the Adversarial Pyramid Network, which improves the generalizability of video features by capturing the local-relation, global-relation, and cross-relation features progressively. On the basis of pyramid features, the second contribution is a new and robust approach of adversarial data augmentation that can bridge different video domains by improving the diversity and quality of augmented data. We construct three video domain generalization benchmarks in which domains are divided according to different datasets, different consequences of actions, or different camera views, respectively. VideoDG consistently outperforms the combinations of previous video classification models and existing domain generalization methods on all benchmarks.

北京阿比特科技有限公司