美国式禁忌电影在线观看免费观看_日日夜夜亚洲欧洲高清免费香蕉_日韩国产激情综合视频一区二区_日批网站在线观看_一区二区三区欧美_久久久久久青草大香综合精品_这里久久精彩视频

We provide a composite version of Ville's theorem that an event has zero measure if and only if there exists a nonnegative martingale which explodes to infinity when that event occurs. This is a classic result connecting measure-theoretic probability to the sequence-by-sequence game-theoretic probability, recently developed by Shafer and Vovk. Our extension of Ville's result involves appropriate composite generalizations of nonnegative martingales and measure-zero events: these are respectively provided by ``e-processes'', and a new inverse capital outer measure. We then develop a novel line-crossing inequality for sums of random variables which are only required to have a finite first moment, which we use to prove a composite version of the strong law of large numbers (SLLN). This allows us to show that violation of the SLLN is an event of outer measure zero and that our e-process explodes to infinity on every such violating sequence, while this is provably not achievable with a nonnegative (super)martingale.

相關內容

泛化理論

關注 25

Analysis · 設計 · 可辨認的 · TransAct · Integration ·

2023 年 6 月 16 日

Boundary Blending: Reconsidering the Design of Multi-View Visualizations

Maoyuan Sun,Abdul Rahman Shaikh,Yue Ma,David Koop,Hamed Alhoori

Multiple-view visualizations (MVs) have been widely used for visual analysis. Each view shows some part of the data in a usable way, and together multiple views enable a holistic understanding of the data under investigation. For example, an analyst may check a social network graph, a map of sensitive locations, a table of transaction records, and a collection of reports to identify suspicious activities. While each view is designed to preserve its own visual context with visible borders or perceivable spatial distance from others, the key to solving real-world analysis problems often requires "breaking" such boundaries, and further integrating and synthesizing the data scattered across multiple views. This calls for blending the boundaries in MVs, instead of simply breaking them, which brings key questions: what are possible boundaries in MVs, and what are design options that can support the boundary blending in MVs? To answer these questions, we present three boundaries in MVs: 1) data boundary, 2) representation boundary, and 3) semantic boundary, corresponding to three major aspects regarding the usage of MVs: encoded information, visual representation, and interpretation. Then, we discuss four design strategies (highlighting, linking, embedding, and extending) and their pros and cons for supporting boundary blending in MVs. We conclude our discussion with future research opportunities.

評論員 · 圖 · 潛在 · 表示 · INFORMS ·

2023 年 6 月 16 日

Latent Graph Representations for Critical View of Safety Assessment

Aditya Murali,Deepak Alapatt,Pietro Mascagni,Armine Vardazaryan,Alain Garcia,Nariaki Okamoto,Didier Mutter,Nicolas Padoy

from arxiv, 12 pages, 4 figures

Assessing the critical view of safety in laparoscopic cholecystectomy requires accurate identification and localization of key anatomical structures, reasoning about their geometric relationships to one another, and determining the quality of their exposure. Prior works have approached this task by including semantic segmentation as an intermediate step, using predicted segmentation masks to then predict the CVS. While these methods are effective, they rely on extremely expensive ground-truth segmentation annotations and tend to fail when the predicted segmentation is incorrect, limiting generalization. In this work, we propose a method for CVS prediction wherein we first represent a surgical image using a disentangled latent scene graph, then process this representation using a graph neural network. Our graph representations explicitly encode semantic information - object location, class information, geometric relations - to improve anatomy-driven reasoning, as well as visual features to retain differentiability and thereby provide robustness to semantic errors. Finally, to address annotation cost, we propose to train our method using only bounding box annotations, incorporating an auxiliary image reconstruction objective to learn fine-grained object boundaries. We show that our method not only outperforms several baseline methods when trained with bounding box annotations, but also scales effectively when trained with segmentation masks, maintaining state-of-the-art performance.

估計/估計量 · 推斷 · 控制器 · 損失 · state-of-the-art ·

2023 年 6 月 15 日

UrbanIR: Large-Scale Urban Scene Inverse Rendering from a Single Video

Zhi-Hao Lin,Bohan Liu,Yi-Ting Chen,David Forsyth,Jia-Bin Huang,Anand Bhattad,Shenlong Wang

We show how to build a model that allows realistic, free-viewpoint renderings of a scene under novel lighting conditions from video. Our method -- UrbanIR: Urban Scene Inverse Rendering -- computes an inverse graphics representation from the video. UrbanIR jointly infers shape, albedo, visibility, and sun and sky illumination from a single video of unbounded outdoor scenes with unknown lighting. UrbanIR uses videos from cameras mounted on cars (in contrast to many views of the same points in typical NeRF-style estimation). As a result, standard methods produce poor geometry estimates (for example, roofs), and there are numerous ''floaters''. Errors in inverse graphics inference can result in strong rendering artifacts. UrbanIR uses novel losses to control these and other sources of error. UrbanIR uses a novel loss to make very good estimates of shadow volumes in the original scene. The resulting representations facilitate controllable editing, delivering photorealistic free-viewpoint renderings of relit scenes and inserted objects. Qualitative evaluation demonstrates strong improvements over the state-of-the-art.

優化器 · 可理解性 · Learning · Lipschitz常數 · Networking ·

2023 年 6 月 15 日

Understanding Optimization of Deep Learning

Xianbiao Qi,Jianan Wang,Lei Zhang

from arxiv, International Digital Economy Academy (IDEA)

This article provides a comprehensive understanding of optimization in deep learning, with a primary focus on the challenges of gradient vanishing and gradient exploding, which normally lead to diminished model representational ability and training instability, respectively. We analyze these two challenges through several strategic measures, including the improvement of gradient flow and the imposition of constraints on a network's Lipschitz constant. To help understand the current optimization methodologies, we categorize them into two classes: explicit optimization and implicit optimization. Explicit optimization methods involve direct manipulation of optimizer parameters, including weight, gradient, learning rate, and weight decay. Implicit optimization methods, by contrast, focus on improving the overall landscape of a network by enhancing its modules, such as residual shortcuts, normalization methods, attention mechanisms, and activations. In this article, we provide an in-depth analysis of these two optimization classes and undertake a thorough examination of the Jacobian matrices and the Lipschitz constants of many widely used deep learning modules, highlighting existing issues as well as potential improvements. Moreover, we also conduct a series of analytical experiments to substantiate our theoretical discussions. This article does not aim to propose a new optimizer or network. Rather, our intention is to present a comprehensive understanding of optimization in deep learning. We hope that this article will assist readers in gaining a deeper insight in this field and encourages the development of more robust, efficient, and high-performing models.

泛化理論 · 情景 · 示例 · 損失 · 訓練實例 ·

2023 年 6 月 15 日

Tighter Information-Theoretic Generalization Bounds from Supersamples

Ziqiao Wang,Yongyi Mao

from arxiv, Accepted to ICML 2023, fixed some typos in the camera-ready version

In this work, we present a variety of novel information-theoretic generalization bounds for learning algorithms, from the supersample setting of Steinke & Zakynthinou (2020)-the setting of the "conditional mutual information" framework. Our development exploits projecting the loss pair (obtained from a training instance and a testing instance) down to a single number and correlating loss values with a Rademacher sequence (and its shifted variants). The presented bounds include square-root bounds, fast-rate bounds, including those based on variance and sharpness, and bounds for interpolating algorithms etc. We show theoretically or empirically that these bounds are tighter than all information-theoretic bounds known to date on the same supersample setting.

簇 · 近似 · 圖 · 有偏 · motivation ·

2023 年 6 月 14 日

Multi-class Graph Clustering via Approximated Effective $p$-Resistance

Shota Saito,Mark Herbster

from arxiv, To appear at ICML 2023

This paper develops an approximation to the (effective) $p$-resistance and applies it to multi-class clustering. Spectral methods based on the graph Laplacian and its generalization to the graph $p$-Laplacian have been a backbone of non-euclidean clustering techniques. The advantage of the $p$-Laplacian is that the parameter $p$ induces a controllable bias on cluster structure. The drawback of $p$-Laplacian eigenvector based methods is that the third and higher eigenvectors are difficult to compute. Thus, instead, we are motivated to use the $p$-resistance induced by the $p$-Laplacian for clustering. For $p$-resistance, small $p$ biases towards clusters with high internal connectivity while large $p$ biases towards clusters of small ``extent,'' that is a preference for smaller shortest-path distances between vertices in the cluster. However, the $p$-resistance is expensive to compute. We overcome this by developing an approximation to the $p$-resistance. We prove upper and lower bounds on this approximation and observe that it is exact when the graph is a tree. We also provide theoretical justification for the use of $p$-resistance for clustering. Finally, we provide experiments comparing our approximated $p$-resistance clustering to other $p$-Laplacian based methods.

線性的 · Subspace · Performer · Integration · 離散化 ·

2023 年 6 月 14 日

Improved ParaDiag via low-rank updates and interpolation

Daniel Kressner,Stefano Massei,Junli Zhu

This work is concerned with linear matrix equations that arise from the space-time discretization of time-dependent linear partial differential equations (PDEs). Such matrix equations have been considered, for example, in the context of parallel-in-time integration leading to a class of algorithms called ParaDiag. We develop and analyze two novel approaches for the numerical solution of such equations. Our first approach is based on the observation that the modification of these equations performed by ParaDiag in order to solve them in parallel has low rank. Building upon previous work on low-rank updates of matrix equations, this allows us to make use of tensorized Krylov subspace methods to account for the modification. Our second approach is based on interpolating the solution of the matrix equation from the solutions of several modifications. Both approaches avoid the use of iterative refinement needed by ParaDiag and related space-time approaches in order to attain good accuracy. In turn, our new algorithms have the potential to outperform, sometimes significantly, existing methods. This potential is demonstrated for several different types of PDEs.

Learning · 泛化理論 · 概率近似正確 · 泛化誤差 · 少試學習 ·

2022 年 7 月 29 日

A Survey of Learning on Small Data

Xiaofeng Cao,Weixin Bu,Shengjun Huang,Yingpeng Tang,Yaming Guo,Yi Chang,Ivor W. Tsang

Learning on big data brings success for artificial intelligence (AI), but the annotation and training costs are expensive. In future, learning on small data is one of the ultimate purposes of AI, which requires machines to recognize objectives and scenarios relying on small data as humans. A series of machine learning models is going on this way such as active learning, few-shot learning, deep clustering. However, there are few theoretical guarantees for their generalization performance. Moreover, most of their settings are passive, that is, the label distribution is explicitly controlled by one specified sampling scenario. This survey follows the agnostic active sampling under a PAC (Probably Approximately Correct) framework to analyze the generalization error and label complexity of learning on small data using a supervised and unsupervised fashion. With these theoretical analyses, we categorize the small data learning models from two geometric perspectives: the Euclidean and non-Euclidean (hyperbolic) mean representation, where their optimization solutions are also presented and discussed. Later, some potential learning scenarios that may benefit from small data learning are then summarized, and their potential learning scenarios are also analyzed. Finally, some challenging applications such as computer vision, natural language processing that may benefit from learning on small data are also surveyed.

泛化理論 · 學成 · Machine Learning · Performer · 監督模型 ·

2021 年 8 月 31 日

Towards Out-Of-Distribution Generalization: A Survey

Zheyan Shen,Jiashuo Liu,Yue He,Xingxuan Zhang,Renzhe Xu,Han Yu,Peng Cui

Classic machine learning methods are built on the $i.i.d.$ assumption that training and testing data are independent and identically distributed. However, in real scenarios, the $i.i.d.$ assumption can hardly be satisfied, rendering the sharp drop of classic machine learning algorithms' performances under distributional shifts, which indicates the significance of investigating the Out-of-Distribution generalization problem. Out-of-Distribution (OOD) generalization problem addresses the challenging setting where the testing distribution is unknown and different from the training. This paper serves as the first effort to systematically and comprehensively discuss the OOD generalization problem, from the definition, methodology, evaluation to the implications and future directions. Firstly, we provide the formal definition of the OOD generalization problem. Secondly, existing methods are categorized into three parts based on their positions in the whole learning pipeline, namely unsupervised representation learning, supervised model learning and optimization, and typical methods for each category are discussed in detail. We then demonstrate the theoretical connections of different categories, and introduce the commonly used datasets and evaluation metrics. Finally, we summarize the whole literature and raise some future directions for OOD generalization problem. The summary of OOD generalization methods reviewed in this survey can be found at //out-of-distribution-generalization.com.

估計/估計量 · 估計誤差 · MoDELS · 學成 · 無偏 ·

2020 年 12 月 17 日

The Causal Learning of Retail Delinquency

Yiyan Huang,Cheuk Hang Leung,Xing Yan,Qi Wu,Nanbo Peng,Dongdong Wang,Zhixiang Huang

from arxiv, This paper was accepted and will be published in the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.