99视频在线播放喷射,人人操人人干人人上

Modern datasets often exhibit high dimensionality, yet the data reside in low-dimensional manifolds that can reveal underlying geometric structures critical for data analysis. A prime example of such a dataset is a collection of cell cycle measurements, where the inherently cyclical nature of the process can be represented as a circle or sphere. Motivated by the need to analyze these types of datasets, we propose a nonlinear dimension reduction method, Spherical Rotation Component Analysis (SRCA), that incorporates geometric information to better approximate low-dimensional manifolds. SRCA is a versatile method designed to work in both high-dimensional and small sample size settings. By employing spheres or ellipsoids, SRCA provides a low-rank spherical representation of the data with general theoretic guarantees, effectively retaining the geometric structure of the dataset during dimensionality reduction. A comprehensive simulation study, along with a successful application to human cell cycle data, further highlights the advantages of SRCA compared to state-of-the-art alternatives, demonstrating its superior performance in approximating the manifold while preserving inherent geometric structures.

相關內容

Sphering

關注 0

邊緣化 · 優化器 · 全局優化 · Performer · 超平面 ·

2023 年 6 月 12 日

Horospherical Decision Boundaries for Large Margin Classification in Hyperbolic Space

Xiran Fan,Chun-Hao Yang,Baba C. Vemuri

Hyperbolic spaces have been quite popular in the recent past for representing hierarchically organized data. Further, several classification algorithms for data in these spaces have been proposed in the literature. These algorithms mainly use either hyperplanes or geodesics for decision boundaries in a large margin classifiers setting leading to a non-convex optimization problem. In this paper, we propose a novel large margin classifier based on horospherical decision boundaries that leads to a geodesically convex optimization problem that can be optimized using any Riemannian gradient descent technique guaranteeing a globally optimal solution. We present several experiments depicting the competitive performance of our classifier in comparison to SOTA.

Processing（編程語言） · 估計/估計量 · 推斷 · 似然 · MoDELS ·

2023 年 6 月 11 日

A Penalized Poisson Likelihood Approach to High-Dimensional Semi-Parametric Inference for Doubly-Stochastic Point Processes

Si Cheng,Jon Wakefield,Ali Shojaie

Doubly-stochastic point processes model the occurrence of events over a spatial domain as an inhomogeneous Poisson process conditioned on the realization of a random intensity function. They are flexible tools for capturing spatial heterogeneity and dependence. However, implementations of doubly-stochastic spatial models are computationally demanding, often have limited theoretical guarantee, and/or rely on restrictive assumptions. We propose a penalized regression method for estimating covariate effects in doubly-stochastic point processes that is computationally efficient and does not require a parametric form or stationarity of the underlying intensity. We establish the consistency and asymptotic normality of the proposed estimator, and develop a covariance estimator that leads to a conservative statistical inference procedure. A simulation study shows the validity of our approach under less restrictive assumptions on the data generating mechanism, and an application to Seattle crime data demonstrates better prediction accuracy compared with existing alternatives.

穩健性 · 機器人 · 帶符號距離 · 評論員 · 逼真度 ·

2023 年 6 月 11 日

Contact Reduction with Bounded Stiffness for Robust Sim-to-Real Transfer of Robot Assembly

Nghia Vuong,Quang-Cuong Pham

In sim-to-real Reinforcement Learning (RL), a policy is trained in a simulated environment and then deployed on the physical system. The main challenge of sim-to-real RL is to overcome the reality gap - the discrepancies between the real world and its simulated counterpart. Using general geometric representations, such as convex decomposition, triangular mesh, signed distance field can improve simulation fidelity, and thus potentially narrow the reality gap. Common to these approaches is that many contact points are generated for geometrically-complex objects, which slows down simulation and may cause numerical instability. Contact reduction methods address these issues by limiting the number of contact points, but the validity of these methods for sim-to-real RL has not been confirmed. In this paper, we present a contact reduction method with bounded stiffness to improve the simulation accuracy. Our experiments show that the proposed method critically enables training RL policy for a tight-clearance double pin insertion task and successfully deploying the policy on a rigid, position-controlled physical robot.

泛函 · 頻率主義學派 · state-of-the-art · 自助法/自舉法 · 正則的 ·

2023 年 6 月 9 日

Semiparametric posterior corrections

Andrew Yiu,Edwin Fong,Chris Holmes,Judith Rousseau

from arxiv, 53 pages

We present a new approach to semiparametric inference using corrected posterior distributions. The method allows us to leverage the adaptivity, regularization and predictive power of nonparametric Bayesian procedures to estimate low-dimensional functionals of interest without being restricted by the holistic Bayesian formalism. Starting from a conventional nonparametric posterior, we target the functional of interest by transforming the entire distribution with a Bayesian bootstrap correction. We provide conditions for the resulting $\textit{one-step posterior}$ to possess calibrated frequentist properties and specialize the results for several canonical examples: the integrated squared density, the mean of a missing-at-random outcome, and the average causal treatment effect on the treated. The procedure is computationally attractive, requiring only a simple, efficient post-processing step that can be attached onto any arbitrary posterior sampling algorithm. Using the ACIC 2016 causal data analysis competition, we illustrate that our approach can outperform the existing state-of-the-art through the propagation of Bayesian uncertainty.

Analysis · Processing（編程語言） · 傅立葉變換 · 優化器 · 變換 ·

2023 年 6 月 9 日

HVOX: Scalable Interferometric Synthesis and Analysis of Spherical Sky Maps

Sepand Kashani,Joan Rué Queralt,Adrian Jarret,Matthieu Simeoni

Analysis and synthesis are key steps of the radio-interferometric imaging process, serving as a bridge between visibility and sky domains. They can be expressed as partial Fourier transforms involving a large number of non-uniform frequencies and spherically-constrained spatial coordinates. Due to the data non-uniformity, these partial Fourier transforms are computationally expensive and represent a serious bottleneck in the image reconstruction process. The W-gridding algorithm achieves log-linear complexity for both steps by applying a series of 2D non-uniform FFTs (NUFFT) to the data sliced along the so-called $w$ frequency coordinate. A major drawback of this method however is its restriction to direction-cosine meshes, which are fundamentally ill-suited for large field of views. This paper introduces the HVOX gridder, a novel algorithm for analysis/synthesis based on a 3D-NUFFT. Unlike W-gridding, the latter is compatible with arbitrary spherical meshes such as the popular HEALPix scheme for spherical data processing. The 3D-NUFFT allows one to optimally select the size of the inner FFTs, in particular the number of W-planes. This results in a better performing and auto-tuned algorithm, with controlled accuracy guarantees backed by strong results from approximation theory. To cope with the challenging scale of next-generation radio telescopes, we propose moreover a chunked evaluation strategy: by partitioning the visibility and sky domains, the 3D-NUFFT is decomposed into sub-problems which execute in parallel, while simultaneously cutting memory requirements. Our benchmarking results demonstrate the scalability of HVOX for both SKA and LOFAR, considering state-of-the-art challenging imaging setups. HVOX is moreover computationally competitive with W-gridder, despite the absence of domain-specific optimizations in our implementation.

方差減小 · Networking · Learning · 方差 · Neural Networks ·

2023 年 6 月 9 日

On the effectiveness of partial variance reduction in federated learning with heterogeneous data

Bo Li,Mikkel N. Schmidt,Tommy S. Alstr?m,Sebastian U. Stich

from arxiv, Accepted to CVPR 2023

Data heterogeneity across clients is a key challenge in federated learning. Prior works address this by either aligning client and server models or using control variates to correct client model drift. Although these methods achieve fast convergence in convex or simple non-convex problems, the performance in over-parameterized models such as deep neural networks is lacking. In this paper, we first revisit the widely used FedAvg algorithm in a deep neural network to understand how data heterogeneity influences the gradient updates across the neural network layers. We observe that while the feature extraction layers are learned efficiently by FedAvg, the substantial diversity of the final classification layers across clients impedes the performance. Motivated by this, we propose to correct model drift by variance reduction only on the final layers. We demonstrate that this significantly outperforms existing benchmarks at a similar or lower communication cost. We furthermore provide proof for the convergence rate of our algorithm.

估計/估計量 · 設計 · Performer · 樣本 · 可理解性 ·

2023 年 6 月 8 日

Task-specific experimental design for treatment effect estimation

Bethany Connolly,Kim Moore,Tobias Schwedes,Alexander Adam,Gary Willis,Ilya Feige,Christopher Frye

from arxiv, To appear in ICML 2023; 8 pages, 7 figures, 4 appendices

Understanding causality should be a core requirement of any attempt to build real impact through AI. Due to the inherent unobservability of counterfactuals, large randomised trials (RCTs) are the standard for causal inference. But large experiments are generically expensive, and randomisation carries its own costs, e.g. when suboptimal decisions are trialed. Recent work has proposed more sample-efficient alternatives to RCTs, but these are not adaptable to the downstream application for which the causal effect is sought. In this work, we develop a task-specific approach to experimental design and derive sampling strategies customised to particular downstream applications. Across a range of important tasks, real-world datasets, and sample sizes, our method outperforms other benchmarks, e.g. requiring an order-of-magnitude less data to match RCT performance on targeted marketing tasks.

Analysis · 離散化 · 不變 · 估計/估計量 · 泛函 ·

2023 年 6 月 8 日

Design and analysis of a hybridized discontinuous Galerkin method for incompressible flows on meshes with quadrilateral cells

Joseph P. Dean,Sander Rhebergen,Garth N. Wells

We present and analyse a hybridized discontinuous Galerkin method for incompressible flow problems using non-affine cells, proving that it preserves a key invariance property that illudes most methods, namely that any irrotational component of the prescribed force is exactly balanced by the pressure gradient and does not influence the velocity field. This invariance property can be preserved in the discrete problem if the incompressibility constraint is satisfied in a sufficiently strong sense. We derive sufficient conditions to guarantee discretely divergence-free functions are exactly divergence-free, and give examples of divergence-free finite elements on meshes containing triangular, quadrilateral, tetrahedral, or hexahedral cells generated by a (possibly non-affine) map from their respective reference cells. In the case of quadrilateral cells, we prove an optimal error estimate for the velocity field that does not depend on the pressure approximation. Our theoretical analysis is supported by numerical results.

MoDELS · 學成 · contrastive · 相互獨立的 · 下游任務 ·

2021 年 5 月 26 日

GeomCA: Geometric Evaluation of Data Representations

Petra Poklukar,Anastasia Varava,Danica Kragic

from arxiv, ICML2021 camera ready version

Evaluating the quality of learned representations without relying on a downstream task remains one of the challenges in representation learning. In this work, we present Geometric Component Analysis (GeomCA) algorithm that evaluates representation spaces based on their geometric and topological properties. GeomCA can be applied to representations of any dimension, independently of the model that generated them. We demonstrate its applicability by analyzing representations obtained from a variety of scenarios, such as contrastive learning models, generative models and supervised learning models.

樣本 · 類別 · 損失 · Performer · SimPLe ·

2019 年 1 月 16 日

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui,Menglin Jia,Tsung-Yi Lin,Yang Song,Serge Belongie

from arxiv, Code is available at: //github.com/richardaecn/class-balanced-loss

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.