精品亚洲中文一区二区三区,日本一区不卡在线观看,污视频欧美一区二区,狠狠色六月丁香婷婷久久

Statistical problems often involve linear equality and inequality constraints on model parameters. Direct estimation of parameters restricted to general polyhedral cones, particularly when one is interested in estimating low dimensional features, may be challenging. We use a dual form parameterization to characterize parameter vectors restricted to lower dimensional faces of polyhedral cones and use the characterization to define a notion of 'sparsity' on such cones. We show that the proposed notion agrees with the usual notion of sparsity in the unrestricted case and prove the validity of the proposed definition as a measure of sparsity. The identifiable parameterization of the lower dimensional faces allows a generalization of popular spike-and-slab priors to a closed convex polyhedral cone. The prior measure utilizes the geometry of the cone by defining a Markov random field over the adjacency graph of the extreme rays of the cone. We describe an efficient way of computing the posterior of the parameters in the restricted case. We illustrate the usefulness of the proposed methodology for imposing linear equality and inequality constraints by using wearables data from the National Health and Nutrition Examination Survey (NHANES) actigraph study where the daily average activity profiles of participants exhibit patterns that seem to obey such constraints.

相關內容

特化

關注 0

優化器 · 最優化 · 中位數 · 情景 · 計算學習理論 ·

2024 年 2 月 16 日

The Price of Adaptivity in Stochastic Convex Optimization

Yair Carmon,Oliver Hinder

We prove impossibility results for adaptivity in non-smooth stochastic convex optimization. Given a set of problem parameters we wish to adapt to, we define a "price of adaptivity" (PoA) that, roughly speaking, measures the multiplicative increase in suboptimality due to uncertainty in these parameters. When the initial distance to the optimum is unknown but a gradient norm bound is known, we show that the PoA is at least logarithmic for expected suboptimality, and double-logarithmic for median suboptimality. When there is uncertainty in both distance and gradient norm, we show that the PoA must be polynomial in the level of uncertainty. Our lower bounds nearly match existing upper bounds, and establish that there is no parameter-free lunch.

可約的 · 平穩的 · Boosting（一種模型訓練加速方式） · 回合 · 情景 ·

2024 年 2 月 16 日

Best of Three Worlds: Adaptive Experimentation for Digital Marketing in Practice

Tanner Fiez,Houssam Nassif,Arick Chen,Sergio Gamez,Lalit Jain

Adaptive experimental design (AED) methods are increasingly being used in industry as a tool to boost testing throughput or reduce experimentation cost relative to traditional A/B/N testing methods. However, the behavior and guarantees of such methods are not well-understood beyond idealized stationary settings. This paper shares lessons learned regarding the challenges of naively using AED systems in industrial settings where non-stationarity is prevalent, while also providing perspectives on the proper objectives and system specifications in such settings. We developed an AED framework for counterfactual inference based on these experiences, and tested it in a commercial environment.

點云 · 3D · 傳感器 · 真實值 · state-of-the-art ·

2024 年 2 月 16 日

Multi-Model 3D Registration: Finding Multiple Moving Objects in Cluttered Point Clouds

David Jin,Sushrut Karmalkar,Harry Zhang,Luca Carlone

from arxiv, 8 pages, Accepted by ICRA 2024

We investigate a variation of the 3D registration problem, named multi-model 3D registration. In the multi-model registration problem, we are given two point clouds picturing a set of objects at different poses (and possibly including points belonging to the background) and we want to simultaneously reconstruct how all objects moved between the two point clouds. This setup generalizes standard 3D registration where one wants to reconstruct a single pose, e.g., the motion of the sensor picturing a static scene. Moreover, it provides a mathematically grounded formulation for relevant robotics applications, e.g., where a depth sensor onboard a robot perceives a dynamic scene and has the goal of estimating its own motion (from the static portion of the scene) while simultaneously recovering the motion of all dynamic objects. We assume a correspondence-based setup where we have putative matches between the two point clouds and consider the practical case where these correspondences are plagued with outliers. We then propose a simple approach based on Expectation-Maximization (EM) and establish theoretical conditions under which the EM approach converges to the ground truth. We evaluate the approach in simulated and real datasets ranging from table-top scenes to self-driving scenarios and demonstrate its effectiveness when combined with state-of-the-art scene flow methods to establish dense correspondences.

MoDELS · 生成模型 · CASES · 生成方法 · 分離的 ·

2024 年 2 月 16 日

Compositional Generative Modeling: A Single Model is Not All You Need

Yilun Du,Leslie Kaelbling

Large monolithic generative models trained on massive amounts of data have become an increasingly dominant approach in AI research. In this paper, we argue that we should instead construct large generative systems by composing smaller generative models together. We show how such a compositional generative approach enables us to learn distributions in a more data-efficient manner, enabling generalization to parts of the data distribution unseen at training time. We further show how this enables us to program and construct new generative models for tasks completely unseen at training. Finally, we show that in many cases, we can discover separate compositional components from data.

Processing（編程語言） · MoDELS · Extensibility · DATE · GPT-2 ·

2024 年 2 月 16 日

In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Yuri Kuratov,Aydar Bulatov,Petr Anokhin,Dmitry Sorokin,Artyom Sorokin,Mikhail Burtsev

This paper addresses the challenge of processing long documents using generative transformer models. To evaluate different approaches, we introduce BABILong, a new benchmark designed to assess model capabilities in extracting and processing distributed facts within extensive texts. Our evaluation, which includes benchmarks for GPT-4 and RAG, reveals that common methods are effective only for sequences up to $10^4$ elements. In contrast, fine-tuning GPT-2 with recurrent memory augmentations enables it to handle tasks involving up to $10^7$ elements. This achievement marks a substantial leap, as it is by far the longest input processed by any open neural network model to date, demonstrating a significant improvement in the processing capabilities for long sequences.

圖 · 知識 (knowledge) · 知識圖譜 · 結點 · contrastive ·

2024 年 2 月 15 日

EntailE: Introducing Textual Entailment in Commonsense Knowledge Graph Completion

Ying Su,Tianqing Fang,Huiru Xiao,Weiqi Wang,Yangqiu Song,Tong Zhang,Lei Chen

from arxiv, 10 pages, 5 figures, 9 tables

Commonsense knowledge graph completion is a new challenge for commonsense knowledge graph construction and application. In contrast to factual knowledge graphs such as Freebase and YAGO, commonsense knowledge graphs (CSKGs; e.g., ConceptNet) utilize free-form text to represent named entities, short phrases, and events as their nodes. Such a loose structure results in large and sparse CSKGs, which makes the semantic understanding of these nodes more critical for learning rich commonsense knowledge graph embedding. While current methods leverage semantic similarities to increase the graph density, the semantic plausibility of the nodes and their relations are under-explored. Previous works adopt conceptual abstraction to improve the consistency of modeling (event) plausibility, but they are not scalable enough and still suffer from data sparsity. In this paper, we propose to adopt textual entailment to find implicit entailment relations between CSKG nodes, to effectively densify the subgraph connecting nodes within the same conceptual class, which indicates a similar level of plausibility. Each node in CSKG finds its top entailed nodes using a finetuned transformer over natural language inference (NLI) tasks, which sufficiently capture textual entailment signals. The entailment relation between these nodes are further utilized to: 1) build new connections between source triplets and entailed nodes to densify the sparse CSKGs; 2) enrich the generalization ability of node representations by comparing the node embeddings with a contrastive loss. Experiments on two standard CSKGs demonstrate that our proposed framework EntailE can improve the performance of CSKG completion tasks under both transductive and inductive settings.

Processing（編程語言） · 無限 · 類別 · MoDELS · 容差 ·

2024 年 2 月 14 日

Stabilizing Agreement is Impossible in Delayed Message Passing Models

Stephan Felber,Hugo Rincon Galeana

Most distributed computing research has focused on terminating problems like consensus and similar agreement problems. Non-terminating problems have been studied exhaustively in the context of self-stabilizing distributed algorithms, however, which may start from arbitrary initial states and can tolerate arbitrary transient faults. Somehow in-between is the stabilizing consensus problem, where the processes start from a well-defined initial state but do not need to decide irrevocably and need to agree on a common value only eventually. Charron-Bost and Moran studied stabilizing consensus in synchronous dynamic networks controlled by a message adversary. They introduced the simple and elegant class of min-max algorithms, which allow to solve stabilizing consensus under every message adversary that (i) allows at least one process to reach all other processes infinitely often, and (ii) does so within a bounded (but unknown) number of rounds. Moreover, the authors proved that (i) is a necessary condition. The question whether (i) is also sufficient, i.e., whether (ii) is also necessary, was left open. We answer this question by proving that stabilizing consensus is impossible if (ii) is dropped, i.e., even if some process reaches all other processes infinitely often but only within finite time. We accomplish this by introducing a novel class of arbitrarily delayed message adversaries, which also allows us to establish a connection between terminating task solvability under some message adversary to stabilizing task solvability under the corresponding arbitrarily delayed message adversary. Finally, we outline how to extend this relation to terminating task solvability in asynchronous message passing with guaranteed broadcasts, which highlights the asynchronous characteristics induced by arbitrary delays.

離散化 · MoDELS · UniFormer · Continuity · Markov ·

2024 年 2 月 14 日

Convergence Analysis of Discrete Diffusion Model: Exact Implementation through Uniformization

Hongrui Chen,Lexing Ying

from arxiv, 19 pages

Diffusion models have achieved huge empirical success in data generation tasks. Recently, some efforts have been made to adapt the framework of diffusion models to discrete state space, providing a more natural approach for modeling intrinsically discrete data, such as language and graphs. This is achieved by formulating both the forward noising process and the corresponding reversed process as Continuous Time Markov Chains (CTMCs). In this paper, we investigate the theoretical properties of the discrete diffusion model. Specifically, we introduce an algorithm leveraging the uniformization of continuous Markov chains, implementing transitions on random time points. Under reasonable assumptions on the learning of the discrete score function, we derive Total Variation distance and KL divergence guarantees for sampling from any distribution on a hypercube. Our results align with state-of-the-art achievements for diffusion models in $\mathbb{R}^d$ and further underscore the advantages of discrete diffusion models in comparison to the $\mathbb{R}^d$ setting.

變換 · 可理解性 · MoDELS · 層 · 近似 ·

2024 年 2 月 14 日

Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling

Mingze Wang,Weinan E

from arxiv, 65 pages

We conduct a systematic study of the approximation properties of Transformer for sequence modeling with long, sparse and complicated memory. We investigate the mechanisms through which different components of Transformer, such as the dot-product self-attention, positional encoding and feed-forward layer, affect its expressive power, and we study their combined effects through establishing explicit approximation rates. Our study reveals the roles of critical parameters in the Transformer, such as the number of layers and the number of attention heads, and these insights also provide natural suggestions for alternative architectures.

知識 (knowledge) · Machine Learning · MoDELS · 學成 · Conformer ·

2022 年 5 月 10 日

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Julian W?rmann,Daniel Bogdoll,Etienne Bührle,Han Chen,Evaristus Fuh Chuo,Kostadin Cvejoski,Ludger van Elst,Tobias Glei?ner,Philip Gottschall,Stefan Griesche,Christian Hellert,Christian Hesels,Sebastian Houben,Tim Joseph,Niklas Keil,Johann Kelsch,Hendrik K?nigshof,Erwin Kraft,Leonie Kreuser,Kevin Krone,Tobias Latka,Denny Mattern,Stefan Matthes,Mohsin Munir,Moritz Nekolla,Adrian Paschke,Maximilian Alexander Pintz,Tianming Qiu,Faraz Qureishi,Syed Tahseen Raza Rizvi,J?rg Reichardt,Laura von Rueden,Stefan Rudolph,Alexander Sagel,Gerhard Schunk,Hao Shen,Hendrik Stapelbroek,Vera Stehr,Gurucharan Srinivas,Anh Tuan Tran,Abhishek Vivekanandan,Ya Wang,Florian Wasserrab,Tino Werner,Christian Wirth,Stefan Zwicklbauer

from arxiv, 93 pages

The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.