亚洲综合蜜桃久久丁香婷_亚洲国产欧美一区二区午夜浪_久久精品A无码中文字字幕不卡_我的极品女教师_免费欧美视频一区二区三区_厨房征服丰满熟妇少妇人妻_91精品在线播放视频

Probabilistic databases (PDBs) model uncertainty in data in a quantitative way. In the established formal framework, probabilistic (relational) databases are finite probability spaces over relational database instances. This finiteness can clash with intuitive query behavior (Ceylan et al., KR 2016), and with application scenarios that are better modeled by continuous probability distributions (Dalvi et al., CACM 2009). We formally introduced infinite PDBs in (Grohe and Lindner, PODS 2019) with a primary focus on countably infinite spaces. However, an extension beyond countable probability spaces raises nontrivial foundational issues concerned with the measurability of events and queries and ultimately with the question whether queries have a well-defined semantics. We argue that finite point processes are an appropriate model from probability theory for dealing with general probabilistic databases. This allows us to construct suitable (uncountable) probability spaces of database instances in a systematic way. Our main technical results are measurability statements for relational algebra queries as well as aggregate queries and Datalog queries.

相關內容

Extensibility

關注 5

iOS 8 提供的應用間和應用跟系統的功能交互特性。

Today (iOS and OS X): widgets for the Today view of Notification Center
Share (iOS and OS X): post content to web services or share content with others
Actions (iOS and OS X): app extensions to view or manipulate inside another app
Photo Editing (iOS): edit a photo or video in Apple's Photos app with extensions from a third-party apps
Finder Sync (OS X): remote file storage in the Finder with support for Finder content annotation
Storage Provider (iOS): an interface between files inside an app and other apps on a user's device
Custom Keyboard (iOS): system-wide alternative keyboards

Source:

Color · 邊緣分布 · 模式崩潰 · Performer · 估計/估計量 ·

2021 年 9 月 29 日

Generative Probabilistic Image Colorization

Chie Furusawa,Shinya Kitaoka,Michael Li,Yuri Odagiri

from arxiv, 11 pages

We propose Generative Probabilistic Image Colorization, a diffusion-based generative process that trains a sequence of probabilistic models to reverse each step of noise corruption. Given a line-drawing image as input, our method suggests multiple candidate colorized images. Therefore, our method accounts for the ill-posed nature of the colorization problem. We conducted comprehensive experiments investigating the colorization of line-drawing images, report the influence of a score-based MCMC approach that corrects the marginal distribution of estimated samples, and further compare different combinations of models and the similarity of their generated images. Despite using only a relatively small training dataset, we experimentally develop a method to generate multiple diverse colorization candidates which avoids mode collapse and does not require any additional constraints, losses, or re-training with alternative training conditions. Our proposed approach performed well not only on color-conditional image generation tasks using biased initial values, but also on some practical image completion and inpainting tasks.

流形 · MoDELS · 可辨認的 · 模型性能 · Performance ·

2021 年 9 月 29 日

Flow Based Models For Manifold Data

Mingtian Zhang,Yitong Sun,Steven McDonagh,Chen Zhang

Flow-based generative models typically define a latent space with dimensionality identical to the observational space. In many problems, however, the data does not populate the full ambient data-space that they natively reside in, rather inhabiting a lower-dimensional manifold. In such scenarios, flow-based models are unable to represent data structures exactly as their density will always have support off the data manifold, potentially resulting in degradation of model performance. In addition, the requirement for equal latent and data space dimensionality can unnecessarily increase complexity for contemporary flow models. Towards addressing these problems, we propose to learn a manifold prior that affords benefits to both sample generation and representation quality. An auxiliary benefit of our approach is the ability to identify the intrinsic dimension of the data distribution.

統計量 · 似然 · 可辨認的 · Pivotal（公司） · Continuity ·

2021 年 9 月 28 日

Constructing Prediction Intervals Using the Likelihood Ratio Statistic

Qinglong Tian,Daniel J. Nordman,William Q. Meeker

Statistical prediction plays an important role in many decision processes such as university budgeting (depending on the number of students who will enroll), capital budgeting (depending on the remaining lifetime of a fleet of systems), the needed amount of cash reserves for warranty expenses (depending on the number of warranty returns), and whether a product recall is needed (depending on the number of potentially life-threatening product failures). In statistical inference, likelihood ratios have a long history of use for decision making relating to model parameters (e.g., in evidence-based medicine and forensics). We propose a general prediction method, based on a likelihood ratio (LR) involving both the data and a future random variable. This general approach provides a way to identify prediction interval methods that have excellent statistical properties. For example, if a prediction method can be based on a pivotal quantity, our LR-based method will often identify it. For applications where a pivotal quantity does not exist, the LR-based method provides a procedure with good coverage properties for both continuous or discrete-data prediction applications.

流 · 樣本 · 推斷 · Automator · 貝葉斯推斷 ·

2021 年 9 月 26 日

Statically Bounded-Memory Delayed Sampling for Probabilistic Streams

Eric Atkinson,Guillaume Baudart,Louis Mandel,Charles Yuan,Michael Carbin

from arxiv, OOPSLA 2021

Probabilistic programming languages aid developers performing Bayesian inference. These languages provide programming constructs and tools for probabilistic modeling and automated inference. Prior work introduced a probabilistic programming language, ProbZelus, to extend probabilistic programming functionality to unbounded streams of data. This work demonstrated that the delayed sampling inference algorithm could be extended to work in a streaming context. ProbZelus showed that while delayed sampling could be effectively deployed on some programs, depending on the probabilistic model under consideration, delayed sampling is not guaranteed to use a bounded amount of memory over the course of the execution of the program. In this paper, we the present conditions on a probabilistic program's execution under which delayed sampling will execute in bounded memory. The two conditions are dataflow properties of the core operations of delayed sampling: the $m$-consumed property and the unseparated paths property. A program executes in bounded memory under delayed sampling if, and only if, it satisfies the $m$-consumed and unseparated paths properties. We propose a static analysis that abstracts over these properties to soundly ensure that any program that passes the analysis satisfies these properties, and thus executes in bounded memory under delayed sampling.

Integration · 統計量 · 推斷 · 可理解性 · 評論員 ·

2021 年 9 月 25 日

Statistical Inference for Data Integration

Xi Yang,Katherine A. Hoadley,Jan Hannig,J. S. Marron

In the age of big data, data integration is a critical step especially in the understanding of how diverse data types work together and work separately. Among the data integration methods, the Angle-Based Joint and Individual Variation Explained (AJIVE) is particularly attractive because it not only studies joint behavior but also individual behavior. Typically scores indicate relationships between data objects. The drivers of those relationships are determined by the loadings. A fundamental question is which loadings are statistically significant. A useful approach for assessing this is the jackstraw method. In this paper, we develop jackstraw for the loadings of the AJIVE data analysis. This provides statistical inference about the drivers in both joint and individual feature spaces.

Extensibility · 統計量 · 推斷 · 精確推斷 · 稀疏 ·

2021 年 9 月 24 日

On Statistical Inference with High Dimensional Sparse CCA

Nilanjana Laha,Nathan Huey,Brent Coull,Rajarshi Mukherjee

We consider asymptotically exact inference on the leading canonical correlation directions and strengths between two high dimensional vectors under sparsity restrictions. In this regard, our main contribution is the development of a loss function, based on which, one can operationalize a one-step bias-correction on reasonable initial estimators. Our analytic results in this regard are adaptive over suitable structural restrictions of the high dimensional nuisance parameters, which, in this set-up, correspond to the covariance matrices of the variables of interest. We further supplement the theoretical guarantees behind our procedures with extensive numerical studies.

可約的 · 歐幾里得距離 · 歐氏距離 · 泛函 · AIM ·

2021 年 9 月 24 日

Dynamic Data Structures for $k$-Nearest Neighbor Queries

Sarita de Berg,Frank Staals

from arxiv, 20 pages, 7 figures

Our aim is to develop dynamic data structures that support $k$-nearest neighbors ($k$-NN) queries for a set of $n$ point sites in $O(f(n) + k)$ time, where $f(n)$ is some polylogarithmic function of $n$. The key component is a general query algorithm that allows us to find the $k$-NN spread over $t$ substructures simultaneously, thus reducing a $O(tk)$ term in the query time to $O(k)$. Combining this technique with the logarithmic method allows us to turn any static $k$-NN data structure into a data structure supporting both efficient insertions and queries. For the fully dynamic case, this technique allows us to recover the deterministic, worst-case, $O(\log^2n/\log\log n +k)$ query time for the Euclidean distance claimed before, while preserving the polylogarithmic update times. We adapt this data structure to also support fully dynamic \emph{geodesic} $k$-NN queries among a set of sites in a simple polygon. For this purpose, we design a shallow cutting based, deletion-only $k$-NN data structure. More generally, we obtain a dynamic $k$-NN data structure for any type of distance functions for which we can build vertical shallow cuttings. We apply all of our methods in the plane for the Euclidean distance, the geodesic distance, and general, constant-complexity, algebraic distance functions.

Guidance · Performer · 真實值 · 模型評估 · 損失 ·

2021 年 9 月 24 日

CNN-based synthesis of realistic high-resolution LiDAR data

Larissa T. Triess,David Peter,Christoph B. Rist,Markus Enzweiler,J. Marius Z?llner

from arxiv, Project Page: //ltriess.github.io/pc-upsampling

This paper presents a novel CNN-based approach for synthesizing high-resolution LiDAR point cloud data. Our approach generates semantically and perceptually realistic results with guidance from specialized loss-functions. First, we utilize a modified per-point loss that addresses missing LiDAR point measurements. Second, we align the quality of our generated output with real-world sensor data by applying a perceptual loss. In large-scale experiments on real-world datasets, we evaluate both the geometric accuracy and semantic segmentation performance using our generated data vs. ground truth. In a mean opinion score testing we further assess the perceptual quality of our generated point clouds. Our results demonstrate a significant quantitative and qualitative improvement in both geometry and semantics over traditional non CNN-based up-sampling methods.

MoDELS · Extensibility · 負相關法 · 相關系數 · 學成 ·

2018 年 5 月 17 日

Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures

Luke Vilnis,Xiang Li,Shikhar Murty,Andrew McCallum

from arxiv, ACL 2018 camera-ready version, 14 pages including appendices

Embedding methods which enforce a partial order or lattice structure over the concept space, such as Order Embeddings (OE) (Vendrov et al., 2016), are a natural way to model transitive relational data (e.g. entailment graphs). However, OE learns a deterministic knowledge base, limiting expressiveness of queries and the ability to use uncertainty for both prediction and learning (e.g. learning from expectations). Probabilistic extensions of OE (Lai and Hockenmaier, 2017) have provided the ability to somewhat calibrate these denotational probabilities while retaining the consistency and inductive bias of ordered models, but lack the ability to model the negative correlations found in real-world knowledge. In this work we show that a broad class of models that assign probability measures to OE can never capture negative correlation, which motivates our construction of a novel box lattice and accompanying probability measure to capture anticorrelation and even disjoint concepts, while still providing the benefits of probabilistic modeling, such as the ability to perform rich joint and conditional queries over arbitrary sets of concepts, and both learning from and predicting calibrated uncertainty. We show improvements over previous approaches in modeling the Flickr and WordNet entailment graphs, and investigate the power of the model.

深度強化學習 · 學成 · 強化學習 · tuning · CASE ·

2018 年 1 月 17 日

The Case for Automatic Database Administration using Deep Reinforcement Learning

Ankur Sharma,Felix Martin Schuhknecht,Jens Dittrich

Like any large software system, a full-fledged DBMS offers an overwhelming amount of configuration knobs. These range from static initialisation parameters like buffer sizes, degree of concurrency, or level of replication to complex runtime decisions like creating a secondary index on a particular column or reorganising the physical layout of the store. To simplify the configuration, industry grade DBMSs are usually shipped with various advisory tools, that provide recommendations for given workloads and machines. However, reality shows that the actual configuration, tuning, and maintenance is usually still done by a human administrator, relying on intuition and experience. Recent work on deep reinforcement learning has shown very promising results in solving problems, that require such a sense of intuition. For instance, it has been applied very successfully in learning how to play complicated games with enormous search spaces. Motivated by these achievements, in this work we explore how deep reinforcement learning can be used to administer a DBMS. First, we will describe how deep reinforcement learning can be used to automatically tune an arbitrary software system like a DBMS by defining a problem environment. Second, we showcase our concept of NoDBA at the concrete example of index selection and evaluate how well it recommends indexes for given workloads.