销魂美女一区二区三区AV_免费看黄色片_女国产精品视频一区二区三区_在线免费观看国产视频你懂得_欧美日韩一二三区在线播放_99九九免费热在精品女女_久久SE精品一区二区成人

Sketching is a stochastic dimension reduction method that preserves geometric structures of data and has applications in high-dimensional regression, low rank approximation and graph sparsification. In this work, we show that sketching can be used to compress simulation data and still accurately estimate time autocorrelation and power spectral density. For a given compression ratio, the accuracy is much higher than using previously known methods. In addition to providing theoretical guarantees, we apply sketching to a molecular dynamics simulation of methanol and find that the estimate of spectral density is 90% accurate using only 10% of the data.

相關內容

估(gu)計/估(gu)計量

關注 3

entity · 近似 · International Conference on Conceptual Modeling · Processing（編程語言） · 實體解析 ·

2021 年 11 月 7 日

Em-K Indexing for Approximate Query Matching in Large-scale ER

Samudra Herath,Matthew Roughan,Gary Glonek

Accurate and efficient entity resolution (ER) is a significant challenge in many data mining and analysis projects requiring integrating and processing massive data collections. It is becoming increasingly important in real-world applications to develop ER solutions that produce prompt responses for entity queries on large-scale databases. Some of these applications demand entity query matching against large-scale reference databases within a short time. We define this as the query matching problem in ER in this work. Indexing or blocking techniques reduce the search space and execution time in the ER process. However, approximate indexing techniques that scale to very large-scale datasets remain open to research. In this paper, we investigate the query matching problem in ER to propose an indexing method suitable for approximate and efficient query matching. We first use spatial mappings to embed records in a multidimensional Euclidean space that preserves the domain-specific similarity. Among the various mapping techniques, we choose multidimensional scaling. Then using a Kd-tree and the nearest neighbour search, the method returns a block of records that includes potential matches for a query. Our method can process queries against a large-scale dataset using only a fraction of the data $L$ (given the dataset size is $N$), with a $O(L^2)$ complexity where $L \ll N$. The experiments conducted on several datasets showed the effectiveness of the proposed method.

分離的 · 置換不變性 · 優化器 · 通道 · 不變 ·

2021 年 11 月 7 日

Many-Speakers Single Channel Speech Separation with Optimal Permutation Training

Shaked Dovrat,Eliya Nachmani,Lior Wolf

from arxiv, Accepted to Interspeech 2021, Data creation link added

Single channel speech separation has experienced great progress in the last few years. However, training neural speech separation for a large number of speakers (e.g., more than 10 speakers) is out of reach for the current methods, which rely on the Permutation Invariant Loss (PIT). In this work, we present a permutation invariant training that employs the Hungarian algorithm in order to train with an $O(C^3)$ time complexity, where $C$ is the number of speakers, in comparison to $O(C!)$ of PIT based methods. Furthermore, we present a modified architecture that can handle the increased number of speakers. Our approach separates up to $20$ speakers and improves the previous results for large $C$ by a wide margin.

估計/估計量 · 流 · 線性的 · MoDELS · 情景 ·

2021 年 11 月 6 日

Frequency Estimation with One-Sided Error

Piotr Indyk,Shyam Narayanan,David P. Woodruff

from arxiv, To appear in SODA 2022. Abstract abridged to meet arXiv requirements - see pdf for full abstract

Frequency estimation is one of the most fundamental problems in streaming algorithms. Given a stream $S$ of elements from some universe $U=\{1 \ldots n\}$, the goal is to compute, in a single pass, a short sketch of $S$ so that for any element $i \in U$, one can estimate the number $x_i$ of times $i$ occurs in $S$ based on the sketch alone. Two state of the art solutions to this problems are the Count-Min and Count-Sketch algorithms. The frequency estimator $\tilde{x}$ produced by Count-Min, using $O(1/\varepsilon \cdot \log n)$ dimensions, guarantees that $\|\tilde{x}-x\|_{\infty} \le \varepsilon \|x\|_1$ with high probability, and $\tilde{x} \ge x$ holds deterministically. Also, Count-Min works under the assumption that $x \ge 0$. On the other hand, Count-Sketch, using $O(1/\varepsilon^2 \cdot \log n)$ dimensions, guarantees that $\|\tilde{x}-x\|_{\infty} \le \varepsilon \|x\|_2$ with high probability. A natural question is whether it is possible to design the best of both worlds sketching method, with error guarantees depending on the $\ell_2$ norm and space comparable to Count-Sketch, but (like Count-Min) also has the no-underestimation property. Our main set of results shows that the answer to the above question is negative. We show this in two incomparable computational models: linear sketching and streaming algorithms. We also study the complementary problem, where the sketch is required to not over-estimate, i.e., $\tilde{x} \le x$ should hold always.

近似 · FAST · FOCS · 圖 · 無向圖 ·

2021 年 11 月 5 日

Fast Deterministic Fully Dynamic Distance Approximation

Jan van den Brand,Sebastian Forster,Yasamin Nazari

In this paper, we develop deterministic fully dynamic algorithms for computing approximate distances in a graph with worst-case update time guarantees. In particular we obtain improved dynamic algorithms that, given an unweighted and undirected graph $G=(V,E)$ undergoing edge insertions and deletions, and a parameter $0 < \epsilon \leq 1$, maintain $(1+\epsilon)$-approximations of the $st$ distance of a single pair of nodes, the distances from a single source to all nodes ("SSSP"), the distances from multiple sources to all nodes ("MSSP''), or the distances between all nodes ("APSP"). Our main result is a deterministic algorithm for maintaining $(1+\epsilon)$-approximate single-source distances with worst-case update time $O(n^{1.529})$ (for the current best known bound on the matrix multiplication coefficient $\omega$). This matches a conditional lower bound by [BNS, FOCS 2019]. We further show that we can go beyond this SSSP bound for the problem of maintaining approximate $st$ distances by providing a deterministic algorithm with worst-case update time $O(n^{1.447})$. This even improves upon the fastest known randomized algorithm for this problem. At the core, our approach is to combine algebraic distance maintenance data structures with near-additive emulator constructions. This also leads to novel dynamic algorithms for maintaining $(1+\epsilon, \beta)$-emulators that improve upon the state of the art, which might be of independent interest. Our techniques also lead to improvements for randomized approximate diameter maintenance.

估計/估計量 · INFORMS · CASES · 穩健性 · MoDELS ·

2021 年 11 月 4 日

Estimation of temperature-dependent growth profiles for the assessment of time of hatching in forensic entomology

D. Pigoli,J. A. D. Aston,F. Ferraty,A. Mazumder,C. Richards,M. J. R. Hall

from arxiv, 23 pages; 12 figures

Forensic entomology contributes important information to crime scene investigations. In this paper, we propose a method to estimate the hatching time of larvae (or maggots) based on their lengths, the temperature profile at the crime scene and experimental data on larval development. This requires the estimation of a time-dependent growth curve from experiments where larvae have been exposed to a relatively small number of constant temperature profiles. Since the temperature influences the developmental speed, a crucial step is the time alignment of the curves at different temperatures. We propose a model for time varying temperature profiles based on the local growth rate estimated from the experimental data. This allows us to estimate the most likely hatching time for a sample of larvae from the crime scene. Asymptotic properties are provided for the estimators of the growth curves and the hatching time. We explore via simulations the robustness of the method to errors in the estimated temperature profile. We also apply the methodology to data from two criminal cases from the United Kingdom.

模型評估 · Processing（編程語言） · 統計量 · 級聯 · Notability ·

2021 年 11 月 4 日

Count-Less: A Counting Sketch for the Data Plane of High Speed Switches

SunYoung Kim,Changhun Jung,RhongHo Jang,David Mohaisen,DaeHun Nyang

from arxiv, 16 pages, 14 figures

Demands are increasing to measure per-flow statistics in the data plane of high-speed switches. Measuring flows with exact counting is infeasible due to processing and memory constraints, but a sketch is a promising candidate for collecting approximately per-flow statistics in data plane in real-time. Among them, Count-Min sketch is a versatile tool to measure spectral density of high volume data using a small amount of memory and low processing overhead. Due to its simplicity and versatility, Count-Min sketch and its variants have been adopted in many works as a stand alone or even as a supporting measurement tool. However, Count-Min's estimation accuracy is limited owing to its data structure not fully accommodating Zipfian distribution and the indiscriminate update algorithm without considering a counter value. This in turn degrades the accuracy of heavy hitter, heavy changer, cardinality, and entropy. To enhance measurement accuracy of Count-Min, there have been many and various attempts. One of the most notable approaches is to cascade multiple sketches in a sequential manner so that either mouse or elephant flows should be filtered to separate elephants from mouse flows such as Elastic sketch (an elephant filter leveraging TCAM + Count-Min) and FCM sketch (Count-Min-based layered mouse filters). In this paper, we first show that these cascaded filtering approaches adopting a Pyramid-shaped data structure (allocating more counters for mouse flows) still suffer from under-utilization of memory, which gives us a room for better estimation. To this end, we are facing two challenges: one is (a) how to make Count-Min's data structure accommodate more effectively Zipfian distribution, and the other is (b) how to make update and query work without delaying packet processing in the switch's data plane. Count-Less adopts a different combination ...

近似 · 傅立葉變換 · 估計/估計量 · CASES · 變換 ·

2021 年 11 月 3 日

Quantum Approximate Counting, Simplified

Scott Aaronson,Patrick Rall

from arxiv, Update November 2021: changed several constants throughout and gave an updated proof that simplifies the analysis. This also remedies an algebra mistake present in the previous version

In 1998, Brassard, Hoyer, Mosca, and Tapp (BHMT) gave a quantum algorithm for approximate counting. Given a list of $N$ items, $K$ of them marked, their algorithm estimates $K$ to within relative error $\varepsilon$ by making only $O\left( \frac{1}{\varepsilon}\sqrt{\frac{N}{K}}\right) $ queries. Although this speedup is of "Grover" type, the BHMT algorithm has the curious feature of relying on the Quantum Fourier Transform (QFT), more commonly associated with Shor's algorithm. Is this necessary? This paper presents a simplified algorithm, which we prove achieves the same query complexity using Grover iterations only. We also generalize this to a QFT-free algorithm for amplitude estimation. Related approaches to approximate counting were sketched previously by Grover, Abrams and Williams, Suzuki et al., and Wie (the latter two as we were writing this paper), but in all cases without rigorous analysis.

圖 · 學成 · 近似 · Neural Networks · Performer ·

2021 年 6 月 21 日

BernNet: Learning Arbitrary Graph Spectral Filters via Bernstein Approximation

Mingguo He,Zhewei Wei,Zengfeng Huang,Hongteng Xu

from arxiv, 14 pages, 31 figures

Many representative graph neural networks, $e.g.$, GPR-GNN and ChebyNet, approximate graph convolutions with graph spectral filters. However, existing work either applies predefined filter weights or learns them without necessary constraints, which may lead to oversimplified or ill-posed filters. To overcome these issues, we propose $\textit{BernNet}$, a novel graph neural network with theoretical support that provides a simple but effective scheme for designing and learning arbitrary graph spectral filters. In particular, for any filter over the normalized Laplacian spectrum of a graph, our BernNet estimates it by an order-$K$ Bernstein polynomial approximation and designs its spectral property by setting the coefficients of the Bernstein basis. Moreover, we can learn the coefficients (and the corresponding filter weights) based on observed graphs and their associated signals and thus achieve the BernNet specialized for the data. Our experiments demonstrate that BernNet can learn arbitrary spectral filters, including complicated band-rejection and comb filters, and it achieves superior performance in real-world graph modeling tasks.

似然 · 估計/估計量 · 最大似然估計 · 極大似然 · MoDELS ·

2018 年 9 月 24 日

Implicit Maximum Likelihood Estimation

Ke Li,Jitendra Malik

from arxiv, 21 pages, 4 figures. In the interest of promoting discussion, we make the reviews available at //people.eecs.berkeley.edu/~ke.li/papers/imle_reviews.pdf

Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.

損失函數（機器學習） · Networking · 估計/估計量 · state-of-the-art · MoDELS ·

2018 年 7 月 3 日

Viewpoint Estimation-Insights & Model

Gilad Divon,Ayellet Tal

from arxiv, 17 pages, ECCV2018 submission

This paper addresses the problem of viewpoint estimation of an object in a given image. It presents five key insights that should be taken into consideration when designing a CNN that solves the problem. Based on these insights, the paper proposes a network in which (i) The architecture jointly solves detection, classification, and viewpoint estimation. (ii) New types of data are added and trained on. (iii) A novel loss function, which takes into account both the geometry of the problem and the new types of data, is propose. Our network improves the state-of-the-art results for this problem by 9.8%.