99热日韩这里只有国产中文精品-日韩A精品日韩精品无码

Given a database of bit strings $A_1,\ldots,A_m\in \{0,1\}^n$, a fundamental data structure task is to estimate the distances between a given query $B\in \{0,1\}^n$ with all the strings in the database. In addition, one might further want to ensure the integrity of the database by releasing these distance statistics in a secure manner. In this work, we propose differentially private (DP) data structures for this type of tasks, with a focus on Hamming and edit distance. On top of the strong privacy guarantees, our data structures are also time- and space-efficient. In particular, our data structure is $\epsilon$-DP against any sequence of queries of arbitrary length, and for any query $B$ such that the maximum distance to any string in the database is at most $k$, we output $m$ distance estimates. Moreover, - For Hamming distance, our data structure answers any query in $\widetilde O(mk+n)$ time and each estimate deviates from the true distance by at most $\widetilde O(k/e^{\epsilon/\log k})$; - For edit distance, our data structure answers any query in $\widetilde O(mk^2+n)$ time and each estimate deviates from the true distance by at most $\widetilde O(k/e^{\epsilon/(\log k \log n)})$. For moderate $k$, both data structures support sublinear query operations. We obtain these results via a novel adaptation of the randomized response technique as a bit flipping procedure, applied to the sketched strings.

相關內容

估計/估計量

關注 3

Learning · 點云 · 稀疏 · Performance · 傳感器 ·

2024 年 12 月 20 日

Sparse Point Clouds Assisted Learned Image Compression

Yiheng Jiang,Haotian Zhang,Li Li,Dong Liu,Zhu Li

from arxiv, Accepted by TCSVT

In the field of autonomous driving, a variety of sensor data types exist, each representing different modalities of the same scene. Therefore, it is feasible to utilize data from other sensors to facilitate image compression. However, few techniques have explored the potential benefits of utilizing inter-modality correlations to enhance the image compression performance. In this paper, motivated by the recent success of learned image compression, we propose a new framework that uses sparse point clouds to assist in learned image compression in the autonomous driving scenario. We first project the 3D sparse point cloud onto a 2D plane, resulting in a sparse depth map. Utilizing this depth map, we proceed to predict camera images. Subsequently, we use these predicted images to extract multi-scale structural features. These features are then incorporated into learned image compression pipeline as additional information to improve the compression performance. Our proposed framework is compatible with various mainstream learned image compression models, and we validate our approach using different existing image compression methods. The experimental results show that incorporating point cloud assistance into the compression pipeline consistently enhances the performance.

Rainbow · 相互獨立的 · 方陣 · Notability · Extensibility ·

2024 年 12 月 19 日

Rainbow Arborescence Conjecture

Kristóf Bérczi,Tamás Király,Yutaro Yamaguchi,Yu Yokoi

from arxiv, 9 pages

The famous Ryser--Brualdi--Stein conjecture asserts that every $n \times n$ Latin square contains a partial transversal of size $n-1$. Since its appearance, the conjecture has attracted significant interest, leading to several generalizations. One of the most notable extensions is to matroid intersection given by Aharoni, Kotlar, and Ziv, focusing on the existence of a common independent transversal of the common independent sets of two matroids. In this paper, we study a special case of this setting, the Rainbow Arborescence Conjecture, which states that any graph on $n$ vertices formed by the union of $n-1$ spanning arborescences contains an arborescence using exactly one arc from each. We prove that the computational problem of testing the existence of such an arborescence with a fixed root is NP-complete, verify the conjecture in several cases, and explore relaxed versions of the problem.

MoDELS · 泛函 · 相互獨立的 · 平滑 · 類別 ·

2024 年 12 月 19 日

Asymptotic Equivalence for Nonparametric Regression

Ion Grama,Michael Nussbaum

from arxiv, 36 pages, 0 figures

We consider a nonparametric model $\mathcal{E}^{n},$ generated by independent observations $X_{i},$ $i=1,...,n,$ with densities $p(x,\theta_{i}),$ $i=1,...,n,$ the parameters of which $\theta _{i}=f(i/n)\in \Theta $ are driven by the values of an unknown function $f:[0,1]\rightarrow \Theta $ in a smoothness class. The main result of the paper is that, under regularity assumptions, this model can be approximated, in the sense of the Le Cam deficiency pseudodistance, by a nonparametric Gaussian shift model $Y_{i}=\Gamma (f(i/n))+\varepsilon _{i},$ where $\varepsilon_{1},...,\varepsilon _{n}$ are i.i.d. standard normal r.v.'s, the function $\Gamma (\theta ):\Theta \rightarrow \mathrm{R}$ satisfies $\Gamma ^{\prime}(\theta )=\sqrt{I(\theta )}$ and $I(\theta )$ is the Fisher information corresponding to the density $p(x,\theta ).$

Continuity · 操作 · 正則的 · 離散化 · 泛化理論 ·

2024 年 12 月 18 日

The Church Synthesis Problem over Continuous Time

Alexander Rabinovich,Daniel Fattal

The Church Problem asks for the construction of a procedure which, given a logical specification A(I,O) between input omega-strings I and output omega-strings O, determines whether there exists an operator F that implements the specification in the sense that A(I, F(I)) holds for all inputs I. Buchi and Landweber provided a procedure to solve the Church problem for MSO specifications and operators computable by finite-state automata. We investigate a generalization of the Church synthesis problem to the continuous time domain of the non-negative reals. We show that in the continuous time domain there are phenomena which are very different from the canonical discrete time domain of the natural numbers.

情景 · 相同 · CASE · 分解的 · 成對型 ·

2024 年 12 月 18 日

Subset Selection Problems in Planar Point Sets

József Balogh,Felix Christian Clemen,Adrian Dumitrescu,Dingyuan Liu

from arxiv, 19 pages, 4 figures, comments are welcome

Given a finite set satisfying condition $\mathcal{A}$, the subset selection problem asks, how large of a subset satisfying condition $\mathcal{B}$ can we find? We make progress on three instances of subset selection problems in planar point sets. Let $n,s\in\mathbb{N}$ with $n\geq s$, and let $P\subseteq\mathbb{R}^2$ be a set of $n$ points, where at most $s$ points lie on the same line. Firstly, we select a general position subset of $P$, i.e., a subset containing no $3$ points on the same line. This problem was proposed by Erd\H{o}s under the regime when $s$ is a constant. For $s$ being non-constant, we give new lower and upper bounds on the maximum size of such a subset. In particular, we show that in the worst case such a set can have size at most $O(n/s)$ when $n^{1/3}\leq s\leq n$ and $O(n^{5/6+o(1)}/\sqrt{s})$ when $3\leq s\leq n^{1/3}$. Secondly, we select a monotone general position subset of $P$, that is, a subset in general position where the points are ordered from left to right and their $y$-coordinates are either non-decreasing or non-increasing. We present bounds on the maximum size of such a subset. In particular, when $s=\Theta(\sqrt{n})$, our upper and lower bounds differ only by a logarithmic factor. Lastly, we select a subset of $P$ with pairwise distinct slopes. This problem was initially studied by Erd\H{o}s, Graham, Ruzsa, and Taylor on the grid. We show that for $s=O(\sqrt{n})$ such a subset of size $\Omega((n/\log{s})^{1/3})$ can always be found in $P$. When $s=\Theta(\sqrt{n})$, this matches a lower bound given by Zhang on the grid. As for the upper bound, we show that in the worst case such a subset has size at most $O(\sqrt{n})$ for $2\leq s\leq n^{3/8}$ and $O((n/s)^{4/5})$ for $n^{3/8}\leq s=O(\sqrt{n})$. The proofs use a wide range of tools such as incidence geometry, probabilistic methods, the hypergraph container method, and additive combinatorics.

條件獨立的 · 相互獨立的 · 近似 · MoDELS · Performer ·

2024 年 12 月 18 日

Conditional Diffusion Models Based Conditional Independence Testing

Yanfeng Yang,Shuai Li,Yingjie Zhang,Zhuoran Sun,Hai Shu,Ziqi Chen,Renming Zhang

from arxiv, 17 pages, 7 figures, aaai 2025

Conditional independence (CI) testing is a fundamental task in modern statistics and machine learning. The conditional randomization test (CRT) was recently introduced to test whether two random variables, $X$ and $Y$, are conditionally independent given a potentially high-dimensional set of random variables, $Z$. The CRT operates exceptionally well under the assumption that the conditional distribution $X|Z$ is known. However, since this distribution is typically unknown in practice, accurately approximating it becomes crucial. In this paper, we propose using conditional diffusion models (CDMs) to learn the distribution of $X|Z$. Theoretically and empirically, it is shown that CDMs closely approximate the true conditional distribution. Furthermore, CDMs offer a more accurate approximation of $X|Z$ compared to GANs, potentially leading to a CRT that performs better than those based on GANs. To accommodate complex dependency structures, we utilize a computationally efficient classifier-based conditional mutual information (CMI) estimator as our test statistic. The proposed testing procedure performs effectively without requiring assumptions about specific distribution forms or feature dependencies, and is capable of handling mixed-type conditioning sets that include both continuous and discrete variables. Theoretical analysis shows that our proposed test achieves a valid control of the type I error. A series of experiments on synthetic data demonstrates that our new test effectively controls both type-I and type-II errors, even in high dimensional scenarios.

多峰值 · 異常檢測 · 點云 · Extensibility · 連結 ·

2023 年 3 月 1 日

Multimodal Industrial Anomaly Detection via Hybrid Fusion

Yue Wang,Jinlong Peng,Jiangning Zhang,Ran Yi,Yabiao Wang,Chengjie Wang

from arxiv, Accepted by CVPR 2023

2D-based Industrial Anomaly Detection has been widely discussed, however, multimodal industrial anomaly detection based on 3D point clouds and RGB images still has many untouched fields. Existing multimodal industrial anomaly detection methods directly concatenate the multimodal features, which leads to a strong disturbance between features and harms the detection performance. In this paper, we propose Multi-3D-Memory (M3DM), a novel multimodal anomaly detection method with hybrid fusion scheme: firstly, we design an unsupervised feature fusion with patch-wise contrastive learning to encourage the interaction of different modal features; secondly, we use a decision layer fusion with multiple memory banks to avoid loss of information and additional novelty classifiers to make the final decision. We further propose a point feature alignment operation to better align the point cloud and RGB features. Extensive experiments show that our multimodal industrial anomaly detection model outperforms the state-of-the-art (SOTA) methods on both detection and segmentation precision on MVTec-3D AD dataset. Code is available at //github.com/nomewang/M3DM.

Learning · Processing（編程語言） · MoDELS · 分解的 · 表示學習 ·

2022 年 11 月 21 日

Disentangled Representation Learning

Xin Wang,Hong Chen,Si'ao Tang,Zihao Wu,Wenwu Zhu

from arxiv, 22 pages,9 figures

Disentangled Representation Learning (DRL) aims to learn a model capable of identifying and disentangling the underlying factors hidden in the observable data in representation form. The process of separating underlying factors of variation into variables with semantic meaning benefits in learning explainable representations of data, which imitates the meaningful understanding process of humans when observing an object or relation. As a general learning strategy, DRL has demonstrated its power in improving the model explainability, controlability, robustness, as well as generalization capacity in a wide range of scenarios such as computer vision, natural language processing, data mining etc. In this article, we comprehensively review DRL from various aspects including motivations, definitions, methodologies, evaluations, applications and model designs. We discuss works on DRL based on two well-recognized definitions, i.e., Intuitive Definition and Group Theory Definition. We further categorize the methodologies for DRL into four groups, i.e., Traditional Statistical Approaches, Variational Auto-encoder Based Approaches, Generative Adversarial Networks Based Approaches, Hierarchical Approaches and Other Approaches. We also analyze principles to design different DRL models that may benefit different tasks in practical applications. Finally, we point out challenges in DRL as well as potential research directions deserving future investigations. We believe this work may provide insights for promoting the DRL research in the community.

entity · 圖 · 知識圖譜 · 自動問答 · MoDELS ·

2019 年 10 月 15 日

Efficiently Embedding Dynamic Knowledge Graphs

Tianxing Wu,Arijit Khan,Huan Gao,Cheng Li

from arxiv, 14 pages

Knowledge graph (KG) embedding encodes the entities and relations from a KG into low-dimensional vector spaces to support various applications such as KG completion, question answering, and recommender systems. In real world, knowledge graphs (KGs) are dynamic and evolve over time with addition or deletion of triples. However, most existing models focus on embedding static KGs while neglecting dynamics. To adapt to the changes in a KG, these models need to be re-trained on the whole KG with a high time cost. In this paper, to tackle the aforementioned problem, we propose a new context-aware Dynamic Knowledge Graph Embedding (DKGE) method which supports the embedding learning in an online fashion. DKGE introduces two different representations (i.e., knowledge embedding and contextual element embedding) for each entity and each relation, in the joint modeling of entities and relations as well as their contexts, by employing two attentive graph convolutional networks, a gate strategy, and translation operations. This effectively helps limit the impacts of a KG update in certain regions, not in the entire graph, so that DKGE can rapidly acquire the updated KG embedding by a proposed online learning algorithm. Furthermore, DKGE can also learn KG embedding from scratch. Experiments on the tasks of link prediction and question answering in a dynamic environment demonstrate the effectiveness and efficiency of DKGE.

長短期記憶網絡 · 命名實體識別 · MoDELS · Better · 門控 ·

2018 年 5 月 15 日

Chinese NER Using Lattice LSTM

Yue Zhang,Jie Yang

from arxiv, Accepted at ACL 2018 as Long paper

We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.