苍井空无码免费换线,91婷婷国产精选国产色

The modular subset sum problem consists of deciding, given a modulus $m$, a multiset $S$ of $n$ integers in $0..m-1$, and a target integer $t$, whether there exists a subset of $S$ with elements summing to $t \mod m $, and to report such a set if it exists. We give a simple $O(m \log m)$-time with high probability (w.h.p.) algorithm for the modular subset sum problem. This builds on and improves on a previous $O(m \log^7 m)$ w.h.p. algorithm from Axiotis, Backurs, Jin, Tzamos, and Wu (SODA 19). Our method utilizes the ADT of the dynamic strings structure of Gawrychowski et al. (SODA~18). However, as this structure is rather complicated we present a much simpler alternative which we call the Data Dependent Tree. As an application, we consider the computational version of a fundamental theorem in zero-sum Ramsey theory. The Erd\H{o}s-Ginzburg-Ziv Theorem states that a multiset of $2n - 1$ integers always contains a subset of cardinality exactly $n$ whose values sum to a multiple of $n$. We give an algorithm for finding such a subset in time $O(n \log n)$ w.h.p. which improves on an $O(n^2)$ algorithm due to Del Lungo, Marini, and Mori (Disc. Math. 09).

相關內容

多重集

關注 0

在數學中，多重集是對集的概念的修改，與集不同，集對每個元素允許多個實例。為每個元素提供的實例的正整數個數稱為該元素在多重集中的多重性。結果存在無限多個多重集，它們僅包含元素a和b，但因元素的多樣性而變化：（1）集{a，b}僅包含元素a和b，當將{a，b}視為多集時，每個元素的多重性為1;（2）在多重集{a，a，b}中，元素a具有多重性2，而b具有多重性1;（3）在多集{a，a，a，b，b，b}中，a和b都具有多重性3。

有向 · 近似 · 邊 · CASE · 算法與數據結構 ·

2023 年 12 月 13 日

Approximate Fully Dynamic Directed Densest Subgraph

Richard Li

We give a fully dynamic algorithm maintaining a $(1-\varepsilon)$-approximate directed densest subgraph in $\tilde{O}(\log^3(n)/\varepsilon^6)$ amortized time or $\tilde{O}(\log^4(n)/\varepsilon^7)$ per edge update (where $\tilde{O}$ hides $\log\log$ factors), based on earlier work by Chekuri and Quanrud [arXiv:2210.02611]. This result improves on earlier work done by Sawlani and Wang [arXiv:1907.03037], which guarantees $O(\log^5(n)/\varepsilon^7)$ worst case time for edge insertions and deletions.

輸出 · 分解的 · 近似 · 優化器 · 圖 ·

2023 年 12 月 12 日

Near-Optimal Differentially Private k-Core Decomposition

Laxman Dhulipala,George Z. Li,Quanquan C. Liu

from arxiv, 15 pages

Recent work by Dhulipala, Liu, Raskhodnikova, Shi, Shun, and Yu~\cite{DLRSSY22} initiated the study of the $k$-core decomposition problem under differential privacy. They show that approximate $k$-core numbers can be output while guaranteeing differential privacy, while only incurring a multiplicative error of $(2 +\eta)$ (for any constant $\eta >0$) and additive error of $\poly(\log(n))/\eps$. In this paper, we revisit this problem. Our main result is an $\eps$-edge differentially private algorithm for $k$-core decomposition which outputs the core numbers with no multiplicative error and $O(\text{log}(n)/\eps)$ additive error. This improves upon previous work by a factor of 2 in the multiplicative error, while giving near-optimal additive error. With a little additional work, this implies improved algorithms for densest subgraph and low out-degree ordering under differential privacy. For low out-degree ordering, we give an $\eps$-edge differentially private algorithm which outputs an implicit orientation such that the out-degree of each vertex is at most $d+O(\log{n}/{\eps})$, where $d$ is the degeneracy of the graph. This improves upon the best known guarantees for the problem by a factor of $4$ and gives near-optimal additive error. For densest subgraph, we give an $\eps$-edge differentially private algorithm outputting a subset of nodes that induces a subgraph of density at least ${D^*}/{2}-O(\text{log}(n)/\eps)$, where $D^*$ is the density for the optimal subgraph.

近似 · 泛函 · Integration · 模型評估 · 論文 ·

2023 年 12 月 12 日

Rational Approximations for Oscillatory Two-Parameter Mittag-Leffler Function

Aljowhara H. Honain,Khaled M. Furati,Ibrahim O. Sarumi,Abdul Q. M. Khaliq

from arxiv, 20 pages, 14 figures and 5 tables

The two-parameter Mittag-Leffler function $E_{\alpha, \beta}$ is of fundamental importance in fractional calculus. It appears frequently in the solutions of fractional differential and integral equations. Nonetheless, this vital function is often expensive to compute. Several attempts have been made to construct cost-effective and accurate approximations. These attempts focus mainly on the completely monotone Mittag-Leffler functions. However, when $\alpha > 1$ the monotonicity property is largely lost and as such roots and oscillations are exhibited. Consequently, existing approximants constructed mainly for $\alpha \in (0,1)$ often fail to capture this oscillatory behavior. In this paper, we construct computationally efficient and accurate rational approximants for $E_{\alpha, \beta}(-t)$, $t \ge 0$, with $\alpha \in (1,2)$. This construction is fundamentally based on the decomposition of Mittag-Leffler function with real roots into one without and a polynomial. Following which new approximants are constructed by combining the global Pad\'e approximation with a polynomial of appropriate degree. The rational approximants are extended to approximation of matrix Mittag-Leffler and different approaches to achieve efficient implementation for matrix arguments are discussed. Numerical experiments are provided to illustrate the significant accuracy improvement achieved by the proposed approximants.

Learning · 感知機 · 泛化理論 · CASE · 相同 ·

2023 年 12 月 12 日

The Copycat Perceptron: Smashing Barriers Through Collective Learning

Giovanni Catania,Aurélien Decelle,Beatriz Seoane

from arxiv, 2 figures in the main, 4 figures in the appendix

We characterize the equilibrium properties of a model of $y$ coupled binary perceptrons in the teacher-student scenario, subject to a learning rule, with an explicit ferromagnetic coupling proportional to the Hamming distance between the students' weights. In contrast to recent works, we analyze a more general setting in which thermal noise is present that affects each student's generalization performance. In the nonzero temperature regime, we find that the coupling of replicas produces a bend of the phase diagram towards smaller values of $\alpha$: This suggests that the free energy landscape gets smoother around the solution with perfect generalization (i.e., the teacher's) at a fixed fraction of examples, allowing standard thermal updates such as Simulated Annealing to easily reach the teacher solution and avoid entrapment in metastable states as it happens in the unreplicated case, even in the so-called computationally easy regime. These results provide additional analytic and numerical evidence for the recently conjectured Bayes-optimal property of Replicated Simulated Annealing (RSA) for a sufficient number of replicas. From a learning perspective, these results also suggest that multiple students working together (in this case reviewing the same data) are able to learn the same rule both significantly faster and with fewer examples, a property that could be exploited in the context of cooperative and federated learning.

原點 · 近似 · 縮放 · 分解的 · 流 ·

2023 年 12 月 12 日

Maximum Coverage in Sublinear Space, Faster

Stephen Jaud,Anthony Wirth,Farhana Choudhury

from arxiv, 12 pages, 7 figures

Given a collection of $m$ sets from a universe $\mathcal{U}$, the Maximum Set Coverage problem consists of finding $k$ sets whose union has largest cardinality. This problem is NP-Hard, but the solution can be approximated by a polynomial time algorithm up to a factor $1-1/e$. However, this algorithm does not scale well with the input size. In a streaming context, practical high-quality solutions are found, but with space complexity that scales linearly with respect to the size of the universe $|\mathcal{U}|$. However, one randomized streaming algorithm has been shown to produce a $1-1/e-\varepsilon$ approximation of the optimal solution with a space complexity that scales only poly-logarithmically with respect to $m$ and $|\mathcal{U}|$. In order to achieve such a low space complexity, the authors used a technique called subsampling, based on independent-wise hash functions, and $F_0$-sketching. This article focuses on this sublinear-space algorithm and introduces methods to reduce the time cost of subsampling. Firstly, we give some optimizations that do not alter the space complexity, number of passes and approximation quality of the original algorithm. In particular, we reanalyze the error bounds to show that the original independence factor of $\Omega(\varepsilon^{-2} k \log m)$ can be fine-tuned to $\Omega(k \log m)$. Secondly we show that $F_0$-sketching can be replaced by a much more simple mechanism. Finally, our experimental results show that even a pairwise-independent hash-function sampler does not produce worse solution than the original algorithm, while running significantly faster by several orders of magnitude.

查準率/準確率 · 估計/估計量 · 稀疏 · 限定等距性 · 圖 ·

2023 年 12 月 11 日

Compressive Recovery of Sparse Precision Matrices

Titouan Vayer,Etienne Lasalle,Rémi Gribonval,Paulo Gon?alves

We consider the problem of learning a graph modeling the statistical relations of the $d$ variables from a dataset with $n$ samples $X \in \mathbb{R}^{n \times d}$. Standard approaches amount to searching for a precision matrix $\Theta$ representative of a Gaussian graphical model that adequately explains the data. However, most maximum likelihood-based estimators usually require storing the $d^{2}$ values of the empirical covariance matrix, which can become prohibitive in a high-dimensional setting. In this work, we adopt a compressive viewpoint and aim to estimate a sparse $\Theta$ from a \emph{sketch} of the data, i.e. a low-dimensional vector of size $m \ll d^{2}$ carefully designed from $X$ using non-linear random features. Under certain assumptions on the spectrum of $\Theta$ (or its condition number), we show that it is possible to estimate it from a sketch of size $m=\Omega\left((d+2k)\log(d)\right)$ where $k$ is the maximal number of edges of the underlying graph. These information-theoretic guarantees are inspired by compressed sensing theory and involve restricted isometry properties and instance optimal decoders. We investigate the possibility of achieving practical recovery with an iterative algorithm based on the graphical lasso, viewed as a specific denoiser. We compare our approach and graphical lasso on synthetic datasets, demonstrating its favorable performance even when the dataset is compressed.

Minimax · 估計/估計量 · 穩健性 · 樣本 · 優化器 ·

2023 年 12 月 11 日

Robust Nonparametric Regression under Poisoning Attack

Puning Zhao,Zhiguo Wan

This paper studies robust nonparametric regression, in which an adversarial attacker can modify the values of up to $q$ samples from a training dataset of size $N$. Our initial solution is an M-estimator based on Huber loss minimization. Compared with simple kernel regression, i.e. the Nadaraya-Watson estimator, this method can significantly weaken the impact of malicious samples on the regression performance. We provide the convergence rate as well as the corresponding minimax lower bound. The result shows that, with proper bandwidth selection, $\ell_\infty$ error is minimax optimal. The $\ell_2$ error is optimal with relatively small $q$, but is suboptimal with larger $q$. The reason is that this estimator is vulnerable if there are many attacked samples concentrating in a small region. To address this issue, we propose a correction method by projecting the initial estimate to the space of Lipschitz functions. The final estimate is nearly minimax optimal for arbitrary $q$, up to a $\ln N$ factor.

正則化項 · 圖 · Conformer · 正則表達式 · 代價 ·

2023 年 12 月 9 日

Distinct Shortest Walk Enumeration for RPQs

Claire David,Nadime Francis,Victor Marsault

from arxiv, 27 pages

We consider the Distinct Shortest Walks problem. Given two vertices $s$ and $t$ of a graph database $\mathcal{D}$ and a regular path query, enumerate all walks of minimal length from $s$ to $t$ that carry a label that conforms to the query. Usual theoretical solutions turn out to be inefficient when applied to graph models that are closer to real-life systems, in particular because edges may carry multiple labels. Indeed, known algorithms may repeat the same answer exponentially many times. We propose an efficient algorithm for multi-labelled graph databases. The preprocessing runs in $O{|\mathcal{D}|\times|\mathcal{A}|}$ and the delay between two consecutive outputs is in $O(\lambda\times|\mathcal{A}|)$, where $\mathcal{A}$ is a nondeterministic automaton representing the query and $\lambda$ is the minimal length. The algorithm can handle $\varepsilon$-transitions in $\mathcal{A}$ or queries given as regular expressions at no additional cost.

Learning · 優化器 · 樣本 · 經驗風險最小化 · 國際學習理論會議 ·

2023 年 12 月 8 日

Optimal Multi-Distribution Learning

Zihan Zhang,Wenhao Zhan,Yuxin Chen,Simon S. Du,Jason D. Lee

Multi-distribution learning (MDL), which seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions, has emerged as a unified framework in response to the evolving demand for robustness, fairness, multi-group collaboration, etc. Achieving data-efficient MDL necessitates adaptive sampling, also called on-demand sampling, throughout the learning process. However, there exist substantial gaps between the state-of-the-art upper and lower bounds on the optimal sample complexity. Focusing on a hypothesis class of Vapnik-Chervonenkis (VC) dimension $d$, we propose a novel algorithm that yields an $varepsilon$-optimal randomized hypothesis with a sample complexity on the order of $(d+k)/\varepsilon^2$ (modulo some logarithmic factor), matching the best-known lower bound. Our algorithmic ideas and theory have been further extended to accommodate Rademacher classes. The proposed algorithms are oracle-efficient, which access the hypothesis class solely through an empirical risk minimization oracle. Additionally, we establish the necessity of randomization, unveiling a large sample size barrier when only deterministic hypotheses are permitted. These findings successfully resolve three open problems presented in COLT 2023 (i.e., Awasthi et al., (2023, Problem 1, 3 and 4)).

社區發現 · 估計/估計量 · 簇 · INFORMS · 極大似然估計 ·

2023 年 12 月 8 日

Multi-Frequency Joint Community Detection and Phase Synchronization

Lingda Wang,Zhizhen Zhao

from arxiv, Fixed a minor error and several typos. Accepted by IEEE Transactions on Signal and Information Processing over Networks

This paper studies the joint community detection and phase synchronization problem on the \textit{stochastic block model with relative phase}, where each node is associated with an unknown phase angle. This problem, with a variety of real-world applications, aims to recover the cluster structure and associated phase angles simultaneously. We show this problem exhibits a \textit{``multi-frequency''} structure by closely examining its maximum likelihood estimation (MLE) formulation, whereas existing methods are not originated from this perspective. To this end, two simple yet efficient algorithms that leverage the MLE formulation and benefit from the information across multiple frequencies are proposed. The former is a spectral method based on the novel multi-frequency column-pivoted QR factorization. The factorization applied to the top eigenvectors of the observation matrix provides key information about the cluster structure and associated phase angles. The second approach is an iterative multi-frequency generalized power method, where each iteration updates the estimation in a matrix-multiplication-then-projection manner. Numerical experiments show that our proposed algorithms significantly improve the ability of exactly recovering the cluster structure and the accuracy of the estimated phase angles, compared to state-of-the-art algorithms.