国产欧美日韩视频一区二区,欧美精品日韩精品国内精品

We study the question of whether submodular functions of random variables satisfying various notions of negative dependence satisfy Chernoff-like concentration inequalities. We prove such a concentration inequality for the lower tail when the random variables satisfy negative association or negative regression, partially resolving an open problem raised in (Qiu and Singla [QS22]). Previous work showed such concentration results for random variables that come from specific dependent-rounding algorithms (Chekuri, Vondrak, and Zenklusen [CVZ10] and Harvey and Olver [HO14]). We discuss some applications of our results to combinatorial optimization and beyond. We also show applications to the concentration of read-k families [Gav+15] under certain forms of negative dependence; we further show a simplified proof of the entropy-method approach of [Gav+15].

相關內容

隨機變量

關注 0

MoDELS · INFORMS · 估計/估計量 · Performer · 協方差矩陣 ·

2024 年 11 月 5 日

Assessment of Misspecification in CDMs Using a Generalized Information Matrix Test

Reyhaneh Hosseinpourkhoshkbari,Richard M. Golden

If the probability model is correctly specified, then we can estimate the covariance matrix of the asymptotic maximum likelihood estimate distribution using either the first or second derivatives of the likelihood function. Therefore, if the determinants of these two different covariance matrix estimation formulas differ this indicates model misspecification. This misspecification detection strategy is the basis of the Determinant Information Matrix Test ($GIMT_{Det}$). To investigate the performance of the $GIMT_{Det}$, a Deterministic Input Noisy And gate (DINA) Cognitive Diagnostic Model (CDM) was fit to the Fraction-Subtraction dataset. Next, various misspecified versions of the original DINA CDM were fit to bootstrap data sets generated by sampling from the original fitted DINA CDM. The $GIMT_{Det}$ showed good discrimination performance for larger levels of misspecification. In addition, the $GIMT_{Det}$ did not detect model misspecification when model misspecification was not present and additionally did not detect model misspecification when the level of misspecification was very low. However, the $GIMT_{Det}$ discrimation performance was highly variable across different misspecification strategies when the misspecification level was moderately sized. The proposed new misspecification detection methodology is promising but additional empirical studies are required to further characterize its strengths and limitations.

泛函 · Oracle · MoDELS · 相同 · 情景 ·

2024 年 11 月 4 日

Unclonable Cryptography with Unbounded Collusions and Impossibility of Hyperefficient Shadow Tomography

Alper ?akan,Vipul Goyal

from arxiv, Theory of Cryptography Conference (TCC) 2024. Full version with proofs

Quantum no-cloning theorem gives rise to the intriguing possibility of quantum copy protection where we encode a program or functionality in a quantum state such that a user in possession of k copies cannot create k+1 copies, for any k. Introduced by Aaronson (CCC'09) over a decade ago, copy protection has proven to be notoriously hard to achieve. Previous work has been able to achieve copy-protection for various functionalities only in restricted models: (i) in the bounded collusion setting where k -> k+1 security is achieved for a-priori fixed collusion bound k (in the plain model with the same computational assumptions as ours, by Liu, Liu, Qian, Zhandry [TCC'22]), or, (ii) only k -> 2k security is achieved (relative to a structured quantum oracle, by Aaronson [CCC'09]). In this work, we give the first unbounded collusion-resistant (i.e. multiple-copy secure) copy-protection schemes, answering the long-standing open question of constructing such schemes, raised by multiple previous works starting with Aaronson (CCC'09). More specifically, we obtain the following results. - We construct (i) public-key encryption, (ii) public-key functional encryption, (iii) signature and (iv) pseudorandom function schemes whose keys are copy-protected against unbounded collusions in the plain model (i.e. without any idealized oracles), assuming (post-quantum) subexponentially secure iO and LWE. - We show that any unlearnable functionality can be copy-protected against unbounded collusions, relative to a classical oracle. - As a corollary of our results, we rule out the existence of hyperefficient quantum shadow tomography, * even given non-black-box access to the measurements, assuming subexponentially secure iO and LWE, or, * unconditionally relative to a quantumly accessible classical oracle, and hence answer an open question by Aaronson (STOC'18).

通用動力公司 · 對數幾率回歸 · 評論員 · Less · 可分離的 ·

2024 年 11 月 4 日

Gradient Descent on Logistic Regression with Non-Separable Data and Large Step Sizes

Si Yi Meng,Antonio Orvieto,Daniel Yiming Cao,Christopher De Sa

We study gradient descent (GD) dynamics on logistic regression problems with large, constant step sizes. For linearly-separable data, it is known that GD converges to the minimizer with arbitrarily large step sizes, a property which no longer holds when the problem is not separable. In fact, the behaviour can be much more complex -- a sequence of period-doubling bifurcations begins at the critical step size $2/\lambda$, where $\lambda$ is the largest eigenvalue of the Hessian at the solution. Using a smaller-than-critical step size guarantees convergence if initialized nearby the solution: but does this suffice globally? In one dimension, we show that a step size less than $1/\lambda$ suffices for global convergence. However, for all step sizes between $1/\lambda$ and the critical step size $2/\lambda$, one can construct a dataset such that GD converges to a stable cycle. In higher dimensions, this is actually possible even for step sizes less than $1/\lambda$. Our results show that although local convergence is guaranteed for all step sizes less than the critical step size, global convergence is not, and GD may instead converge to a cycle depending on the initialization.

Analysis · 計算成本 · 代價 · 可行 · Principle ·

2024 年 11 月 4 日

Cost-Gain Analysis of Sequence Selection for Nonlinearity Mitigation

Stella Civelli,Marco Secondini

from arxiv, The manuscript has been submitted for publication at the optical fiber communication (OFC) conference 2025

We propose a low-complexity sign-dependent metric for sequence selection and study the nonlinear shaping gain achievable for a given computational cost, establishing a benchmark for future research. Small gains are obtained with feasible complexity. Higher gains are achievable in principle, but with high complexity or a more sophisticated metric.

ReLU · 泛函 · 有偏 · Learning · 高斯分布 ·

2024 年 11 月 4 日

Agnostic Learning of General ReLU Activation Using Gradient Descent

Pranjal Awasthi,Alex Tang,Aravindan Vijayaraghavan

from arxiv, 28 oages

We provide a convergence analysis of gradient descent for the problem of agnostically learning a single ReLU function with moderate bias under Gaussian distributions. Unlike prior work that studies the setting of zero bias, we consider the more challenging scenario when the bias of the ReLU function is non-zero. Our main result establishes that starting from random initialization, in a polynomial number of iterations gradient descent outputs, with high probability, a ReLU function that achieves an error that is within a constant factor of the optimal error of the best ReLU function with moderate bias. We also provide finite sample guarantees, and these techniques generalize to a broader class of marginal distributions beyond Gaussians.

Performer · Processing（編程語言） · 相同 · 結點 · 論文 ·

2024 年 11 月 2 日

An Implementation and Experimental Comparison of Dynamic Ordered Sets

Jordan Malek

from arxiv, 6 Chapters, 71 Pages

It is becoming increasingly difficult to improve the performance of a a single process (thread) on a computer due to physical limitations. Modern systems use multi-core processors in which multiple processes (threads) may run concurrently. A lock-free data structure can allow these processes to communicate with each other without requiring mutual exclusion, and may increase the amount of work they may perform in parallel rather than sequentially, thus improving the performance of the system as a whole. This paper contains an implementation of Ko's Lock-Free Binary Trie, which stores a dynamic set of keys from an ordered universe. It supports insert, remove, search and predecessor operations. One novel component of this implementation is a lock-free linked list which allows multiple processes to attempt to insert the same node, but which prevents a node from being reinserted once it has been removed from the list. The final section of this paper contains an experimental comparison of this implementation against other data structures which implement the same abstract data type (ADT) as the lock-free trie. Analysis of these experiments reveal that the implementation of Ko's Trie performs better than existing theoretical implementations of this ADT when the universe of keys is large, when removes are rare and when the number of processes performing operations concurrently is low.

損失 · Learning · 情景 · Continuity · Neural Networks ·

2024 年 11 月 1 日

A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning

Arthur Juliani,Jordan T. Ash

Continual learning with deep neural networks presents challenges distinct from both the fixed-dataset and convex continual learning regimes. One such challenge is plasticity loss, wherein a neural network trained in an online fashion displays a degraded ability to fit new tasks. This problem has been extensively studied in both supervised learning and off-policy reinforcement learning (RL), where a number of remedies have been proposed. Still, plasticity loss has received less attention in the on-policy deep RL setting. Here we perform an extensive set of experiments examining plasticity loss and a variety of mitigation methods in on-policy deep RL. We demonstrate that plasticity loss is pervasive under domain shift in this regime, and that a number of methods developed to resolve it in other settings fail, sometimes even performing worse than applying no intervention at all. In contrast, we find that a class of ``regenerative'' methods are able to consistently mitigate plasticity loss in a variety of contexts, including in gridworld tasks and more challenging environments like Montezuma's Revenge and ProcGen.

MoDELS · 潛在 · Subspace · 分解的 · 可理解性 ·

2024 年 10 月 31 日

Exploring Behavior-Relevant and Disentangled Neural Dynamics with Generative Diffusion Models

Yule Wang,Chengrui Li,Weihan Li,Anqi Wu

Understanding the neural basis of behavior is a fundamental goal in neuroscience. Current research in large-scale neuro-behavioral data analysis often relies on decoding models, which quantify behavioral information in neural data but lack details on behavior encoding. This raises an intriguing scientific question: ``how can we enable in-depth exploration of neural representations in behavioral tasks, revealing interpretable neural dynamics associated with behaviors''. However, addressing this issue is challenging due to the varied behavioral encoding across different brain regions and mixed selectivity at the population level. To tackle this limitation, our approach, named ``BeNeDiff'', first identifies a fine-grained and disentangled neural subspace using a behavior-informed latent variable model. It then employs state-of-the-art generative diffusion models to synthesize behavior videos that interpret the neural dynamics of each latent factor. We validate the method on multi-session datasets containing widefield calcium imaging recordings across the dorsal cortex. Through guiding the diffusion model to activate individual latent factors, we verify that the neural dynamics of latent factors in the disentangled neural subspace provide interpretable quantifications of the behaviors of interest. At the same time, the neural subspace in BeNeDiff demonstrates high disentanglement and neural reconstruction quality.

圖形處理器 · 圖 · 可辨認的 · Neural Networks · Networking ·

2021 年 5 月 31 日

On Explainability of Graph Neural Networks via Subgraph Explorations

Hao Yuan,Haiyang Yu,Jie Wang,Kang Li,Shuiwang Ji

from arxiv, Accepted by ICML 2021

We consider the problem of explaining the predictions of graph neural networks (GNNs), which otherwise are considered as black boxes. Existing methods invariably focus on explaining the importance of graph nodes or edges but ignore the substructures of graphs, which are more intuitive and human-intelligible. In this work, we propose a novel method, known as SubgraphX, to explain GNNs by identifying important subgraphs. Given a trained GNN model and an input graph, our SubgraphX explains its predictions by efficiently exploring different subgraphs with Monte Carlo tree search. To make the tree search more effective, we propose to use Shapley values as a measure of subgraph importance, which can also capture the interactions among different subgraphs. To expedite computations, we propose efficient approximation schemes to compute Shapley values for graph data. Our work represents the first attempt to explain GNNs via identifying subgraphs explicitly and directly. Experimental results show that our SubgraphX achieves significantly improved explanations, while keeping computations at a reasonable level.

INFORMS · 圖 · 可約的 · 知識圖譜 · 可辨認的 ·

2018 年 8 月 29 日

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Yi Luan,Luheng He,Mari Ostendorf,Hannaneh Hajishirzi

We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.