秋霞网一区二区三区_东京热无码一区二区三区无码_精品亚洲国产成人小电影_国产午夜精品久久久久婷婷_人人射人人干人人插_亚洲中久无码永久在线观看同_亚洲国产中文午夜精品不卡

The missing item problem, as introduced by Stoeckl in his work at SODA 23, focuses on continually identifying a missing element $e$ in a stream of elements ${e_1, ..., e_{\ell}}$ from the set $\{1,2,...,n\}$, such that $e \neq e_i$ for any $i \in \{1,...,n\}$. Stoeckl's investigation primarily delves into scenarios with $\ell<n$, providing bounds for the (i) deterministic case, (ii) the static case -- where the algorithm might be randomized but the stream is fixed in advanced) and (iii) the adversarially robust case -- where the algorithm is randomized and each stream element can be chosen depending on earlier algorithm outputs. Building upon this foundation, our paper addresses previously unexplored aspects of the missing item problem. In the first segment, we examine the static setting with a long stream, where the length of the steam $\ell$ is close to or even exceeds the size of the universe $n$. We present an algorithm demonstrating that even when $\ell$ is very close to $n$ (say $\ell=n-1$), polylog($n$) bits of memory suffice to identify the missing item. Additionally, we establish tight bounds of $\tilde{\Theta(k)}$ for the scenario of $\ell = n+k$. The second segment of this part of our work focuses on the {\em adversarially robust setting}. We show a lower bound for a pseudo-deterministic error-zero (where the algorithm reports its errors) algorithm of approximating $\Omega(\ell)$, up to polylog factors. Based on Stoeckl's work, we establish a lower bound for a random-start (only use randomness at initialization) error-zero streaming algorithm. In the final segment, we explore streaming algorithms with randomness-on-the-fly, where the random bits that are saved for future use are included in the space cost. For streams with length $\ell = O(\sqrt{n})$, we provide an upper bound of $O(log n)$. This establishes a gap between randomness-on-the-fly to random-start.

相關內容

流(liu)

關注 1

圖 · contrastive · Networking · Learning · 層 ·

2024 年 2 月 23 日

Oversmoothing: A Nightmare for Graph Contrastive Learning?

Jintang Li,Wangbin Sun,Ruofan Wu,Yuchang Zhu,Liang Chen,Zibin Zheng

from arxiv, Technical report; Code available at //github.com/EdisonLeeeee/BlockGCL

Oversmoothing is a common phenomenon observed in graph neural networks (GNNs), in which an increase in the network depth leads to a deterioration in their performance. Graph contrastive learning (GCL) is emerging as a promising way of leveraging vast unlabeled graph data. As a marriage between GNNs and contrastive learning, it remains unclear whether GCL inherits the same oversmoothing defect from GNNs. This work undertakes a fundamental analysis of GCL from the perspective of oversmoothing on the first hand. We demonstrate empirically that increasing network depth in GCL also leads to oversmoothing in their deep representations, and surprisingly, the shallow ones. We refer to this phenomenon in GCL as `long-range starvation', wherein lower layers in deep networks suffer from degradation due to the lack of sufficient guidance from supervision. Based on our findings, we present BlockGCL, a remarkably simple yet effective blockwise training framework to prevent GCL from notorious oversmoothing. Without bells and whistles, BlockGCL consistently improves robustness and stability for well-established GCL methods with increasing numbers of layers on several real-world graph benchmarks.

語言模型化 · 大語言模型 · MoDELS · 有偏 · INTERACT ·

2024 年 2 月 22 日

How Susceptible are Large Language Models to Ideological Manipulation?

Kai Chen,Zihao He,Jun Yan,Taiwei Shi,Kristina Lerman

Large Language Models (LLMs) possess the potential to exert substantial influence on public perceptions and interactions with information. This raises concerns about the societal impact that could arise if the ideologies within these models can be easily manipulated. In this work, we investigate how effectively LLMs can learn and generalize ideological biases from their instruction-tuning data. Our findings reveal a concerning vulnerability: exposure to only a small amount of ideologically driven samples significantly alters the ideology of LLMs. Notably, LLMs demonstrate a startling ability to absorb ideology from one topic and generalize it to even unrelated ones. The ease with which LLMs' ideologies can be skewed underscores the risks associated with intentionally poisoned training data by malicious actors or inadvertently introduced biases by data annotators. It also emphasizes the imperative for robust safeguards to mitigate the influence of ideological manipulations on LLMs.

近似 · 收縮 · 優化器 · Agent · Learning ·

2024 年 2 月 22 日

Are Bounded Contracts Learnable and Approximately Optimal?

Yurong Chen,Zhaohua Chen,Xiaotie Deng,Zhiyi Huang

This paper considers the hidden-action model of the principal-agent problem, in which a principal incentivizes an agent to work on a project using a contract. We investigate whether contracts with bounded payments are learnable and approximately optimal. Our main results are two learning algorithms that can find a nearly optimal bounded contract using a polynomial number of queries, under two standard assumptions in the literature: a costlier action for the agent leads to a better outcome distribution for the principal, and the agent's cost/effort has diminishing returns. Our polynomial query complexity upper bound shows that standard assumptions are sufficient for achieving an exponential improvement upon the known lower bound for general instances. Unlike the existing algorithms, which relied on discretizing the contract space, our algorithms directly learn the underlying outcome distributions. As for the approximate optimality of bounded contracts, we find that they could be far from optimal in terms of multiplicative or additive approximation, but satisfy a notion of mixed approximation.

規范化的 · 樣例 · 可理解性 · state-of-the-art · 多樣性 ·

2024 年 2 月 22 日

Reimagining Anomalies: What If Anomalies Were Normal?

Philipp Liznerski,Saurabh Varshneya,Ece Calikus,Sophie Fellenz,Marius Kloft

from arxiv, 30 pages; preprint

Deep learning-based methods have achieved a breakthrough in image anomaly detection, but their complexity introduces a considerable challenge to understanding why an instance is predicted to be anomalous. We introduce a novel explanation method that generates multiple counterfactual examples for each anomaly, capturing diverse concepts of anomalousness. A counterfactual example is a modification of the anomaly that is perceived as normal by the anomaly detector. The method provides a high-level semantic explanation of the mechanism that triggered the anomaly detector, allowing users to explore "what-if scenarios." Qualitative and quantitative analyses across various image datasets show that the method applied to state-of-the-art anomaly detectors can achieve high-quality semantic explanations of detectors.

INTERACT · INFORMS · Processing（編程語言） · 語言模型化 · 回合 ·

2023 年 5 月 22 日

Interactive Natural Language Processing

Zekun Wang,Ge Zhang,Kexin Yang,Ning Shi,Wangchunshu Zhou,Shaochun Hao,Guangzheng Xiong,Yizhi Li,Mong Yuan Sim,Xiuying Chen,Qingqing Zhu,Zhenzhu Yang,Adam Nik,Qi Liu,Chenghua Lin,Shi Wang,Ruibo Liu,Wenhu Chen,Ke Xu,Dayiheng Liu,Yike Guo,Jie Fu

from arxiv, 110 pages

Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP, aimed at addressing limitations in existing frameworks while aligning with the ultimate goals of artificial intelligence. This paradigm considers language models as agents capable of observing, acting, and receiving feedback iteratively from external entities. Specifically, language models in this context can: (1) interact with humans for better understanding and addressing user needs, personalizing responses, aligning with human values, and improving the overall user experience; (2) interact with knowledge bases for enriching language representations with factual knowledge, enhancing the contextual relevance of responses, and dynamically leveraging external information to generate more accurate and informed responses; (3) interact with models and tools for effectively decomposing and addressing complex tasks, leveraging specialized expertise for specific subtasks, and fostering the simulation of social behaviors; and (4) interact with environments for learning grounded representations of language, and effectively tackling embodied tasks such as reasoning, planning, and decision-making in response to environmental observations. This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept. We then provide a systematic classification of iNLP, dissecting its various components, including interactive objects, interaction interfaces, and interaction methods. We proceed to delve into the evaluation methodologies used in the field, explore its diverse applications, scrutinize its ethical and safety issues, and discuss prospective research directions. This survey serves as an entry point for researchers who are interested in this rapidly evolving area and offers a broad view of the current landscape and future trajectory of iNLP.

全局極小值 · 優化器 · 極小值 · 非凸 · 近似 ·

2021 年 3 月 24 日

Why Do Local Methods Solve Nonconvex Problems?

Tengyu Ma

from arxiv, This is the Chapter 21 of the book "Beyond the Worst-Case Analysis of Algorithms"

Non-convex optimization is ubiquitous in modern machine learning. Researchers devise non-convex objective functions and optimize them using off-the-shelf optimizers such as stochastic gradient descent and its variants, which leverage the local geometry and update iteratively. Even though solving non-convex functions is NP-hard in the worst case, the optimization quality in practice is often not an issue -- optimizers are largely believed to find approximate global minima. Researchers hypothesize a unified explanation for this intriguing phenomenon: most of the local minima of the practically-used objectives are approximately global minima. We rigorously formalize it for concrete instances of machine learning problems.

次最優 · ML · 極小點 · state-of-the-art · MoDELS ·

2020 年 12 月 10 日

Composite Adversarial Attacks

Xiaofeng Mao,Yuefeng Chen,Shuhui Wang,Hang Su,Yuan He,Hui Xue

from arxiv, To appear in AAAI 2021, code will be released later

Adversarial attack is a technique for deceiving Machine Learning (ML) models, which provides a way to evaluate the adversarial robustness. In practice, attack algorithms are artificially selected and tuned by human experts to break a ML system. However, manual selection of attackers tends to be sub-optimal, leading to a mistakenly assessment of model security. In this paper, a new procedure called Composite Adversarial Attack (CAA) is proposed for automatically searching the best combination of attack algorithms and their hyper-parameters from a candidate pool of \textbf{32 base attackers}. We design a search space where attack policy is represented as an attacking sequence, i.e., the output of the previous attacker is used as the initialization input for successors. Multi-objective NSGA-II genetic algorithm is adopted for finding the strongest attack policy with minimum complexity. The experimental result shows CAA beats 10 top attackers on 11 diverse defenses with less elapsed time (\textbf{6 $\times$ faster than AutoAttack}), and achieves the new state-of-the-art on $l_{\infty}$, $l_{2}$ and unrestricted adversarial attacks.

AdderNet · Neural Networks · Networking · 卷積 · 模型評估 ·

2019 年 12 月 31 日

AdderNet: Do We Really Need Multiplications in Deep Learning?

Hanting Chen,Yunhe Wang,Chunjing Xu,Boxin Shi,Chao Xu,Qi Tian,Chang Xu

Compared with cheap addition operation, multiplication operation is of much higher computation complexity. The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values. In this paper, we present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs. In AdderNets, we take the $\ell_1$-norm distance between filters and input feature as the output response. The influence of this new similarity measure on the optimization of neural network have been thoroughly analyzed. To achieve a better performance, we develop a special back-propagation approach for AdderNets by investigating the full-precision gradient. We then propose an adaptive learning rate strategy to enhance the training procedure of AdderNets according to the magnitude of each neuron's gradient. As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset without any multiplication in convolution layer.

置信度 · MoDELS · Extensibility · 圖 · entity ·

2019 年 2 月 26 日

Embedding Uncertain Knowledge Graphs

Xuelu Chen,Muhao Chen,Weijia Shi,Yizhou Sun,Carlo Zaniolo

Embedding models for deterministic Knowledge Graphs (KG) have been extensively studied, with the purpose of capturing latent semantic relations between entities and incorporating the structured knowledge into machine learning. However, there are many KGs that model uncertain knowledge, which typically model the inherent uncertainty of relations facts with a confidence score, and embedding such uncertain knowledge represents an unresolved challenge. The capturing of uncertain knowledge will benefit many knowledge-driven applications such as question answering and semantic search by providing more natural characterization of the knowledge. In this paper, we propose a novel uncertain KG embedding model UKGE, which aims to preserve both structural and uncertainty information of relation facts in the embedding space. Unlike previous models that characterize relation facts with binary classification techniques, UKGE learns embeddings according to the confidence scores of uncertain relation facts. To further enhance the precision of UKGE, we also introduce probabilistic soft logic to infer confidence scores for unseen relation facts during training. We propose and evaluate two variants of UKGE based on different learning objectives. Experiments are conducted on three real-world uncertain KGs via three tasks, i.e. confidence prediction, relation fact ranking, and relation fact classification. UKGE shows effectiveness in capturing uncertain knowledge by achieving promising results on these tasks, and consistently outperforms baselines on these tasks.

長短期記憶網絡 · 命名實體識別 · MoDELS · Better · 門控 ·

2018 年 5 月 15 日

Chinese NER Using Lattice LSTM

Yue Zhang,Jie Yang

from arxiv, Accepted at ACL 2018 as Long paper

We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.