美国式禁忌电影在线观看免费观看_91久久精品美女高潮喷水APP_成在线人免费视频一区二区三区_无码片久久久天堂中文字幕_国产又粗又猛又爽又黄在线观看_免费中文字幕不卡视频_日韩精品欧美精品视频一区

In real-world applications, one often encounters ambiguously labeled data, where different annotators assign conflicting class labels. Partial-label learning allows training classifiers in this weakly supervised setting, where state-of-the-art methods already show good predictive performance. However, even the best algorithms give incorrect predictions, which can have severe consequences when they impact actions or decisions. We propose a novel risk-consistent partial-label learning algorithm with a reject option, that is, the algorithm can reject unsure predictions. Extensive experiments on artificial and real-world datasets show that our method provides the best trade-off between the number and accuracy of non-rejected predictions when compared to our competitors, which use confidence thresholds for rejecting unsure predictions instead. When evaluated without the reject option, our nearest neighbor-based approach also achieves competitive prediction performance.

相關內容

Learning

關注 12

圖 · MoDELS · 優化器 · Networks · 結點 ·

2024 年 7 月 15 日

Friedkin-Johnsen Model for Opinion Dynamics on Signed Graphs

Xiaotian Zhou,Haoxin Sun,Wanyue Xu,Wei Li,Zhongzhi Zhang

A signed graph offers richer information than an unsigned graph, since it describes both collaborative and competitive relationships in social networks. In this paper, we study the opinion dynamics on a signed graph, based on the Friedkin-Johnsen model. We first interpret the equilibrium opinion in terms of a defined random walk on an augmented signed graph, by representing the equilibrium opinion of every node as a combination of all nodes' internal opinions, with the coefficient of the internal opinion for each node being the difference of two absorbing probabilities. We then quantify some relevant social phenomena and express them in terms of the $\ell_2$ norms of vectors. We also design a nearly-linear time signed Laplacian solver for assessing these quantities, by establishing a connection between the absorbing probability of random walks on a signed graph and that on an associated unsigned graph. We further study the opinion optimization problem by changing the initial opinions of a fixed number of nodes, which can be optimally solved in cubic time. We provide a nearly-linear time algorithm with error guarantee to approximately solve the problem. Finally, we execute extensive experiments on sixteen real-life signed networks, which show that both of our algorithms are effective and efficient, and are scalable to massive graphs with over 20 million nodes.

專利 · INFORMS · Performer · 詞表 · Transformer ·

2024 年 7 月 14 日

Comparing Complex Concepts with Transformers: Matching Patent Claims Against Natural Language Text

Matthias Blume,Ghobad Heidari,Christoph Hewel

from arxiv, 5th Workshop on Patent Text Mining and Semantic Technologies (PatentSemTech 2024) at ACM SIGIR

A key capability in managing patent applications or a patent portfolio is comparing claims to other text, e.g. a patent specification. Because the language of claims is different from language used elsewhere in the patent application or in non-patent text, this has been challenging for computer based natural language processing. We test two new LLM-based approaches and find that both provide substantially better performance than previously published values. The ability to match dense information from one domain against much more distributed information expressed in a different vocabulary may also be useful beyond the intellectual property space.

Processing（編程語言） · 無限 · Cognition · MoDELS · 大語言模型 ·

2024 年 7 月 12 日

Human-like Episodic Memory for Infinite Context LLMs

Zafeirios Fountas,Martin A Benfeghoul,Adnan Oomerjee,Fenia Christopoulou,Gerasimos Lampouras,Haitham Bou-Ammar,Jun Wang

Large language models (LLMs) have shown remarkable capabilities, but still struggle with processing extensive contexts, limiting their ability to maintain coherence and accuracy over long sequences. In contrast, the human brain excels at organising and retrieving episodic experiences across vast temporal scales, spanning a lifetime. In this work, we introduce EM-LLM, a novel approach that integrates key aspects of human episodic memory and event cognition into LLMs, enabling them to effectively handle practically infinite context lengths while maintaining computational efficiency. EM-LLM organises sequences of tokens into coherent episodic events using a combination of Bayesian surprise and graph-theoretic boundary refinement in an on-line fashion. When needed, these events are retrieved through a two-stage memory process, combining similarity-based and temporally contiguous retrieval for efficient and human-like access to relevant information. Experiments on the LongBench dataset demonstrate EM-LLM's superior performance, outperforming the state-of-the-art InfLLM model with an overall relative improvement of 4.3% across various tasks, including a 33% improvement on the PassageRetrieval task. Furthermore, our analysis reveals strong correlations between EM-LLM's event segmentation and human-perceived events, suggesting a bridge between this artificial system and its biological counterpart. This work not only advances LLM capabilities in processing extended contexts but also provides a computational framework for exploring human memory mechanisms, opening new avenues for interdisciplinary research in AI and cognitive science.

Attention · 情景 · 多樣性 · MoDELS · motivation ·

2024 年 7 月 12 日

Calibrated Recommendations for Users with Decaying Attention

Jon Kleinberg,Emily Ryu,éva Tardos

from arxiv, 31 pages, 1 figure, 17th International Symposium on Algorithmic Game Theory (SAGT 2024). This paper incorporates and supersedes our earlier paper arXiv:2203.00233

Recommendation systems capable of providing diverse sets of results are a focus of increasing importance, with motivations ranging from fairness to novelty and other aspects of optimizing user experience. One form of diversity of recent interest is calibration, the notion that personalized recommendations should reflect the full distribution of a user's interests, rather than a single predominant category -- for instance, a user who mainly reads entertainment news but also wants to keep up with news on the environment and the economy would prefer to see a mixture of these genres, not solely entertainment news. Existing work has formulated calibration as a subset selection problem; this line of work observes that the formulation requires the unrealistic assumption that all recommended items receive equal consideration from the user, but leaves as an open question the more realistic setting in which user attention decays as they move down the list of results. In this paper, we consider calibration with decaying user attention under two different models. In both models, there is a set of underlying genres that items can belong to. In the first setting, where items are represented by fine-grained mixtures of genre percentages, we provide a $(1-1/e)$-approximation algorithm by extending techniques for constrained submodular optimization. In the second setting, where items are coarsely binned into a single genre each, we surpass the $(1-1/e)$ barrier imposed by submodular maximization and give a $2/3$-approximate greedy algorithm. Our work thus addresses the problem of capturing ordering effects due to decaying attention, allowing for the extension of near-optimal calibration from recommendation sets to recommendation lists.

MoDELS · 縮放 · Performer · Extensibility · AIM ·

2024 年 7 月 12 日

Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey

Zeyu Han,Chao Gao,Jinyang Liu,Jeff Zhang,Sai Qian Zhang

from arxiv, 42 pages, 12 figures. Due to word limit, the abstract here is truncated. The full abstract is available in the PDF

Large models represent a groundbreaking advancement in multiple application fields, enabling remarkable achievements across various tasks. However, their unprecedented scale comes with significant computational costs. These models, often consisting of billions of parameters, require vast amounts of computational resources for execution. Especially, the expansive scale and computational demands pose considerable challenges when customizing them for particular downstream tasks, particularly over the hardware platforms constrained by computational capabilities. Parameter Efficient Fine-Tuning (PEFT) provides a practical solution by efficiently adjusting the large models over the various downstream tasks. In particular, PEFT refers to the process of adjusting the parameters of a pre-trained large models to adapt it to a specific task or domain while minimizing the number of additional parameters introduced or computational resources required. This approach is particularly important when dealing with large-scale language models with high parameter counts, as fine-tuning these models from scratch can be computationally expensive and resource-intensive, posing considerable challenges in the supporting system platform design. In this survey, we present comprehensive studies of various PEFT algorithms, examining their performance and computational overhead. Moreover, we provide an overview of applications developed using different PEFT algorithms and discuss common techniques employed to mitigate computation costs for PEFT. In addition to providing an extensive survey from an algorithmic standpoint, we also examine various real-world system designs to investigate the implementation costs associated with different PEFT approaches. This survey serves as an indispensable resource for researchers aiming to understand both the PEFT algorithm and its system implementation, offering detailed ......

Performer · 損失 · 塑造 · 約束 · Learning ·

2024 年 7 月 11 日

Loss Shaping Constraints for Long-Term Time Series Forecasting

Ignacio Hounie,Javier Porras-Valenzuela,Alejandro Ribeiro

Several applications in time series forecasting require predicting multiple steps ahead. Despite the vast amount of literature in the topic, both classical and recent deep learning based approaches have mostly focused on minimising performance averaged over the predicted window. We observe that this can lead to disparate distributions of errors across forecasting steps, especially for recent transformer architectures trained on popular forecasting benchmarks. That is, optimising performance on average can lead to undesirably large errors at specific time-steps. In this work, we present a Constrained Learning approach for long-term time series forecasting that aims to find the best model in terms of average performance that respects a user-defined upper bound on the loss at each time-step. We call our approach loss shaping constraints because it imposes constraints on the loss at each time step, and leverage recent duality results to show that despite its non-convexity, the resulting problem has a bounded duality gap. We propose a practical Primal-Dual algorithm to tackle it, and demonstrate that the proposed approach exhibits competitive average performance in time series forecasting benchmarks, while shaping the distribution of errors across the predicted window.

MoDELS · 語言模型化 · 大語言模型 · FAST · 控制器 ·

2024 年 7 月 11 日

Real-Time Anomaly Detection and Reactive Planning with Large Language Models

Rohan Sinha,Amine Elhafsi,Christopher Agia,Matthew Foutter,Edward Schmerling,Marco Pavone

from arxiv, Accepted to Robotics: Science and Systems (RSS) 2024

Foundation models, e.g., large language models (LLMs), trained on internet-scale data possess zero-shot generalization capabilities that make them a promising technology towards detecting and mitigating out-of-distribution failure modes of robotic systems. Fully realizing this promise, however, poses two challenges: (i) mitigating the considerable computational expense of these models such that they may be applied online, and (ii) incorporating their judgement regarding potential anomalies into a safe control framework. In this work, we present a two-stage reasoning framework: First is a fast binary anomaly classifier that analyzes observations in an LLM embedding space, which may then trigger a slower fallback selection stage that utilizes the reasoning capabilities of generative LLMs. These stages correspond to branch points in a model predictive control strategy that maintains the joint feasibility of continuing along various fallback plans to account for the slow reasoner's latency as soon as an anomaly is detected, thus ensuring safety. We show that our fast anomaly classifier outperforms autoregressive reasoning with state-of-the-art GPT models, even when instantiated with relatively small language models. This enables our runtime monitor to improve the trustworthiness of dynamic robotic systems, such as quadrotors or autonomous vehicles, under resource and time constraints. Videos illustrating our approach in both simulation and real-world experiments are available on this project page: //sites.google.com/view/aesop-llm.

數據集 · 語言模型化 · MoDELS · 大語言模型 · AIM ·

2024 年 7 月 11 日

Leveraging GPT for the Generation of Multi-Platform Social Media Datasets for Research

Henry Tari,Danial Khan,Justus Rutten,Darian Othman,Rishabh Kaushal,Thales Bertaglia,Adriana Iamnitchi

Social media datasets are essential for research on disinformation, influence operations, social sensing, hate speech detection, cyberbullying, and other significant topics. However, access to these datasets is often restricted due to costs and platform regulations. As such, acquiring datasets that span multiple platforms which are crucial for a comprehensive understanding of the digital ecosystem is particularly challenging. This paper explores the potential of large language models to create lexically and semantically relevant social media datasets across multiple platforms, aiming to match the quality of real datasets. We employ ChatGPT to generate synthetic data from two real datasets, each consisting of posts from three different social media platforms. We assess the lexical and semantic properties of the synthetic data and compare them with those of the real data. Our empirical findings suggest that using large language models to generate synthetic multi-platform social media data is promising. However, further enhancements are necessary to improve the fidelity of the outputs.

圖 · Networking · INTERACT · INFORMS · 圖形處理器 ·

2020 年 11 月 25 日

Time-Series Event Prediction with Evolutionary State Graph

Wenjie Hu,Yang Yang,Ziqiang Cheng,Carl Yang,Xiang Ren

from arxiv, A long version of EvoNet (WSDM 2021)

The accurate and interpretable prediction of future events in time-series data often requires the capturing of representative patterns (or referred to as states) underpinning the observed data. To this end, most existing studies focus on the representation and recognition of states, but ignore the changing transitional relations among them. In this paper, we present evolutionary state graph, a dynamic graph structure designed to systematically represent the evolving relations (edges) among states (nodes) along time. We conduct analysis on the dynamic graphs constructed from the time-series data and show that changes on the graph structures (e.g., edges connecting certain state nodes) can inform the occurrences of events (i.e., time-series fluctuation). Inspired by this, we propose a novel graph neural network model, Evolutionary State Graph Network (EvoNet), to encode the evolutionary state graph for accurate and interpretable time-series event prediction. Specifically, Evolutionary State Graph Network models both the node-level (state-to-state) and graph-level (segment-to-segment) propagation, and captures the node-graph (state-to-segment) interactions over time. Experimental results based on five real-world datasets show that our approach not only achieves clear improvements compared with 11 baselines, but also provides more insights towards explaining the results of event predictions.

entity · 鏈路預測 · 圖 · 知識圖譜 · MoDELS ·

2019 年 12 月 25 日

Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction

Zhanqiu Zhang,Jianyu Cai,Yongdong Zhang,Jie Wang

from arxiv, Accepted to AAAI 2020

Knowledge graph embedding, which aims to represent entities and relations as low dimensional vectors (or matrices, tensors, etc.), has been shown to be a powerful technique for predicting missing links in knowledge graphs. Existing knowledge graph embedding models mainly focus on modeling relation patterns such as symmetry/antisymmetry, inversion, and composition. However, many existing approaches fail to model semantic hierarchies, which are common in real-world applications. To address this challenge, we propose a novel knowledge graph embedding model---namely, Hierarchy-Aware Knowledge Graph Embedding (HAKE)---which maps entities into the polar coordinate system. HAKE is inspired by the fact that concentric circles in the polar coordinate system can naturally reflect the hierarchy. Specifically, the radial coordinate aims to model entities at different levels of the hierarchy, and entities with smaller radii are expected to be at higher levels; the angular coordinate aims to distinguish entities at the same level of the hierarchy, and these entities are expected to have roughly the same radii but different angles. Experiments demonstrate that HAKE can effectively model the semantic hierarchies in knowledge graphs, and significantly outperforms existing state-of-the-art methods on benchmark datasets for the link prediction task.