国产特级黄色片A级无毛视频,国产高潮白浆调教福利在线视频

A critical piece of the modern information retrieval puzzle is approximate nearest neighbor search. Its objective is to return a set of $k$ data points that are closest to a query point, with its accuracy measured by the proportion of exact nearest neighbors captured in the returned set. One popular approach to this question is clustering: The indexing algorithm partitions data points into non-overlapping subsets and represents each partition by a point such as its centroid. The query processing algorithm first identifies the nearest clusters -- a process known as routing -- then performs a nearest neighbor search over those clusters only. In this work, we make a simple observation: The routing function solves a ranking problem. Its quality can therefore be assessed with a ranking metric, making the function amenable to learning-to-rank. Interestingly, ground-truth is often freely available: Given a query distribution in a top-$k$ configuration, the ground-truth is the set of clusters that contain the exact top-$k$ vectors. We develop this insight and apply it to Maximum Inner Product Search (MIPS). As we demonstrate empirically on various datasets, learning a simple linear function consistently improves the accuracy of clustering-based MIPS.

相關內容

近鄰

關注 0

估計/估計量 · MoDELS · Artificial Life · INTERACT · 易處理的 ·

2024 年 5 月 31 日

Methods to Estimate Cryptic Sequence Complexity

Matthew Andres Moreno

Complexity is a signature quality of interest in artificial life systems. Alongside other dimensions of assessment, it is common to quantify genome sites that contribute to fitness as a complexity measure. However, limitations to the sensitivity of fitness assays in models with implicit replication criteria involving rich biotic interactions introduce the possibility of difficult-to-detect ``cryptic'' adaptive sites, which contribute small fitness effects below the threshold of individual detectability or involve epistatic redundancies. Here, we propose three knockout-based assay procedures designed to quantify cryptic adaptive sites within digital genomes. We report initial tests of these methods on a simple genome model with explicitly configured site fitness effects. In these limited tests, estimation results reflect ground truth cryptic sequence complexities well. Presented work provides initial steps toward development of new methods and software tools that improve the resolution, rigor, and tractability of complexity analyses across alife systems, particularly those requiring expensive in situ assessments of organism fitness.

通道 · MoDELS · Performer · 分桶 · 情景 ·

2024 年 5 月 29 日

Broadcasting on Adversarial Multiple Access Channels

Bader A. Aldawsari,Bogdan S. Chlebus,Dariusz R. Kowalski

We study deterministic distributed algorithms for broadcasting on multiple-access channels. Packet injection is modeled by leaky-bucket adversaries. There is a fixed set of stations attached to a channel. Additional features of the model of communication include an upper bound on the number of stations activated in a round, an individual injection rate, and randomness in generating and injecting packets. We demonstrate that some broadcast algorithms designed for ad-hoc channels have bounded latency for increased ranges of injection rates than in ad-hoc channels when executed on channels with a fixed number of stations against adversaries that can activate at most one station per round. Individual injection rates are shown to impact latency, as compared to the model of general leaky bucket adversaries. Outcomes of experiments are given that compare the performance of broadcast algorithms against randomized adversaries. The experiments include deterministic algorithms and randomized backoff algorithms.

控制器 · MoDELS · 誤差函數 · 泛函 · 情景 ·

2024 年 5 月 29 日

Programmable Motion Generation for Open-Set Motion Control Tasks

Hanchao Liu,Xiaohang Zhan,Shaoli Huang,Tai-Jiang Mu,Ying Shan

from arxiv, Accepted by CVPR 2024

Character animation in real-world scenarios necessitates a variety of constraints, such as trajectories, key-frames, interactions, etc. Existing methodologies typically treat single or a finite set of these constraint(s) as separate control tasks. They are often specialized, and the tasks they address are rarely extendable or customizable. We categorize these as solutions to the close-set motion control problem. In response to the complexity of practical motion control, we propose and attempt to solve the open-set motion control problem. This problem is characterized by an open and fully customizable set of motion control tasks. To address this, we introduce a new paradigm, programmable motion generation. In this paradigm, any given motion control task is broken down into a combination of atomic constraints. These constraints are then programmed into an error function that quantifies the degree to which a motion sequence adheres to them. We utilize a pre-trained motion generation model and optimize its latent code to minimize the error function of the generated motion. Consequently, the generated motion not only inherits the prior of the generative model but also satisfies the required constraints. Experiments show that we can generate high-quality motions when addressing a wide range of unseen tasks. These tasks encompass motion control by motion dynamics, geometric constraints, physical laws, interactions with scenes, objects or the character own body parts, etc. All of these are achieved in a unified approach, without the need for ad-hoc paired training data collection or specialized network designs. During the programming of novel tasks, we observed the emergence of new skills beyond those of the prior model. With the assistance of large language models, we also achieved automatic programming. We hope that this work will pave the way for the motion control of general AI agents.

邊 · Networking · 中國電子信息產業集團有限公司 · 邊緣計算 · 多跳 ·

2024 年 5 月 29 日

Multi-Source Coflow Scheduling in Collaborative Edge Computing with Multihop Network

Yuvraj Sahni,Jiannong Cao,Lei Yang,Shengwei Wang

Collaborative edge computing has become a popular paradigm where edge devices collaborate by sharing resources. Data dissemination is a fundamental problem in CEC to decide what data is transmitted from which device and how. Existing works on data dissemination have not focused on coflow scheduling in CEC, which involves deciding the order of flows within and across coflows at network links. Coflow implies a set of parallel flows with a shared objective. The existing works on coflow scheduling in data centers usually assume a non-blocking switch and do not consider congestion at different links in the multi-hop path in CEC, leading to increased coflow completion time (CCT). Furthermore, existing works do not consider multiple flow sources that cannot be ignored, as data can have duplicate copies at different edge devices. This work formulates the multi-source coflow scheduling problem in CEC, which includes jointly deciding the source and flow ordering for multiple coflows to minimize the sum of CCT. This problem is shown to be NP-hard and challenging as each flow can have multiple dependent conflicts at multiple links. We propose a source and coflow-aware search and adjust (SCASA) heuristic that first provides an initial solution considering the coflow characteristics. SCASA further improves the initial solution using the source search and adjust heuristic by leveraging the knowledge of both coflows and network congestion at links. Evaluation done using simulation experiments shows that SCASA leads to up to 83% reduction in the sum of CCT compared to benchmarks without a joint solution.

MoDELS · 語言模型化 · Performer · 詞元分析器 · Attention ·

2024 年 5 月 29 日

Training-Free Long-Context Scaling of Large Language Models

Chenxin An,Fei Huang,Jun Zhang,Shansan Gong,Xipeng Qiu,Chang Zhou,Lingpeng Kong

The ability of Large Language Models (LLMs) to process and generate coherent text is markedly weakened when the number of input tokens exceeds their pretraining length. Given the expensive overhead of finetuning large-scale models with longer sequences, we propose Dual Chunk Attention (DCA), which enables Llama2 70B to support context windows of more than 100k tokens without continual training. By decomposing the attention computation for long sequences into chunk-based modules, DCA manages to effectively capture the relative positional information of tokens within the same chunk (Intra-Chunk) and across distinct chunks (Inter-Chunk), as well as integrates seamlessly with Flash Attention. In addition to its impressive extrapolation capability, DCA achieves performance on practical long-context tasks that is comparable to or even better than that of finetuned models. When compared with proprietary models, our training-free 70B model attains 94% of the performance of gpt-3.5-16k, indicating it is a viable open-source alternative. All code and data used in this work are released at \url{//github.com/HKUNLP/ChunkLlama}.

動力系統 · 在線 · Processing（編程語言） · 塑造 · INFORMS ·

2024 年 5 月 28 日

A Dynamical Systems Approach to Bots and Online Political Communication

Beril Bulat,Martin Hilbert

Bots have become increasingly prevalent in the digital sphere and have taken up a proactive role in shaping democratic processes. While previous studies have focused on their influence at the individual level, their potential macro-level impact on communication dynamics is still little understood. This study adopts an information theoretic approach from dynamical systems theory to examine the role of political bots shaping the dynamics of an online political discussion on Twitter. We quantify the components of this dynamic process in terms of its complexity, predictability, and the remaining uncertainty. Our findings suggest that bot activity is associated with increased complexity and uncertainty in the structural dynamics of online political communication. This work serves as a showcase for the use of information-theoretic measures from dynamical systems theory in modeling human-bot dynamics as a computational process that unfolds over time.

Better · 代碼 · Automator · Performer · 可理解性 ·

2024 年 5 月 28 日

Training LLMs to Better Self-Debug and Explain Code

Nan Jiang,Xiaopeng Li,Shiqi Wang,Qiang Zhou,Soneya Binta Hossain,Baishakhi Ray,Varun Kumar,Xiaofei Ma,Anoop Deoras

In the domain of code generation, self-debugging is crucial. It allows LLMs to refine their generated code based on execution feedback. This is particularly important because generating correct solutions in one attempt proves challenging for complex tasks. Prior works on self-debugging mostly focus on prompting methods by providing LLMs with few-shot examples, which work poorly on small open-sourced LLMs. In this work, we propose a training framework that significantly improves self-debugging capability of LLMs. Intuitively, we observe that a chain of explanations on the wrong code followed by code refinement helps LLMs better analyze the wrong code and do refinement. We thus propose an automated pipeline to collect a high-quality dataset for code explanation and refinement by generating a number of explanations and refinement trajectories and filtering via execution verification. We perform supervised fine-tuning (SFT) and further reinforcement learning (RL) on both success and failure trajectories with a novel reward design considering code explanation and refinement quality. SFT improves the pass@1 by up to 15.92% and pass@10 by 9.30% over four benchmarks. RL training brings additional up to 3.54% improvement on pass@1 and 2.55% improvement on pass@10. The trained LLMs show iterative refinement ability, and can keep refining code continuously. Lastly, our human evaluation shows that the LLMs trained with our framework generate more useful code explanations and help developers better understand bugs in source code.

INFORMS · 回合 · 變換 · Things · 操作 ·

2021 年 6 月 3 日

Image-Audio Encoding to Improve C2 Decision-Making in Multi-Domain Environment

Piyush K. Sharma,Adrienne Raglin

from arxiv, Published in: The 25th International Command and Control Research and Technology Symposium (ICCRTS - 2020)

The military is investigating methods to improve communication and agility in its multi-domain operations (MDO). Nascent popularity of Internet of Things (IoT) has gained traction in public and government domains. Its usage in MDO may revolutionize future battlefields and may enable strategic advantage. While this technology offers leverage to military capabilities, it comes with challenges where one is the uncertainty and associated risk. A key question is how can these uncertainties be addressed. Recently published studies proposed information camouflage to transform information from one data domain to another. As this is comparatively a new approach, we investigate challenges of such transformations and how these associated uncertainties can be detected and addressed, specifically unknown-unknowns to improve decision-making.

Vision · 模型評估 · 可約的 · 計算機視覺 · DNN ·

2020 年 3 月 24 日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Abhinav Goel,Caleb Tung,Yung-Hsiang Lu,George K. Thiruvathukal

from arxiv, Accepted for publication at 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA 2020

Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.

INFORMS · 圖像分割 · Networking · state-of-the-art · Integration ·

2019 年 4 月 9 日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Linwei Ye,Mrigank Rochan,Zhi Liu,Yang Wang

from arxiv, Accepted to CVPR2019

We consider the problem of referring image segmentation. Given an input image and a natural language expression, the goal is to segment the object referred by the language expression in the image. Existing works in this area treat the language expression and the input image separately in their representations. They do not sufficiently capture long-range correlations between these two modalities. In this paper, we propose a cross-modal self-attention (CMSA) module that effectively captures the long-range dependencies between linguistic and visual features. Our model can adaptively focus on informative words in the referring expression and important regions in the input image. In addition, we propose a gated multi-level fusion module to selectively integrate self-attentive cross-modal features corresponding to different levels in the image. This module controls the information flow of features at different levels. We validate the proposed approach on four evaluation datasets. Our proposed approach consistently outperforms existing state-of-the-art methods.