草莓视频在线观看免费完整,亚洲欧洲综合成人AV一区

Strassen's asymptotic rank conjecture [Progr. Math. 120 (1994)] claims a strong submultiplicative upper bound on the rank of a three-tensor obtained as an iterated Kronecker product of a constant-size base tensor. The conjecture, if true, most notably would put square matrix multiplication in quadratic time. We note here that some more-or-less unexpected algorithmic results in the area of exponential-time algorithms would also follow. Specifically, we study the so-called set cover conjecture, which states that for any $\epsilon>0$ there exists a positive integer constant $k$ such that no algorithm solves the $k$-Set Cover problem in worst-case time $\mathcal{O}((2-\epsilon)^n|\mathcal F|\operatorname{poly}(n))$. The $k$-Set Cover problem asks, given as input an $n$-element universe $U$, a family $\mathcal F$ of size-at-most-$k$ subsets of $U$, and a positive integer $t$, whether there is a subfamily of at most $t$ sets in $\mathcal F$ whose union is $U$. The conjecture was formulated by Cygan et al. in the monograph Parameterized Algorithms [Springer, 2015] but was implicit as a hypothesis already in Cygan et al. [CCC 2012, ACM Trans. Algorithms 2016], there conjectured to follow from the Strong Exponential Time Hypothesis. We prove that if the asymptotic rank conjecture is true, then the set cover conjecture is false. Using a reduction by Krauthgamer and Trabelsi [STACS 2019], in this scenario we would also get a $\mathcal{O}((2-\delta)^n)$-time randomized algorithm for some constant $\delta>0$ for another well-studied problem for which no such algorithm is known, namely that of deciding whether a given $n$-vertex directed graph has a Hamiltonian cycle.

相關內容

秩

關注 0

評論員 · 情景 · Pair · Weight · CASE ·

2023 年 12 月 5 日

Switch Points of Bi-Persistence Matching Distance

Robyn Brooks,Celia Hacker,Claudia Landi,Barbara I. Mahler,Elizabeth R. Stephenson

from arxiv, 30 pages, 10 figures. Comments welcome

In multi-parameter persistence, the matching distance is defined as the supremum of weighted bottleneck distances on the barcodes given by the restriction of persistence modules to lines with a positive slope. In the case of finitely presented bi-persistence modules, all the available methods to compute the matching distance are based on restricting the computation to lines through pairs from a finite set of points in the plane. Some of these points are determined by the filtration data as they are entrance values of critical simplices. However, these critical values alone are not sufficient for the matching distance computation and it is necessary to add so-called switch points, i.e. points such that on a line through any of them, the bottleneck matching switches the matched pair. This paper is devoted to the algorithmic computation of the set of switch points given a set of critical values. We find conditions under which a candidate switch point is erroneous or superfluous. The obtained conditions are turned into algorithms that have been implemented. With this, we analyze how the size of the set of switch points increases as the number of critical values increases, and how it varies depending on the distribution of critical values. Experiments are carried out on various types of bi-persistence modules.

contrastive · MoDELS · 逼真度 · 組合性 · Performer ·

2023 年 12 月 4 日

A Contrastive Compositional Benchmark for Text-to-Image Synthesis: A Study with Unified Text-to-Image Fidelity Metrics

Xiangru Zhu,Penglei Sun,Chengyu Wang,Jingping Liu,Zhixu Li,Yanghua Xiao,Jun Huang

from arxiv, 17 pages, 14 figures, 11 tables

Text-to-image (T2I) synthesis has recently achieved significant advancements. However, challenges remain in the model's compositionality, which is the ability to create new combinations from known components. We introduce Winoground-T2I, a benchmark designed to evaluate the compositionality of T2I models. This benchmark includes 11K complex, high-quality contrastive sentence pairs spanning 20 categories. These contrastive sentence pairs with subtle differences enable fine-grained evaluations of T2I synthesis models. Additionally, to address the inconsistency across different metrics, we propose a strategy that evaluates the reliability of various metrics by using comparative sentence pairs. We use Winoground-T2I with a dual objective: to evaluate the performance of T2I models and the metrics used for their evaluation. Finally, we provide insights into the strengths and weaknesses of these metrics and the capabilities of current T2I models in tackling challenges across a range of complex compositional categories. Our benchmark is publicly available at //github.com/zhuxiangru/Winoground-T2I .

近似 · Tensor · 秩 · 噪聲 · 近似誤差 ·

2023 年 12 月 4 日

Theoretical Bounds for Noise Filtration using Low-Rank Tensor Approximations

Sergey Petrov,Nikolai Zamarashkin

Low-rank tensor approximation error bounds are proposed for the case of noisy input data that depend on low-rank representation type, rank and the dimensionality of the tensor. The bounds show that high-dimensional low-rank structured approximations provide superior noise-filtering properties compared to matrices with the same rank and total element count.

語言模型化 · MoDELS · Integration · Performance · 步幅 ·

2023 年 12 月 4 日

Exchange-of-Thought: Enhancing Large Language Model Capabilities through Cross-Model Communication

Zhangyue Yin,Qiushi Sun,Cheng Chang,Qipeng Guo,Junqi Dai,Xuanjing Huang,Xipeng Qiu

from arxiv, 19 pages, 11 figures, accepted by EMNLP2023

Large Language Models (LLMs) have recently made significant strides in complex reasoning tasks through the Chain-of-Thought technique. Despite this progress, their reasoning is often constrained by their intrinsic understanding, lacking external insights. To address this, we propose Exchange-of-Thought (EoT), a novel framework that enables cross-model communication during problem-solving. Drawing inspiration from network topology, EoT integrates four unique communication paradigms: Memory, Report, Relay, and Debate. This paper delves into the communication dynamics and volume associated with each paradigm. To counterbalance the risks of incorrect reasoning chains, we implement a robust confidence evaluation mechanism within these communications. Our experiments across diverse complex reasoning tasks demonstrate that EoT significantly surpasses established baselines, underscoring the value of external insights in enhancing LLM performance. Furthermore, we show that EoT achieves these superior results in a cost-effective manner, marking a promising advancement for efficient and collaborative AI problem-solving.

SR · SimPLe · MoDELS · 估計/估計量 · 相關系數 ·

2023 年 12 月 3 日

CEScore: Simple and Efficient Confidence Estimation Model for Evaluating Split and Rephrase

AlMotasem Bellah Al Ajlouni,Jinlong Li

The split and rephrase (SR) task aims to divide a long, complex sentence into a set of shorter, simpler sentences that convey the same meaning. This challenging problem in NLP has gained increased attention recently because of its benefits as a pre-processing step in other NLP tasks. Evaluating quality of SR is challenging, as there no automatic metric fit to evaluate this task. In this work, we introduce CEScore, as novel statistical model to automatically evaluate SR task. By mimicking the way humans evaluate SR, CEScore provides 4 metrics (Sscore, Gscore, Mscore, and CEscore) to assess simplicity, grammaticality, meaning preservation, and overall quality, respectively. In experiments with 26 models, CEScore correlates strongly with human evaluations, achieving 0.98 in Spearman correlations at model-level. This underscores the potential of CEScore as a simple and effective metric for assessing the overall quality of SR models.

PULSE · 通道 · 塊 · 塑造 · 可辨認的 ·

2023 年 12 月 3 日

On Merits of Faster-than-Nyquist Signaling in the Finite Blocklength Regime

Yong Jin Daniel Kim

We identify potential merits of faster-than-Nyquist (FTN) signaling in the finite blocklength (FBL) regime. A unique aspect of FTN signaling is that it can increase the blocklength by packing more data symbols within the same time and frequency to yield strictly higher number of independent signaling dimensions than that of Nyquist rate signaling. Using the finite-blocklength information theory, we provide tight bounds on the maximum channel coding rate (MCCR) of FTN signaling for any finite time-bandwidth product. The merits are categorized into two operating regions of FTN, i.e., when the time-acceleration factor of FTN, $\tau$, is above or below a certain threshold $\tau_{0}$. When $\tau > \tau_{0}$, FTN has both higher channel capacity and MCCR than that of Nyquist rate signaling, when the utilized pulse shape is non-sinc. Since the issues associated with the ideal sinc pulse only get exacerbated when packets are short, the benefit of FTN becomes more significant in the FBL regime. On the other hand, when $\tau < \tau_{0}$, the channel capacity is fixed but MCCR of FTN can continue to increase to a certain degree, thereby reducing the gap between the capacity and MCCR. This benefit is present regardless of the utilized pulse shape, including the ideal sinc-pulse, and is unique to the FBL regime. Instead of increasing MCCR for fixed block error rates, FTN can alternatively lower the block error rates for fixed channel coding rates. These results imply that FTN can lower the penalty from limited channel coding over short blocklength and can improve the performance and reliability of short packet communications.

Learning · Machine Learning · MoDELS · ML · state-of-the-art ·

2023 年 12 月 1 日

Machine Learning for Actionable Warning Identification: A Comprehensive Survey

Xiuting Ge,Chunrong Fang,Xuanye Li,Weisong Sun,Daoyuan Wu,Juan Zhai,Shangwei Lin,Zhihong Zhao,Yang Liu,Zhenyu Chen

Actionable Warning Identification (AWI) plays a crucial role in improving the usability of static code analyzers. With recent advances in Machine Learning (ML), various approaches have been proposed to incorporate ML techniques into AWI. These ML-based AWI approaches, benefiting from ML's strong ability to learn subtle and previously unseen patterns from historical data, have demonstrated superior performance. However, a comprehensive overview of these approaches is missing, which could hinder researchers/practitioners from understanding the current process and discovering potential for future improvement in the ML-based AWI community. In this paper, we systematically review the state-of-the-art ML-based AWI approaches. First, we employ a meticulous survey methodology and gather 50 primary studies from 2000/01/01 to 2023/09/01. Then, we outline the typical ML-based AWI workflow, including warning dataset preparation, preprocessing, AWI model construction, and evaluation stages. In such a workflow, we categorize ML-based AWI approaches based on the warning output format. Besides, we analyze the techniques used in each stage, along with their strengths, weaknesses, and distribution. Finally, we provide practical research directions for future ML-based AWI approaches, focusing on aspects like data improvement (e.g., enhancing the warning labeling strategy) and model exploration (e.g., exploring large language models for AWI).

MoDELS · Machine Learning · 學成 · 線性的 · 線性模型 ·

2021 年 9 月 6 日

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Yehuda Dar,Vidya Muthukumar,Richard G. Baraniuk

The rapid recent progress in machine learning (ML) has raised a number of scientific questions that challenge the longstanding dogma of the field. One of the most important riddles is the good empirical generalization of overparameterized models. Overparameterized models are excessively complex with respect to the size of the training dataset, which results in them perfectly fitting (i.e., interpolating) the training data, which is usually noisy. Such interpolation of noisy data is traditionally associated with detrimental overfitting, and yet a wide range of interpolating models -- from simple linear models to deep neural networks -- have recently been observed to generalize extremely well on fresh test data. Indeed, the recently discovered double descent phenomenon has revealed that highly overparameterized models often improve over the best underparameterized model in test performance. Understanding learning in this overparameterized regime requires new theory and foundational empirical studies, even for the simplest case of the linear model. The underpinnings of this understanding have been laid in very recent analyses of overparameterized linear regression and related statistical learning tasks, which resulted in precise analytic characterizations of double descent. This paper provides a succinct overview of this emerging theory of overparameterized ML (henceforth abbreviated as TOPML) that explains these recent findings through a statistical signal processing perspective. We emphasize the unique aspects that define the TOPML research area as a subfield of modern ML theory and outline interesting open questions that remain.

MoDELS · Performer · Processing（編程語言） · 學成 · 穩健性 ·

2021 年 9 月 3 日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Paul Michel

from arxiv, PhD thesis

The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.

卷積神經網絡 · Neural Networks · Performer · Seven · Processing（編程語言） ·

2019 年 1 月 17 日

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

Asifullah Khan,Anabia Sohail,Umme Zahoora,Aqsa Saeed Qureshi

from arxiv, Number of Pages: 60 Number of Figures: 11 Number of Tables:1

Deep Convolutional Neural Networks (CNNs) are a special type of Neural Networks, which have shown state-of-the-art results on various competitive benchmarks. The powerful learning ability of deep CNN is largely achieved with the use of multiple non-linear feature extraction stages that can automatically learn hierarchical representation from the data. Availability of a large amount of data and improvements in the hardware processing units have accelerated the research in CNNs and recently very interesting deep CNN architectures are reported. The recent race in deep CNN architectures for achieving high performance on the challenging benchmarks has shown that the innovative architectural ideas, as well as parameter optimization, can improve the CNN performance on various vision-related tasks. In this regard, different ideas in the CNN design have been explored such as use of different activation and loss functions, parameter optimization, regularization, and restructuring of processing units. However, the major improvement in representational capacity is achieved by the restructuring of the processing units. Especially, the idea of using a block as a structural unit instead of a layer is gaining substantial appreciation. This survey thus focuses on the intrinsic taxonomy present in the recently reported CNN architectures and consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature map exploitation, channel boosting and attention. Additionally, it covers the elementary understanding of the CNN components and sheds light on the current challenges and applications of CNNs.