一本色道综合久久欧美日韩精品_国产一国产一级毛片A久久久_人妻精品久久久久精品中文字幕_人人妻人人藻人人爽欧美一区欧美_岛国女人性爱免费毛片视频一二三区在线_亚洲国产中文欧美在线人成大黄瓜_欧美精品在线免费观看

Computing the {\em matching statistics} of a string $P[1..m]$ with respect to a text $T[1..n]$ is a fundamental problem which has application to genome sequence comparison. In this paper, we study the problem of computing the matching statistics upon highly repetitive texts. We design three different data structures that are similar to LZ-compressed indexes. The space costs of all of them can be measured by $\gamma$, the size of the smallest string attractor [STOC'2018] and $\delta$, a better measure of repetitiveness [LATIN'2020].

相關內容

統計量(liang)

關注 3

FAST · 知識 (knowledge) · 講稿 · 確切的 · 可行 ·

2022 年 4 月 20 日

Fast Circular Pattern Matching

Will Solow,Matthew Barich,Brendan Mumey

The Exact Circular Pattern Matching (ECPM) problem consists of reporting every occurrence of a rotation of a pattern $P$ in a text $T$. In many real-world applications, specifically in computational biology, circular rotations are of interest because of their prominence in virus DNA. Thus, given no restrictions on pre-processing time, how quickly all such circular rotation occurrences is of interest to many areas of study. We highlight, to the best of our knowledge, a novel approach to the ECPM problem and present four data structures that accompany this approach, each with their own time-space trade-offs, in addition to experimental results to determine the most computationally feasible data structure.

Performer · 可理解性 · MoDELS · Excel · 論文 ·

2022 年 4 月 20 日

Theoretical analysis of edit distance algorithms: an applied perspective

Paul Medvedev

Given its status as a classic problem and its importance to both theoreticians and practitioners, edit distance provides an excellent lens through which to understand how the theoretical analysis of algorithms impacts practical implementations. From an applied perspective, the goals of theoretical analysis are to predict the empirical performance of an algorithm and to serve as a yardstick to design novel algorithms that perform well in practice. In this paper, we systematically survey the types of theoretical analysis techniques that have been applied to edit distance and evaluate the extent to which each one has achieved these two goals. These techniques include traditional worst-case analysis, worst-case analysis parametrized by edit distance or entropy or compressibility, average-case analysis, semi-random models, and advice-based models. We find that the track record is mixed. On one hand, two algorithms widely used in practice have been born out of theoretical analysis and their empirical performance is captured well by theoretical predictions. On the other hand, all the algorithms developed using theoretical analysis as a yardstick since then have not had any practical relevance. We conclude by discussing the remaining open problems and how they can be tackled.

樣例 · 標注 · 噪聲 · 樣本 · 情景 ·

2022 年 4 月 20 日

Quantity vs Quality: Investigating the Trade-Off between Sample Size and Label Reliability

Timo Bertram,Johannes Fürnkranz,Martin Müller

from arxiv, Preliminary work under review for ICML 2022

In this paper, we study learning in probabilistic domains where the learner may receive incorrect labels but can improve the reliability of labels by repeatedly sampling them. In such a setting, one faces the problem of whether the fixed budget for obtaining training examples should rather be used for obtaining all different examples or for improving the label quality of a smaller number of examples by re-sampling their labels. We motivate this problem in an application to compare the strength of poker hands where the training signal depends on the hidden community cards, and then study it in depth in an artificial setting where we insert controlled noise levels into the MNIST database. Our results show that with increasing levels of noise, resampling previous examples becomes increasingly more important than obtaining new examples, as classifier performance deteriorates when the number of incorrect labels is too high. In addition, we propose two different validation strategies; switching from lower to higher validations over the course of training and using chi-square statistics to approximate the confidence in obtained labels.

層 · 可約的 · Automator · Continuity · 講稿 ·

2022 年 4 月 20 日

Isogeometric Analysis of Acoustic Scattering with Perfectly Matched Layers (IGAPML)

Jon Vegard Ven?s,Trond Kvamsdal

The perfectly matched layer (PML) formulation is a prominent way of handling radiation problems in unbounded domain and has gained interest due to its simple implementation in finite element codes. However, its simplicity can be advanced further using the isogeometric framework. This work presents a spline based PML formulation which avoids additional coordinate transformation as the formulation is based on the same space in which the numerical solution is sought. The procedure can be automated for any convex artificial boundary. This removes restrictions on the domain construction using PML and can therefore reduce computational cost and improve mesh quality. The usage of spline basis functions with higher continuity also improves the accuracy of the numerical solution.

Extensibility · 示例 · Integration · 類別 · 注意力機制 ·

2022 年 4 月 19 日

On the Coexistence of Stability and Incentive Compatibility in Fractional Matchings

Shivika Narang,Y Narahari

Stable matchings have been studied extensively in social choice literature. The focus has been mostly on integral matchings, in which the nodes on the two sides are wholly matched. A fractional matching, which is a convex combination of integral matchings, is a natural extension of integral matchings. The topic of stability of fractional matchings has started receiving attention only very recently. Further, incentive compatibility in the context of fractional matchings has received very little attention. With this as the backdrop, our paper studies the important topic of incentive compatibility of mechanisms to find stable fractional matchings. We work with preferences expressed in the form of cardinal utilities. Our first result is an impossibility result that there are matching instances for which no mechanism that produces a stable fractional matching can be incentive compatible or even approximately incentive compatible. This provides the motivation to seek special classes of matching instances for which there exist incentive compatible mechanisms that produce stable fractional matchings. Our study leads to a class of matching instances that admit unique stable fractional matchings. We first show that a unique stable fractional matching for a matching instance exists if and only if the given matching instance satisfies the conditional mutual first preference (CMFP) property. To this end, we provide a polynomial-time algorithm that makes ingenious use of envy-graphs to find a non-integral stable matching whenever the preferences are strict and the given instance is not a CMFP matching instance. For this class of CMFP matching instances, we prove that every mechanism that produces the unique stable fractional matching is (a) incentive compatible and further (b) resistant to coalitional manipulations.

binary · TOOLS · Better · 查全率/召回率 · 可理解性 ·

2022 年 4 月 18 日

Modx: Binary Level Partial Imported Third-Party Library Detection through Program Modularization and Semantic Matching

Can Yang,Zhengzi Xu,Hongxu Chen,Yang Liu,Xiaorui Gong,Baoxu Liu

With the rapid growth of software, using third-party libraries (TPLs) has become increasingly popular. The prosperity of the library usage has provided the software engineers with handful of methods to facilitate and boost the program development. Unfortunately, it also poses great challenges as it becomes much more difficult to manage the large volume of libraries. Researches and studies have been proposed to detect and understand the TPLs in the software. However, most existing approaches rely on syntactic features, which are not robust when these features are changed or deliberately hidden by the adversarial parties. Moreover, these approaches typically model each of the imported libraries as a whole, therefore, cannot be applied to scenarios where the host software only partially uses the library code segments. To detect both fully and partially imported TPLs at the semantic level, we propose ModX, a framework that leverages novel program modularization techniques to decompose the program into finegrained functionality-based modules. By extracting both syntactic and semantic features, it measures the distance between modules to detect similar library module reuse in the program. Experimental results show that ModX outperforms other modularization tools by distinguishing more coherent program modules with 353% higher module quality scores and beats other TPL detection tools with on average 17% better in precision and 8% better in recall.

Conformer · 頻率主義學派 · MoDELS · 優化器 · 覆蓋 ·

2022 年 4 月 18 日

Optimal Conformal Prediction for Small Areas

Elizabeth Bersson,Peter D. Hoff

from arxiv, 24 pages, 9 figures

Existing inferential methods for small area data involve a trade-off between maintaining area-level frequentist coverage rates and improving inferential precision via the incorporation of indirect information. In this article, we propose a method to obtain an area-level prediction region for a future observation which mitigates this trade-off. The proposed method takes a conformal prediction approach in which the conformity measure is the posterior predictive density of a working model that incorporates indirect information. The resulting prediction region has guaranteed frequentist coverage regardless of the working model, and, if the working model assumptions are accurate, the region has minimum expected volume compared to other regions with the same coverage rate. When constructed under a normal working model, we prove such a prediction region is an interval and construct an efficient algorithm to obtain the exact interval. We illustrate the performance of our method through simulation studies and an application to EPA radon survey data.

得分 · 優化器 · SimPLe · 可辨認的 · binary ·

2022 年 4 月 17 日

Optimization of Scoring Rules

Jason D. Hartline,Yingkai Li,Liren Shan,Yifan Wu

This paper introduces an objective for optimizing proper scoring rules. The objective is to maximize the increase in payoff of a forecaster who exerts a binary level of effort to refine a posterior belief from a prior belief. In this framework we characterize optimal scoring rules in simple settings, give efficient algorithms for computing optimal scoring rules in complex settings, and identify simple scoring rules that are approximately optimal. In comparison, standard scoring rules in theory and practice -- for example the quadratic rule, scoring rules for the expectation, and scoring rules for multiple tasks that are averages of single-task scoring rules -- can be very far from optimal.

估計/估計量 · 統計量 · 優化器 · 穩健性 · 噪聲 ·

2022 年 4 月 16 日

Computationally Efficient and Statistically Optimal Robust Low-rank Matrix Estimation

Yinan Shen,Jingyang Li,Jian-Feng Cai,Dong Xia

Low-rank matrix estimation under heavy-tailed noise is challenging, both computationally and statistically. Convex approaches have been proven statistically optimal but suffer from high computational costs, especially since robust loss functions are usually non-smooth. More recently, computationally fast non-convex approaches via sub-gradient descent are proposed, which, unfortunately, fail to deliver a statistically consistent estimator even under sub-Gaussian noise. In this paper, we introduce a novel Riemannian sub-gradient (RsGrad) algorithm which is not only computationally efficient with linear convergence but also is statistically optimal, be the noise Gaussian or heavy-tailed. Convergence theory is established for a general framework and specific applications to absolute loss, Huber loss, and quantile loss are investigated. Compared with existing non-convex methods, ours reveals a surprising phenomenon of dual-phase convergence. In phase one, RsGrad behaves as in a typical non-smooth optimization that requires gradually decaying stepsizes. However, phase one only delivers a statistically sub-optimal estimator which is already observed in the existing literature. Interestingly, during phase two, RsGrad converges linearly as if minimizing a smooth and strongly convex objective function and thus a constant stepsize suffices. Underlying the phase-two convergence is the smoothing effect of random noise to the non-smooth robust losses in an area close but not too close to the truth. Lastly, RsGrad is applicable for low-rank tensor estimation under heavy-tailed noise where a statistically optimal rate is attainable with the same phenomenon of dual-phase convergence, and a novel shrinkage-based second-order moment method is guaranteed to deliver a warm initialization. Numerical simulations confirm our theoretical discovery and showcase the superiority of RsGrad over prior methods.

優化器 · BASIC · 全局優化 · Processing（編程語言） · SimPLe ·

2022 年 4 月 15 日

Investigation of Bare-bones Algorithms from Quantum Perspective: A Quantum Dynamical Global Optimizer

Peng Wang,Gang Xin,Fang Wang

from arxiv, The paper may provide a new quantum perspective for studying a bare-bones intelligence algorithms

Recent decades, the emergence of numerous novel algorithms makes it a gimmick to propose an intelligent optimization system based on metaphor, and hinders researchers from exploring the essence of search behavior in algorithms. However, it is difficult to directly discuss the search behavior of an intelligent optimization algorithm, since there are so many kinds of intelligent schemes. To address this problem, an intelligent optimization system is regarded as a simulated physical optimization system in this paper. The dynamic search behavior of such a simplified physical optimization system are investigated with quantum theory. To achieve this goal, the Schroedinger equation is employed as the dynamics equation of the optimization algorithm, which is used to describe dynamic search behaviours in the evolution process with quantum theory. Moreover, to explore the basic behaviour of the optimization system, the optimization problem is assumed to be decomposed and approximated. Correspondingly, the basic search behaviour is derived, which constitutes the basic iterative process of a simple optimization system. The basic iterative process is compared with some classical bare-bones schemes to verify the similarity of search behavior under different metaphors. The search strategies of these bare bones algorithms are analyzed through experiments.