一本色道综合久久欧美日韩精品,丁香五月天激情婷婷五月天,丰满人妻热妇乱又伦精品,亚洲电影在线观看的网址,免费看的黄色视频

Inspired by a mathematical riddle involving fuses, we define the "fusible numbers" as follows: $0$ is fusible, and whenever $x,y$ are fusible with $|y-x|<1$, the number $(x+y+1)/2$ is also fusible. We prove that the set of fusible numbers, ordered by the usual order on $\mathbb R$, is well-ordered, with order type $\varepsilon_0$. Furthermore, we prove that the density of the fusible numbers along the real line grows at an incredibly fast rate: Letting $g(n)$ be the largest gap between consecutive fusible numbers in the interval $[n,\infty)$, we have $g(n)^{-1} \ge F_{\varepsilon_0}(n-c)$ for some constant $c$, where $F_\alpha$ denotes the fast-growing hierarchy. Finally, we derive some true statements that can be formulated but not proven in Peano Arithmetic, of a different flavor than previously known such statements: PA cannot prove the true statement "For every natural number $n$ there exists a smallest fusible number larger than $n$." Also, consider the algorithm "$M(x)$: if $x<0$ return $-x$, else return $M(x-M(x-1))/2$." Then $M$ terminates on real inputs, although PA cannot prove the statement "$M$ terminates on all natural inputs."

相關內容

總回報

關注 0

序列化 · Vision · Performance · 計算機視覺 · 極小點 ·

2022 年 4 月 20 日

Review of Serial and Parallel Min-Cut/Max-Flow Algorithms for Computer Vision

Patrick M. Jensen,Niels Jeppesen,Anders B. Dahl,Vedrana A. Dahl

from arxiv, 20 pages, 13 figures, accepted for publication at T-PAMI

Minimum cut/maximum flow (min-cut/max-flow) algorithms solve a variety of problems in computer vision and thus significant effort has been put into developing fast min-cut/max-flow algorithms. As a result, it is difficult to choose an ideal algorithm for a given problem. Furthermore, parallel algorithms have not been thoroughly compared. In this paper, we evaluate the state-of-the-art serial and parallel min-cut/max-flow algorithms on the largest set of computer vision problems yet. We focus on generic algorithms, i.e., for unstructured graphs, but also compare with the specialized GridCut implementation. When applicable, GridCut performs best. Otherwise, the two pseudoflow algorithms, Hochbaum pseudoflow and excesses incremental breadth first search, achieves the overall best performance. The most memory efficient implementation tested is the Boykov-Kolmogorov algorithm. Amongst generic parallel algorithms, we find the bottom-up merging approach by Liu and Sun to be best, but no method is dominant. Of the generic parallel methods, only the parallel preflow push-relabel algorithm is able to efficiently scale with many processors across problem sizes, and no generic parallel method consistently outperforms serial algorithms. Finally, we provide and evaluate strategies for algorithm selection to obtain good expected performance. We make our dataset and implementations publicly available for further research.

估計/估計量 · 樣本 · 統計量 · Oracle · INTERACT ·

2022 年 4 月 19 日

Making Progress Based on False Discoveries

Roi Livni

We consider the question of adaptive data analysis within the framework of convex optimization. We ask how many samples are needed in order to compute $\epsilon$-accurate estimates of $O(1/\epsilon^2)$ gradients queried by gradient descent, and we provide two intermediate answers to this question. First, we show that for a general analyst (not necessarily gradient descent) $\Omega(1/\epsilon^3)$ samples are required. This rules out the possibility of a foolproof mechanism. Our construction builds upon a new lower bound (that may be of interest of its own right) for an analyst that may ask several non adaptive questions in a batch of fixed and known $T$ rounds of adaptivity and requires a fraction of true discoveries. We show that for such an analyst $\Omega (\sqrt{T}/\epsilon^2)$ samples are necessary. Second, we show that, under certain assumptions on the oracle, in an interaction with gradient descent $\tilde \Omega(1/\epsilon^{2.5})$ samples are necessary. Our assumptions are that the oracle has only \emph{first order access} and is \emph{post-hoc generalizing}. First order access means that it can only compute the gradients of the sampled function at points queried by the algorithm. Our assumption of \emph{post-hoc generalization} follows from existing lower bounds for statistical queries. More generally then, we provide a generic reduction from the standard setting of statistical queries to the problem of estimating gradients queried by gradient descent. These results are in contrast with classical bounds that show that with $O(1/\epsilon^2)$ samples one can optimize the population risk to accuracy of $O(\epsilon)$ but, as it turns out, with spurious gradients.

泛函 · 相互獨立的 · 查準率/準確率 · contrastive · MASS ·

2022 年 4 月 18 日

Low Degree Testing over the Reals

Vipul Arora,Arnab Bhattacharyya,Noah Fleming,Esty Kelman,Yuichi Yoshida

We study the problem of testing whether a function $f: \mathbb{R}^n \to \mathbb{R}$ is a polynomial of degree at most $d$ in the \emph{distribution-free} testing model. Here, the distance between functions is measured with respect to an unknown distribution $\mathcal{D}$ over $\mathbb{R}^n$ from which we can draw samples. In contrast to previous work, we do not assume that $\mathcal{D}$ has finite support. We design a tester that given query access to $f$, and sample access to $\mathcal{D}$, makes $(d/\varepsilon)^{O(1)}$ many queries to $f$, accepts with probability $1$ if $f$ is a polynomial of degree $d$, and rejects with probability at least $2/3$ if every degree-$d$ polynomial $P$ disagrees with $f$ on a set of mass at least $\varepsilon$ with respect to $\mathcal{D}$. Our result also holds under mild assumptions when we receive only a polynomial number of bits of precision for each query to $f$, or when $f$ can only be queried on rational points representable using a logarithmic number of bits. Along the way, we prove a new stability theorem for multivariate polynomials that may be of independent interest.

可約的 · 漢明距離 · 泛函 · 查準率/準確率 · 分解的 ·

2022 年 4 月 15 日

Towards a Stronger Theory for Permutation-based Evolutionary Algorithms

Benjamin Doerr,Yassine Ghannane,Marouane Ibn Brahim

from arxiv, To appear in the proceedings of GECCO 2022. This version contains the proofs omitted in the proceedings version for reasons of space

While the theoretical analysis of evolutionary algorithms (EAs) has made significant progress for pseudo-Boolean optimization problems in the last 25 years, only sporadic theoretical results exist on how EAs solve permutation-based problems. To overcome the lack of permutation-based benchmark problems, we propose a general way to transfer the classic pseudo-Boolean benchmarks into benchmarks defined on sets of permutations. We then conduct a rigorous runtime analysis of the permutation-based $(1+1)$ EA proposed by Scharnow, Tinnefeld, and Wegener (2004) on the analogues of the \textsc{LeadingOnes} and \textsc{Jump} benchmarks. The latter shows that, different from bit-strings, it is not only the Hamming distance that determines how difficult it is to mutate a permutation $\sigma$ into another one $\tau$, but also the precise cycle structure of $\sigma \tau^{-1}$. For this reason, we also regard the more symmetric scramble mutation operator. We observe that it not only leads to simpler proofs, but also reduces the runtime on jump functions with odd jump size by a factor of $\Theta(n)$. Finally, we show that a heavy-tailed version of the scramble operator, as in the bit-string case, leads to a speed-up of order $m^{\Theta(m)}$ on jump functions with jump size~$m$.%

Extensibility · 情景 · 原點 · 論文 ·

2022 年 4 月 15 日

Subset Sum in $O(n^{16}\log(n))$

Rion Tolchin

from arxiv, 7 pages, expands on Section 5.4 to replace and simplify large portions of the algorithm

This extensive revision of my paper "Description of an $O(\text{poly}(n))$ Algorithm for NP-Complete Combinatorial Problems" will dramatically simplify the content of the original paper by solving subset-sum instead of $3$-SAT. I will first define the "product-derivative" method which will be used to generate a system of equations for solving unknown polynomial coefficients. Then I will describe the "Dragonfly" algorithm usable to solve subset-sum in $O(n^{16}\log(n))$ which is itself composed of a set of symbolic algebra steps on monic polynomials to convert a subset, $S_T$, of a set of positive integers, $S$, with a given target sum, $T$ into a polynomial with roots corresponding to the elements of $S_T$.

INFORMS · 線性的 · Processing（編程語言） · 泛函 · 可行 ·

2022 年 4 月 15 日

Linear Programs with Polynomial Coefficients and Applications to 1D Cellular Automata

Guy Bresler,Chenghao Guo,Yury Polyanskiy

Given a matrix $A$ and vector $b$ with polynomial entries in $d$ real variables $\delta=(\delta_1,\ldots,\delta_d)$ we consider the following notion of feasibility: the pair $(A,b)$ is locally feasible if there exists an open neighborhood $U$ of $0$ such that for every $\delta\in U$ there exists $x$ satisfying $A(\delta)x\ge b(\delta)$ entry-wise. For $d=1$ we construct a polynomial time algorithm for deciding local feasibility. For $d \ge 2$ we show local feasibility is NP-hard. As an application (which was the primary motivation for this work) we give a computer-assisted proof of ergodicity of the following elementary 1D cellular automaton: given the current state $\eta_t \in \{0,1\}^{\mathbb{Z}}$ the next state $\eta_{t+1}(n)$ at each vertex $n\in \mathbb{Z}$ is obtained by $\eta_{t+1}(n)= \text{NAND}\big(\text{BSC}_\delta(\eta_t(n-1)), \text{BSC}_\delta(\eta_t(n))\big)$. Here the binary symmetric channel $\text{BSC}_\delta$ takes a bit as input and flips it with probability $\delta$ (and leaves it unchanged with probability $1-\delta$). We also consider the problem of broadcasting information on the 2D-grid of noisy binary-symmetric channels $\text{BSC}_\delta$, where each node may apply an arbitrary processing function to its input bits. We prove that there exists $\delta_0'>0$ such that for all noise levels $0<\delta<\delta_0'$ it is impossible to broadcast information for any processing function, as conjectured in Makur, Mossel, Polyanskiy (ISIT 2021).

泛函 · 表示 · 規范化的 · 閉式 · 線性的 ·

2022 年 4 月 14 日

On the representation of non-holonomic univariate power series

Bertrand Teguia Tabuguia,Wolfram Koepf

from arxiv, 20 pages; 26 references. Update: revised version

Holonomic functions play an essential role in Computer Algebra since they allow the application of many symbolic algorithms. Among all algorithmic attempts to find formulas for power series, the holonomic property remains the most important requirement to be satisfied by the function under consideration. The targeted functions mainly summarize that of meromorphic functions. However, expressions like $\tan(z)$, $z/(\exp(z)-1)$, $\sec(z)$, etc., particularly, reciprocals, quotients and compositions of holonomic functions, are generally not holonomic. Therefore their power series are inaccessible by the holonomic framework. From the mathematical dictionaries, one can observe that most of the known closed-form formulas of non-holonomic power series involve another sequence whose evaluation depends on some finite summations. In the case of $\tan(z)$ and $\sec(z)$ the corresponding sequences are the Bernoulli and Euler numbers, respectively. Thus providing a symbolic approach that yields complete representations when linear summations for power series coefficients of non-holonomic functions appear, might be seen as a step forward towards the representation of non-holonomic power series. By adapting the method of ansatz with undetermined coefficients, we build an algorithm that computes least-order quadratic differential equations with polynomial coefficients for a large class of non-holonomic functions. A differential equation resulting from this procedure is converted into a recurrence equation by applying the Cauchy product formula and rewriting powers into polynomials and derivatives into shifts. Finally, using enough initial values we are able to give normal form representations to characterize several non-holonomic power series and prove non-trivial identities. We discuss this algorithm and its implementation for Maple 2022.

可辨認的 · GROUP · Integration · MoDELS · 統計量 ·

2022 年 4 月 14 日

A general framework for identification of permissible variable subsets and development of structured variable selection methods

Guanbo Wang,Mireille E. Schnitzer,Tom Chen,Rui Wang,Robert W. Platt

In variable selection, a selection rule that prescribes the permissible sets of selected variables (called a "selection dictionary") is desirable due to the inherent structural constraints among the candidate variables. The methods that can incorporate such restrictions can improve model interpretability and prediction accuracy. Penalized regression can integrate selection rules by assigning the coefficients to different groups and then applying penalties to the groups. However, no general framework has been proposed to formalize selection rules and their applications. In this work, we establish a framework for structured variable selection that can incorporate universal structural constraints. We develop a mathematical language for constructing arbitrary selection rules, where the selection dictionary is formally defined. We show that all selection rules can be represented as a combination of operations on constructs, which can be used to identify the related selection dictionary. One may then apply some criteria to select the best model. We show that the theoretical framework can help to identify the grouping structure in existing penalized regression methods. In addition, we formulate structured variable selection into mixed-integer optimization problems which can be solved by existing software. Finally, we discuss the significance of the framework in the context of statistics.

學成 · 高斯分布 · UniFormer · Pair · contrastive ·

2022 年 4 月 14 日

Testing distributional assumptions of learning algorithms

Ronitt Rubinfeld,Arsen Vasilyan

There are many important high dimensional function classes that have fast agnostic learning algorithms when strong assumptions on the distribution of examples can be made, such as Gaussianity or uniformity over the domain. But how can one be sufficiently confident that the data indeed satisfies the distributional assumption, so that one can trust in the output quality of the agnostic learning algorithm? We propose a model by which to systematically study the design of tester-learner pairs $(\mathcal{A},\mathcal{T})$, such that if the distribution on examples in the data passes the tester $\mathcal{T}$ then one can safely trust the output of the agnostic learner $\mathcal{A}$ on the data. To demonstrate the power of the model, we apply it to the classical problem of agnostically learning halfspaces under the standard Gaussian distribution and present a tester-learner pair with a combined run-time of $n^{\tilde{O}(1/\epsilon^4)}$. This qualitatively matches that of the best known ordinary agnostic learning algorithms for this task. In contrast, finite sample Gaussian distribution testers do not exist for the $L_1$ and EMD distance measures. A key step in the analysis is a novel characterization of concentration and anti-concentration properties of a distribution whose low-degree moments approximately match those of a Gaussian. We also use tools from polynomial approximation theory. In contrast, we show strong lower bounds on the combined run-times of tester-learner pairs for the problems of agnostically learning convex sets under the Gaussian distribution and for monotone Boolean functions under the uniform distribution over $\{0,1\}^n$. Through these lower bounds we exhibit natural problems where there is a dramatic gap between standard agnostic learning run-time and the run-time of the best tester-learner pair.

樣本 · 類別 · 損失 · Performer · SimPLe ·

2019 年 1 月 16 日

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui,Menglin Jia,Tsung-Yi Lin,Yang Song,Serge Belongie

from arxiv, Code is available at: //github.com/richardaecn/class-balanced-loss

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.