亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Given a set of $n$ points in the Euclidean plane, the $k$-MinSumRadius problem asks to cover this point set using $k$ disks with the objective of minimizing the sum of the radii of the disks. After a long line of research on related problems, it was finally discovered that this problem admits a polynomial time algorithm [GKKPV~'12]; however, the running time of this algorithm is $O(n^{881})$, and its relevance is thereby mostly of theoretical nature. A practically and structurally interesting special case of the $k$-MinSumRadius problem is that of small $k$. For the $2$-MinSumRadius problem, a near-quadratic time algorithm with expected running time $O(n^2 \log^2 n \log^2 \log n)$ was given over 30 years ago [Eppstein~'92]. We present the first improvement of this result, namely, a near-linear time algorithm to compute the $2$-MinSumRadius that runs in expected $O(n \log^2 n \log^2 \log n)$ time. We generalize this result to any constant dimension $d$, for which we give an $O(n^{2-1/(\lceil d/2\rceil + 1) + \varepsilon})$ time algorithm. Additionally, we give a near-quadratic time algorithm for $3$-MinSumRadius in the plane that runs in expected $O(n^2 \log^2 n \log^2 \log n)$ time. All of these algorithms rely on insights that uncover a surprisingly simple structure of optimal solutions: we can specify a linear number of lines out of which one separates one of the clusters from the remaining clusters in an optimal solution.

相關內容

This paper discusses two approaches to the diachronic normalization of Polish texts: a rule-based solution that relies on a set of handcrafted patterns, and a neural normalization model based on the text-to-text transfer transformer architecture. The training and evaluation data prepared for the task are discussed in detail, along with experiments conducted to compare the proposed normalization solutions. A quantitative and qualitative analysis is made. It is shown that at the current stage of inquiry into the problem, the rule-based solution outperforms the neural one on 3 out of 4 variants of the prepared dataset, although in practice both approaches have distinct advantages and disadvantages.

The length-constrained cycle partition problem (LCCP) is a graph optimization problem in which a set of nodes must be partitioned into a minimum number of cycles. Every node is associated with a critical time and the length of every cycle must not exceed the critical time of any node in the cycle. We formulate LCCP as a set partitioning model and solve it using an exact branch-and-price approach. We use a dynamic programming-based pricing algorithm to generate improving cycles, exploiting the particular structure of the pricing problem for efficient bidirectional search and symmetry breaking. Computational results show that the LP relaxation of the set partitioning model produces strong dual bounds and our branch-and-price method improves significantly over the state of the art. It is able to solve closed instances in a fraction of the previously needed time and closes 13 previously unsolved instances, one of which has 76 nodes, a notable improvement over the previous limit of 52 nodes.

We study a cooperative multi-agent bandit setting in the distributed GOSSIP model: in every round, each of $n$ agents chooses an action from a common set, observes the action's corresponding reward, and subsequently exchanges information with a single randomly chosen neighbor, which informs its policy in the next round. We introduce and analyze several families of fully-decentralized local algorithms in this setting under the constraint that each agent has only constant memory. We highlight a connection between the global evolution of such decentralized algorithms and a new class of "zero-sum" multiplicative weights update methods, and we develop a general framework for analyzing the population-level regret of these natural protocols. Using this framework, we derive sublinear regret bounds for both stationary and adversarial reward settings. Moreover, we show that these simple local algorithms can approximately optimize convex functions over the simplex, assuming that the reward distributions are generated from a stochastic gradient oracle.

The Strong Exponential Hierarchy $SEH$ was shown to collapse to $P^{NExp}$ by Hemachandra by proving $P^{NExp} = NP^{NExp}$ via a census argument. Nonetheless, Hemachandra also asked for certificate-based and alternating Turing machine characterizations of the $SEH$ levels, in the hope that these might have revealed deeper structural reasons behind the collapse. These open questions have thus far remained unanswered. To close them, by building upon the notion of Hausdorff reductions, we investigate a natural normal form for the intermediate levels of the (generalized) exponential hierarchies, i.e., the single-, the double-Exponential Hierarchy, and so on. Although the two characterizations asked for derive from our Hausdorff characterization, it is nevertheless from the latter that a surprising structural reason behind the collapse of $SEH$ is uncovered as a consequence of a very general result: the intermediate levels of the exponential hierarchies are precisely characterized by specific "Hausdorff classes", which define these levels without resorting to oracle machines. By this, contrarily to oracle classes, which may have different shapes for a same class (e.g., $P^{NP}_{||} = P^{NP[Log]} = LogSpace^{NP}$), hierarchy intermediate levels are univocally identified by Hausdorff classes (under the hypothesis of no hierarchy collapse). In fact, we show that the rather simple reason behind many equivalences of oracle classes is that they just refer to different ways of deciding the languages of a same Hausdorff class, and this happens also for $P^{NExp}$ and $NP^{NExp}$. In addition, via Hausdorff classes, we define complete problems for various intermediate levels of the exponential hierarchies. Through these, we obtain matching lower-bounds for problems known to be in $P^{NExp[Log]}$, but whose hardness was left open due to the lack of known $P^{NExp[Log]}$-complete problems.

We conduct a systematic study of the approximation properties of Transformer for sequence modeling with long, sparse and complicated memory. We investigate the mechanisms through which different components of Transformer, such as the dot-product self-attention, positional encoding and feed-forward layer, affect its expressive power, and we study their combined effects through establishing explicit approximation rates. Our study reveals the roles of critical parameters in the Transformer, such as the number of layers and the number of attention heads, and these insights also provide natural suggestions for alternative architectures.

The solution to empirical risk minimization with $f$-divergence regularization (ERM-$f$DR) is presented under mild conditions on $f$. Under such conditions, the optimal measure is shown to be unique. Examples of the solution for particular choices of the function $f$ are presented. Previously known solutions to common regularization choices are obtained by leveraging the flexibility of the family of $f$-divergences. These include the unique solutions to empirical risk minimization with relative entropy regularization (Type-I and Type-II). The analysis of the solution unveils the following properties of $f$-divergences when used in the ERM-$f$DR problem: $i\bigl)$ $f$-divergence regularization forces the support of the solution to coincide with the support of the reference measure, which introduces a strong inductive bias that dominates the evidence provided by the training data; and $ii\bigl)$ any $f$-divergence regularization is equivalent to a different $f$-divergence regularization with an appropriate transformation of the empirical risk function.

Let $S_d(n)$ denote the minimum number of wires of a depth-$d$ (unbounded fan-in) circuit encoding an error-correcting code $C:\{0, 1\}^n \to \{0, 1\}^{32n}$ with distance at least $4n$. G\'{a}l, Hansen, Kouck\'{y}, Pudl\'{a}k, and Viola [IEEE Trans. Inform. Theory 59(10), 2013] proved that $S_d(n) = \Theta_d(\lambda_d(n)\cdot n)$ for any fixed $d \ge 3$. By improving their construction and analysis, we prove $S_d(n)= O(\lambda_d(n)\cdot n)$. Letting $d = \alpha(n)$, a version of the inverse Ackermann function, we obtain circuits of linear size. This depth $\alpha(n)$ is the minimum possible to within an additive constant 2; we credit the nearly-matching depth lower bound to G\'{a}l et al., since it directly follows their method (although not explicitly claimed or fully verified in that work), and is obtained by making some constants explicit in a graph-theoretic lemma of Pudl\'{a}k [Combinatorica, 14(2), 1994], extending it to super-constant depths. We also study a subclass of MDS codes $C: \mathbb{F}^n \to \mathbb{F}^m$ characterized by the Hamming-distance relation $\mathrm{dist}(C(x), C(y)) \ge m - \mathrm{dist}(x, y) + 1$ for any distinct $x, y \in \mathbb{F}^n$. (For linear codes this is equivalent to the generator matrix being totally invertible.) We call these superconcentrator-induced codes, and we show their tight connection with superconcentrators. Specifically, we observe that any linear or nonlinear circuit encoding a superconcentrator-induced code must be a superconcentrator graph, and any superconcentrator graph can be converted to a linear circuit, over a sufficiently large field (exponential in the size of the graph), encoding a superconcentrator-induced code.

We consider the problem of finding a geodesic disc of smallest radius containing at least $k$ points from a set of $n$ points in a simple polygon that has $m$ vertices, $r$ of which are reflex vertices. We refer to such a disc as a SKEG disc. We present an algorithm to compute a SKEG disc using higher-order geodesic Voronoi diagrams with worst-case time $O(k^{2} n + k^{2} r + \min(kr, r(n-k)) + m)$ ignoring polylogarithmic factors. We then present two $2$-approximation algorithms that find a geodesic disc containing at least $k$ points whose radius is at most twice that of a SKEG disc. The first algorithm computes a $2$-approximation with high probability in $O((n^{2} / k) \log n \log r + m)$ worst-case time with $O(n + m)$ space. The second algorithm runs in $O(n \log^{2} n \log r + m)$ expected time using $O(n + m)$ expected space, independent of $k$. Note that the first algorithm is faster when $k \in \omega(n / \log n)$.

This paper leverages the use of \emph{Gram iteration} an efficient, deterministic, and differentiable method for computing spectral norm with an upper bound guarantee. Designed for circular convolutional layers, we generalize the use of the Gram iteration to zero padding convolutional layers and prove its quadratic convergence. We also provide theorems for bridging the gap between circular and zero padding convolution's spectral norm. We design a \emph{spectral rescaling} that can be used as a competitive $1$-Lipschitz layer that enhances network robustness. Demonstrated through experiments, our method outperforms state-of-the-art techniques in precision, computational cost, and scalability. The code of experiments is available at //github.com/blaisedelattre/lip4conv.

We investigate the complexity of solving stable or perturbation-resilient instances of $k$-Means and $k$-Median clustering in fixed dimension Euclidean metrics (more generally doubling metrics). The notion of stable (perturbation resilient) instances was introduced by Bilu and Linial [2010] and Awasthi et al. [2012]. In our context we say a $k$-Means instance is $\alpha$-stable if there is a unique OPT which remains optimum if distances are (non-uniformly) stretched by a factor of at most $\alpha$. Stable clustering instances have been studied to explain why heuristics such as Lloyd's algorithm perform well in practice. In this work we show that for any fixed $\epsilon>0$, $(1+\epsilon)$-stable instances of $k$-Means in doubling metrics can be solved in polynomial time. More precisely we show a natural multiswap local search algorithm finds OPT for $(1+\epsilon)$-stable instances of $k$-Means and $k$-Median in a polynomial number of iterations. We complement this result by showing that under a new PCP theorem, this is essentially tight: that when the dimension d is part of the input, there is a fixed $\epsilon_0>0$ s.t. there is not even a PTAS for $(1+\epsilon_0)$-stable $k$-Means in $R^d$ unless NP=RP. To do this, we consider a robust property of CSPs; call an instance stable if there is a unique optimum solution $x^*$ and for any other solution $x'$, the number of unsatisfied clauses is proportional to the Hamming distance between $x^*$ and $x'$. Dinur et al. have already shown stable QSAT is hard to approximate for some constant Q, our hypothesis is simply that stable QSAT with bounded variable occurrence is also hard. Given this hypothesis we consider "stability-preserving" reductions to prove our hardness for stable k-Means. Such reductions seem to be more fragile than standard L-reductions and may be of further use to demonstrate other stable optimization problems are hard.

北京阿比特科技有限公司