亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider the Low Rank Approximation problem, where the input consists of a matrix $A \in \mathbb{R}^{n_R \times n_C}$ and an integer $k$, and the goal is to find a matrix $B$ of rank at most $k$ that minimizes $\| A - B \|_0$, which is the number of entries where $A$ and $B$ differ. For any constant $k$ and $\varepsilon > 0$, we present a polynomial time $(1 + \varepsilon)$-approximation time for this problem, which significantly improves the previous best $poly(k)$-approximation. Our algorithm is obtained by viewing the problem as a Constraint Satisfaction Problem (CSP) where each row and column becomes a variable that can have a value from $\mathbb{R}^k$. In this view, we have a constraint between each row and column, which results in a {\em dense} CSP, a well-studied topic in approximation algorithms. While most of previous algorithms focus on finite-size (or constant-size) domains and involve an exhaustive enumeration over the entire domain, we present a new framework that bypasses such an enumeration in $\mathbb{R}^k$. We also use tools from the rich literature of Low Rank Approximation in different objectives (e.g., $\ell_p$ with $p \in (0, \infty)$) or domains (e.g., finite fields/generalized Boolean). We believe that our techniques might be useful to study other real-valued CSPs and matrix optimization problems. On the hardness side, when $k$ is part of the input, we prove that Low Rank Approximation is NP-hard to approximate within a factor of $\Omega(\log n)$. This is the first superconstant NP-hardness of approximation for any $p \in [0, \infty]$ that does not rely on stronger conjectures (e.g., the Small Set Expansion Hypothesis).

相關內容

We present a $p$-adic algorithm to recover the lexicographic Gr\"obner basis $\mathcal G$ of an ideal in $\mathbb Q[x,y]$ with a generating set in $\mathbb Z[x,y]$, with a complexity that is less than cubic in terms of the dimension of $\mathbb Q[x,y]/\langle \mathcal G \rangle$ and softly linear in the height of its coefficients. We observe that previous results of Lazard's that use Hermite normal forms to compute Gr\"obner bases for ideals with two generators can be generalized to a set of $t\in \mathbb N^+$ generators. We use this result to obtain a bound on the height of the coefficients of $\mathcal G$, and to control the probability of choosing a \textit{good} prime $p$ to build the $p$-adic expansion of $\mathcal G$.

Federated Learning (FL) has emerged as a promising distributed learning paradigm that enables multiple clients to learn a global model collaboratively without sharing their private data. However, the effectiveness of FL is highly dependent on the quality of the data that is being used for training. In particular, data heterogeneity issues, such as label distribution skew and feature skew, can significantly impact the performance of FL. Previous studies in FL have primarily focused on addressing label distribution skew data heterogeneity, while only a few recent works have made initial progress in tackling feature skew issues. Notably, these two forms of data heterogeneity have been studied separately and have not been well explored within a unified FL framework. To address this gap, we propose Fed-CO$_{2}$, a universal FL framework that handles both label distribution skew and feature skew within a \textbf{C}ooperation mechanism between the \textbf{O}nline and \textbf{O}ffline models. Specifically, the online model learns general knowledge that is shared among all clients, while the offline model is trained locally to learn the specialized knowledge of each individual client. To further enhance model cooperation in the presence of feature shifts, we design an intra-client knowledge transfer mechanism that reinforces mutual learning between the online and offline models, and an inter-client knowledge transfer mechanism to increase the models' domain generalization ability. Extensive experiments show that our Fed-CO$_{2}$ outperforms a wide range of existing personalized federated learning algorithms in terms of handling label distribution skew and feature skew, both individually and collectively. The empirical results are supported by our convergence analyses in a simplified setting.

We propose two approaches, based on Riemannian optimization, for computing a stochastic approximation of the $p$th root of a stochastic matrix $A$. In the first approach, the approximation is found in the Riemannian manifold of positive stochastic matrices. In the second approach, we introduce the Riemannian manifold of positive stochastic matrices sharing with $A$ the Perron eigenvector and we compute the approximation of the $p$th root of $A$ in such a manifold. This way, differently from the available methods based on constrained optimization, $A$ and its $p$th root approximation share the Perron eigenvector. Such a property is relevant, from a modelling point of view, in the embedding problem for Markov chains. The extended numerical experimentation shows that, in the first approach, the Riemannian optimization methods are generally faster and more accurate than the available methods based on constrained optimization. In the second approach, even though the stochastic approximation of the $p$th root is found in a smaller set, the approximation is generally more accurate than the one obtained by standard constrained optimization.

Formal theories of arithmetic have traditionally been based on either classical or intuitionistic logic, leading to the development of Peano and Heyting arithmetic, respectively. We propose a use $\mu$MALL as a formal theory of arithmetic based on linear logic. This formal system is presented as a sequent calculus proof system that extends the standard proof system for multiplicative-additive linear logic (MALL) with the addition of the logical connectives universal and existential quantifiers (first-order quantifiers), term equality and non-equality, and the least and greatest fixed point operators. We first demonstrate how functions defined using $\mu$MALL relational specifications can be computed using a simple proof search algorithm. By incorporating weakening and contraction into $\mu$MALL, we obtain $\mu$LK+, a natural candidate for a classical sequent calculus for arithmetic. While important proof theory results are still lacking for $\mu$LK+ (including cut-elimination and the completeness of focusing), we prove that $\mu$LK+ is consistent and that it contains Peano arithmetic. We also prove two conservativity results regarding $\mu$LK+ over $\mu$MALL.

We consider online reinforcement learning (RL) in episodic Markov decision processes (MDPs) under the linear $q^\pi$-realizability assumption, where it is assumed that the action-values of all policies can be expressed as linear functions of state-action features. This class is known to be more general than linear MDPs, where the transition kernel and the reward function are assumed to be linear functions of the feature vectors. As our first contribution, we show that the difference between the two classes is the presence of states in linearly $q^\pi$-realizable MDPs where for any policy, all the actions have approximately equal values, and skipping over these states by following an arbitrarily fixed policy in those states transforms the problem to a linear MDP. Based on this observation, we derive a novel (computationally inefficient) learning algorithm for linearly $q^\pi$-realizable MDPs that simultaneously learns what states should be skipped over and runs another learning algorithm on the linear MDP hidden in the problem. The method returns an $\epsilon$-optimal policy after $\text{polylog}(H, d)/\epsilon^2$ interactions with the MDP, where $H$ is the time horizon and $d$ is the dimension of the feature vectors, giving the first polynomial-sample-complexity online RL algorithm for this setting. The results are proved for the misspecified case, where the sample complexity is shown to degrade gracefully with the misspecification error.

We study the task of $(\epsilon, \delta)$-differentially private online convex optimization (OCO). In the online setting, the release of each distinct decision or iterate carries with it the potential for privacy loss. This problem has a long history of research starting with Jain et al. [2012] and the best known results for the regime of {\epsilon} not being very small are presented in Agarwal et al. [2023]. In this paper we improve upon the results of Agarwal et al. [2023] in terms of the dimension factors as well as removing the requirement of smoothness. Our results are now the best known rates for DP-OCO in this regime. Our algorithms builds upon the work of [Asi et al., 2023] which introduced the idea of explicitly limiting the number of switches via rejection sampling. The main innovation in our algorithm is the use of sampling from a strongly log-concave density which allows us to trade-off the dimension factors better leading to improved results.

In the semi-streaming model for processing massive graphs, an algorithm makes multiple passes over the edges of a given $n$-vertex graph and is tasked with computing the solution to a problem using $O(n \cdot \text{polylog}(n))$ space. Semi-streaming algorithms for Maximal Independent Set (MIS) that run in $O(\log\log{n})$ passes have been known for almost a decade, however, the best lower bounds can only rule out single-pass algorithms. We close this large gap by proving that the current algorithms are optimal: Any semi-streaming algorithm for finding an MIS with constant probability of success requires $\Omega(\log\log{n})$ passes. This settles the complexity of this fundamental problem in the semi-streaming model, and constitutes one of the first optimal multi-pass lower bounds in this model. We establish our result by proving an optimal round vs communication tradeoff for the (multi-party) communication complexity of MIS. The key ingredient of this result is a new technique, called hierarchical embedding, for performing round elimination: we show how to pack many but small hard $(r-1)$-round instances of the problem into a single $r$-round instance, in a way that enforces any $r$-round protocol to effectively solve all these $(r-1)$-round instances also. These embeddings are obtained via a novel application of results from extremal graph theory -- in particular dense graphs with many disjoint unique shortest paths -- together with a newly designed graph product, and are analyzed via information-theoretic tools such as direct-sum and message compression arguments.

In this paper, we revisit the bilevel optimization problem, in which the upper-level objective function is generally nonconvex and the lower-level objective function is strongly convex. Although this type of problem has been studied extensively, it still remains an open question how to achieve an ${O}(\epsilon^{-1.5})$ sample complexity in Hessian/Jacobian-free stochastic bilevel optimization without any second-order derivative computation. To fill this gap, we propose a novel Hessian/Jacobian-free bilevel optimizer named FdeHBO, which features a simple fully single-loop structure, a projection-aided finite-difference Hessian/Jacobian-vector approximation, and momentum-based updates. Theoretically, we show that FdeHBO requires ${O}(\epsilon^{-1.5})$ iterations (each using ${O}(1)$ samples and only first-order gradient information) to find an $\epsilon$-accurate stationary point. As far as we know, this is the first Hessian/Jacobian-free method with an ${O}(\epsilon^{-1.5})$ sample complexity for nonconvex-strongly-convex stochastic bilevel optimization.

A novel H3N3-2$_\sigma$ interpolation approximation for the Caputo fractional derivative of order $\alpha\in(1,2)$ is derived in this paper, which improves the popular L2C formula with (3-$\alpha$)-order accuracy. By an interpolation technique, the second-order accuracy of the truncation error is skillfully estimated. Based on this formula, a finite difference scheme with second-order accuracy both in time and in space is constructed for the initial-boundary value problem of the time fractional hyperbolic equation. It is well known that the coefficient properties of discrete fractional derivatives are fundamental to the numerical stability of time fractional differential models. We prove the related properties of the coefficients of the H3N3-2$_\sigma$ approximate formula. With these properties, the numerical stability and convergence of the difference scheme is derived immediately by the energy method in the sense of $H^1$-norm. Considering the weak regularity of the solution to the problem at the starting time, a finite difference scheme on the graded meshes based on H3N3-2$_\sigma$ formula is also presented. The numerical simulations are performed to show the effectiveness of the derived finite difference schemes, in which the fast algorithms are employed to speed up the numerical computation.

For a permutation $\pi: [K]\rightarrow [K]$, a sequence $f: \{1,2,\cdots, n\}\rightarrow \mathbb R$ contains a $\pi$-pattern of size $K$, if there is a sequence of indices $(i_1, i_2, \cdots, i_K)$ ($i_1<i_2<\cdots<i_K$), satisfying that $f(i_a)<f(i_b)$ if $\pi(a)<\pi(b)$, for $a,b\in [K]$. Otherwise, $f$ is referred to as $\pi$-free. For the special case where $\pi = (1,2,\cdots, K)$, it is referred to as the monotone pattern. \cite{newman2017testing} initiated the study of testing $\pi$-freeness with one-sided error. They focused on two specific problems, testing the monotone permutations and the $(1,3,2)$ permutation. For the problem of testing monotone permutation $(1,2,\cdots,K)$, \cite{ben2019finding} improved the $(\log n)^{O(K^2)}$ non-adaptive query complexity of \cite{newman2017testing} to $O((\log n)^{\lfloor \log_{2} K\rfloor})$. Further, \cite{ben2019optimal} proposed an adaptive algorithm with $O(\log n)$ query complexity. However, no progress has yet been made on the problem of testing $(1,3,2)$-freeness. In this work, we present an adaptive algorithm for testing $(1,3,2)$-freeness. The query complexity of our algorithm is $O(\epsilon^{-2}\log^4 n)$, which significantly improves over the $O(\epsilon^{-7}\log^{26}n)$-query adaptive algorithm of \cite{newman2017testing}. This improvement is mainly achieved by the proposal of a new structure embedded in the patterns.

北京阿比特科技有限公司