亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We present the first mini-batch algorithm for maximizing a non-negative monotone decomposable submodular function, $F=\sum_{i=1}^N f^i$, under a set of constraints. We improve over the sparsifier based approach both in theory and in practice. We experimentally observe that our algorithm generates solutions that are far superior to those generated by the sparsifier based approach.

相關內容

We study the problem of $(\epsilon,\delta)$-differentially private learning of linear predictors with convex losses. We provide results for two subclasses of loss functions. The first case is when the loss is smooth and non-negative but not necessarily Lipschitz (such as the squared loss). For this case, we establish an upper bound on the excess population risk of $\tilde{O}\left(\frac{\Vert w^*\Vert}{\sqrt{n}} + \min\left\{\frac{\Vert w^* \Vert^2}{(n\epsilon)^{2/3}},\frac{\sqrt{d}\Vert w^*\Vert^2}{n\epsilon}\right\}\right)$, where $n$ is the number of samples, $d$ is the dimension of the problem, and $w^*$ is the minimizer of the population risk. Apart from the dependence on $\Vert w^\ast\Vert$, our bound is essentially tight in all parameters. In particular, we show a lower bound of $\tilde{\Omega}\left(\frac{1}{\sqrt{n}} + {\min\left\{\frac{\Vert w^*\Vert^{4/3}}{(n\epsilon)^{2/3}}, \frac{\sqrt{d}\Vert w^*\Vert}{n\epsilon}\right\}}\right)$. We also revisit the previously studied case of Lipschitz losses [SSTT20]. For this case, we close the gap in the existing work and show that the optimal rate is (up to log factors) $\Theta\left(\frac{\Vert w^*\Vert}{\sqrt{n}} + \min\left\{\frac{\Vert w^*\Vert}{\sqrt{n\epsilon}},\frac{\sqrt{\text{rank}}\Vert w^*\Vert}{n\epsilon}\right\}\right)$, where $\text{rank}$ is the rank of the design matrix. This improves over existing work in the high privacy regime. Finally, our algorithms involve a private model selection approach that we develop to enable attaining the stated rates without a-priori knowledge of $\Vert w^*\Vert$.

Consider the Telephone Broadcast problem in which an input is a connected graph $G$ on $n$ vertices, a source vertex $s \in V(G)$, and a positive integer $t$. The objective is to decide whether there is a broadcast protocol from $s$ that ensures that all the vertices of $G$ get the message in at most $t$ rounds. We consider the broadcast protocol where, in a round, any node aware of the message can forward it to at most one of its neighbors. As the number of nodes aware of the message can at most double at each round, for a non-trivial instance we have $n \le 2^t$. Hence, the brute force algorithm that checks all the permutations of the vertices runs in time $2^{2^{\calO(t)}} \cdot n^{\calO(1)}$. As our first result, we prove this simple algorithm is the best possible in the following sense. Telephone Broadcast does not admit an algorithm running in time $2^{2^{o(t)}} \cdot n^{\calO(1)}$, unless the \ETH\ fails. To the best of our knowledge, this is only the fourth example of \NP-Complete problem that admits a double exponential lower bound when parameterized by the solution size. It also resolves the question by Fomin, Fraigniaud, and Golovach [WG 2023]. In the same article, the authors asked whether the problem is \FPT\ when parameterized by the feedback vertex set number of the graph. We answer this question in the negative. Telephone Broadcast, when restricted to graphs of the feedback vertex number one, and hence treewidth of two, is \NP-\complete. We find this a relatively rare example of problems that admit a polynomial-time algorithm on trees but is \NP-\complete\ on graphs of treewidth two.

We study the problem of maintaining a lightweight bounded-degree $(1+\varepsilon)$-spanner of a dynamic point set in a $d$-dimensional Euclidean space, where $\varepsilon>0$ and $d$ are arbitrary constants. In our fully-dynamic setting, points are allowed to be inserted as well as deleted, and our objective is to maintain a $(1+\varepsilon)$-spanner that has constant bounds on its maximum degree and its lightness (the ratio of its weight to that of the minimum spanning tree), while minimizing the recourse, which is the number of edges added or removed by each point insertion or deletion. We present a fully-dynamic algorithm that handles point insertion with amortized constant recourse and point deletion with amortized $O(\log\Delta)$ recourse, where $\Delta$ is the aspect ratio of the point set.

In 1991, Brenier proved a theorem that generalizes the $QR$ decomposition for square matrices -- factored as PSD $\times$ unitary -- to any vector field $F:\mathbb{R}^d\rightarrow \mathbb{R}^d$. The theorem, known as the polar factorization theorem, states that any field $F$ can be recovered as the composition of the gradient of a convex function $u$ with a measure-preserving map $M$, namely $F=\nabla u \circ M$. We propose a practical implementation of this far-reaching theoretical result, and explore possible uses within machine learning. The theorem is closely related to optimal transport (OT) theory, and we borrow from recent advances in the field of neural optimal transport to parameterize the potential $u$ as an input convex neural network. The map $M$ can be either evaluated pointwise using $u^*$, the convex conjugate of $u$, through the identity $M=\nabla u^* \circ F$, or learned as an auxiliary network. Because $M$ is, in general, not injective, we consider the additional task of estimating the ill-posed inverse map that can approximate the pre-image measure $M^{-1}$ using a stochastic generator. We illustrate possible applications of \citeauthor{Brenier1991PolarFA}'s polar factorization to non-convex optimization problems, as well as sampling of densities that are not log-concave.

The $(k, z)$-Clustering problem in Euclidean space $\mathbb{R}^d$ has been extensively studied. Given the scale of data involved, compression methods for the Euclidean $(k, z)$-Clustering problem, such as data compression and dimension reduction, have received significant attention in the literature. However, the space complexity of the clustering problem, specifically, the number of bits required to compress the cost function within a multiplicative error $\varepsilon$, remains unclear in existing literature. This paper initiates the study of space complexity for Euclidean $(k, z)$-Clustering and offers both upper and lower bounds. Our space bounds are nearly tight when $k$ is constant, indicating that storing a coreset, a well-known data compression approach, serves as the optimal compression scheme. Furthermore, our lower bound result for $(k, z)$-Clustering establishes a tight space bound of $\Theta( n d )$ for terminal embedding, where $n$ represents the dataset size. Our technical approach leverages new geometric insights for principal angles and discrepancy methods, which may hold independent interest.

An \emph{eight-partition} of a finite set of points (respectively, of a continuous mass distribution) in $\mathbb{R}^3$ consists of three planes that divide the space into $8$ octants, such that each open octant contains at most $1/8$ of the points (respectively, of the mass). In 1966, Hadwiger showed that any mass distribution in $\mathbb{R}^3$ admits an eight-partition; moreover, one can prescribe the normal direction of one of the three planes. The analogous result for finite point sets follows by a standard limit argument. We prove the following variant of this result: Any mass distribution (or point set) in $\mathbb{R}^3$ admits an eight-partition for which the intersection of two of the planes is a line with a prescribed direction. Moreover, we present an efficient algorithm for calculating an eight-partition of a set of $n$ points in~$\mathbb{R}^3$ (with prescribed normal direction of one of the planes) in time $O^{*}(n^{5/2})$.

We design a deterministic subexponential time algorithm that takes as input a multivariate polynomial $f$ computed by a constant-depth circuit over rational numbers, and outputs a list $L$ of circuits (of unbounded depth and possibly with division gates) that contains all irreducible factors of $f$ computable by constant-depth circuits. This list $L$ might also include circuits that are spurious: they either do not correspond to factors of $f$ or are not even well-defined, e.g. the input to a division gate is a sub-circuit that computes the identically zero polynomial. The key technical ingredient of our algorithm is a notion of the pseudo-resultant of $f$ and a factor $g$, which serves as a proxy for the resultant of $g$ and $f/g$, with the advantage that the circuit complexity of the pseudo-resultant is comparable to that of the circuit complexity of $f$ and $g$. This notion, which might be of independent interest, together with the recent results of Limaye, Srinivasan and Tavenas, helps us derandomize one key step of multivariate polynomial factorization algorithms - that of deterministically finding a good starting point for Newton Iteration for the case when the input polynomial as well as the irreducible factor of interest have small constant-depth circuits.

We propose principled Gaussian processes (GPs) for modeling functions defined over the edge set of a simplicial 2-complex, a structure similar to a graph in which edges may form triangular faces. This approach is intended for learning flow-type data on networks where edge flows can be characterized by the discrete divergence and curl. Drawing upon the Hodge decomposition, we first develop classes of divergence-free and curl-free edge GPs, suitable for various applications. We then combine them to create \emph{Hodge-compositional edge GPs} that are expressive enough to represent any edge function. These GPs facilitate direct and independent learning for the different Hodge components of edge functions, enabling us to capture their relevance during hyperparameter optimization. To highlight their practical potential, we apply them for flow data inference in currency exchange, ocean currents and water supply networks, comparing them to alternative models.

We introduce a new algorithm for solving unconstrained discrete-time optimal control problems. Our method follows a direct multiple shooting approach, and consists of applying the SQP method together with an $\ell_2$ augmented Lagrangian primal-dual merit function. We use the LQR algorithm to efficiently solve the primal-dual SQP problem. As our algorithm is a specialization of NPSQP (Gill et al. 1992), it inherits its generic properties, including global convergence, fast local convergence, and the lack of need for second order corrections, improving on existing direct multiple shooting approaches such as GNMS (Giftthaler et al. 2018) and FDDP (Mastalli et al. 2020).

We develop an algorithm for parameter-free stochastic convex optimization (SCO) whose rate of convergence is only a double-logarithmic factor larger than the optimal rate for the corresponding known-parameter setting. In contrast, the best previously known rates for parameter-free SCO are based on online parameter-free regret bounds, which contain unavoidable excess logarithmic terms compared to their known-parameter counterparts. Our algorithm is conceptually simple, has high-probability guarantees, and is also partially adaptive to unknown gradient norms, smoothness, and strong convexity. At the heart of our results is a novel parameter-free certificate for SGD step size choice, and a time-uniform concentration result that assumes no a-priori bounds on SGD iterates.

北京阿比特科技有限公司