亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Let $P_Z$ be a given distribution on $\mathbb{R}^n$. For any $y\in\mathbb{R}^n$, we may interpret $\rho(y):=\ln\mathbb{E}[e^{\left<y,Z\right>}]$ as a soft-max of $\left<y,Z\right>$. We explore lower bounds on $\mathbb{E}[\rho(Y)]$ in terms of the minimum mutual information $I(Z,\bar{Z})$ over $P_{Z\bar{Z}}$ which is a coupling of $P_Z$ and itself such that $Z-\bar{Z}$ is bounded in a certain sense. This may be viewed as a soft version of Sudakov's minoration, which lower bounds the expected supremum of a stochastic process in terms of the packing number. Our method is based on convex geometry (thrifty approximation of convex bodies), and works for general non-Gaussian $Y$. When $Y$ is Gaussian and $\bar{Z}$ converges to $Z$, this recovers a recent inequality of Bai-Wu-Ozgur on information-constrained optimal transport, previously established using Gaussian-specific techniques. We also use soft-minoration to obtain asymptotically (in tensor order) tight bounds on the free energy in the Sherrington-Kirkpatrick model with spins uniformly distributed on a type class, implying asymptotically tight bounds for the type~II error exponent in spiked tensor detection.

相關內容

Shortest paths problems are subject to extensive studies in classic distributed models such as the CONGEST or Congested Clique. These models dictate how nodes may communicate in order to determine shortest paths in a distributed input graph. This article focuses on shortest paths problems in the HYBRID model, which combines local communication along edges of the input graph with global communication between arbitrary pairs of nodes that is restricted in terms of bandwidth. Previous work showed that it takes $\tilde \Omega(\!\sqrt{k})$ rounds in the \hybrid model for each node to learn its distance to $k$ dedicated source nodes (aka the $k$-SSP problem), even for crude approximations. This lower bound was also matched with algorithmic solutions for $k \geq n^{2/3}$. However, as $k$ gets smaller, the gap between the known upper and lower bounds diverges and even becomes exponential for a single source. In this work we close this gap for the whole range of $k$ (up to terms that are polylogarithmic in $n$), by giving algorithmic solutions for $k$-SSP in $\tilde O\big(\!\sqrt k\big)$ rounds for any $k$.

We propose a novel probabilistically robust controller for the guidance of an unmanned aerial vehicle (UAV) in coverage planning missions, which can simultaneously optimize both the UAV's motion, and camera control inputs for the 3D coverage of a given object of interest. Specifically, the coverage planning problem is formulated in this work as an optimal control problem with logical constraints to enable the UAV agent to jointly: a) select a series of discrete camera field-of-view states which satisfy a set of coverage constraints, and b) optimize its motion control inputs according to a specified mission objective. We show how this hybrid optimal control problem can be solved with standard optimization tools by converting the logical expressions in the constraints into equality/inequality constraints involving only continuous variables. Finally, probabilistic robustness is achieved by integrating the unscented transformation to the proposed controller, thus enabling the design of robust open-loop coverage plans which take into account the future posterior distribution of the UAV's state inside the planning horizon.

We show that convex-concave Lipschitz stochastic saddle point problems (also known as stochastic minimax optimization) can be solved under the constraint of $(\epsilon,\delta)$-differential privacy with \emph{strong (primal-dual) gap} rate of $\tilde O\big(\frac{1}{\sqrt{n}} + \frac{\sqrt{d}}{n\epsilon}\big)$, where $n$ is the dataset size and $d$ is the dimension of the problem. This rate is nearly optimal, based on existing lower bounds in differentially private stochastic optimization. Specifically, we prove a tight upper bound on the strong gap via novel implementation and analysis of the recursive regularization technique repurposed for saddle point problems. We show that this rate can be attained with $O\big(\min\big\{\frac{n^2\epsilon^{1.5}}{\sqrt{d}}, n^{3/2}\big\}\big)$ gradient complexity, and $\tilde{O}(n)$ gradient complexity if the loss function is smooth. As a byproduct of our method, we develop a general algorithm that, given a black-box access to a subroutine satisfying a certain $\alpha$ primal-dual accuracy guarantee with respect to the empirical objective, gives a solution to the stochastic saddle point problem with a strong gap of $\tilde{O}(\alpha+\frac{1}{\sqrt{n}})$. We show that this $\alpha$-accuracy condition is satisfied by standard algorithms for the empirical saddle point problem such as the proximal point method and the stochastic gradient descent ascent algorithm. Further, we show that even for simple problems it is possible for an algorithm to have zero weak gap and suffer from $\Omega(1)$ strong gap. We also show that there exists a fundamental tradeoff between stability and accuracy. Specifically, we show that any $\Delta$-stable algorithm has empirical gap $\Omega\big(\frac{1}{\Delta n}\big)$, and that this bound is tight. This result also holds also more specifically for empirical risk minimization problems and may be of independent interest.

Consider the family of power divergence statistics based on $n$ trials, each leading to one of $r$ possible outcomes. This includes the log-likelihood ratio and Pearson's statistic as important special cases. It is known that in certain regimes (e.g., when $r$ is of order $n^2$ and the allocation is asymptotically uniform as $n\to\infty$) the power divergence statistic converges in distribution to a linear transformation of a Poisson random variable. We establish explicit error bounds in the Kolmogorov (or uniform) metric to complement this convergence result, which may be applied for any values of $n$, $r$ and the index parameter $\lambda$ for which such a finite-sample bound is meaningful. We further use this Poisson approximation result to derive error bounds in Gaussian approximation of the power divergence statistics.

Federated learning is an approach to collaboratively training machine learning models for multiple parties that prohibit data sharing. One of the challenges in federated learning is non-IID data between clients, as a single model can not fit the data distribution for all clients. Meta-learning, such as Per-FedAvg, is introduced to cope with the challenge. Meta-learning learns shared initial parameters for all clients. Each client employs gradient descent to adapt the initialization to local data distributions quickly to realize model personalization. However, due to non-convex loss function and randomness of sampling update, meta-learning approaches have unstable goals in local adaptation for the same client. This fluctuation in different adaptation directions hinders the convergence in meta-learning. To overcome this challenge, we use the historical local adapted model to restrict the direction of the inner loop and propose an elastic-constrained method. As a result, the current round inner loop keeps historical goals and adapts to better solutions. Experiments show our method boosts meta-learning convergence and improves personalization without additional calculation and communication. Our method achieved SOTA on all metrics in three public datasets.

Information projections have found many important applications in probability theory, statistics, and related fields. In the field of hypothesis testing in particular, the reverse information projection (RIPr) has recently been shown to lead to so-called growth-rate optimal (GRO) e-statistics for testing simple alternatives against composite null hypotheses. However, the RIPr as well as the GRO criterion are only defined in cases where the infimum information divergence between the null and alternative is finite. Here, we show that under much weaker conditions there often still exists an element in the alternative that is `closest' to the null: the universal reverse information projection. The universal reverse information projection and its non-universal counterpart coincide whenever the KL is finite, and the strictness of this generalization will be shown by an example. Furthermore, the universal RIPr leads to optimal e-statistics in a sense that is a novel, but natural, extension of the GRO criterion. Finally, we discuss conditions under which the universal RIPr is a strict sub-probability distributions, and conditions under which an approximation of the universal RIPr leads to approximate e-statistics.

Many isomorphism problems for tensors, groups, algebras, and polynomials were recently shown to be equivalent to one another under polynomial-time reductions, prompting the introduction of the complexity class TI (Grochow & Qiao, ITCS '21; SIAM J. Comp., '23). Using the tensorial viewpoint, Grochow & Qiao (CCC '21) then gave moderately exponential-time search- and counting-to-decision reductions for a class of $p$-groups. A significant issue was that the reductions usually incurred a quadratic increase in the length of the tensors involved. When the tensors represent $p$-groups, this corresponds to an increase in the order of the group of the form $|G|^{\Theta(\log |G|)}$, negating any asymptotic gains in the Cayley table model. In this paper, we present a new kind of tensor gadget that allows us to replace those quadratic-length reductions with linear-length ones, yielding the following consequences: 1. Combined with the recent breakthrough $|G|^{O((\log |G|)^{5/6})}$-time isomorphism-test for $p$-groups of class 2 and exponent $p$ (Sun, STOC '23), our reductions extend this runtime to $p$-groups of class $c$ and exponent $p$ where $c<p$. 2. Our reductions show that Sun's algorithm solves several TI-complete problems over $F_p$, such as isomorphism problems for cubic forms, algebras, and tensors, in time $p^{O(n^{1.8} \log p)}$. 3. Polynomial-time search- and counting-to-decision reduction for testing isomorphism of $p$-groups of class $2$ and exponent $p$ in the Cayley table model. This answers questions of Arvind and T\'oran (Bull. EATCS, 2005) for this group class, thought to be one of the hardest cases of Group Isomorphism. 4. If Graph Isomorphism is in P, then testing equivalence of cubic forms and testing isomorphism of algebra over a finite field $F_q$ can both be solved in time $q^{O(n)}$, improving from the brute-force upper bound $q^{O(n^2)}$.

We prove existence of equal area partitions of the unit sphere via optimal transport methods, accompanied by diameter bounds written in terms of Monge--Kantorovich distances. This can be used to obtain bounds on the expectation of the maximum diameter of partition sets, when points are uniformly sampled from the sphere. An application to the computation of sliced Monge--Kantorovich distances is also presented.

Conventional solvers are often computationally expensive for constrained optimization, particularly in large-scale and time-critical problems. While this leads to a growing interest in using neural networks (NNs) as fast optimal solution approximators, incorporating the constraints with NNs is challenging. In this regard, we propose deep Lagrange dual with equality embedding (DeepLDE), a framework that learns to find an optimal solution without using labels. To ensure feasible solutions, we embed equality constraints into the NNs and train the NNs using the primal-dual method to impose inequality constraints. Furthermore, we prove the convergence of DeepLDE and show that the primal-dual learning method alone cannot ensure equality constraints without the help of equality embedding. Simulation results on convex, non-convex, and AC optimal power flow (AC-OPF) problems show that the proposed DeepLDE achieves the smallest optimality gap among all the NN-based approaches while always ensuring feasible solutions. Furthermore, the computation time of the proposed method is about 5 to 250 times faster than DC3 and the conventional solvers in solving constrained convex, non-convex optimization, and/or AC-OPF.

Despite rapid advancement in the field of Constrained Natural Language Generation, little time has been spent on exploring the potential of language models which have had their vocabularies lexically, semantically, and/or phonetically constrained. We find that most language models generate compelling text even under significant constraints. We present a simple and universally applicable technique for modifying the output of a language model by compositionally applying filter functions to the language models vocabulary before a unit of text is generated. This approach is plug-and-play and requires no modification to the model. To showcase the value of this technique, we present an easy to use AI writing assistant called Constrained Text Generation Studio (CTGS). CTGS allows users to generate or choose from text with any combination of a wide variety of constraints, such as banning a particular letter, forcing the generated words to have a certain number of syllables, and/or forcing the words to be partial anagrams of another word. We introduce a novel dataset of prose that omits the letter e. We show that our method results in strictly superior performance compared to fine-tuning alone on this dataset. We also present a Huggingface space web-app presenting this technique called Gadsby. The code is available to the public here: //github.com/Hellisotherpeople/Constrained-Text-Generation-Studio

北京阿比特科技有限公司