亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Minimum Weight Cycle (MWC) is the problem of finding a simple cycle of minimum weight in a graph $G=(V,E)$. This is a fundamental graph problem with classical sequential algorithms that run in $\tilde{O}(n^3)$ and $\tilde{O}(mn)$ time where $n=|V|$ and $m=|E|$. In recent years this problem has received significant attention in the context of hardness through fine grained sequential complexity as well as in design of faster sequential approximation algorithms. For computing minimum weight cycle in the distributed CONGEST model, near-linear in $n$ lower and upper bounds on round complexity are known for directed graphs (weighted and unweighted), and for undirected weighted graphs; these lower bounds also apply to any $(2-\epsilon)$-approximation algorithm. This paper focuses on round complexity bounds for approximating MWC in the CONGEST model: For coarse approximations we show that for any constant $\alpha >1$, computing an $\alpha$-approximation of MWC requires $\Omega (\frac{\sqrt n}{\log n})$ rounds on weighted undirected graphs and on directed graphs, even if unweighted. We complement these lower bounds with sublinear $\tilde{O}(n^{2/3}+D)$-round algorithms for approximating MWC close to a factor of 2 in these classes of graphs. A key ingredient of our approximation algorithms is an efficient algorithm for computing $(1+\epsilon)$-approximate shortest paths from $k$ sources in directed and weighted graphs, which may be of independent interest for other CONGEST problems. We present an algorithm that runs in $\tilde{O}(\sqrt{nk} + D)$ rounds if $k \ge n^{1/3}$ and $\tilde{O}(\sqrt{nk} + k^{2/5}n^{2/5+o(1)}D^{2/5} + D)$ rounds if $k<n^{1/3}$, and this round complexity smoothly interpolates between the best known upper bounds for approximate (or exact) SSSP when $k=1$ and APSP when $k=n$.

相關內容

We study a new graph separation problem called Multiway Near-Separator. Given an undirected graph $G$, integer $k$, and terminal set $T \subseteq V(G)$, it asks whether there is a vertex set $S \subseteq V(G) \setminus T$ of size at most $k$ such that in graph $G-S$, no pair of distinct terminals can be connected by two pairwise internally vertex-disjoint paths. Hence each terminal pair can be separated in $G-S$ by removing at most one vertex. The problem is therefore a generalization of (Node) Multiway Cut, which asks for a vertex set for which each terminal is in a different component of $G-S$. We develop a fixed-parameter tractable algorithm for Multiway Near-Separator running in time $2^{O(k \log k)} * n^{O(1)}$. Our algorithm is based on a new pushing lemma for solutions with respect to important separators, along with two problem-specific ingredients. The first is a polynomial-time subroutine to reduce the number of terminals in the instance to a polynomial in the solution size $k$ plus the size of a given suboptimal solution. The second is a polynomial-time algorithm that, given a graph $G$ and terminal set $T \subseteq V(G)$ along with a single vertex $x \in V(G)$ that forms a multiway near-separator, computes a 14-approximation for the problem of finding a multiway near-separator not containing $x$.

Kinetic equations are crucial for modeling non-equilibrium phenomena, but their computational complexity is a challenge. This paper presents a data-driven approach using reduced order models (ROM) to efficiently model non-equilibrium flows in kinetic equations by comparing two ROM approaches: Proper Orthogonal Decomposition (POD) and autoencoder neural networks (AE). While AE initially demonstrate higher accuracy, POD's precision improves as more modes are considered. Notably, our work recognizes that the classical POD-MOR approach, although capable of accurately representing the non-linear solution manifold of the kinetic equation, may not provide a parsimonious model of the data due to the inherently non-linear nature of the data manifold. We demonstrate how AEs are used in finding the intrinsic dimension of a system and to allow correlating the intrinsic quantities with macroscopic quantities that have a physical interpretation.

We develop a vector space semantics for Lambek Calculus with Soft Subexponentials, apply the calculus to construct compositional vector interpretations for parasitic gap noun phrases and discourse units with anaphora and ellipsis, and experiment with the constructions in a distributional sentence similarity task. As opposed to previous work, which used Lambek Calculus with a Relevant Modality the calculus used in this paper uses a bounded version of the modality and is decidable. The vector space semantics of this new modality allows us to meaningfully define contraction as projection and provide a linear theory behind what we could previously only achieve via nonlinear maps.

Large Language Models (LLMs) have the ability to solve a variety of tasks, such as text summarization and mathematical questions, just out of the box, but they are often trained with a single task in mind. Due to high computational costs, the current trend is to use prompt instruction tuning to better adjust monolithic, pretrained LLMs for new -- but often individual -- downstream tasks. Thus, how one would expand prompt tuning to handle -- concomitantly -- heterogeneous tasks and data distributions is a widely open question. To address this gap, we suggest the use of \emph{Mixture of Prompts}, or MoPs, associated with smart gating functionality: the latter -- whose design is one of the contributions of this paper -- can identify relevant skills embedded in different groups of prompts and dynamically assign combined experts (i.e., collection of prompts), based on the target task. Additionally, MoPs are empirically agnostic to any model compression technique applied -- for efficiency reasons -- as well as instruction data source and task composition. In practice, MoPs can simultaneously mitigate prompt training "interference" in multi-task, multi-source scenarios (e.g., task and data heterogeneity across sources), as well as possible implications from model approximations. As a highlight, MoPs manage to decrease final perplexity from $\sim20\%$ up to $\sim70\%$, as compared to baselines, in the federated scenario, and from $\sim 3\%$ up to $\sim30\%$ in the centralized scenario.

Information geometry is a study of statistical manifolds, that is, spaces of probability distributions from a geometric perspective. Its classical information-theoretic applications relate to statistical concepts such as Fisher information, sufficient statistics, and efficient estimators. Today, information geometry has emerged as an interdisciplinary field that finds applications in diverse areas such as radar sensing, array signal processing, quantum physics, deep learning, and optimal transport. This article presents an overview of essential information geometry to initiate an information theorist, who may be unfamiliar with this exciting area of research. We explain the concepts of divergences on statistical manifolds, generalized notions of distances, orthogonality, and geodesics, thereby paving the way for concrete applications and novel theoretical investigations. We also highlight some recent information-geometric developments, which are of interest to the broader information theory community.

Optimal Transport (OT) problem investigates a transport map that bridges two distributions while minimizing a given cost function. In this regard, OT between tractable prior distribution and data has been utilized for generative modeling tasks. However, OT-based methods are susceptible to outliers and face optimization challenges during training. In this paper, we propose a novel generative model based on the semi-dual formulation of Unbalanced Optimal Transport (UOT). Unlike OT, UOT relaxes the hard constraint on distribution matching. This approach provides better robustness against outliers, stability during training, and faster convergence. We validate these properties empirically through experiments. Moreover, we study the theoretical upper-bound of divergence between distributions in UOT. Our model outperforms existing OT-based generative models, achieving FID scores of 2.97 on CIFAR-10 and 5.80 on CelebA-HQ-256. The code is available at \url{//github.com/Jae-Moo/UOTM}.

Following the success of GPT4, there has been a surge in interest in multimodal large language model (MLLM) research. This line of research focuses on developing general-purpose LLMs through fine-tuning pre-trained LLMs and vision models. However, catastrophic forgetting, a notorious phenomenon where the fine-tuned model fails to retain similar performance compared to the pre-trained model, still remains an inherent problem in multimodal LLMs (MLLM). In this paper, we introduce EMT: Evaluating MulTimodality for evaluating the catastrophic forgetting in MLLMs, by treating each MLLM as an image classifier. We first apply EMT to evaluate several open-source fine-tuned MLLMs and we discover that almost all evaluated MLLMs fail to retain the same performance levels as their vision encoders on standard image classification tasks. Moreover, we continue fine-tuning LLaVA, an MLLM and utilize EMT to assess performance throughout the fine-tuning. Interestingly, our results suggest that early-stage fine-tuning on an image dataset improves performance across other image datasets, by enhancing the alignment of text and visual features. However, as fine-tuning proceeds, the MLLMs begin to hallucinate, resulting in a significant loss of generalizability, even when the image encoder remains frozen. Our results suggest that MLLMs have yet to demonstrate performance on par with their vision models on standard image classification tasks and the current MLLM fine-tuning procedure still has room for improvement.

In the stochastic gradient descent (SGD) for sequential simulations such as the neural stochastic differential equations, the Multilevel Monte Carlo (MLMC) method is known to offer better theoretical computational complexity compared to the naive Monte Carlo approach. However, in practice, MLMC scales poorly on massively parallel computing platforms such as modern GPUs, because of its large parallel complexity which is equivalent to that of the naive Monte Carlo method. To cope with this issue, we propose the delayed MLMC gradient estimator that drastically reduces the parallel complexity of MLMC by recycling previously computed gradient components from earlier steps of SGD. The proposed estimator provably reduces the average parallel complexity per iteration at the cost of a slightly worse per-iteration convergence rate. In our numerical experiments, we use an example of deep hedging to demonstrate the superior parallel complexity of our method compared to the standard MLMC in SGD.

Fictitious Play (FP) is a simple and natural dynamic for repeated play with many applications in game theory and multi-agent reinforcement learning. It was introduced by Brown (1949,1951) and its convergence properties for two-player zero-sum games was established later by Robinson (1951). Potential games Monderer and Shapley (1996b) is another class of games which exhibit the FP property (Monderer and Shapley (1996a)), i.e., FP dynamics converges to a Nash equilibrium if all agents follows it. Nevertheless, except for two-player zero-sum games and for specific instances of payoff matrices (Abernethy et al. (2021)) or for adversarial tie-breaking rules (Daskalakis and Pan (2014)), the convergence rate of FP is unknown. In this work, we focus on the rate of convergence of FP when applied to potential games and more specifically identical payoff games. We prove that FP can take exponential time (in the number of strategies) to reach a Nash equilibrium, even if the game is restricted to two agents and for arbitrary tie-breaking rules. To prove this, we recursively construct a two-player coordination game with a unique Nash equilibrium. Moreover, every approximate Nash equilibrium in the constructed game must be close to the pure Nash equilibrium in $\ell_1$-distance.

Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6%.

北京阿比特科技有限公司