亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this work, we demonstrate the application of a first-order Taylor expansion to approximate a generic function $F: R^{n \times m} \to R^{n \times m}$ and utilize it in language modeling. To enhance the basic Taylor expansion, we introduce iteration and piecewise modeling, leading us to name the algorithm the Iterative Piecewise Affine (IPA) approximation. The final algorithm exhibits interesting resemblances to the Transformers decoder architecture. By comparing parameter arrangements in IPA and Transformers, we observe a strikingly similar performance, with IPA outperforming Transformers by 1.5\% in the next token prediction task with cross-entropy loss for smaller sequence lengths.

相關內容

We study the task of $(\epsilon, \delta)$-differentially private online convex optimization (OCO). In the online setting, the release of each distinct decision or iterate carries with it the potential for privacy loss. This problem has a long history of research starting with Jain et al. [2012] and the best known results for the regime of {\epsilon} not being very small are presented in Agarwal et al. [2023]. In this paper we improve upon the results of Agarwal et al. [2023] in terms of the dimension factors as well as removing the requirement of smoothness. Our results are now the best known rates for DP-OCO in this regime. Our algorithms builds upon the work of [Asi et al., 2023] which introduced the idea of explicitly limiting the number of switches via rejection sampling. The main innovation in our algorithm is the use of sampling from a strongly log-concave density which allows us to trade-off the dimension factors better leading to improved results.

In this paper, we consider the problem of preprocessing a text $T$ of length $n$ and a dictionary $\mathcal{D}$ to answer multiple types of pattern queries. Inspired by [Charalampopoulos-Kociumaka-Mohamed-Radoszewski-Rytter-Wale\'n ISAAC 2019], we consider the Internal Dictionary, where the dictionary is interval in the sense that every pattern is given as a fragment of $T$. Therefore, the size of $\mathcal{D}$ is proportional to the number of patterns instead of their total length, which could be $\Theta(n \cdot |\mathcal{D}|)$. We propose a new technique to preprocess $T$ and organize the substring structure. In this way, we are able to develop algorithms to answer queries more efficiently than in previous works.

We present a randomized algorithm that computes single-source shortest paths (SSSP) in $O(m\log^8(n)\log W)$ time when edge weights are integral and can be negative. This essentially resolves the classic negative-weight SSSP problem. The previous bounds are $\tilde O((m+n^{1.5})\log W)$ [BLNPSSSW FOCS'20] and $m^{4/3+o(1)}\log W$ [AMV FOCS'20]. Near-linear time algorithms were known previously only for the special case of planar directed graphs [Fakcharoenphol and Rao FOCS'01]. In contrast to all recent developments that rely on sophisticated continuous optimization methods and dynamic algorithms, our algorithm is simple: it requires only a simple graph decomposition and elementary combinatorial tools. In fact, ours is the first combinatorial algorithm for negative-weight SSSP to break through the classic $\tilde O(m\sqrt{n}\log W)$ bound from over three decades ago [Gabow and Tarjan SICOMP'89].

Physical reasoning is a crucial aspect in the development of general AI systems, given that human learning starts with interacting with the physical world before progressing to more complex concepts. Although researchers have studied and assessed the physical reasoning of AI approaches through various specific benchmarks, there is no comprehensive approach to evaluating and measuring progress. Therefore, we aim to offer an overview of existing benchmarks and their solution approaches and propose a unified perspective for measuring the physical reasoning capacity of AI systems. We select benchmarks that are designed to test algorithmic performance in physical reasoning tasks. While each of the selected benchmarks poses a unique challenge, their ensemble provides a comprehensive proving ground for an AI generalist agent with a measurable skill level for various physical reasoning concepts. This gives an advantage to such an ensemble of benchmarks over other holistic benchmarks that aim to simulate the real world by intertwining its complexity and many concepts. We group the presented set of physical reasoning benchmarks into subcategories so that more narrow generalist AI agents can be tested first on these groups.

We present a modular approach to \emph{reinforcement learning} (RL) in environments consisting of simpler components evolving in parallel. A monolithic view of such modular environments may be prohibitively large to learn, or may require unrealizable communication between the components in the form of a centralized controller. Our proposed approach is based on the assume-guarantee paradigm where the optimal control for the individual components is synthesized in isolation by making \emph{assumptions} about the behaviors of neighboring components, and providing \emph{guarantees} about their own behavior. We express these \emph{assume-guarantee contracts} as regular languages and provide automatic translations to scalar rewards to be used in RL. By combining local probabilities of satisfaction for each component, we provide a lower bound on the probability of satisfaction of the complete system. By solving a Markov game for each component, RL can produce a controller for each component that maximizes this lower bound. The controller utilizes the information it receives through communication, observations, and any knowledge of a coarse model of other agents. We experimentally demonstrate the efficiency of the proposed approach on a variety of case studies.

For a permutation $\pi: [K]\rightarrow [K]$, a sequence $f: \{1,2,\cdots, n\}\rightarrow \mathbb R$ contains a $\pi$-pattern of size $K$, if there is a sequence of indices $(i_1, i_2, \cdots, i_K)$ ($i_1<i_2<\cdots<i_K$), satisfying that $f(i_a)<f(i_b)$ if $\pi(a)<\pi(b)$, for $a,b\in [K]$. Otherwise, $f$ is referred to as $\pi$-free. For the special case where $\pi = (1,2,\cdots, K)$, it is referred to as the monotone pattern. \cite{newman2017testing} initiated the study of testing $\pi$-freeness with one-sided error. They focused on two specific problems, testing the monotone permutations and the $(1,3,2)$ permutation. For the problem of testing monotone permutation $(1,2,\cdots,K)$, \cite{ben2019finding} improved the $(\log n)^{O(K^2)}$ non-adaptive query complexity of \cite{newman2017testing} to $O((\log n)^{\lfloor \log_{2} K\rfloor})$. Further, \cite{ben2019optimal} proposed an adaptive algorithm with $O(\log n)$ query complexity. However, no progress has yet been made on the problem of testing $(1,3,2)$-freeness. In this work, we present an adaptive algorithm for testing $(1,3,2)$-freeness. The query complexity of our algorithm is $O(\epsilon^{-2}\log^4 n)$, which significantly improves over the $O(\epsilon^{-7}\log^{26}n)$-query adaptive algorithm of \cite{newman2017testing}. This improvement is mainly achieved by the proposal of a new structure embedded in the patterns.

Segment anything model (SAM) addresses two practical yet challenging segmentation tasks: \textbf{segment anything (SegAny)}, which utilizes a certain point to predict the mask for a single object of interest, and \textbf{segment everything (SegEvery)}, which predicts the masks for all objects on the image. What makes SegAny slow for SAM is its heavyweight image encoder, which has been addressed by MobileSAM via decoupled knowledge distillation. The efficiency bottleneck of SegEvery with SAM, however, lies in its mask decoder because it needs to first generate numerous masks with redundant grid-search prompts and then perform filtering to obtain the final valid masks. We propose to improve its efficiency by directly generating the final masks with only valid prompts, which can be obtained through object discovery. Our proposed approach not only helps reduce the total time on the mask decoder by at least 16 times but also achieves superior performance. Specifically, our approach yields an average performance boost of 3.6\% (42.5\% \textit{v.s.} 38.9\%) for zero-shot object proposal on the LVIS dataset with the mask AR@$K$ metric. Qualitative results show that our approach generates fine-grained masks while avoiding over-segmenting things. This project targeting faster SegEvery than the original SAM is termed MobileSAMv2 to differentiate from MobileSAM which targets faster SegAny. Moreover, we demonstrate that our new prompt sampling is also compatible with the distilled image encoders in MobileSAM, contributing to a unified framework for efficient SegAny and SegEvery. The code is available at the same link as MobileSAM Project \href{//github.com/ChaoningZhang/MobileSAM}{\textcolor{red}{//github.com/ChaoningZhang/MobileSAM}}. \end{abstract}

Given a graph $G$, an integer $k\geq 0$, and a non-negative integral function $f:V(G) \rightarrow \mathcal{N}$, the {\sc Vector Domination} problem asks whether a set $S$ of vertices, of cardinality $k$ or less, exists in $G$ so that every vertex $v \in V(G)-S$ has at least $f(v)$ neighbors in $S$. The problem generalizes several domination problems and it has also been shown to generalize Bounded-Degree Vertex Deletion. In this paper, the parameterized version of Vector Domination is studied when the input graph is planar. A linear problem kernel is presented.

We introduce a termination method for the algebraic graph transformation framework PBPO+, in which we weigh objects by summing a class of weighted morphisms targeting them. The method is well-defined in rm-adhesive quasitoposes (which include toposes and therefore many graph categories of interest), and is applicable to non-linear rules. The method is also defined for other frameworks, including SqPO and left-linear DPO, because we have previously shown that they are naturally encodable into PBPO+ in the quasitopos setting. We have implemented our method, and the implementation includes a REPL that can be used for guiding relative termination proofs.

Federated Learning (FL) is a decentralized machine-learning paradigm, in which a global server iteratively averages the model parameters of local users without accessing their data. User heterogeneity has imposed significant challenges to FL, which can incur drifted global models that are slow to converge. Knowledge Distillation has recently emerged to tackle this issue, by refining the server model using aggregated knowledge from heterogeneous users, other than directly averaging their model parameters. This approach, however, depends on a proxy dataset, making it impractical unless such a prerequisite is satisfied. Moreover, the ensemble knowledge is not fully utilized to guide local model learning, which may in turn affect the quality of the aggregated model. Inspired by the prior art, we propose a data-free knowledge distillation} approach to address heterogeneous FL, where the server learns a lightweight generator to ensemble user information in a data-free manner, which is then broadcasted to users, regulating local training using the learned knowledge as an inductive bias. Empirical studies powered by theoretical implications show that, our approach facilitates FL with better generalization performance using fewer communication rounds, compared with the state-of-the-art.

北京阿比特科技有限公司