特警力量全集免费观看,18GAY国产小鲜肉可播放,久精品在线观看免费视频

We describe two algorithms for multiplying n x n matrices using time and energy n^2 polylog(n) under basic models of classical physics. The first algorithm is for multiplying integer-valued matrices, and the second, quite different algorithm, is for Boolean matrix multiplication. We hope this work inspires a deeper consideration of physically plausible/realizable models of computing that might allow for algorithms which improve upon the runtimes and energy usages suggested by the parallel RAM model in which each operation requires one unit of time and one unit of energy.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · 語言模型化 · MoDELS · 大語言模型 · 監督 ·

2024 年 1 月 19 日

How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition

Guanting Dong,Hongyi Yuan,Keming Lu,Chengpeng Li,Mingfeng Xue,Dayiheng Liu,Wei Wang,Zheng Yuan,Chang Zhou,Jingren Zhou

Large language models (LLMs) with enormous pre-training tokens and parameters emerge diverse abilities, including math reasoning, code generation, and instruction following. These abilities are further enhanced by supervised fine-tuning (SFT). While the open-source community has explored ad-hoc SFT for enhancing individual capabilities, proprietary LLMs exhibit versatility across various skills. Therefore, understanding the facilitation of multiple abilities via SFT is paramount. In this study, we specifically focuses on the interplay of data composition between mathematical reasoning, code generation, and general human-aligning abilities during SFT. We propose four intriguing research questions to explore the association between model performance and various factors including data amount, composition ratio, model size and SFT strategies. Our experiments reveal that distinct capabilities scale differently and larger models generally show superior performance with same amount of data. Mathematical reasoning and code generation consistently improve with increasing data amount, whereas general abilities plateau after roughly a thousand samples. Moreover, we observe data composition appears to enhance various abilities under limited data conditions, yet can lead to performance conflicts when data is plentiful. Our findings also suggest the amount of composition data influences performance more than the composition ratio. In analysis of SFT strategies, we find that sequentially learning multiple skills risks catastrophic forgetting. Our proposed Dual-stage Mixed Fine-tuning (DMT) strategy offers a promising solution to learn multiple abilities with different scaling patterns.

泛化誤差 · Learning · 泛化理論 · 潛變量/隱變量 · MoDELS ·

2024 年 1 月 19 日

Generalization Error Guaranteed Auto-Encoder-Based Nonlinear Model Reduction for Operator Learning

Hao Liu,Biraj Dahal,Rongjie Lai,Wenjing Liao

Many physical processes in science and engineering are naturally represented by operators between infinite-dimensional function spaces. The problem of operator learning, in this context, seeks to extract these physical processes from empirical data, which is challenging due to the infinite or high dimensionality of data. An integral component in addressing this challenge is model reduction, which reduces both the data dimensionality and problem size. In this paper, we utilize low-dimensional nonlinear structures in model reduction by investigating Auto-Encoder-based Neural Network (AENet). AENet first learns the latent variables of the input data and then learns the transformation from these latent variables to corresponding output data. Our numerical experiments validate the ability of AENet to accurately learn the solution operator of nonlinear partial differential equations. Furthermore, we establish a mathematical and statistical estimation theory that analyzes the generalization error of AENet. Our theoretical framework shows that the sample complexity of training AENet is intricately tied to the intrinsic dimension of the modeled process, while also demonstrating the remarkable resilience of AENet to noise.

估計/估計量 · 可約的 · 過估計 · 樣例 · CASE ·

2024 年 1 月 18 日

Residual Based Error Estimator for Chemical-Mechanically Coupled Battery Active Particles

Raphael Schoof,Lennart Flür,Florian Tuschner,Willy D?rfler

Adaptive finite element methods are a powerful tool to obtain numerical simulation results in a reasonable time. Due to complex chemical and mechanical couplings in lithium-ion batteries, numerical simulations are very helpful to investigate promising new battery active materials such as amorphous silicon featuring a higher energy density than graphite. Based on a thermodynamically consistent continuum model with large deformation and chemo-mechanically coupled approach, we compare three different spatial adaptive refinement strategies: Kelly-, gradient recovery- and residual based error estimation. For the residual based case, the strong formulation of the residual is explicitly derived. With amorphous silicon as example material, we investigate two 3D representative host particle geometries, reduced with symmetry assumptions to a 1D unit interval and a 2D elliptical domain. Our numerical studies show that the Kelly estimator overestimates the error, whereas the gradient recovery estimator leads to lower refinement levels and a good capture of the change of the lithium flux. The residual based error estimator reveals a strong dependency on the cell error part which can be improved by a more suitable choice of constants to be more efficient. In a 2D domain, the concentration has a larger influence on the mesh distribution than the Cauchy stress.

方陣 · 可辨認的 · 樣本復雜度 · state-of-the-art · HTTPS ·

2024 年 1 月 18 日

Recovering Simultaneously Structured Data via Non-Convex Iteratively Reweighted Least Squares

Christian Kümmerle,Johannes Maly

from arxiv, 35 pages, 7 figures

We propose a new algorithm for the problem of recovering data that adheres to multiple, heterogeneous low-dimensional structures from linear observations. Focusing on data matrices that are simultaneously row-sparse and low-rank, we propose and analyze an iteratively reweighted least squares (IRLS) algorithm that is able to leverage both structures. In particular, it optimizes a combination of non-convex surrogates for row-sparsity and rank, a balancing of which is built into the algorithm. We prove locally quadratic convergence of the iterates to a simultaneously structured data matrix in a regime of minimal sample complexity (up to constants and a logarithmic factor), which is known to be impossible for a combination of convex surrogates. In experiments, we show that the IRLS method exhibits favorable empirical convergence, identifying simultaneously row-sparse and low-rank matrices from fewer measurements than state-of-the-art methods. Code is available at //github.com/ckuemmerle/simirls.

AutoML · Learning · 可約的 · Performer · 數據集 ·

2024 年 1 月 18 日

A Meta-Level Learning Algorithm for Sequential Hyper-Parameter Space Reduction in AutoML

Giorgos Borboudakis,Paulos Charonyktakis,Konstantinos Paraschakis,Ioannis Tsamardinos

AutoML platforms have numerous options for the algorithms to try for each step of the analysis, i.e., different possible algorithms for imputation, transformations, feature selection, and modelling. Finding the optimal combination of algorithms and hyper-parameter values is computationally expensive, as the number of combinations to explore leads to an exponential explosion of the space. In this paper, we present the Sequential Hyper-parameter Space Reduction (SHSR) algorithm that reduces the space for an AutoML tool with negligible drop in its predictive performance. SHSR is a meta-level learning algorithm that analyzes past runs of an AutoML tool on several datasets and learns which hyper-parameter values to filter out from consideration on a new dataset to analyze. SHSR is evaluated on 284 classification and 375 regression problems, showing an approximate 30% reduction in execution time with a performance drop of less than 0.1%.

近似 · 泛化理論 · 操作 · 穩健性 · Networking ·

2024 年 1 月 17 日

Approximating Numerical Fluxes Using Fourier Neural Operators for Hyperbolic Conservation Laws

Taeyoung Kim,Myungjoo Kang

from arxiv, 26 pages, 28 figures

Traditionally, classical numerical schemes have been employed to solve partial differential equations (PDEs) using computational methods. Recently, neural network-based methods have emerged. Despite these advancements, neural network-based methods, such as physics-informed neural networks (PINNs) and neural operators, exhibit deficiencies in robustness and generalization. To address these issues, numerous studies have integrated classical numerical frameworks with machine learning techniques, incorporating neural networks into parts of traditional numerical methods. In this study, we focus on hyperbolic conservation laws by replacing traditional numerical fluxes with neural operators. To this end, we developed loss functions inspired by established numerical schemes related to conservation laws and approximated numerical fluxes using Fourier neural operators (FNOs). Our experiments demonstrated that our approach combines the strengths of both traditional numerical schemes and FNOs, outperforming standard FNO methods in several respects. For instance, we demonstrate that our method is robust, has resolution invariance, and is feasible as a data-driven method. In particular, our method can make continuous predictions over time and exhibits superior generalization capabilities with out-of-distribution (OOD) samples, which are challenges that existing neural operator methods encounter.

Lipschitz · Lipschitz連續 · Continuity · 再縮放 · 離散化 ·

2024 年 1 月 16 日

A Continuous-Time Perspective on Global Acceleration for Monotone Equation Problems

Tianyi Lin,Michael. I. Jordan

from arxiv, Accepted by Communications in Optimization Theory; 29 Pages

We propose a new framework to design and analyze accelerated methods that solve general monotone equation (ME) problems $F(x)=0$. Traditional approaches include generalized steepest descent methods and inexact Newton-type methods. If $F$ is uniformly monotone and twice differentiable, these methods achieve local convergence rates while the latter methods are globally convergent thanks to line search and hyperplane projection. However, a global rate is unknown for these methods. The variational inequality methods can be applied to yield a global rate that is expressed in terms of $\|F(x)\|$ but these results are restricted to first-order methods and a Lipschitz continuous operator. It has not been clear how to obtain global acceleration using high-order Lipschitz continuity. This paper takes a continuous-time perspective where accelerated methods are viewed as the discretization of dynamical systems. Our contribution is to propose accelerated rescaled gradient systems and prove that they are equivalent to closed-loop control systems. Based on this connection, we establish the properties of solution trajectories. Moreover, we provide a unified algorithmic framework obtained from discretization of our system, which together with two approximation subroutines yields both existing high-order methods and new first-order methods. We prove that the $p^{th}$-order method achieves a global rate of $O(k^{-p/2})$ in terms of $\|F(x)\|$ if $F$ is $p^{th}$-order Lipschitz continuous and the first-order method achieves the same rate if $F$ is $p^{th}$-order strongly Lipschitz continuous. If $F$ is strongly monotone, the restarted versions achieve local convergence with order $p$ when $p \geq 2$. Our discrete-time analysis is largely motivated by the continuous-time analysis and demonstrates the fundamental role that rescaled gradients play in global acceleration for solving ME problems.

LDPC · Performer · 解碼 · 蒸餾 · 閾值 ·

2024 年 1 月 16 日

Entanglement Purification with Quantum LDPC Codes and Iterative Decoding

Narayanan Rengaswamy,Nithin Raveendran,Ankur Raina,Bane Vasi?

from arxiv, Final accepted version in Quantum; includes a new algorithm to generate logical Pauli operators for stabilizer codes; our software is available at: //github.com/nrenga/ghz_distillation_qec/tree/main/qldpc-ghz_protocol_II and //zenodo.org/record/8284903. arXiv admin note: substantial text overlap with arXiv:2109.06248

Recent constructions of quantum low-density parity-check (QLDPC) codes provide optimal scaling of the number of logical qubits and the minimum distance in terms of the code length, thereby opening the door to fault-tolerant quantum systems with minimal resource overhead. However, the hardware path from nearest-neighbor-connection-based topological codes to long-range-interaction-demanding QLDPC codes is a challenging one. Given the practical difficulty in building a monolithic architecture for quantum computers based on optimal QLDPC codes, it is worth considering a distributed implementation of such codes over a network of interconnected quantum processors. In such a setting, all syndrome measurements and logical operations must be performed using high-fidelity shared entangled states between the processing nodes. Since probabilistic many-to-1 distillation schemes for purifying entanglement are inefficient, we investigate quantum error correction based entanglement purification in this work. Specifically, we employ QLDPC codes to distill GHZ states, as the resulting high-fidelity logical GHZ states can interact directly with the code used to perform distributed quantum computing (DQC), e.g. for fault-tolerant Steane syndrome extraction. This protocol is applicable beyond DQC since entanglement purification is a quintessential task of any quantum network. We use the min-sum algorithm (MSA) based iterative decoder for distilling $3$-qubit GHZ states using a rate $0.118$ family of lifted product QLDPC codes and obtain an input threshold of $\approx 0.7974$ under i.i.d. single-qubit depolarizing noise. This represents the best threshold for a yield of $0.118$ for any GHZ purification protocol. Our results apply to larger size GHZ states as well, where we extend our technical result about a measurement property of $3$-qubit GHZ states to construct a scalable GHZ purification protocol.

知識 (knowledge) · MoDELS · 圖 · 知識圖譜 · AIM ·

2022 年 12 月 12 日

Reasoning over Different Types of Knowledge Graphs: Static, Temporal and Multi-Modal

Ke Liang,Lingyuan Meng,Meng Liu,Yue Liu,Wenxuan Tu,Siwei Wang,Sihang Zhou,Xinwang Liu,Fuchun Sun

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Knowledge graph reasoning (KGR), aiming to deduce new facts from existing facts based on mined logic rules underlying knowledge graphs (KGs), has become a fast-growing research direction. It has been proven to significantly benefit the usage of KGs in many AI applications, such as question answering and recommendation systems, etc. According to the graph types, the existing KGR models can be roughly divided into three categories, \textit{i.e.,} static models, temporal models, and multi-modal models. The early works in this domain mainly focus on static KGR and tend to directly apply general knowledge graph embedding models to the reasoning task. However, these models are not suitable for more complex but practical tasks, such as inductive static KGR, temporal KGR, and multi-modal KGR. To this end, multiple works have been developed recently, but no survey papers and open-source repositories comprehensively summarize and discuss models in this important direction. To fill the gap, we conduct a survey for knowledge graph reasoning tracing from static to temporal and then to multi-modal KGs. Concretely, the preliminaries, summaries of KGR models, and typical datasets are introduced and discussed consequently. Moreover, we discuss the challenges and potential opportunities. The corresponding open-source repository is shared on GitHub: //github.com/LIANGKE23/Awesome-Knowledge-Graph-Reasoning.

可約的 · 模型評估 · 目標檢測 · FAST · Processing（編程語言） ·

2018 年 3 月 27 日

Dynamic Zoom-in Network for Fast Object Detection in Large Images

Mingfei Gao,Ruichi Yu,Ang Li,Vlad I. Morariu,Larry S. Davis

from arxiv, CVPR2018

We introduce a generic framework that reduces the computational cost of object detection while retaining accuracy for scenarios where objects with varied sizes appear in high resolution images. Detection progresses in a coarse-to-fine manner, first on a down-sampled version of the image and then on a sequence of higher resolution regions identified as likely to improve the detection accuracy. Built upon reinforcement learning, our approach consists of a model (R-net) that uses coarse detection results to predict the potential accuracy gain for analyzing a region at a higher resolution and another model (Q-net) that sequentially selects regions to zoom in. Experiments on the Caltech Pedestrians dataset show that our approach reduces the number of processed pixels by over 50% without a drop in detection accuracy. The merits of our approach become more significant on a high resolution test set collected from YFCC100M dataset, where our approach maintains high detection performance while reducing the number of processed pixels by about 70% and the detection time by over 50%.