亚洲AV永久无码精品九之-97人人模人人妻人人添

ML is shifting from the cloud to the edge. Edge computing reduces the surface exposing private data and enables reliable throughput guarantees in real-time applications. Of the panoply of devices deployed at the edge, resource-constrained MCUs, e.g., Arm Cortex-M, are more prevalent, orders of magnitude cheaper, and less power-hungry than application processors or GPUs. Thus, enabling intelligence at the deep edge is the zeitgeist, with researchers focusing on unveiling novel approaches to deploy ANNs on these constrained devices. Quantization is a well-established technique that has proved effective in enabling the deployment of neural networks on MCUs; however, it is still an open question to understand the robustness of QNNs in the face of adversarial examples. To fill this gap, we empirically evaluate the effectiveness of attacks and defenses from (full-precision) ANNs on (constrained) QNNs. Our evaluation includes three QNNs targeting TinyML applications, ten attacks, and six defenses. With this study, we draw a set of interesting findings. First, quantization increases the point distance to the decision boundary and leads the gradient estimated by some attacks to explode or vanish. Second, quantization can act as a noise attenuator or amplifier, depending on the noise magnitude, and causes gradient misalignment. Regarding adversarial defenses, we conclude that input pre-processing defenses show impressive results on small perturbations; however, they fall short as the perturbation increases. At the same time, train-based defenses increase the average point distance to the decision boundary, which holds after quantization. However, we argue that train-based defenses still need to smooth the quantization-shift and gradient misalignment phenomenons to counteract adversarial example transferability to QNNs. All artifacts are open-sourced to enable independent validation of results.

相關內容

邊

關注 0

MoDELS · 原點 · 在線 · 推薦系統 · INFORMS ·

2024 年 6 月 13 日

Retrieval and Distill: A Temporal Data Shift-Free Paradigm for Online Recommendation System

Lei Zheng,Ning Li,Weinan Zhang,Yong Yu

Current recommendation systems are significantly affected by a serious issue of temporal data shift, which is the inconsistency between the distribution of historical data and that of online data. Most existing models focus on utilizing updated data, overlooking the transferable, temporal data shift-free information that can be learned from shifting data. We propose the Temporal Invariance of Association theorem, which suggests that given a fixed search space, the relationship between the data and the data in the search space keeps invariant over time. Leveraging this principle, we designed a retrieval-based recommendation system framework that can train a data shift-free relevance network using shifting data, significantly enhancing the predictive performance of the original model in the recommendation system. However, retrieval-based recommendation models face substantial inference time costs when deployed online. To address this, we further designed a distill framework that can distill information from the relevance network into a parameterized module using shifting data. The distilled model can be deployed online alongside the original model, with only a minimal increase in inference time. Extensive experiments on multiple real datasets demonstrate that our framework significantly improves the performance of the original model by utilizing shifting data.

線性的 · 線性回歸 · 可交換的 · 統計量 · 情景 ·

2024 年 6 月 11 日

The Exchangeability Assumption for Permutation Tests of Multiple Regression Models: Implications for Statistics and Data Science

Johanna Hardin,Lauren Quesada,Julie Ye,Nicholas J. Horton

Permutation tests are a powerful and flexible approach to inference via resampling. As computational methods become more ubiquitous in the statistics curriculum, use of permutation tests has become more tractable. At the heart of the permutation approach is the exchangeability assumption, which determines the appropriate null sampling distribution. We explore the exchangeability assumption in the context of permutation tests for multiple linear regression models. Various permutation schemes for the multiple linear regression setting have been previously proposed and assessed in the literature. As has been demonstrated previously, in most settings, the choice of how to permute a multiple linear regression model does not materially change inferential conclusions. Regardless, we believe that (1) understanding exchangeability in the multiple linear regression setting and also (2) how it relates to the null hypothesis of interest is valuable. We also briefly explore model settings beyond multiple linear regression (e.g., settings where clustering or hierarchical relationships exist) as a motivation for the benefit and flexibility of permutation tests. We close with pedagogical recommendations for instructors who want to bring multiple linear regression permutation inference into their classroom as a way to deepen student understanding of resampling-based inference.

分離的 · 大語言模型 · 泛函 · 控制器 · Guidance ·

2024 年 6 月 11 日

Guiding LLM Temporal Logic Generation with Explicit Separation of Data and Control

William Murphy,Nikolaus Holzer,Nathan Koenig,Leyi Cui,Raven Rothkopf,Feitong Qiao,Mark Santolucito

Temporal logics are powerful tools that are widely used for the synthesis and verification of reactive systems. The recent progress on Large Language Models (LLMs) has the potential to make the process of writing such specifications more accessible. However, writing specifications in temporal logics remains challenging for all but the most expert users. A key question in using LLMs for temporal logic specification engineering is to understand what kind of guidance is most helpful to the LLM and the users to easily produce specifications. Looking specifically at the problem of reactive program synthesis, we explore the impact of providing an LLM with guidance on the separation of control and data--making explicit for the LLM what functionality is relevant for the specification, and treating the remaining functionality as an implementation detail for a series of pre-defined functions and predicates. We present a benchmark set and find that this separation of concerns improves specification generation. Our benchmark provides a test set against which to verify future work in LLM generation of temporal logic specifications.

NLP · 論文 · ACL · 樣本 · 自然語言處理 ·

2024 年 6 月 10 日

Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research

Surangika Ranathunga,Nisansa de Silva,Dilith Jayakody,Aloka Fernando

from arxiv, Will appear in ACL 2024

We analysed a sample of NLP research papers archived in ACL Anthology as an attempt to quantify the degree of openness and the benefit of such an open culture in the NLP community. We observe that papers published in different NLP venues show different patterns related to artefact reuse. We also note that more than 30% of the papers we analysed do not release their artefacts publicly, despite promising to do so. Further, we observe a wide language-wise disparity in publicly available NLP-related artefacts.

可辨認的 · 潛在 · 平穩的 · Learning · 粵港澳大灣區數字經濟研究院 ·

2024 年 6 月 7 日

When and How: Learning Identifiable Latent States for Nonstationary Time Series Forecasting

Zijian Li,Ruichu Cai,Zhenhui Yang,Haiqin Huang,Guangyi Chen,Yifan Shen,Zhengming Chen,Xiangchen Song,Kun Zhang

Temporal distribution shifts are ubiquitous in time series data. One of the most popular methods assumes that the temporal distribution shift occurs uniformly to disentangle the stationary and nonstationary dependencies. But this assumption is difficult to meet, as we do not know when the distribution shifts occur. To solve this problem, we propose to learn IDentifiable latEnt stAtes (IDEA) to detect when the distribution shifts occur. Beyond that, we further disentangle the stationary and nonstationary latent states via sufficient observation assumption to learn how the latent states change. Specifically, we formalize the causal process with environment-irrelated stationary and environment-related nonstationary variables. Under mild conditions, we show that latent environments and stationary/nonstationary variables are identifiable. Based on these theories, we devise the IDEA model, which incorporates an autoregressive hidden Markov model to estimate latent environments and modular prior networks to identify latent states. The IDEA model outperforms several latest nonstationary forecasting methods on various benchmark datasets, highlighting its advantages in real-world scenarios.

圖 · 圖形處理器 · Networking · Neural Networks · 置信度 ·

2024 年 6 月 6 日

GNNAnatomy: Systematic Generation and Evaluation of Multi-Level Explanations for Graph Neural Networks

Hsiao-Ying Lu,Yiran Li,Ujwal Pratap Krishna Kaluvakolanu Thyagarajan,Kwan-Liu Ma

Graph Neural Networks (GNNs) have proven highly effective in various machine learning (ML) tasks involving graphs, such as node/graph classification and link prediction. However, explaining the decisions made by GNNs poses challenges because of the aggregated relational information based on graph structure, leading to complex data transformations. Existing methods for explaining GNNs often face limitations in systematically exploring diverse substructures and evaluating results in the absence of ground truths. To address this gap, we introduce GNNAnatomy, a model- and dataset-agnostic visual analytics system designed to facilitate the generation and evaluation of multi-level explanations for GNNs. In GNNAnatomy, we employ graphlets to elucidate GNN behavior in graph-level classification tasks. By analyzing the associations between GNN classifications and graphlet frequencies, we formulate hypothesized factual and counterfactual explanations. To validate a hypothesized graphlet explanation, we introduce two metrics: (1) the correlation between its frequency and the classification confidence, and (2) the change in classification confidence after removing this substructure from the original graph. To demonstrate the effectiveness of GNNAnatomy, we conduct case studies on both real-world and synthetic graph datasets from various domains. Additionally, we qualitatively compare GNNAnatomy with a state-of-the-art GNN explainer, demonstrating the utility and versatility of our design.

層 · SimPLe · 詞元分析器 · 變換 · Vision ·

2024 年 6 月 6 日

DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs

Lingchen Meng,Jianwei Yang,Rui Tian,Xiyang Dai,Zuxuan Wu,Jianfeng Gao,Yu-Gang Jiang

from arxiv, Project Page: //deepstack-vl.github.io/

Most large multimodal models (LMMs) are implemented by feeding visual tokens as a sequence into the first layer of a large language model (LLM). The resulting architecture is simple but significantly increases computation and memory costs, as it has to handle a large number of additional tokens in its input layer. This paper presents a new architecture DeepStack for LMMs. Considering $N$ layers in the language and vision transformer of LMMs, we stack the visual tokens into $N$ groups and feed each group to its aligned transformer layer \textit{from bottom to top}. Surprisingly, this simple method greatly enhances the power of LMMs to model interactions among visual tokens across layers but with minimal additional cost. We apply DeepStack to both language and vision transformer in LMMs, and validate the effectiveness of DeepStack LMMs with extensive empirical results. Using the same context length, our DeepStack 7B and 13B parameters surpass their counterparts by \textbf{2.7} and \textbf{2.9} on average across \textbf{9} benchmarks, respectively. Using only one-fifth of the context length, DeepStack rivals closely to the counterparts that use the full context length. These gains are particularly pronounced on high-resolution tasks, e.g., \textbf{4.2}, \textbf{11.0}, and \textbf{4.0} improvements on TextVQA, DocVQA, and InfoVQA compared to LLaVA-1.5-7B, respectively. We further apply DeepStack to vision transformer layers, which brings us a similar amount of improvements, \textbf{3.8} on average compared with LLaVA-1.5-7B.

控制器 · 操作 · INFORMS · Performer · 泛函 ·

2024 年 6 月 5 日

The Semantics of Effects: Centrality, Quantum Control and Reversible Recursion

Louis Lemonnier

from arxiv, PhD thesis from Universit\'e Paris-Saclay

This thesis revolves around an area of computer science called "semantics". We work with operational semantics, equational theories, and denotational semantics. The first contribution of this thesis is a study of the commutativity of effects through the prism of monads. Monads are the generalisation of algebraic structures such as monoids, which have a notion of centre: the centre of a monoid is made of elements which commute with all others. We provide the necessary and sufficient conditions for a monad to have a centre. We also detail the semantics of a language with effects that carry information on which effects are central. Moreover, we provide a strong link between its equational theories and its denotational semantics. The second focus of the thesis is quantum computing, seen as a reversible effect. Physically permissible quantum operations are all reversible, except measurement; however, measurement can be deferred at the end of the computation, allowing us to focus on the reversible part first. We define a simply-typed reversible programming language performing quantum operations called "unitaries". A denotational semantics and an equational theory adapted to the language are presented, and we prove that the former is complete. Furthermore, we study recursion in reversible programming, providing adequate operational and denotational semantics to a Turing-complete, reversible, functional programming language. The denotational semantics uses the dcpo enrichment of rig join inverse categories. This mathematical account of higher-order reasoning on reversible computing does not directly generalise to its quantum counterpart. In the conclusion, we detail the limitations and possible future for higher-order quantum control through guarded recursion.

學成 · 大數據 · 相同 · 人工智能 · 統計方法 ·

2020 年 5 月 5 日

A Survey of Learning Causality with Data: Problems and Methods

Ruocheng Guo,Lu Cheng,Jundong Li,P. Richard Hahn,Huan Liu

from arxiv, 35 pages, accepted by ACM CSUR

This work considers the question of how convenient access to copious data impacts our ability to learn causal effects and relations. In what ways is learning causality in the era of big data different from -- or the same as -- the traditional one? To answer this question, this survey provides a comprehensive and structured review of both traditional and frontier methods in learning causality and relations along with the connections between causality and machine learning. This work points out on a case-by-case basis how big data facilitates, complicates, or motivates each approach.

圖 · Neural Networks · 圖形處理器 · Networking · INFORMS ·

2018 年 12 月 20 日

Graph Neural Networks: A Review of Methods and Applications

Jie Zhou,Ganqu Cui,Zhengyan Zhang,Cheng Yang,Zhiyuan Liu,Maosong Sun

Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics system, learning molecular fingerprints, predicting protein interface, and classifying diseases require that a model to learn from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures, like the dependency tree of sentences and the scene graph of images, is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are connectionist models that capture the dependence of graphs via message passing between the nodes of graphs. Unlike standard neural networks, graph neural networks retain a state that can represent information from its neighborhood with an arbitrary depth. Although the primitive graph neural networks have been found difficult to train for a fixed point, recent advances in network architectures, optimization techniques, and parallel computation have enabled successful learning with them. In recent years, systems based on graph convolutional network (GCN) and gated graph neural network (GGNN) have demonstrated ground-breaking performance on many tasks mentioned above. In this survey, we provide a detailed review over existing graph neural network models, systematically categorize the applications, and propose four open problems for future research.