亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We study feature interactions in the context of feature attribution methods for post-hoc interpretability. In interpretability research, getting to grips with feature interactions is increasingly recognised as an important challenge, because interacting features are key to the success of neural networks. Feature interactions allow a model to build up hierarchical representations for its input, and might provide an ideal starting point for the investigation into linguistic structure in language models. However, uncovering the exact role that these interactions play is also difficult, and a diverse range of interaction attribution methods has been proposed. In this paper, we focus on the question which of these methods most faithfully reflects the inner workings of the target models. We work out a grey box methodology, in which we train models to perfection on a formal language classification task, using PCFGs. We show that under specific configurations, some methods are indeed able to uncover the grammatical rules acquired by a model. Based on these findings we extend our evaluation to a case study on language models, providing novel insights into the linguistic structure that these models have acquired.

相關內容

IFIP TC13 Conference on Human-Computer Interaction是人機交互領域的研究者和實踐者展示其工作的重要平臺。多年來,這些會議吸引了來自幾個國家和文化的研究人員。官網鏈接: · 再參數化/重參數化 · 分離的 · Better · 非凸 ·
2023 年 8 月 14 日

Pathwise coordinate descent algorithms have been used to compute entire solution paths for lasso and other penalized regression problems quickly with great success. They improve upon cold start algorithms by solving the problems that make up the solution path sequentially for an ordered set of tuning parameter values, instead of solving each problem separately. However, extending pathwise coordinate descent algorithms to more the general bridge or power family of $\ell_q$ penalties is challenging. Faster algorithms for computing solution paths for these penalties are needed because $\ell_q$ penalized regression problems can be nonconvex and especially burdensome to solve. In this paper, we show that a reparameterization of $\ell_q$ penalized regression problems is more amenable to pathwise coordinate descent algorithms. This allows us to improve computation of the mode-thresholding function for $\ell_q$ penalized regression problems in practice and introduce two separate pathwise algorithms. We show that either pathwise algorithm is faster than the corresponding cold-start alternative, and demonstrate that different pathwise algorithms may be more likely to reach better solutions.

This paper proposes a new approach for change point detection in causal networks of multivariate Hawkes processes using Frechet statistics. Our method splits the point process into overlapping windows, estimates kernel matrices in each window, and reconstructs the signed Laplacians by treating the kernel matrices as the adjacency matrices of the causal network. We demonstrate the effectiveness of our method through experiments on both simulated and real-world cryptocurrency datasets. Our results show that our method is capable of accurately detecting and characterizing changes in the causal structure of multivariate Hawkes processes, and may have potential applications in fields such as finance and neuroscience. The proposed method is an extension of previous work on Frechet statistics in point process settings and represents an important contribution to the field of change point detection in multivariate point processes.

The performance of graph representation learning is affected by the quality of graph input. While existing research usually pursues a globally smoothed graph embedding, we believe the rarely observed anomalies are as well harmful to an accurate prediction. This work establishes a graph learning scheme that automatically detects (locally) corrupted feature attributes and recovers robust embedding for prediction tasks. The detection operation leverages a graph autoencoder, which does not make any assumptions about the distribution of the local corruptions. It pinpoints the positions of the anomalous node attributes in an unbiased mask matrix, where robust estimations are recovered with sparsity promoting regularizer. The optimizer approaches a new embedding that is sparse in the framelet domain and conditionally close to input observations. Extensive experiments are provided to validate our proposed model can recover a robust graph representation from black-box poisoning and achieve excellent performance.

Hyperproperties extend trace properties to express properties of sets of traces, and they are increasingly popular in specifying various security and performance-related properties in domains such as cyber-physical systems, smart grids, and automotive. This paper introduces a model checking algorithm for a new formalism, HyperTWTL, which extends Time Window Temporal Logic (TWTL) -- a domain-specific formal specification language for robotics, by allowing explicit and simultaneous quantification over multiple execution traces. We present HyperTWTL with both \emph{synchronous} and \emph{asynchronous} semantics, based on the alignment of the timestamps in the traces. Consequently, we demonstrate the application of HyperTWTL in formalizing important information-flow security policies and concurrency for robotics applications. Finally, we propose a model checking algorithm for verifying fragments of HyperTWTL by reducing the problem to a TWTL model checking problem.

Conformal inference is a popular tool for constructing prediction intervals (PI). We consider here the scenario of post-selection/selective conformal inference, that is PIs are reported only for individuals selected from an unlabeled test data. To account for multiplicity, we develop a general split conformal framework to construct selective PIs with the false coverage-statement rate (FCR) control. We first investigate the Benjamini and Yekutieli (2005)'s FCR-adjusted method in the present setting, and show that it is able to achieve FCR control but yields uniformly inflated PIs. We then propose a novel solution to the problem, named as Selective COnditional conformal Predictions (SCOP), which entails performing selection procedures on both calibration set and test set and construct marginal conformal PIs on the selected sets by the aid of conditional empirical distribution obtained by the calibration set. Under a unified framework and exchangeable assumptions, we show that the SCOP can exactly control the FCR. More importantly, we provide non-asymptotic miscoverage bounds for a general class of selection procedures beyond exchangeablity and discuss the conditions under which the SCOP is able to control the FCR. As special cases, the SCOP with quantile-based selection or conformal p-values-based multiple testing procedures enjoys valid coverage guarantee under mild conditions. Numerical results confirm the effectiveness and robustness of SCOP in FCR control and show that it achieves more narrowed PIs over existing methods in many settings.

Graph neural networks (GNNs) is widely used to learn a powerful representation of graph-structured data. Recent work demonstrates that transferring knowledge from self-supervised tasks to downstream tasks could further improve graph representation. However, there is an inherent gap between self-supervised tasks and downstream tasks in terms of optimization objective and training data. Conventional pre-training methods may be not effective enough on knowledge transfer since they do not make any adaptation for downstream tasks. To solve such problems, we propose a new transfer learning paradigm on GNNs which could effectively leverage self-supervised tasks as auxiliary tasks to help the target task. Our methods would adaptively select and combine different auxiliary tasks with the target task in the fine-tuning stage. We design an adaptive auxiliary loss weighting model to learn the weights of auxiliary tasks by quantifying the consistency between auxiliary tasks and the target task. In addition, we learn the weighting model through meta-learning. Our methods can be applied to various transfer learning approaches, it performs well not only in multi-task learning but also in pre-training and fine-tuning. Comprehensive experiments on multiple downstream tasks demonstrate that the proposed methods can effectively combine auxiliary tasks with the target task and significantly improve the performance compared to state-of-the-art methods.

Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.

Embedding entities and relations into a continuous multi-dimensional vector space have become the dominant method for knowledge graph embedding in representation learning. However, most existing models ignore to represent hierarchical knowledge, such as the similarities and dissimilarities of entities in one domain. We proposed to learn a Domain Representations over existing knowledge graph embedding models, such that entities that have similar attributes are organized into the same domain. Such hierarchical knowledge of domains can give further evidence in link prediction. Experimental results show that domain embeddings give a significant improvement over the most recent state-of-art baseline knowledge graph embedding models.

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.

We advocate the use of implicit fields for learning generative models of shapes and introduce an implicit field decoder for shape generation, aimed at improving the visual quality of the generated shapes. An implicit field assigns a value to each point in 3D space, so that a shape can be extracted as an iso-surface. Our implicit field decoder is trained to perform this assignment by means of a binary classifier. Specifically, it takes a point coordinate, along with a feature vector encoding a shape, and outputs a value which indicates whether the point is outside the shape or not. By replacing conventional decoders by our decoder for representation learning and generative modeling of shapes, we demonstrate superior results for tasks such as shape autoencoding, generation, interpolation, and single-view 3D reconstruction, particularly in terms of visual quality.

北京阿比特科技有限公司