When using ordinal patterns, which describe the ordinal structure within a data vector, the problem of ties appeared permanently. So far, model classes were used which do not allow for ties; randomization has been another attempt to overcome this problem. Often, time periods with constant values even have been counted as times of monotone increase. To overcome this, a new approach is proposed: it explicitly allows for ties and, hence, considers more patterns than before. Ties are no longer seen as nuisance, but to carry valuable information. Limit theorems in the new framework are provided, both, for a single time series and for the dependence between two time series. The methods are used on hydrological data sets. It is common to distinguish five flood classes (plus 'absence of flood'). Considering data vectors of these classes at a certain gauge in a river basin, one will usually encounter several ties. Co-monotonic behavior between the data sets of two gauges (increasing, constant, decreasing) can be detected by the method as well as spatial patterns. Thus, it helps to analyze the strength of dependence between different gauges in an intuitive way. This knowledge can be used to asses risk and to plan future construction projects.
Social recommendation systems face the problem of social influence bias, which can lead to an overemphasis on recommending items that friends have interacted with. Addressing this problem is crucial, and existing methods often rely on techniques such as weight adjustment or leveraging unbiased data to eliminate this bias. However, we argue that not all biases are detrimental, i.e., some items recommended by friends may align with the user's interests. Blindly eliminating such biases could undermine these positive effects, potentially diminishing recommendation accuracy. In this paper, we propose a Causal Disentanglement-based framework for Regulating Social influence Bias in social recommendation, named CDRSB, to improve recommendation performance. From the perspective of causal inference, we find that the user social network could be regarded as a confounder between the user and item embeddings (treatment) and ratings (outcome). Due to the presence of this social network confounder, two paths exist from user and item embeddings to ratings: a non-causal social influence path and a causal interest path. Building upon this insight, we propose a disentangled encoder that focuses on disentangling user and item embeddings into interest and social influence embeddings. Mutual information-based objectives are designed to enhance the distinctiveness of these disentangled embeddings, eliminating redundant information. Additionally, a regulatory decoder that employs a weight calculation module to dynamically learn the weights of social influence embeddings for effectively regulating social influence bias has been designed. Experimental results on four large-scale real-world datasets Ciao, Epinions, Dianping, and Douban book demonstrate the effectiveness of CDRSB compared to state-of-the-art baselines.
Quantum density matrix represents all the information of the entire quantum system, and novel models of meaning employing density matrices naturally model linguistic phenomena such as hyponymy and linguistic ambiguity, among others in quantum question answering tasks. Naturally, we argue that applying the quantum density matrix into classical Question Answering (QA) tasks can show more effective performance. Specifically, we (i) design a new mechanism based on Long Short-Term Memory (LSTM) to accommodate the case when the inputs are matrixes; (ii) apply the new mechanism to QA problems with Convolutional Neural Network (CNN) and gain the LSTM-based QA model with the quantum density matrix. Experiments of our new model on TREC-QA and WIKI-QA data sets show encouraging results. Similarly, we argue that the quantum density matrix can also enhance the image feature information and the relationship between the features for the classical image classification. Thus, we (i) combine density matrices and CNN to design a new mechanism; (ii) apply the new mechanism to some representative classical image classification tasks. A series of experiments show that the application of quantum density matrix in image classification has the generalization and high efficiency on different datasets. The application of quantum density matrix both in classical question answering tasks and classical image classification tasks show more effective performance.
Virtual reality (VR) is a promising data engine for autonomous driving (AD). However, data fidelity in this paradigm is often degraded by VR inconsistency, for which the existing VR approaches become ineffective, as they ignore the inter-dependency between low-level VR synchronizer designs (i.e., data collector) and high-level VR synthesizer designs (i.e., data processor). This paper presents a seamless virtual reality SVR platform for AD, which mitigates such inconsistency, enabling VR agents to interact with each other in a shared symbiotic world. The crux to SVR is an integrated synchronizer and synthesizer IS2 design, which consists of a drift-aware lidar-inertial synchronizer for VR colocation and a motion-aware deep visual synthesis network for augmented reality image generation. We implement SVR on car-like robots in two sandbox platforms, achieving a cm-level VR colocalization accuracy and 3.2% VR image deviation, thereby avoiding missed collisions or model clippings. Experiments show that the proposed SVR reduces the intervention times, missed turns, and failure rates compared to other benchmarks. The SVR-trained neural network can handle unseen situations in real-world environments, by leveraging its knowledge learnt from the VR space.
Knowledge distillation, the technique of transferring knowledge from large, complex models to smaller ones, marks a pivotal step towards efficient AI deployment. Distilling Step-by-Step (DSS), a novel method utilizing chain-of-thought (CoT) distillation, has demonstrated promise by imbuing smaller models with the superior reasoning capabilities of their larger counterparts. In DSS, the distilled model acquires the ability to generate rationales and predict labels concurrently through a multi-task learning framework. However, DSS overlooks the intrinsic relationship between the two training tasks, leading to ineffective integration of CoT knowledge with the task of label prediction. To this end, we investigate the mutual relationship of the two tasks from Information Bottleneck perspective and formulate it as maximizing the mutual information of the representation features of the two tasks. We propose a variational approach to solve this optimization problem using a learning-based method. Our experimental results across four datasets demonstrate that our method outperforms the state-of-the-art DSS. Our findings offer insightful guidance for future research on language model distillation as well as applications involving CoT. Code and models will be released soon.
We present AlloyInEcore, a tool for specifying metamodels with their static semantics to facilitate automated, formal reasoning on models. Software development projects require that software systems be specified in various models (e.g., requirements models, architecture models, test models, and source code). It is crucial to reason about those models to ensure the correct and complete system specifications. AlloyInEcore allows the user to specify metamodels with their static semantics, while, using the semantics, it automatically detects inconsistent models, and completes partial models. It has been evaluated on three industrial case studies in the automotive domain (//modelwriter.github.io/AlloyInEcore/).
We propose a new and generic approach for detecting multiple change-points in general dependent data, termed random interval distillation (RID). By collecting random intervals with sufficient strength of signals and reassembling them into a sequence of informative short intervals, our new approach captures the shifts in signal characteristics across diverse dependent data forms including locally stationary high-dimensional time series and dynamic networks with Markov formation. We further propose a range of secondary refinements tailored to various data types to enhance the localization precision. Notably, for univariate time series and low-rank autoregressive networks, our methods achieve the minimax optimality as their independent counterparts. For practical applications, we introduce a clustering-based and data-driven procedure to determine the optimal threshold for signal strength, which is adaptable to a wide array of dependent data scenarios utilizing the connection between RID and clustering. Additionally, our method has been extended to identify kinks and changes in signals characterized by piecewise polynomial trends. We examine the effectiveness and usefulness of our methodology via extensive simulation studies and a real data example, implementing it in the R-package rid.
This paper intends to apply the sample-average-approximation (SAA) scheme to solve a system of stochastic equations (SSE), which has many applications in a variety of fields. The SAA is an effective paradigm to address risks and uncertainty in stochastic models from the perspective of Monte Carlo principle. Nonetheless, a numerical conflict arises from the sample size of SAA when one has to make a tradeoff between the accuracy of solutions and the computational cost. To alleviate this issue, we incorporate a gradually reinforced SAA scheme into a differentiable homotopy method and develop a gradually reinforced sample-average-approximation (GRSAA) differentiable homotopy method in this paper. By introducing a series of continuously differentiable functions of the homotopy parameter $t$ ranging between zero and one, we establish a differentiable homotopy system, which is able to gradually increase the sample size of SAA as $t$ descends from one to zero. The set of solutions to the homotopy system contains an everywhere smooth path, which starts from an arbitrary point and ends at a solution to the SAA with any desired accuracy. The GRSAA differentiable homotopy method serves as a bridge to link the gradually reinforced SAA scheme and a differentiable homotopy method and retains the nice property of global convergence the homotopy method possesses while greatly reducing the computational cost for attaining a desired solution to the original SSE. Several numerical experiments further confirm the effectiveness and efficiency of the proposed method.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.
We introduce a generic framework that reduces the computational cost of object detection while retaining accuracy for scenarios where objects with varied sizes appear in high resolution images. Detection progresses in a coarse-to-fine manner, first on a down-sampled version of the image and then on a sequence of higher resolution regions identified as likely to improve the detection accuracy. Built upon reinforcement learning, our approach consists of a model (R-net) that uses coarse detection results to predict the potential accuracy gain for analyzing a region at a higher resolution and another model (Q-net) that sequentially selects regions to zoom in. Experiments on the Caltech Pedestrians dataset show that our approach reduces the number of processed pixels by over 50% without a drop in detection accuracy. The merits of our approach become more significant on a high resolution test set collected from YFCC100M dataset, where our approach maintains high detection performance while reducing the number of processed pixels by about 70% and the detection time by over 50%.
Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.