亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Deep Reinforcement Learning (DRL) has gained prominence as an effective approach for control systems. However, its practical deployment is impeded by state perturbations that can severely impact system performance. Addressing this critical challenge requires robustness verification about system performance, which involves tackling two quantitative questions: (i) how to establish guaranteed bounds for expected cumulative rewards, and (ii) how to determine tail bounds for cumulative rewards. In this work, we present the first approach for robustness verification of DRL-based control systems by introducing reward martingales, which offer a rigorous mathematical foundation to characterize the impact of state perturbations on system performance in terms of cumulative rewards. Our verified results provide provably quantitative certificates for the two questions. We then show that reward martingales can be implemented and trained via neural networks, against different types of control policies. Experimental results demonstrate that our certified bounds tightly enclose simulation outcomes on various DRL-based control systems, indicating the effectiveness and generality of the proposed approach.

相關內容

Process mining provides methods to analyse event logs generated by information systems during the execution of processes. It thereby supports the design, validation, and execution of processes in domains ranging from healthcare, through manufacturing, to e-commerce. To explore the regularities of flexible processes that show a large behavioral variability, it was suggested to mine recurrent behavioral patterns that jointly describe the underlying process. Existing approaches to behavioral pattern mining, however, suffer from two limitations. First, they show limited scalability as incremental computation is incorporated only in the generation of pattern candidates, but not in the evaluation of their quality. Second, process analysis based on mined patterns shows limited effectiveness due to an overwhelmingly large number of patterns obtained in practical application scenarios, many of which are redundant. In this paper, we address these limitations to facilitate the analysis of complex, flexible processes based on behavioral patterns. Specifically, we improve COBPAM, our initial behavioral pattern mining algorithm, by an incremental procedure to evaluate the quality of pattern candidates, optimizing thereby its efficiency. Targeting a more effective use of the resulting patterns, we further propose pruning strategies for redundant patterns and show how relations between the remaining patterns are extracted and visualized to provide process insights. Our experiments with diverse real-world datasets indicate a considerable reduction of the runtime needed for pattern mining, while a qualitative assessment highlights how relations between patterns guide the analysis of the underlying process.

Printed Electronics (PE) feature distinct and remarkable characteristics that make them a prominent technology for achieving true ubiquitous computing. This is particularly relevant in application domains that require conformal and ultra-low cost solutions, which have experienced limited penetration of computing until now. Unlike silicon-based technologies, PE offer unparalleled features such as non-recurring engineering costs, ultra-low manufacturing cost, and on-demand fabrication of conformal, flexible, non-toxic, and stretchable hardware. However, PE face certain limitations due to their large feature sizes, that impede the realization of complex circuits, such as machine learning classifiers. In this work, we address these limitations by leveraging the principles of Approximate Computing and Bespoke (fully-customized) design. We propose an automated framework for designing ultra-low power Multilayer Perceptron (MLP) classifiers which employs, for the first time, a holistic approach to approximate all functions of the MLP's neurons: multiplication, accumulation, and activation. Through comprehensive evaluation across various MLPs of varying size, our framework demonstrates the ability to enable battery-powered operation of even the most intricate MLP architecture examined, significantly surpassing the current state of the art.

Business Process Simulation (BPS) is a common approach to estimate the impact of changes to a business process on its performance measures. For example, it allows us to estimate what would be the cycle time of a process if we automated one of its activities, or if some resources become unavailable. The starting point of BPS is a business process model annotated with simulation parameters (a BPS model). In traditional approaches, BPS models are manually designed by modeling specialists. This approach is time-consuming and error-prone. To address this shortcoming, several studies have proposed methods to automatically discover BPS models from event logs via process mining techniques. However, current techniques in this space discover BPS models that only capture waiting times caused by resource contention or resource unavailability. Oftentimes, a considerable portion of the waiting time in a business process corresponds to extraneous delays, e.g., a resource waits for the customer to return a phone call. This article proposes a method that discovers extraneous delays from event logs of business process executions. The proposed approach computes, for each pair of causally consecutive activity instances in the event log, the time when the target activity instance should theoretically have started, given the availability of the relevant resource. Based on the difference between the theoretical and the actual start times, the approach estimates the distribution of extraneous delays, and it enhances the BPS model with timer events to capture these delays. An empirical evaluation involving synthetic and real-life logs shows that the approach produces BPS models that better reflect the temporal dynamics of the process, relative to BPS models that do not capture extraneous delays.

Recent developments enable the quantification of causal control given a structural causal model (SCM). This has been accomplished by introducing quantities which encode changes in the entropy of one variable when intervening on another. These measures, named causal entropy and causal information gain, aim to address limitations in existing information theoretical approaches for machine learning tasks where causality plays a crucial role. They have not yet been properly mathematically studied. Our research contributes to the formal understanding of the notions of causal entropy and causal information gain by establishing and analyzing fundamental properties of these concepts, including bounds and chain rules. Furthermore, we elucidate the relationship between causal entropy and stochastic interventions. We also propose definitions for causal conditional entropy and causal conditional information gain. Overall, this exploration paves the way for enhancing causal machine learning tasks through the study of recently-proposed information theoretic quantities grounded in considerations about causality.

Information Bottleneck (IB) is a widely used framework that enables the extraction of information related to a target random variable from a source random variable. In the objective function, IB controls the trade-off between data compression and predictiveness through the Lagrange multiplier $\beta$. Traditionally, to find the trade-off to be learned, IB requires a search for $\beta$ through multiple training cycles, which is computationally expensive. In this study, we introduce Flexible Variational Information Bottleneck (FVIB), an innovative framework for classification task that can obtain optimal models for all values of $\beta$ with single, computationally efficient training. We theoretically demonstrate that across all values of reasonable $\beta$, FVIB can simultaneously maximize an approximation of the objective function for Variational Information Bottleneck (VIB), the conventional IB method. Then we empirically show that FVIB can learn the VIB objective as effectively as VIB. Furthermore, in terms of calibration performance, FVIB outperforms other IB and calibration methods by enabling continuous optimization of $\beta$. Our codes are available at //github.com/sotakudo/fvib.

Efficient parallel computing has become a pivotal element in advancing artificial intelligence. Yet, the deployment of Spiking Neural Networks (SNNs) in this domain is hampered by their inherent sequential computational dependency. This constraint arises from the need for each time step's processing to rely on the preceding step's outcomes, significantly impeding the adaptability of SNN models to massively parallel computing environments. Addressing this challenge, our paper introduces the innovative Parallel Spiking Unit (PSU) and its two derivatives, the Input-aware PSU (IPSU) and Reset-aware PSU (RPSU). These variants skillfully decouple the leaky integration and firing mechanisms in spiking neurons while probabilistically managing the reset process. By preserving the fundamental computational attributes of the spiking neuron model, our approach enables the concurrent computation of all membrane potential instances within the SNN, facilitating parallel spike output generation and substantially enhancing computational efficiency. Comprehensive testing across various datasets, including static and sequential images, Dynamic Vision Sensor (DVS) data, and speech datasets, demonstrates that the PSU and its variants not only significantly boost performance and simulation speed but also augment the energy efficiency of SNNs through enhanced sparsity in neural activity. These advancements underscore the potential of our method in revolutionizing SNN deployment for high-performance parallel computing applications.

Deep generative models have recently emerged as an effective approach to offline reinforcement learning. However, their large model size poses challenges in computation. We address this issue by proposing a knowledge distillation method based on data augmentation. In particular, high-return trajectories are generated from a conditional diffusion model, and they are blended with the original trajectories through a novel stitching algorithm that leverages a new reward generator. Applying the resulting dataset to behavioral cloning, the learned shallow policy whose size is much smaller outperforms or nearly matches deep generative planners on several D4RL benchmarks.

One of the most important challenges of Smart City Applications is to adapt the system to interact with non-expert users. Robot imitation frameworks aim to simplify and reduce times of robot programming by allowing users to program directly through demonstrations. In classical frameworks, actions are modeled using joint or Cartesian space trajectories. Other features, such as visual ones, are not always well represented with these pure geometrical approaches. Continuous Goal-Directed Actions (CGDA) is an alternative to these methods, as it encodes actions as changes of any feature that can be extracted from the environment. As a consequence of this, the robot joint trajectories for execution must be fully computed to comply with this feature-agnostic encoding. This is achieved using Evolutionary Algorithms (EA), which usually requires too many evaluations to perform this evolution step in the actual robot. Current strategies involve performing evaluations in a simulation, transferring the final joint trajectory to the actual robot. Smart City applications involve working in highly dynamic and complex environments, where having a precise model is not always achievable. Our goal is to study the tractability of performing these evaluations directly in a real-world scenario. Two different approaches to reduce the number of evaluations using EA, are proposed and compared. In the first approach, Particle Swarm Optimization (PSO)-based methods have been studied and compared within CGDA: naive PSO, Fitness Inheritance PSO (FI-PSO), and Adaptive Fuzzy Fitness Granulation with PSO (AFFG-PSO). The second approach studied the introduction of geometrical and velocity constraints within CGDA. The effects of both approaches were analyzed and compared in the wax and paint actions, two CGDA commonly studied use cases. Results from this paper depict an important reduction in the number of evaluations.

There is strong interest in developing mathematical methods that can be used to understand complex neural networks used in image analysis. In this paper, we introduce techniques from Linear Algebra to model neural network layers as maps between signal spaces. First, we demonstrate how signal spaces can be used to visualize weight spaces and convolutional layer kernels. We also demonstrate how residual vector spaces can be used to further visualize information lost at each layer. Second, we introduce the concept of invertible networks and an algorithm for computing input images that yield specific outputs. We demonstrate our approach on two invertible networks and ResNet18.

Spectral clustering (SC) is a popular clustering technique to find strongly connected communities on a graph. SC can be used in Graph Neural Networks (GNNs) to implement pooling operations that aggregate nodes belonging to the same cluster. However, the eigendecomposition of the Laplacian is expensive and, since clustering results are graph-specific, pooling methods based on SC must perform a new optimization for each new sample. In this paper, we propose a graph clustering approach that addresses these limitations of SC. We formulate a continuous relaxation of the normalized minCUT problem and train a GNN to compute cluster assignments that minimize this objective. Our GNN-based implementation is differentiable, does not require to compute the spectral decomposition, and learns a clustering function that can be quickly evaluated on out-of-sample graphs. From the proposed clustering method, we design a graph pooling operator that overcomes some important limitations of state-of-the-art graph pooling techniques and achieves the best performance in several supervised and unsupervised tasks.

北京阿比特科技有限公司