We investigate modifications to Bayesian Optimization for a resource-constrained setting of sequential experimental design where changes to certain design variables of the search space incur a switching cost. This models the scenario where there is a trade-off between evaluating more while maintaining the same setup, or switching and restricting the number of possible evaluations due to the incurred cost. We adapt two process-constrained batch algorithms to this sequential problem formulation, and propose two new methods: one cost-aware and one cost-ignorant. We validate and compare the algorithms using a set of 7 scalable test functions in different dimensionalities and switching-cost settings for 30 total configurations. Our proposed cost-aware hyperparameter-free algorithm yields comparable results to tuned process-constrained algorithms in all settings we considered, suggesting some degree of robustness to varying landscape features and cost trade-offs. This method starts to outperform the other algorithms with increasing switching-cost. Our work broadens out from other recent Bayesian Optimization studies in resource-constrained settings that consider a batch setting only. While the contributions of this work are relevant to the general class of resource-constrained problems, they are particularly relevant to problems where adaptability to varying resource availability is of high importance
We consider the problem of selecting an optimal subset of information sources for a hypothesis testing/classification task where the goal is to identify the true state of the world from a finite set of hypotheses, based on finite observation samples from the sources. In order to characterize the learning performance, we propose a misclassification penalty framework, which enables non-uniform treatment of different misclassification errors. In a centralized Bayesian learning setting, we study two variants of the subset selection problem: (i) selecting a minimum cost information set to ensure that the maximum penalty of misclassifying the true hypothesis remains bounded and (ii) selecting an optimal information set under a limited budget to minimize the maximum penalty of misclassifying the true hypothesis. Under certain assumptions, we prove that the objective (or constraints) of these combinatorial optimization problems are weak (or approximate) submodular, and establish high-probability performance guarantees for greedy algorithms. Further, we propose an alternate metric for information set selection which is based on the total penalty of misclassification. We prove that this metric is submodular and establish near-optimal guarantees for the greedy algorithms for both the information set selection problems. Finally, we present numerical simulations to validate our theoretical results over several randomly generated instances.
Image restoration has made marvelous progress with the advent of deep learning. Previous methods usually rely on designing powerful network architecture to elevate performance, however, the natural visual effect of the restored results is limited by color and texture distortions. Besides the visual perceptual quality, the semantic perception recovery is an important but often overlooked perspective of restored image, which is crucial for the deployment in high-level tasks. In this paper, we propose a new perspective to resort these issues by introducing a naturalness-oriented and semantic-aware optimization mechanism, dubbed DiffLoss. Specifically, inspired by the powerful distribution coverage capability of the diffusion model for natural image generation, we exploit the Markov chain sampling property of diffusion model and project the restored results of existing networks into the sampling space. Besides, we reveal that the bottleneck feature of diffusion models, also dubbed h-space feature, is a natural high-level semantic space. We delve into this property and propose a semantic-aware loss to further unlock its potential of semantic perception recovery, which paves the way to connect image restoration task and downstream high-level recognition task. With these two strategies, the DiffLoss can endow existing restoration methods with both more natural and semantic-aware results. We verify the effectiveness of our method on substantial common image restoration tasks and benchmarks. Code will be available at //github.com/JosephTiTan/DiffLoss.
We propose a reinforcement learning (RL) approach to model optimal exercise strategies for option-type products. We pursue the RL avenue in order to learn the optimal action-value function of the underlying stopping problem. In addition to retrieving the optimal Q-function at any time step, one can also price the contract at inception. We first discuss the standard setting with one exercise right, and later extend this framework to the case of multiple stopping opportunities in the presence of constraints. We propose to approximate the Q-function with a deep neural network, which does not require the specification of basis functions as in the least-squares Monte Carlo framework and is scalable to higher dimensions. We derive a lower bound on the option price obtained from the trained neural network and an upper bound from the dual formulation of the stopping problem, which can also be expressed in terms of the Q-function. Our methodology is illustrated with examples covering the pricing of swing options.
Modern automated driving solutions utilize trajectory planning and control components with numerous parameters that need to be tuned for different driving situations and vehicle types to achieve optimal performance. This paper proposes a method to automatically tune such parameters to resemble expert demonstrations. We utilize a cost function which captures deviations of the closed-loop operation of the controller from the recorded desired driving behavior. Parameter tuning is then accomplished by using local optimization techniques. Three optimization alternatives are compared in a case study, where a trajectory planner is tuned for lane following in a real-world driving scenario. The results suggest that the proposed approach improves manually tuned initial parameters significantly even with respect to noisy demonstration data.
Activation steering methods were shown to be effective in conditioning language model generation by additively intervening over models' intermediate representations. However, the evaluation of these techniques has so far been limited to single conditioning properties and synthetic settings. In this work, we conduct a comprehensive evaluation of various activation steering strategies, highlighting the property-dependent nature of optimal parameters to ensure a robust effect throughout generation. To address this issue, we propose Dynamic Activation Composition, an information-theoretic approach to modulate the steering intensity of one or more properties throughout generation. Our experiments on multi-property steering show that our method successfully maintains high conditioning while minimizing the impact of conditioning on generation fluency.
Operator splitting methods tailored to coupled linear port-Hamiltonian systems are developed. We present algorithms that are able to exploit scalar coupling, as well as multirate potential of these coupled systems. The obtained algorithms preserve the dissipative structure of the overall system and are convergent of second order. Numerical results for coupled mass-spring-damper chains illustrate the computational efficiency of the splitting methods compared to a straight-forward application of the implicit midpoint rule to the overall system.
Cellular automata (CA) are simulation models that can produce complex emergent behaviors from simple local rules. Although state-of-the-art GPU solutions are already fast due to their data-parallel nature, their performance can rapidly degrade in CA with a large neighborhood radius. With the inclusion of tensor cores across the entire GPU ecosystem, interest has grown in finding ways to leverage these fast units outside the field of artificial intelligence, which was their original purpose. In this work, we present CAT, a GPU tensor core approach that can accelerate CA in which the cell transition function acts on a weighted summation of its neighborhood. CAT is evaluated theoretically, using an extended PRAM cost model, as well as empirically using the Larger Than Life (LTL) family of CA as case studies. The results confirm that the cost model is accurate, showing that CAT exhibits constant time throughout the entire radius range $1 \le r \le 16$, and its theoretical speedups agree with the empirical results. At low radius $r=1,2$, CAT is competitive and is only surpassed by the fastest state-of-the-art GPU solution. Starting from $r=3$, CAT progressively outperforms all other approaches, reaching speedups of up to $101\times$ over a GPU baseline and up to $\sim 14\times$ over the fastest state-of-the-art GPU approach. In terms of energy efficiency, CAT is competitive in the range $1 \le r \le 4$ and from $r \ge 5$ it is the most energy efficient approach. As for performance scaling across GPU architectures, CAT shows a promising trend that if continues for future generations, it would increase its performance at a higher rate than classical GPU solutions. The results obtained in this work put CAT as an attractive GPU approach for scientists that need to study emerging phenomena on CA with large neighborhood radius.
In relational verification, judicious alignment of computational steps facilitates proof of relations between programs using simple relational assertions. Relational Hoare logics (RHL) provide compositional rules that embody various alignments of executions. Seemingly more flexible alignments can be expressed in terms of product automata based on program transition relations. A single degenerate alignment rule (self-composition), atop a complete Hoare logic, comprises a RHL for $\forall\forall$ properties that is complete in the ordinary logical sense (Cook'78). The notion of alignment completeness was previously proposed as a more satisfactory measure, and some rules were shown to be alignment complete with respect to a few ad hoc forms of alignment automata. This paper proves alignment completeness with respect to a general class of $\forall\forall$ alignment automata, for a RHL comprised of standard rules together with a rule of semantics-preserving rewrites based on Kleene algebra with tests. A new logic for $\forall\exists$ properties is introduced and shown to be alignment complete. The $\forall\forall$ and $\forall\exists$ automata are shown to be semantically complete. Thus the logics are both complete in the ordinary sense. Recent work by D'Osualdo et al highlights the importance of completeness relative to assumptions (which we term entailment completeness), and presents $\forall\forall$ examples seemingly beyond the scope of RHLs. Additional rules enable these examples to be proved in our RHL, shedding light on the open problem of entailment completeness.
The success of AI models relies on the availability of large, diverse, and high-quality datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and high costs. Synthetic data has emerged as a promising solution by generating artificial data that mimics real-world patterns. This paper provides an overview of synthetic data research, discussing its applications, challenges, and future directions. We present empirical evidence from prior art to demonstrate its effectiveness and highlight the importance of ensuring its factuality, fidelity, and unbiasedness. We emphasize the need for responsible use of synthetic data to build more powerful, inclusive, and trustworthy language models.
The recent proliferation of knowledge graphs (KGs) coupled with incomplete or partial information, in the form of missing relations (links) between entities, has fueled a lot of research on knowledge base completion (also known as relation prediction). Several recent works suggest that convolutional neural network (CNN) based models generate richer and more expressive feature embeddings and hence also perform well on relation prediction. However, we observe that these KG embeddings treat triples independently and thus fail to cover the complex and hidden information that is inherently implicit in the local neighborhood surrounding a triple. To this effect, our paper proposes a novel attention based feature embedding that captures both entity and relation features in any given entity's neighborhood. Additionally, we also encapsulate relation clusters and multihop relations in our model. Our empirical study offers insights into the efficacy of our attention based model and we show marked performance gains in comparison to state of the art methods on all datasets.