亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this paper, we address the issue of fairness in preference-based reinforcement learning (PbRL) in the presence of multiple objectives. The main objective is to design control policies that can optimize multiple objectives while treating each objective fairly. Toward this objective, we design a new fairness-induced preference-based reinforcement learning or FPbRL. The main idea of FPbRL is to learn vector reward functions associated with multiple objectives via new welfare-based preferences rather than reward-based preference in PbRL, coupled with policy learning via maximizing a generalized Gini welfare function. Finally, we provide experiment studies on three different environments to show that the proposed FPbRL approach can achieve both efficiency and equity for learning effective and fair policies.

相關內容

We present a new technique to enhance the robustness of imitation learning methods by generating corrective data to account for compounding errors and disturbances. While existing methods rely on interactive expert labeling, additional offline datasets, or domain-specific invariances, our approach requires minimal additional assumptions beyond access to expert data. The key insight is to leverage local continuity in the environment dynamics to generate corrective labels. Our method first constructs a dynamics model from the expert demonstration, encouraging local Lipschitz continuity in the learned model. In locally continuous regions, this model allows us to generate corrective labels within the neighborhood of the demonstrations but beyond the actual set of states and actions in the dataset. Training on this augmented data enhances the agent's ability to recover from perturbations and deal with compounding errors. We demonstrate the effectiveness of our generated labels through experiments in a variety of robotics domains in simulation that have distinct forms of continuity and discontinuity, including classic control problems, drone flying, navigation with high-dimensional sensor observations, legged locomotion, and tabletop manipulation.

In this paper, we explore conjunctive query rewriting, focusing on queries containing universally quantified negation within the framework of disjunctive existential rules. We address the undecidability of the existence of a finite and complete UCQ-rewriting and the identification of finite unification sets (fus) of rules. We introduce new rule classes, connected linear rules and connected domain restricted rules, that exhibit the fus property for existential rules. Additionally, we propose disconnected disjunction for disjunctive existential rules to achieve the fus property when we extend the introduced rule fragments to disjunctive existential rules. We present ECOMPLETO, a system for efficient query rewriting with disjunctive existential rules, capable of handling UCQs with universally quantified negation. Our experiments demonstrate ECOMPLETO's consistent ability to produce finite UCQ-rewritings and describe the performance on different ontologies and queries.

In this paper, we study the problem of computing the majority function by low-depth monotone circuits and a related problem of constructing low-depth sorting networks. We consider both the classical setting with elementary operations of arity $2$ and the generalized setting with operations of arity $k$, where $k$ is a parameter. For both problems and both settings, there are various constructions known, the minimal known depth being logarithmic. However, there is currently no known construction that simultaneously achieves sub-log-squared depth, effective constructability, simplicity, and has a potential to be used in practice. In this paper we make progress towards resolution of this problem. For computing majority by standard monotone circuits (gates of arity 2) we provide an explicit monotone circuit of depth $O(\log_2^{5/3} n)$. The construction is a combination of several known and not too complicated ideas. For arbitrary arity of gates $k$ we provide a new sorting network architecture inspired by representation of inputs as a high-dimensional cube. As a result we provide a simple construction that improves previous upper bound of $4 \log_k^2 n$ to $2 \log_k^2 n$. We prove the similar bound for the depth of the circuit computing majority of $n$ bits consisting of gates computing majority of $k$ bits. Note, that for both problems there is an explicit construction of depth $O(\log_k n)$ known, but the construction is complicated and the constant hidden in $O$-notation is huge.

In this work, we aim to establish a strong connection between two significant bodies of machine learning research: continual learning and sequence modeling. That is, we propose to formulate continual learning as a sequence modeling problem, allowing advanced sequence models to be utilized for continual learning. Under this formulation, the continual learning process becomes the forward pass of a sequence model. By adopting the meta-continual learning (MCL) framework, we can train the sequence model at the meta-level, on multiple continual learning episodes. As a specific example of our new formulation, we demonstrate the application of Transformers and their efficient variants as MCL methods. Our experiments on seven benchmarks, covering both classification and regression, show that sequence models can be an attractive solution for general MCL.

The maximum likelihood estimation is widely used for statistical inferences. In this study, we reformulate the h-likelihood proposed by Lee and Nelder in 1996, whose maximization yields maximum likelihood estimators for fixed parameters and asymptotically best unbiased predictors for random parameters. We establish the statistical foundations for h-likelihood theories, which extend classical likelihood theories to embrace broad classes of statistical models with random parameters. The maximum h-likelihood estimators asymptotically achieve the generalized Cramer-Rao lower bound. Furthermore, we explore asymptotic theory when the consistency of either fixed parameter estimation or random parameter prediction is violated. The introduction of this new h-likelihood framework enables likelihood theories to cover inferences for a much broader class of models, while also providing computationally efficient fitting algorithms to give asymptotically optimal estimators for fixed parameters and predictors for random parameters.

In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks. Preliminary results showed that the proposed method, MetaASR, significantly outperforms the state-of-the-art multitask pretraining approach on all target languages with different combinations of pretraining languages. In addition, since MAML's model-agnostic property, this paper also opens new research direction of applying meta learning to more speech-related applications.

The recent proliferation of knowledge graphs (KGs) coupled with incomplete or partial information, in the form of missing relations (links) between entities, has fueled a lot of research on knowledge base completion (also known as relation prediction). Several recent works suggest that convolutional neural network (CNN) based models generate richer and more expressive feature embeddings and hence also perform well on relation prediction. However, we observe that these KG embeddings treat triples independently and thus fail to cover the complex and hidden information that is inherently implicit in the local neighborhood surrounding a triple. To this effect, our paper proposes a novel attention based feature embedding that captures both entity and relation features in any given entity's neighborhood. Additionally, we also encapsulate relation clusters and multihop relations in our model. Our empirical study offers insights into the efficacy of our attention based model and we show marked performance gains in comparison to state of the art methods on all datasets.

BERT, a pre-trained Transformer model, has achieved ground-breaking performance on multiple NLP tasks. In this paper, we describe BERTSUM, a simple variant of BERT, for extractive summarization. Our system is the state of the art on the CNN/Dailymail dataset, outperforming the previous best-performed system by 1.65 on ROUGE-L. The codes to reproduce our results are available at //github.com/nlpyang/BertSum

Recently, ensemble has been applied to deep metric learning to yield state-of-the-art results. Deep metric learning aims to learn deep neural networks for feature embeddings, distances of which satisfy given constraint. In deep metric learning, ensemble takes average of distances learned by multiple learners. As one important aspect of ensemble, the learners should be diverse in their feature embeddings. To this end, we propose an attention-based ensemble, which uses multiple attention masks, so that each learner can attend to different parts of the object. We also propose a divergence loss, which encourages diversity among the learners. The proposed method is applied to the standard benchmarks of deep metric learning and experimental results show that it outperforms the state-of-the-art methods by a significant margin on image retrieval tasks.

In this paper, we propose a novel multi-task learning architecture, which incorporates recent advances in attention mechanisms. Our approach, the Multi-Task Attention Network (MTAN), consists of a single shared network containing a global feature pool, together with task-specific soft-attention modules, which are trainable in an end-to-end manner. These attention modules allow for learning of task-specific features from the global pool, whilst simultaneously allowing for features to be shared across different tasks. The architecture can be built upon any feed-forward neural network, is simple to implement, and is parameter efficient. Experiments on the CityScapes dataset show that our method outperforms several baselines in both single-task and multi-task learning, and is also more robust to the various weighting schemes in the multi-task loss function. We further explore the effectiveness of our method through experiments over a range of task complexities, and show how our method scales well with task complexity compared to baselines.

北京阿比特科技有限公司