亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

A long line of works characterizes the sample complexity of regret minimization in sequential decision-making by min-max programs. In the corresponding saddle-point game, the min-player optimizes the sampling distribution against an adversarial max-player that chooses confusing models leading to large regret. The most recent instantiation of this idea is the decision-estimation coefficient (DEC), which was shown to provide nearly tight lower and upper bounds on the worst-case expected regret in structured bandits and reinforcement learning. By re-parametrizing the offset DEC with the confidence radius and solving the corresponding min-max program, we derive an anytime variant of the Estimation-To-Decisions (E2D) algorithm. Importantly, the algorithm optimizes the exploration-exploitation trade-off online instead of via the analysis. Our formulation leads to a practical algorithm for finite model classes and linear feedback models. We further point out connections to the information ratio, decoupling coefficient and PAC-DEC, and numerically evaluate the performance of E2D on simple examples.

相關內容

Constrained optimization plays a crucial role in the fields of quantum physics and quantum information science and becomes especially challenging for high-dimensional complex structure problems. One specific issue is that of quantum process tomography, in which the goal is to retrieve the underlying quantum process based on a given set of measurement data. In this paper, we introduce a modified version of stochastic gradient descent on a Riemannian manifold that integrates recent advancements in numerical methods for Riemannian optimization. This approach inherently supports the physically driven constraints of a quantum process, takes advantage of state-of-the-art large-scale stochastic objective optimization, and has superior performance to traditional approaches such as maximum likelihood estimation and projected least squares. The data-driven approach enables accurate, order-of-magnitude faster results, and works with incomplete data. We demonstrate our approach on simulations of quantum processes and in hardware by characterizing an engineered process on quantum computers.

We present a novel technique for online safety verification of autonomous systems, which performs reachability analysis efficiently for both bounded and unbounded horizons by employing neural barrier certificates. Our approach uses barrier certificates given by parameterized neural networks that depend on a given initial set, unsafe sets, and time horizon. Such networks are trained efficiently offline using system simulations sampled from regions of the state space. We then employ a meta-neural network to generalize the barrier certificates to state space regions that are outside the training set. These certificates are generated and validated online as sound over-approximations of the reachable states, thus either ensuring system safety or activating appropriate alternative actions in unsafe scenarios. We demonstrate our technique on case studies from linear models to nonlinear control-dependent models for online autonomous driving scenarios.

The framework of approximate differential privacy is considered, and augmented by leveraging the notion of ``the total variation of a (privacy-preserving) mechanism'' (denoted by $\eta$-TV). With this refinement, an exact composition result is derived, and shown to be significantly tighter than the optimal bounds for differential privacy (which do not consider the total variation). Furthermore, it is shown that $(\varepsilon,\delta)$-DP with $\eta$-TV is closed under subsampling. The induced total variation of commonly used mechanisms are computed. Moreover, the notion of total variation of a mechanism is studied in the local privacy setting and privacy-utility tradeoffs are investigated. In particular, total variation distance and KL divergence are considered as utility functions and studied through the lens of contraction coefficients. Finally, the results are compared and connected to the locally differentially private setting.

We define QSE, a symbolic execution framework for quantum programs by integrating symbolic variables into quantum states and the outcomes of quantum measurements. The soundness of QSE is established through a theorem that ensures the correctness of symbolic execution within operational semantics. We further introduce symbolic stabilizer states, which symbolize the phases of stabilizer generators, for the efficient analysis of quantum error correction (QEC) programs. Within the QSE framework, we can use symbolic expressions to characterize the possible discrete Pauli errors in QEC, providing a significant improvement over existing methods that rely on sampling with simulators. We implement QSE with the support of symbolic stabilizer states in a prototype tool named QuantumSE.jl. Our experiments on representative QEC codes, including quantum repetition codes, Kitaev's toric codes, and quantum Tanner codes, demonstrate the efficiency of QuantumSE.jl for debugging QEC programs with over 1000 qubits. In addition, by substituting concrete values in symbolic expressions of measurement results, QuantumSE.jl is also equipped with a sampling feature for stabilizer circuits. Despite a longer initialization time than the state-of-the-art stabilizer simulator, Google's Stim, QuantumSE.jl offers a quicker sampling rate in the experiments.

Many techniques for automated inference of inductive invariants for distributed protocols have been developed over the past several years, but their performance can still be unpredictable and their failure modes opaque for large-scale verification tasks. In this paper, we present inductive proof slicing, a new automated, compositional technique for inductive invariant inference that scales effectively to large distributed protocol verification tasks. Our technique is built on a core, novel data structure, the inductive proof graph, which explicitly represents the lemma and action dependencies of an inductive invariant and is built incrementally during the inference procedure, backwards from a target safety property. We present an invariant inference algorithm that integrates localized syntax-guided lemma synthesis routines at nodes of this graph, which are accelerated by computation of localized grammar and state variable slices. Additionally, in the case of failure to produce a complete inductive invariant, maintenance of this proof graph structure allows failures to be localized to small sub-components of this graph, enabling fine-grained failure diagnosis and repair by a user. We evaluate our technique on several complex distributed and concurrent protocols, including a large scale specification of the Raft consensus protocol, which is beyond the capabilities of modern distributed protocol verification tools, and also demonstrate how its interpretability features allow effective diagnosis and repair in cases of initial failure.

Despite significant progress in robotic systems for operation within human-centric environments, existing models still heavily rely on explicit human commands to identify and manipulate specific objects. This limits their effectiveness in environments where understanding and acting on implicit human intentions are crucial. In this study, we introduce a novel task: reasoning grasping, where robots need to generate grasp poses based on indirect verbal instructions or intentions. To accomplish this, we propose an end-to-end reasoning grasping model that integrates a multi-modal Large Language Model (LLM) with a vision-based robotic grasping framework. In addition, we present the first reasoning grasping benchmark dataset generated from the GraspNet-1 billion, incorporating implicit instructions for object-level and part-level grasping, and this dataset will soon be available for public access. Our results show that directly integrating CLIP or LLaVA with the grasp detection model performs poorly on the challenging reasoning grasping tasks, while our proposed model demonstrates significantly enhanced performance both in the reasoning grasping benchmark and real-world experiments.

For robustness toward model misspecification, the generalized posterior inference approach modifies the likelihood term by raising it to the power of a learning rate, thereby adjusting the spread of the posterior. This paper proposes a computationally efficient strategy for selecting an appropriate learning rate. The proposed approach builds upon the generalized posterior calibration (GPC) algorithm introduced by Syring and Martin (2019) [Biometrika, Volume 106, Issue 2, pp. 479-486], which is designed to select the learning rate to achieve the nominal frequentist coverage. This algorithm, which evaluates the coverage probability based on bootstrap samples, suffers from high computational costs due to the need for repeated posterior simulations for bootstrap samples. To address this limitation, the study proposes an algorithm that combines elements of the GPC algorithm with the sequential Monte Carlo (SMC) sampler. By leveraging the similarity between the learning rate in generalized posterior inference and the inverse temperature in SMC sampling, the proposed algorithm efficiently calibrates the posterior distribution with less computational cost. For demonstration, the proposed algorithm was applied to several statistical learning models.

Neural reflectance models are capable of reproducing the spatially-varying appearance of many real-world materials at different scales. Unfortunately, existing techniques such as NeuMIP have difficulties handling materials with strong shadowing effects or detailed specular highlights. In this paper, we introduce a neural appearance model that offers a new level of accuracy. Central to our model is an inception-based core network structure that captures material appearances at multiple scales using parallel-operating kernels and ensures multi-stage features through specialized convolution layers. Furthermore, we encode the inputs into frequency space, introduce a gradient-based loss, and employ it adaptive to the progress of the learning phase. We demonstrate the effectiveness of our method using a variety of synthetic and real examples.

We investigate the role of the initial screening order (ISO) in candidate screening processes, such as employee hiring and academic admissions. The ISO refers to the order in which the screener evaluates the candidate pool. It has been largely overlooked in the literature, despite its potential impact on the optimality and fairness of the chosen set, especially under a human screener. We define two problem formulations: the best-$k$, where the screener selects the $k$ best candidates, and the good-$k$, where the screener selects the $k$ first good-enough candidates. To study the impact of the ISO, we introduce a human-like screener and compare it to its algorithmic counterpart. The human-like screener is conceived to be inconsistent over time due to fatigue. Our analysis shows that the ISO, in particular, under a human-like screener hinders individual fairness despite meeting group level fairness. This is due to the position bias, where a candidate's evaluation is affected by its position within the ISO. We report extensive simulated experiments exploring the parameters of the best-$k$ and good-$k$ problem formulations both for the algorithmic and human-like screeners. This work is motivated by a real world candidate screening problem studied in collaboration with a large European company.

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.

北京阿比特科技有限公司