亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Bayesian optimization is an effective method for optimizing expensive-to-evaluate black-box functions. High-dimensional problems are particularly challenging as the surrogate model of the objective suffers from the curse of dimensionality, which makes accurate modeling difficult. We propose a group testing approach to identify active variables to facilitate efficient optimization in these domains. The proposed algorithm, Group Testing Bayesian Optimization (GTBO), first runs a testing phase where groups of variables are systematically selected and tested on whether they influence the objective. To that end, we extend the well-established theory of group testing to functions of continuous ranges. In the second phase, GTBO guides optimization by placing more importance on the active dimensions. By exploiting the axis-aligned subspace assumption, GTBO is competitive against state-of-the-art methods on several synthetic and real-world high-dimensional optimization tasks. Furthermore, GTBO aids in the discovery of active parameters in applications, thereby enhancing practitioners' understanding of the problem at hand.

相關內容

Designing a perfect reward function that depicts all the aspects of the intended behavior is almost impossible, especially generalizing it outside of the training environments. Active Inverse Reward Design (AIRD) proposed the use of a series of queries, comparing possible reward functions in a single training environment. This allows the human to give information to the agent about suboptimal behaviors, in order to compute a probability distribution over the intended reward function. However, it ignores the possibility of unknown features appearing in real-world environments, and the safety measures needed until the agent completely learns the reward function. I improved this method and created Risk-averse Batch Active Inverse Reward Design (RBAIRD), which constructs batches, sets of environments the agent encounters when being used in the real world, processes them sequentially, and, for a predetermined number of iterations, asks queries that the human needs to answer for each environment of the batch. After this process is completed in one batch, the probabilities have been improved and are transferred to the next batch. This makes it capable of adapting to real-world scenarios and learning how to treat unknown features it encounters for the first time. I also integrated a risk-averse planner, similar to that of Inverse Reward Design (IRD), which samples a set of reward functions from the probability distribution and computes a trajectory that takes the most certain rewards possible. This ensures safety while the agent is still learning the reward function, and enables the use of this approach in situations where cautiousness is vital. RBAIRD outperformed the previous approaches in terms of efficiency, accuracy, and action certainty, demonstrated quick adaptability to new, unknown features, and can be more widely used for the alignment of crucial, powerful AI models.

QR decomposition is an essential operation for solving linear equations and obtaining least-squares solutions. In high-performance computing systems, large-scale parallel QR decomposition often faces node faults. We address this issue by proposing a fault-tolerant algorithm that incorporates `coded computing' into the parallel Gram-Schmidt method, commonly used for QR decomposition. Coded computing introduces error-correcting codes into computational processes to enhance resilience against intermediate failures. While traditional coding strategies cannot preserve the orthogonality of $Q$, recent work has proven a post-orthogonalization condition that allows low-cost restoration of the degraded orthogonality. In this paper, we construct a checksum-generator matrix for multiple-node failures that satisfies the post-orthogonalization condition and prove that our code satisfies the maximum-distance separable (MDS) property with high probability. Furthermore, we consider in-node checksum storage setting where checksums are stored in original nodes. We obtain the minimal number of checksums required to be resilient to any $f$ failures under the in-node checksum storage, and also propose an in-node systematic MDS coding strategy that achieves the lower bound. Extensive experiments validate our theories and showcase the negligible overhead of our coded computing framework for fault-tolerant QR decomposition.

Knowledge distillation is an effective paradigm for boosting the performance of pocket-size model, especially when multiple teacher models are available, the student would break the upper limit again. However, it is not economical to train diverse teacher models for the disposable distillation. In this paper, we introduce a new concept dubbed Avatars for distillation, which are the inference ensemble models derived from the teacher. Concretely, (1) For each iteration of distillation training, various Avatars are generated by a perturbation transformation. We validate that Avatars own higher upper limit of working capacity and teaching ability, aiding the student model in learning diverse and receptive knowledge perspectives from the teacher model. (2) During the distillation, we propose an uncertainty-aware factor from the variance of statistical differences between the vanilla teacher and Avatars, to adjust Avatars' contribution on knowledge transfer adaptively. Avatar Knowledge Distillation AKD is fundamentally different from existing methods and refines with the innovative view of unequal training. Comprehensive experiments demonstrate the effectiveness of our Avatars mechanism, which polishes up the state-of-the-art distillation methods for dense prediction without more extra computational cost. The AKD brings at most 0.7 AP gains on COCO 2017 for Object Detection and 1.83 mIoU gains on Cityscapes for Semantic Segmentation, respectively. Code is available at //github.com/Gumpest/AvatarKD.

We examine the extent to which sublinear-sample property testing and estimation applies to settings where samples are independently but not identically distributed. Specifically, we consider the following distributional property testing framework: Suppose there is a set of distributions over a discrete support of size $k$, $\textbf{p}_1, \textbf{p}_2,\ldots,\textbf{p}_T$, and we obtain $c$ independent draws from each distribution. Suppose the goal is to learn or test a property of the average distribution, $\textbf{p}_{\mathrm{avg}}$. This setup models a number of important practical settings where the individual distributions correspond to heterogeneous entities -- either individuals, chronologically distinct time periods, spatially separated data sources, etc. From a learning standpoint, even with $c=1$ samples from each distribution, $\Theta(k/\varepsilon^2)$ samples are necessary and sufficient to learn $\textbf{p}_{\mathrm{avg}}$ to within error $\varepsilon$ in TV distance. To test uniformity or identity -- distinguishing the case that $\textbf{p}_{\mathrm{avg}}$ is equal to some reference distribution, versus has $\ell_1$ distance at least $\varepsilon$ from the reference distribution, we show that a linear number of samples in $k$ is necessary given $c=1$ samples from each distribution. In contrast, for $c \ge 2$, we recover the usual sublinear sample testing of the i.i.d. setting: we show that $O(\sqrt{k}/\varepsilon^2 + 1/\varepsilon^4)$ samples are sufficient, matching the optimal sample complexity in the i.i.d. case in the regime where $\varepsilon \ge k^{-1/4}$. Additionally, we show that in the $c=2$ case, there is a constant $\rho > 0$ such that even in the linear regime with $\rho k$ samples, no tester that considers the multiset of samples (ignoring which samples were drawn from the same $\textbf{p}_i$) can perform uniformity testing.

We introduce and study the problem of dueling optimization with a monotone adversary, which is a generalization of (noiseless) dueling convex optimization. The goal is to design an online algorithm to find a minimizer $\mathbf{x}^{*}$ for a function $f\colon X \to \mathbb{R}$, where $X \subseteq \mathbb{R}^d$. In each round, the algorithm submits a pair of guesses, i.e., $\mathbf{x}^{(1)}$ and $\mathbf{x}^{(2)}$, and the adversary responds with any point in the space that is at least as good as both guesses. The cost of each query is the suboptimality of the worse of the two guesses; i.e., ${\max} \left( f(\mathbf{x}^{(1)}), f(\mathbf{x}^{(2)}) \right) - f(\mathbf{x}^{*})$. The goal is to minimize the number of iterations required to find an $\varepsilon$-optimal point and to minimize the total cost (regret) of the guesses over many rounds. Our main result is an efficient randomized algorithm for several natural choices of the function $f$ and set $X$ that incurs cost $O(d)$ and iteration complexity $O(d\log(1/\varepsilon)^2)$. Moreover, our dependence on $d$ is asymptotically optimal, as we show examples in which any randomized algorithm for this problem must incur $\Omega(d)$ cost and iteration complexity.

We propose integrating the edge-computing paradigm into the multi-robot collaborative scheduling to maximize resource utilization for complex collaborative tasks, which many robots must perform together. Examples include collaborative map-merging to produce a live global map during exploration instead of traditional approaches that schedule tasks on centralized cloud-based systems to facilitate computing. Our decentralized approach to a consensus-based scheduling strategy benefits a multi-robot-edge collaboration system by adapting to dynamic computation needs and communication-changing statistics as the system tries to optimize resources while maintaining overall performance objectives. Before collaborative task offloading, continuous device, and network profiling are performed at the computing resources, and the distributed scheduling scheme then selects the resource with maximum utility derived using a utility maximization approach. Thorough evaluations with and without edge servers on simulation and real-world multi-robot systems demonstrate that a lower task latency, a large throughput gain, and better frame rate processing may be achieved compared to the conventional edge-based systems.

We introduce an autonomous system with closed-loop damping for first-order convex optimization. While, to this day, optimal rates of convergence are only achieved by non-autonomous methods via open-loop damping (e.g., Nesterov's algorithm), we show that our system is the first one featuring a closed-loop damping while exhibiting a rate arbitrarily close to the optimal one. We do so by coupling the damping and the speed of convergence of the system via a well-chosen Lyapunov function. We then derive a practical first-order algorithm called LYDIA by discretizing our system, and present numerical experiments supporting our theoretical findings.

We propose an efficient semi-Lagrangian characteristic mapping method for solving the one+one-dimensional Vlasov-Poisson equations with high precision on a coarse grid. The flow map is evolved numerically and exponential resolution in linear time is obtained. Global third-order convergence in space and time is shown and conservation properties are assessed. For benchmarking, we consider linear and nonlinear Landau damping and the two-stream instability. We compare the results with a Fourier pseudo-spectral method. The extreme fine-scale resolution features are illustrated showing the method's capabilities to efficiently treat filamentation in fusion plasma simulations.

Most object recognition approaches predominantly focus on learning discriminative visual patterns while overlooking the holistic object structure. Though important, structure modeling usually requires significant manual annotations and therefore is labor-intensive. In this paper, we propose to "look into object" (explicitly yet intrinsically model the object structure) through incorporating self-supervisions into the traditional framework. We show the recognition backbone can be substantially enhanced for more robust representation learning, without any cost of extra annotation and inference speed. Specifically, we first propose an object-extent learning module for localizing the object according to the visual patterns shared among the instances in the same category. We then design a spatial context learning module for modeling the internal structures of the object, through predicting the relative positions within the extent. These two modules can be easily plugged into any backbone networks during training and detached at inference time. Extensive experiments show that our look-into-object approach (LIO) achieves large performance gain on a number of benchmarks, including generic object recognition (ImageNet) and fine-grained object recognition tasks (CUB, Cars, Aircraft). We also show that this learning paradigm is highly generalizable to other tasks such as object detection and segmentation (MS COCO). Project page: //github.com/JDAI-CV/LIO.

Due to their inherent capability in semantic alignment of aspects and their context words, attention mechanism and Convolutional Neural Networks (CNNs) are widely applied for aspect-based sentiment classification. However, these models lack a mechanism to account for relevant syntactical constraints and long-range word dependencies, and hence may mistakenly recognize syntactically irrelevant contextual words as clues for judging aspect sentiment. To tackle this problem, we propose to build a Graph Convolutional Network (GCN) over the dependency tree of a sentence to exploit syntactical information and word dependencies. Based on it, a novel aspect-specific sentiment classification framework is raised. Experiments on three benchmarking collections illustrate that our proposed model has comparable effectiveness to a range of state-of-the-art models, and further demonstrate that both syntactical information and long-range word dependencies are properly captured by the graph convolution structure.

北京阿比特科技有限公司