亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

A great deal of research has been conducted in the consideration of meta-heuristic optimisation methods that are able to find global optima in settings that gradient based optimisers have traditionally struggled. Of these, so-called particle swarm optimisation (PSO) approaches have proven to be highly effective in a number of application areas. Given the maturity of the PSO field, it is likely that novel variants of the PSO algorithm stand to offer only marginal gains in terms of performance -- there is, after all, no free lunch. Instead of only chasing performance on suites of benchmark optimisation functions, it is argued herein that research effort is better placed in the pursuit of algorithms that also have other useful properties. In this work, a highly-general, interpretable variant of the PSO algorithm -- particle attractor algorithm (PAO) -- is proposed. Furthermore, the algorithm is designed such that the transition densities (describing the motions of the particles from one generation to the next) can be computed exactly in closed form for each step. Access to closed-form transition densities has important ramifications for the closely-related field of Sequential Monte Carlo (SMC). In order to demonstrate that the useful properties do not come at the cost of performance, PAO is compared to several other state-of-the art heuristic optimisation algorithms in a benchmark comparison study.

相關內容

In this work, we aim to reconstruct a time-varying 3D model, capable of rendering photo-realistic renderings with independent control of viewpoint, illumination, and time, from Internet photos of large-scale landmarks. The core challenges are twofold. First, different types of temporal changes, such as illumination and changes to the underlying scene itself (such as replacing one graffiti artwork with another) are entangled together in the imagery. Second, scene-level temporal changes are often discrete and sporadic over time, rather than continuous. To tackle these problems, we propose a new scene representation equipped with a novel temporal step function encoding method that can model discrete scene-level content changes as piece-wise constant functions over time. Specifically, we represent the scene as a space-time radiance field with a per-image illumination embedding, where temporally-varying scene changes are encoded using a set of learned step functions. To facilitate our task of chronology reconstruction from Internet imagery, we also collect a new dataset of four scenes that exhibit various changes over time. We demonstrate that our method exhibits state-of-the-art view synthesis results on this dataset, while achieving independent control of viewpoint, time, and illumination.

Quickly and reliably finding accurate inverse kinematics (IK) solutions remains a challenging problem for robotic manipulation. Existing numerical solvers are broadly applicable, but typically only produce a single solution and rely on local search techniques to minimize highly nonconvex objective functions. More recent learning-based approaches that approximate the entire feasible set of solutions have shown promise as a means to generate multiple fast and accurate IK results in parallel. However, existing learning-based techniques have a significant drawback: each robot of interest requires a specialized model that must be trained from scratch. To address this key shortcoming, we investigate a novel distance-geometric robot representation coupled with a graph structure that allows us to leverage the flexibility of graph neural networks (GNNs). We use this approach to train the first learned generative graphical inverse kinematics (GGIK) solver that is able to produce a large number of diverse solutions in parallel and to also generalize: a single learned model can be used to produce IK solutions for a variety of different robots. When compared to several other learned IK methods, GGIK provides more accurate solutions. GGIK is also able to generalize reasonably well to robot manipulators unseen during training. Finally, we show that GGIK can be used to complement local IK solvers by providing reliable initializations to seed the local optimization process.

Large language models (LLMs) have exhibited remarkable capabilities in learning from explanations in prompts, but there has been limited understanding of exactly how these explanations function or why they are effective. This work aims to better understand the mechanisms by which explanations are used for in-context learning. We first study the impact of two different factors on the performance of prompts with explanations: the computation trace (the way the solution is decomposed) and the natural language used to express the prompt. By perturbing explanations on three controlled tasks, we show that both factors contribute to the effectiveness of explanations. We further study how to form maximally effective sets of explanations for solving a given test query. We find that LLMs can benefit from the complementarity of the explanation set: diverse reasoning skills shown by different exemplars can lead to better performance. Therefore, we propose a maximal marginal relevance-based exemplar selection approach for constructing exemplar sets that are both relevant as well as complementary, which successfully improves the in-context learning performance across three real-world tasks on multiple LLMs.

We study the generalization behavior of transfer learning of deep neural networks (DNNs). We adopt the overparameterization perspective -- featuring interpolation of the training data (i.e., approximately zero train error) and the double descent phenomenon -- to explain the delicate effect of the transfer learning setting on generalization performance. We study how the generalization behavior of transfer learning is affected by the dataset size in the source and target tasks, the number of transferred layers that are kept frozen in the target DNN training, and the similarity between the source and target tasks. We show that the test error evolution during the target DNN training has a more significant double descent effect when the target training dataset is sufficiently large. In addition, a larger source training dataset can yield a slower target DNN training. Moreover, we demonstrate that the number of frozen layers can determine whether the transfer learning is effectively underparameterized or overparameterized and, in turn, this may induce a freezing-wise double descent phenomenon that determines the relative success or failure of learning. Also, we show that the double descent phenomenon may make a transfer from a less related source task better than a transfer from a more related source task. We establish our results using image classification experiments with the ResNet, DenseNet and the vision transformer (ViT) architectures.

We study step-wise time approximations of non-linear hyperbolic initial value problems. The technique used here is a generalization of the minimizing movements method, using two time-scales: one for velocity, the other (potentially much larger) for acceleration. The main applications are from elastodynamics namely so-called generalized solids. The evolution follows an underlying variational structure exploited by step-wise minimisation. We show for a large family of (elastic) energies that the introduced scheme is stable; allowing for non-linearities of highest order. If the highest order can assumed to be linear, we show that the limit solutions are regular and that the minimizing movements scheme converges with optimal linear rate. Thus this work extends numerical time-step minimization methods to the realm of hyperbolic problems.

To efficiently exploit the massive amounts of raw data that are increasingly being generated in mobile edge networks, federated learning (FL) has emerged as a promising distributed learning technique. By collaboratively training a shared learning model on edge devices, raw data transmission and storage are replaced by the exchange of the local computed parameters/gradients in FL, which thus helps address latency and privacy issues. However, the number of resource blocks when using traditional orthogonal transmission strategies for FL linearly scales with the number of participating devices, which conflicts with the scarcity of communication resources. To tackle this issue, over-the-air computation (AirComp) has emerged recently which leverages the inherent superposition property of wireless channels to perform one-shot model aggregation. However, the aggregation accuracy in AirComp suffers from the unfavorable wireless propagation environment. In this paper, we consider the use of intelligent reflecting surfaces (IRSs) to mitigate this problem and improve FL performance with AirComp. Specifically, a performance-oriented design scheme that directly minimizes the optimality gap of the loss function is proposed to accelerate the convergence of AirComp-based FL. We first analyze the convergence behavior of the FL procedure with the absence of channel fading and noise. Based on the obtained optimality gap which characterizes the impact of channel fading and noise in different communication rounds on the ultimate performance of FL, we propose both online and offline approaches to tackle the resulting design problem. Simulation results demonstrate that such a performance-oriented design strategy can achieve higher test accuracy than the conventional isolated mean square error (MSE) minimization approach in FL.

Diffusion models have demonstrated excellent potential for generating diverse images. However, their performance often suffers from slow generation due to iterative denoising. Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few without significant quality degradation. However, existing distillation methods either require significant amounts of offline computation for generating synthetic training data from the teacher model or need to perform expensive online learning with the help of real data. In this work, we present a novel technique called BOOT, that overcomes these limitations with an efficient data-free distillation algorithm. The core idea is to learn a time-conditioned model that predicts the output of a pre-trained diffusion model teacher given any time step. Such a model can be efficiently trained based on bootstrapping from two consecutive sampled steps. Furthermore, our method can be easily adapted to large-scale text-to-image diffusion models, which are challenging for conventional methods given the fact that the training sets are often large and difficult to access. We demonstrate the effectiveness of our approach on several benchmark datasets in the DDIM setting, achieving comparable generation quality while being orders of magnitude faster than the diffusion teacher. The text-to-image results show that the proposed approach is able to handle highly complex distributions, shedding light on more efficient generative modeling.

Deep learning has shown great potential for modeling the physical dynamics of complex particle systems such as fluids (in Lagrangian descriptions). Existing approaches, however, require the supervision of consecutive particle properties, including positions and velocities. In this paper, we consider a partially observable scenario known as fluid dynamics grounding, that is, inferring the state transitions and interactions within the fluid particle systems from sequential visual observations of the fluid surface. We propose a differentiable two-stage network named NeuroFluid. Our approach consists of (i) a particle-driven neural renderer, which involves fluid physical properties into the volume rendering function, and (ii) a particle transition model optimized to reduce the differences between the rendered and the observed images. NeuroFluid provides the first solution to unsupervised learning of particle-based fluid dynamics by training these two models jointly. It is shown to reasonably estimate the underlying physics of fluids with different initial shapes, viscosity, and densities. It is a potential alternative approach to understanding complex fluid mechanics, such as turbulence, that are difficult to model using traditional methods of mathematical physics.

Deep reinforcement learning algorithms can perform poorly in real-world tasks due to the discrepancy between source and target environments. This discrepancy is commonly viewed as the disturbance in transition dynamics. Many existing algorithms learn robust policies by modeling the disturbance and applying it to source environments during training, which usually requires prior knowledge about the disturbance and control of simulators. However, these algorithms can fail in scenarios where the disturbance from target environments is unknown or is intractable to model in simulators. To tackle this problem, we propose a novel model-free actor-critic algorithm -- namely, state-conservative policy optimization (SCPO) -- to learn robust policies without modeling the disturbance in advance. Specifically, SCPO reduces the disturbance in transition dynamics to that in state space and then approximates it by a simple gradient-based regularizer. The appealing features of SCPO include that it is simple to implement and does not require additional knowledge about the disturbance or specially designed simulators. Experiments in several robot control tasks demonstrate that SCPO learns robust policies against the disturbance in transition dynamics.

Multivariate time series forecasting is extensively studied throughout the years with ubiquitous applications in areas such as finance, traffic, environment, etc. Still, concerns have been raised on traditional methods for incapable of modeling complex patterns or dependencies lying in real word data. To address such concerns, various deep learning models, mainly Recurrent Neural Network (RNN) based methods, are proposed. Nevertheless, capturing extremely long-term patterns while effectively incorporating information from other variables remains a challenge for time-series forecasting. Furthermore, lack-of-explainability remains one serious drawback for deep neural network models. Inspired by Memory Network proposed for solving the question-answering task, we propose a deep learning based model named Memory Time-series network (MTNet) for time series forecasting. MTNet consists of a large memory component, three separate encoders, and an autoregressive component to train jointly. Additionally, the attention mechanism designed enable MTNet to be highly interpretable. We can easily tell which part of the historic data is referenced the most.

北京阿比特科技有限公司