亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Missing values challenge the probabilistic wind power forecasting at both parameter estimation and operational forecasting stages. In this paper, we illustrate that we are allowed to estimate forecasting functions for each missing patterns conveniently, and propose an adaptive quantile regression model whose parameters can adapt to missing patterns. For that, we particularly design a feature extraction block within the quantile regression model, where parameters are set as a function of missingness pattern and only account for observed values. To avoidthe quantile-crossing phenomena, we design a multi-task model to ensure the monotonicity of quantiles, where higher quantiles are derived by the addition between lower quantiles and non-negative increments modeled by neural networks. The proposed approach is distribution-free and applicable to both missing-at-random and missing-not-at-random cases. Case studies demonstrate that the proposed approach achieves the state-of-the-art in terms of the continuous ranked probability score.

相關內容

Cellwise contamination remains a challenging problem for data scientists, particularly in research fields that require the selection of sparse features. Traditional robust methods may not be feasible nor efficient in dealing with such contaminated datasets. We propose CR-Lasso, a robust Lasso-type cellwise regularization procedure that performs feature selection in the presence of cellwise outliers by minimising a regression loss and cell deviation measure simultaneously. To evaluate the approach, we conduct empirical studies comparing its selection and prediction performance with several sparse regression methods. We show that CR-Lasso is competitive under the settings considered. We illustrate the effectiveness of the proposed method on real data through an analysis of a bone mineral density dataset.

We report a flexible language-model based deep learning strategy, applied here to solve complex forward and inverse problems in protein modeling, based on an attention neural network that integrates transformer and graph convolutional architectures in a causal multi-headed graph mechanism, to realize a generative pretrained model. The model is applied to predict secondary structure content (per-residue level and overall content), protein solubility, and sequencing tasks. Further trained on inverse tasks, the model is rendered capable of designing proteins with these properties as target features. The model is formulated as a general framework, completely prompt-based, and can be adapted for a variety of downstream tasks. We find that adding additional tasks yields emergent synergies that the model exploits in improving overall performance, beyond what would be possible by training a model on each dataset alone. Case studies are presented to validate the method, yielding protein designs specifically focused on structural proteins, but also exploring the applicability in the design of soluble, antimicrobial biomaterials. While our model is trained to ultimately perform 8 distinct tasks, with available datasets it can be extended to solve additional problems. In a broader sense, this work illustrates a form of multiscale modeling that relates a set of ultimate building blocks (here, byte-level utf8 characters that define the nature of the physical system at hand) to complex output. This materiomic scheme captures complex emergent relationships between universal building block and resulting properties via a synergizing learning capacity to express a set of potentialities embedded in the knowledge used in training, via the interplay of universality and diversity.

As the use of solar power increases, having accurate and timely forecasts will be essential for smooth grid operators. There are many proposed methods for forecasting solar irradiance / solar power production. However, many of these methods formulate the problem as a time-series, relying on near real-time access to observations at the location of interest to generate forecasts. This requires both access to a real-time stream of data and enough historical observations for these methods to be deployed. In this paper, we propose the use of Global methods to train our models in a generalised way, enabling them to generate forecasts for unseen locations. We apply this approach to both classical ML and state of the art methods. Using data from 20 locations distributed throughout the UK and widely available weather data, we show that it is possible to build systems that do not require access to this data. We utilise and compare both satellite and ground observations (e.g. temperature, pressure) of weather data. Leveraging weather observations and measurements from other locations we show it is possible to create models capable of accurately forecasting solar irradiance at new locations. This could facilitate use planning and optimisation for both newly deployed solar farms and domestic installations from the moment they come online. Additionally, we show that training a single global model for multiple locations can produce a more robust model with more consistent and accurate results across locations.

Adaptive learning is necessary for non-stationary environments where the learning machine needs to forget past data distribution. Efficient algorithms require a compact model update to not grow in computational burden with the incoming data and with the lowest possible computational cost for online parameter updating. Existing solutions only partially cover these needs. Here, we propose the first adaptive sparse Gaussian Process (GP) able to address all these issues. We first reformulate a variational sparse GP algorithm to make it adaptive through a forgetting factor. Next, to make the model inference as simple as possible, we propose updating a single inducing point of the sparse GP model together with the remaining model parameters every time a new sample arrives. As a result, the algorithm presents a fast convergence of the inference process, which allows an efficient model update (with a single inference iteration) even in highly non-stationary environments. Experimental results demonstrate the capabilities of the proposed algorithm and its good performance in modeling the predictive posterior in mean and confidence interval estimation compared to state-of-the-art approaches.

In this study, a density-on-density regression model is introduced, where the association between densities is elucidated via a warping function. The proposed model has the advantage of a being straightforward demonstration of how one density transforms into another. Using the Riemannian representation of density functions, which is the square-root function (or half density), the model is defined in the correspondingly constructed Riemannian manifold. To estimate the warping function, it is proposed to minimize the average Hellinger distance, which is equivalent to minimizing the average Fisher-Rao distance between densities. An optimization algorithm is introduced by estimating the smooth monotone transformation of the warping function. Asymptotic properties of the proposed estimator are discussed. Simulation studies demonstrate the superior performance of the proposed approach over competing approaches in predicting outcome density functions. Applying to a proteomic-imaging study from the Alzheimer's Disease Neuroimaging Initiative, the proposed approach illustrates the connection between the distribution of protein abundance in the cerebrospinal fluid and the distribution of brain regional volume. Discrepancies among cognitive normal subjects, patients with mild cognitive impairment, and Alzheimer's disease (AD) are identified and the findings are in line with existing knowledge about AD.

Deep Ensembles, as a type of Bayesian Neural Networks, can be used to estimate uncertainty on the prediction of multiple neural networks by collecting votes from each network and computing the difference in those predictions. In this paper, we introduce a method for uncertainty estimation that considers a set of independent categorical distributions for each layer of the network, giving many more possible samples with overlapped layers than in the regular Deep Ensembles. We further introduce an optimized inference procedure that reuses common layer outputs, achieving up to 19x speed up and reducing memory usage quadratically. We also show that the method can be further improved by ranking samples, resulting in models that require less memory and time to run while achieving higher uncertainty quality than Deep Ensembles.

Traditional centralized multi-agent reinforcement learning (MARL) algorithms are sometimes unpractical in complicated applications, due to non-interactivity between agents, curse of dimensionality and computation complexity. Hence, several decentralized MARL algorithms are motivated. However, existing decentralized methods only handle the fully cooperative setting where massive information needs to be transmitted in training. The block coordinate gradient descent scheme they used for successive independent actor and critic steps can simplify the calculation, but it causes serious bias. In this paper, we propose a flexible fully decentralized actor-critic MARL framework, which can combine most of actor-critic methods, and handle large-scale general cooperative multi-agent setting. A primal-dual hybrid gradient descent type algorithm framework is designed to learn individual agents separately for decentralization. From the perspective of each agent, policy improvement and value evaluation are jointly optimized, which can stabilize multi-agent policy learning. Furthermore, our framework can achieve scalability and stability for large-scale environment and reduce information transmission, by the parameter sharing mechanism and a novel modeling-other-agents methods based on theory-of-mind and online supervised learning. Sufficient experiments in cooperative Multi-agent Particle Environment and StarCraft II show that our decentralized MARL instantiation algorithms perform competitively against conventional centralized and decentralized methods.

Car-following behavior modeling is critical for understanding traffic flow dynamics and developing high-fidelity microscopic simulation models. Most existing impulse-response car-following models prioritize computational efficiency and interpretability by using a parsimonious nonlinear function based on immediate preceding state observations. However, this approach disregards historical information, limiting its ability to explain real-world driving data. Consequently, serially correlated residuals are commonly observed when calibrating these models with actual trajectory data, hindering their ability to capture complex and stochastic phenomena. To address this limitation, we propose a dynamic regression framework incorporating time series models, such as autoregressive processes, to capture error dynamics. This statistically rigorous calibration outperforms the simple assumption of independent errors and enables more accurate simulation and prediction by leveraging higher-order historical information. We validate the effectiveness of our framework using HighD and OpenACC data, demonstrating improved probabilistic simulations. In summary, our framework preserves the parsimonious nature of traditional car-following models while offering enhanced probabilistic simulations.

An algorithm is said to be adaptive to a certain parameter (of the problem) if it does not need a priori knowledge of such a parameter but performs competitively to those that know it. This dissertation presents our work on adaptive algorithms in following scenarios: 1. In the stochastic optimization setting, we only receive stochastic gradients and the level of noise in evaluating them greatly affects the convergence rate. Tuning is typically required when without prior knowledge of the noise scale in order to achieve the optimal rate. Considering this, we designed and analyzed noise-adaptive algorithms that can automatically ensure (near)-optimal rates under different noise scales without knowing it. 2. In training deep neural networks, the scales of gradient magnitudes in each coordinate can scatter across a very wide range unless normalization techniques, like BatchNorm, are employed. In such situations, algorithms not addressing this problem of gradient scales can behave very poorly. To mitigate this, we formally established the advantage of scale-free algorithms that adapt to the gradient scales and presented its real benefits in empirical experiments. 3. Traditional analyses in non-convex optimization typically rely on the smoothness assumption. Yet, this condition does not capture the properties of some deep learning objective functions, including the ones involving Long Short-Term Memory networks and Transformers. Instead, they satisfy a much more relaxed condition, with potentially unbounded smoothness. Under this condition, we show that a generalized SignSGD algorithm can theoretically match the best-known convergence rates obtained by SGD with gradient clipping but does not need explicit clipping at all, and it can empirically match the performance of Adam and beat others. Moreover, it can also be made to automatically adapt to the unknown relaxed smoothness.

While recent studies on semi-supervised learning have shown remarkable progress in leveraging both labeled and unlabeled data, most of them presume a basic setting of the model is randomly initialized. In this work, we consider semi-supervised learning and transfer learning jointly, leading to a more practical and competitive paradigm that can utilize both powerful pre-trained models from source domain as well as labeled/unlabeled data in the target domain. To better exploit the value of both pre-trained weights and unlabeled target examples, we introduce adaptive consistency regularization that consists of two complementary components: Adaptive Knowledge Consistency (AKC) on the examples between the source and target model, and Adaptive Representation Consistency (ARC) on the target model between labeled and unlabeled examples. Examples involved in the consistency regularization are adaptively selected according to their potential contributions to the target task. We conduct extensive experiments on several popular benchmarks including CUB-200-2011, MIT Indoor-67, MURA, by fine-tuning the ImageNet pre-trained ResNet-50 model. Results show that our proposed adaptive consistency regularization outperforms state-of-the-art semi-supervised learning techniques such as Pseudo Label, Mean Teacher, and MixMatch. Moreover, our algorithm is orthogonal to existing methods and thus able to gain additional improvements on top of MixMatch and FixMatch. Our code is available at //github.com/SHI-Labs/Semi-Supervised-Transfer-Learning.

北京阿比特科技有限公司