亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this paper, we propose a framework to enhance the robustness of the neural models by mitigating the effects of process-induced and aging-related variations of analog computing components on the accuracy of the analog neural networks. We model these variations as the noise affecting the precision of the activations and introduce a denoising block inserted between selected layers of a pre-trained model. We demonstrate that training the denoising block significantly increases the model's robustness against various noise levels. To minimize the overhead associated with adding these blocks, we present an exploration algorithm to identify optimal insertion points for the denoising blocks. Additionally, we propose a specialized architecture to efficiently execute the denoising blocks, which can be integrated into mixed-signal accelerators. We evaluate the effectiveness of our approach using Deep Neural Network (DNN) models trained on the ImageNet and CIFAR-10 datasets. The results show that on average, by accepting 2.03% parameter count overhead, the accuracy drop due to the variations reduces from 31.7% to 1.15%.

相關內容

機器學習系統設計系統評估標準

In this work, we consider the approximation capabilities of shallow neural networks in weighted Sobolev spaces for functions in the spectral Barron space. The existing literature already covers several cases, in which the spectral Barron space can be approximated well, i.e., without curse of dimensionality, by shallow networks and several different classes of activation function. The limitations of the existing results are mostly on the error measures that were considered, in which the results are restricted to Sobolev spaces over a bounded domain. We will here treat two cases that extend upon the existing results. Namely, we treat the case with bounded domain and Muckenhoupt weights and the case, where the domain is allowed to be unbounded and the weights are required to decay. We first present embedding results for the more general weighted Fourier-Lebesgue spaces in the weighted Sobolev spaces and then we establish asymptotic approximation rates for shallow neural networks that come without curse of dimensionality.

The utilization of AIoT technology has become a crucial trend in modern poultry management, offering the potential to optimize farming operations and reduce human workloads. This paper presents a real-time and compact edge-AI enabled detector designed to identify chickens and their healthy statuses using frames captured by a lightweight and intelligent camera equipped with an edge-AI enabled CMOS sensor. To ensure efficient deployment of the proposed compact detector within the memory-constrained edge-AI enabled CMOS sensor, we employ a FCOS-Lite detector leveraging MobileNet as the backbone. To mitigate the issue of reduced accuracy in compact edge-AI detectors without incurring additional inference costs, we propose a gradient weighting loss function as classification loss and introduce CIOU loss function as localization loss. Additionally, we propose a knowledge distillation scheme to transfer valuable information from a large teacher detector to the proposed FCOS-Lite detector, thereby enhancing its performance while preserving a compact model size. Experimental results demonstrate the proposed edge-AI enabled detector achieves commendable performance metrics, including a mean average precision (mAP) of 95.1$\%$ and an F1-score of 94.2$\%$, etc. Notably, the proposed detector can be efficiently deployed and operates at a speed exceeding 20 FPS on the edge-AI enabled CMOS sensor, achieved through int8 quantization. That meets practical demands for automated poultry health monitoring using lightweight intelligent cameras with low power consumption and minimal bandwidth costs.

In this paper, we explore a multi-task semantic communication (SemCom) system for distributed sources, extending the existing focus on collaborative single-task execution. We build on the cooperative multi-task processing introduced in [1], which divides the encoder into a common unit (CU) and multiple specific units (SUs). While earlier studies in multi-task SemCom focused on full observation settings, our research explores a more realistic case where only distributed partial observations are available, such as in a production line monitored by multiple sensing nodes. To address this, we propose an SemCom system that supports multi-task processing through cooperation on the transmitter side via split structure and collaboration on the receiver side. We have used an information-theoretic perspective with variational approximations for our end-to-end data-driven approach. Simulation results demonstrate that the proposed cooperative and collaborative multi-task (CCMT) SemCom system significantly improves task execution accuracy, particularly in complex datasets, if the noise introduced from the communication channel is not limiting the task performance too much. Our findings contribute to a more general SemCom framework capable of handling distributed sources and multiple tasks simultaneously, advancing the applicability of SemCom systems in real-world scenarios.

In this work, we study the generalizability of diffusion models by looking into the hidden properties of the learned score functions, which are essentially a series of deep denoisers trained on various noise levels. We observe that as diffusion models transition from memorization to generalization, their corresponding nonlinear diffusion denoisers exhibit increasing linearity. This discovery leads us to investigate the linear counterparts of the nonlinear diffusion models, which are a series of linear models trained to match the function mappings of the nonlinear diffusion denoisers. Surprisingly, these linear denoisers are approximately the optimal denoisers for a multivariate Gaussian distribution characterized by the empirical mean and covariance of the training dataset. This finding implies that diffusion models have the inductive bias towards capturing and utilizing the Gaussian structure (covariance information) of the training dataset for data generation. We empirically demonstrate that this inductive bias is a unique property of diffusion models in the generalization regime, which becomes increasingly evident when the model's capacity is relatively small compared to the training dataset size. In the case that the model is highly overparameterized, this inductive bias emerges during the initial training phases before the model fully memorizes its training data. Our study provides crucial insights into understanding the notable strong generalization phenomenon recently observed in real-world diffusion models.

Motivated by studies investigating causal effects in survival analysis, we propose a transformation model to quantify the impact of a binary treatment on a time-to-event outcome. The approach is based on a flexible linear transformation structural model that links a monotone function of the time-to-event with the propensity for treatment through a bivariate Gaussian distribution. The model equations are specified as functions of additive predictors, allowing the impacts of observed confounders to be accounted for flexibly. Furthermore, the effect of the instrumental variable may be regularized through a ridge penalty, while interactions between the treatment and modifier variables can be incorporated into the model to acknowledge potential variations in treatment effects across different subgroups. The baseline survival function is estimated in a flexible manner using monotonic P-splines, while unobserved confounding is captured through the dependence parameter of the bivariate Gaussian. The proposal naturally provides an intuitive causal measure of interest, the survival average treatment effect. Parameter estimation is achieved via a computationally efficient and stable penalized maximum likelihood estimation approach and intervals constructed using the related inferential results. We revisit a dataset from the Illinois Reemployment Bonus Experiment to estimate the causal effect of a cash bonus on unemployment duration, unveiling new insights. The modeling framework is incorporated into the R package GJRM, enabling researchers and practitioners to fit the proposed causal survival model and obtain easy-to-interpret numerical and visual summaries.

In this study, we introduce FEET, a standardized protocol designed to guide the development and benchmarking of foundation models. While numerous benchmark datasets exist for evaluating these models, we propose a structured evaluation protocol across three distinct scenarios to gain a comprehensive understanding of their practical performance. We define three primary use cases: frozen embeddings, few-shot embeddings, and fully fine-tuned embeddings. Each scenario is detailed and illustrated through two case studies: one in sentiment analysis and another in the medical domain, demonstrating how these evaluations provide a thorough assessment of foundation models' effectiveness in research applications. We recommend this protocol as a standard for future research aimed at advancing representation learning models.

In this article, we explored the usage of the equivariance criterion in linear model with fixed-X for the estimation and extended the model to allow multiple populations, which, in turn, leads to a larger transformation group. The minimum risk equivariant estimators of the coefficient vector and the covariance matrix were derived via the maximal invariants, which was consistent with earlier works. This article serves as an early exploration of the equivariance criterion in linear model.

In this paper, we discuss the second-order finite element method (FEM) and finite difference method (FDM) for numerically solving elliptic cross-interface problems characterized by vertical and horizontal straight lines, piecewise constant coefficients, two homogeneous jump conditions, continuous source terms, and Dirichlet boundary conditions. For brevity, we consider a 2D simplified version where the intersection points of the interface lines coincide with grid points in uniform Cartesian grids. Our findings reveal interesting and important results: (1) When the coefficient functions exhibit either high jumps with low-frequency oscillations or low jumps with high-frequency oscillations, the finite element method and finite difference method yield similar numerical solutions. (2) However, when the interface problems involve high-contrast and high-frequency coefficient functions, the numerical solutions obtained from the finite element and finite difference methods differ significantly. Given that the widely studied SPE10 benchmark problem (see //www.spe.org/web/csp/datasets/set02.htm) typically involves high-contrast and high-frequency permeability due to varying geological layers in porous media, this phenomenon warrants attention. Furthermore, this observation is particularly important for developing multiscale methods, as reference solutions for these methods are usually obtained using the standard second-order finite element method with a fine mesh, and analytical solutions are not available. We provide sufficient details to enable replication of our numerical results, and the implementation is straightforward. This simplicity ensures that readers can easily confirm the validity of our findings.

Molecular design and synthesis planning are two critical steps in the process of molecular discovery that we propose to formulate as a single shared task of conditional synthetic pathway generation. We report an amortized approach to generate synthetic pathways as a Markov decision process conditioned on a target molecular embedding. This approach allows us to conduct synthesis planning in a bottom-up manner and design synthesizable molecules by decoding from optimized conditional codes, demonstrating the potential to solve both problems of design and synthesis simultaneously. The approach leverages neural networks to probabilistically model the synthetic trees, one reaction step at a time, according to reactivity rules encoded in a discrete action space of reaction templates. We train these networks on hundreds of thousands of artificial pathways generated from a pool of purchasable compounds and a list of expert-curated templates. We validate our method with (a) the recovery of molecules using conditional generation, (b) the identification of synthesizable structural analogs, and (c) the optimization of molecular structures given oracle functions relevant to drug discovery.

In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.

北京阿比特科技有限公司