亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

This paper focuses on one of the most frequently visited multithreading library interfaces - ParallelFor. In this study, it is inferred that ParallelFor's end-to-end latency performance is noticeably affected by the frequency with which fetch-add-add (FAA) is called during program execution. This can be explained by ParallelFor's uniform semantics and the utilization of atomic FAA. To prove this assumption, a battery of tests was designed and conducted on diverse platforms. From the collected performance statistics and overall trends, several conclusions were drawn and a cost model is proposed to enhance performance by mitigating the influence of FAA.

相關內容

In Multiple Instance Learning (MIL), models are trained using bags of instances, where only a single label is provided for each bag. A bag label is often only determined by a handful of key instances within a bag, making it difficult to interpret what information a classifier is using to make decisions. In this work, we establish the key requirements for interpreting MIL models. We then go on to develop several model-agnostic approaches that meet these requirements. Our methods are compared against existing inherently interpretable MIL models on several datasets, and achieve an increase in interpretability accuracy of up to 30%. We also examine the ability of the methods to identify interactions between instances and scale to larger datasets, improving their applicability to real-world problems.

Consider solving large sparse range symmetric singular linear systems $A{\bf x} = {\bf b}$ which arise, for instance, in the discretization of convection diffusion equations with periodic boundary conditions, and partial differential equations for electromagnetic fields using the edge-based finite element method. In theory, the Generalized Minimal Residual (GMRES) method converges to the least squares solution for inconsistent systems if the coefficient matrix $A$ is range symmetric, i.e. ${\rm R}(A)={ {\rm R}(A^{\rm T} })$, where ${\rm R}(A)$ is the range space of $A$. However, in practice, GMRES may not converge due to numerical instability. In order to improve the convergence, we propose using the pseudo-inverse for the solution of the severely ill-conditioned Hessenberg systems in GMRES. Numerical experiments on semi-definite inconsistent systems indicate that the method is efficient and robust. Finally, we further improve the convergence of the method, by reorthogonalizing the Modified Gram-Schmidt procedure.

Though platform trials have been touted for their flexibility and streamlined use of trial resources, their statistical efficiency is not well understood. We fill this gap by establishing their greater efficiency for comparing the relative efficacy of multiple interventions over using several separate, two-arm trials, where the relative efficacy of an arbitrary pair of interventions is evaluated by contrasting their relative risks as compared to control. In theoretical and numerical studies, we demonstrate that the inference of such a contrast using data from a platform trial enjoys identical or better precision than using data from separate trials, even when the former enrolls substantially fewer participants. This benefit is attributed to the sharing of controls among interventions under contemporaneous randomization, which is a key feature of platform trials. We further provide a novel procedure for establishing the non-inferiority of a given intervention relative to the most efficacious of the other interventions under evaluation, where this procedure is adaptive in the sense that it need not be \textit{a priori} known which of these other interventions is most efficacious. Our numerical studies show that this testing procedure can attain substantially better power when the data arise from a platform trial rather than multiple separate trials. Our results are illustrated using data from two monoclonal antibody trials for the prevention of HIV.

We introduce a novel, probabilistic binary latent variable model to detect noisy or approximate repeats of patterns in sparse binary data. The model is based on the "Noisy-OR model" (Heckerman, 1990), used previously for disease and topic modelling. The model's capability is demonstrated by extracting structure in recordings from retinal neurons, but it can be widely applied to discover and model latent structure in noisy binary data. In the context of spiking neural data, the task is to "explain" spikes of individual neurons in terms of groups of neurons, "Cell Assemblies" (CAs), that often fire together, due to mutual interactions or other causes. The model infers sparse activity in a set of binary latent variables, each describing the activity of a cell assembly. When the latent variable of a cell assembly is active, it reduces the probabilities of neurons belonging to this assembly to be inactive. The conditional probability kernels of the latent components are learned from the data in an expectation maximization scheme, involving inference of latent states and parameter adjustments to the model. We thoroughly validate the model on synthesized spike trains constructed to statistically resemble recorded retinal responses to white noise stimulus and natural movie stimulus in data. We also apply our model to spiking responses recorded in retinal ganglion cells (RGCs) during stimulation with a movie and discuss the found structure.

Spatio-temporal forecasting has numerous applications in analyzing wireless, traffic, and financial networks. Many classical statistical models often fall short in handling the complexity and high non-linearity present in time-series data. Recent advances in deep learning allow for better modelling of spatial and temporal dependencies. While most of these models focus on obtaining accurate point forecasts, they do not characterize the prediction uncertainty. In this work, we consider the time-series data as a random realization from a nonlinear state-space model and target Bayesian inference of the hidden states for probabilistic forecasting. We use particle flow as the tool for approximating the posterior distribution of the states, as it is shown to be highly effective in complex, high-dimensional settings. Thorough experimentation on several real world time-series datasets demonstrates that our approach provides better characterization of uncertainty while maintaining comparable accuracy to the state-of-the art point forecasting methods.

The collective attention on online items such as web pages, search terms, and videos reflects trends that are of social, cultural, and economic interest. Moreover, attention trends of different items exhibit mutual influence via mechanisms such as hyperlinks or recommendations. Many visualisation tools exist for time series, network evolution, or network influence; however, few systems connect all three. In this work, we present AttentionFlow, a new system to visualise networks of time series and the dynamic influence they have on one another. Centred around an ego node, our system simultaneously presents the time series on each node using two visual encodings: a tree ring for an overview and a line chart for details. AttentionFlow supports interactions such as overlaying time series of influence and filtering neighbours by time or flux. We demonstrate AttentionFlow using two real-world datasets, VevoMusic and WikiTraffic. We show that attention spikes in songs can be explained by external events such as major awards, or changes in the network such as the release of a new song. Separate case studies also demonstrate how an artist's influence changes over their career, and that correlated Wikipedia traffic is driven by cultural interests. More broadly, AttentionFlow can be generalised to visualise networks of time series on physical infrastructures such as road networks, or natural phenomena such as weather and geological measurements.

Personalized recommender systems are playing an increasingly important role as more content and services become available and users struggle to identify what might interest them. Although matrix factorization and deep learning based methods have proved effective in user preference modeling, they violate the triangle inequality and fail to capture fine-grained preference information. To tackle this, we develop a distance-based recommendation model with several novel aspects: (i) each user and item are parameterized by Gaussian distributions to capture the learning uncertainties; (ii) an adaptive margin generation scheme is proposed to generate the margins regarding different training triplets; (iii) explicit user-user/item-item similarity modeling is incorporated in the objective function. The Wasserstein distance is employed to determine preferences because it obeys the triangle inequality and can measure the distance between probabilistic distributions. Via a comparison using five real-world datasets with state-of-the-art methods, the proposed model outperforms the best existing models by 4-22% in terms of recall@K on Top-K recommendation.

Real-time semantic segmentation plays an important role in practical applications such as self-driving and robots. Most research working on semantic segmentation focuses on accuracy with little consideration for efficiency. Several existing studies that emphasize high-speed inference often cannot produce high-accuracy segmentation results. In this paper, we propose a novel convolutional network named Efficient Dense modules with Asymmetric convolution (EDANet), which employs an asymmetric convolution structure incorporating the dilated convolution and the dense connectivity to attain high efficiency at low computational cost, inference time, and model size. Compared to FCN, EDANet is 11 times faster and has 196 times fewer parameters, while it achieves a higher the mean of intersection-over-union (mIoU) score without any additional decoder structure, context module, post-processing scheme, and pretrained model. We evaluate EDANet on Cityscapes and CamVid datasets to evaluate its performance and compare it with the other state-of-art systems. Our network can run on resolution 512x1024 inputs at the speed of 108 and 81 frames per second on a single GTX 1080Ti and Titan X, respectively.

Clustering and classification critically rely on distance metrics that provide meaningful comparisons between data points. We present mixed-integer optimization approaches to find optimal distance metrics that generalize the Mahalanobis metric extensively studied in the literature. Additionally, we generalize and improve upon leading methods by removing reliance on pre-designated "target neighbors," "triplets," and "similarity pairs." Another salient feature of our method is its ability to enable active learning by recommending precise regions to sample after an optimal metric is computed to improve classification performance. This targeted acquisition can significantly reduce computational burden by ensuring training data completeness, representativeness, and economy. We demonstrate classification and computational performance of the algorithms through several simple and intuitive examples, followed by results on real image and medical datasets.

This paper proposes a new neural architecture for collaborative ranking with implicit feedback. Our model, LRML (\textit{Latent Relational Metric Learning}) is a novel metric learning approach for recommendation. More specifically, instead of simple push-pull mechanisms between user and item pairs, we propose to learn latent relations that describe each user item interaction. This helps to alleviate the potential geometric inflexibility of existing metric learing approaches. This enables not only better performance but also a greater extent of modeling capability, allowing our model to scale to a larger number of interactions. In order to do so, we employ a augmented memory module and learn to attend over these memory blocks to construct latent relations. The memory-based attention module is controlled by the user-item interaction, making the learned relation vector specific to each user-item pair. Hence, this can be interpreted as learning an exclusive and optimal relational translation for each user-item interaction. The proposed architecture demonstrates the state-of-the-art performance across multiple recommendation benchmarks. LRML outperforms other metric learning models by $6\%-7.5\%$ in terms of Hits@10 and nDCG@10 on large datasets such as Netflix and MovieLens20M. Moreover, qualitative studies also demonstrate evidence that our proposed model is able to infer and encode explicit sentiment, temporal and attribute information despite being only trained on implicit feedback. As such, this ascertains the ability of LRML to uncover hidden relational structure within implicit datasets.

北京阿比特科技有限公司