亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Diffusion model has become a main paradigm for synthetic data generation in many subfields of modern machine learning, including computer vision, language model, or speech synthesis. In this paper, we leverage the power of diffusion model for generating synthetic tabular data. The heterogeneous features in tabular data have been main obstacles in tabular data synthesis, and we tackle this problem by employing the auto-encoder architecture. When compared with the state-of-the-art tabular synthesizers, the resulting synthetic tables from our model show nice statistical fidelities to the real data, and perform well in downstream tasks for machine learning utilities. We conducted the experiments over 15 publicly available datasets. Notably, our model adeptly captures the correlations among features, which has been a long-standing challenge in tabular data synthesis. Our code is available upon request and will be publicly released if paper is accepted.

相關內容

ACM/IEEE第23屆模型驅動工程語言和系統國際會議,是模型驅動軟件和系統工程的首要會議系列,由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來,模型涵蓋了建模的各個方面,從語言和方法到工具和應用程序。模特的參加者來自不同的背景,包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇,參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會,并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。 官網鏈接: · Machine Learning · 模型評估 · Learning · Weight ·
2023 年 12 月 11 日

Density functional theory (DFT) stands as a cornerstone method in computational quantum chemistry and materials science due to its remarkable versatility and scalability. Yet, it suffers from limitations in accuracy, particularly when dealing with strongly correlated systems. To address these shortcomings, recent work has begun to explore how machine learning can expand the capabilities of DFT; an endeavor with many open questions and technical challenges. In this work, we present Grad DFT: a fully differentiable JAX-based DFT library, enabling quick prototyping and experimentation with machine learning-enhanced exchange-correlation energy functionals. Grad DFT employs a pioneering parametrization of exchange-correlation functionals constructed using a weighted sum of energy densities, where the weights are determined using neural networks. Moreover, Grad DFT encompasses a comprehensive suite of auxiliary functions, notably featuring a just-in-time compilable and fully differentiable self-consistent iterative procedure. To support training and benchmarking efforts, we additionally compile a curated dataset of experimental dissociation energies of dimers, half of which contain transition metal atoms characterized by strong electronic correlations. The software library is tested against experimental results to study the generalization capabilities of a neural functional across potential energy surfaces and atomic species, as well as the effect of training data noise on the resulting model accuracy.

The joint modeling of multiple longitudinal biomarkers together with a time-to-event outcome is a challenging modeling task of continued scientific interest. In particular, the computational complexity of high dimensional (generalized) mixed effects models often restricts the flexibility of shared parameter joint models, even when the subject-specific marker trajectories follow highly nonlinear courses. We propose a parsimonious multivariate functional principal components representation of the shared random effects. This allows better scalability, as the dimension of the random effects does not directly increase with the number of markers, only with the chosen number of principal component basis functions used in the approximation of the random effects. The functional principal component representation additionally allows to estimate highly flexible subject-specific random trajectories without parametric assumptions. The modeled trajectories can thus be distinctly different for each biomarker. We build on the framework of flexible Bayesian additive joint models implemented in the R-package 'bamlss', which also supports estimation of nonlinear covariate effects via Bayesian P-splines. The flexible yet parsimonious functional principal components basis used in the estimation of the joint model is first estimated in a preliminary step. We validate our approach in a simulation study and illustrate its advantages by analyzing a study on primary biliary cholangitis.

This study examines the varying coefficient model in tail index regression. The varying coefficient model is an efficient semiparametric model that avoids the curse of dimensionality when including large covariates in the model. In fact, the varying coefficient model is useful in mean, quantile, and other regressions. The tail index regression is not an exception. However, the varying coefficient model is flexible, but leaner and simpler models are preferred for applications. Therefore, it is important to evaluate whether the estimated coefficient function varies significantly with covariates. If the effect of the non-linearity of the model is weak, the varying coefficient structure is reduced to a simpler model, such as a constant or zero. Accordingly, the hypothesis test for model assessment in the varying coefficient model has been discussed in mean and quantile regression. However, there are no results in tail index regression. In this study, we investigate the asymptotic properties of an estimator and provide a hypothesis testing method for varying coefficient models for tail index regression.

Normal modal logics extending the logic K4.3 of linear transitive frames are known to lack the Craig interpolation property, except some logics of bounded depth such as S5. We turn this `negative' fact into a research question and pursue a non-uniform approach to Craig interpolation by investigating the following interpolant existence problem: decide whether there exists a Craig interpolant between two given formulas in any fixed logic above K4.3. Using a bisimulation-based characterisation of interpolant existence for descriptive frames, we show that this problem is decidable and coNP-complete for all finitely axiomatisable normal modal logics containing K4.3. It is thus not harder than entailment in these logics, which is in sharp contrast to other recent non-uniform interpolation results. We also extend our approach to Priorean temporal logics (with both past and future modalities) over the standard time flows-the integers, rationals, reals, and finite strict linear orders-none of which is blessed with the Craig interpolation property.

With the rising concern on model interpretability, the application of eXplainable AI (XAI) tools on deepfake detection models has been a topic of interest recently. In image classification tasks, XAI tools highlight pixels influencing the decision given by a model. This helps in troubleshooting the model and determining areas that may require further tuning of parameters. With a wide range of tools available in the market, choosing the right tool for a model becomes necessary as each one may highlight different sets of pixels for a given image. There is a need to evaluate different tools and decide the best performing ones among them. Generic XAI evaluation methods like insertion or removal of salient pixels/segments are applicable for general image classification tasks but may produce less meaningful results when applied on deepfake detection models due to their functionality. In this paper, we perform experiments to show that generic removal/insertion XAI evaluation methods are not suitable for deepfake detection models. We also propose and implement an XAI evaluation approach specifically suited for deepfake detection models.

We introduce a general framework for active learning in regression problems. Our framework extends the standard setup by allowing for general types of data, rather than merely pointwise samples of the target function. This generalization covers many cases of practical interest, such as data acquired in transform domains (e.g., Fourier data), vector-valued data (e.g., gradient-augmented data), data acquired along continuous curves, and, multimodal data (i.e., combinations of different types of measurements). Our framework considers random sampling according to a finite number of sampling measures and arbitrary nonlinear approximation spaces (model classes). We introduce the concept of generalized Christoffel functions and show how these can be used to optimize the sampling measures. We prove that this leads to near-optimal sample complexity in various important cases. This paper focuses on applications in scientific computing, where active learning is often desirable, since it is usually expensive to generate data. We demonstrate the efficacy of our framework for gradient-augmented learning with polynomials, Magnetic Resonance Imaging (MRI) using generative models and adaptive sampling for solving PDEs using Physics-Informed Neural Networks (PINNs).

Distributed quantum computing, particularly distributed quantum machine learning, has gained substantial prominence for its capacity to harness the collective power of distributed quantum resources, transcending the limitations of individual quantum nodes. Meanwhile, the critical concern of privacy within distributed computing protocols remains a significant challenge, particularly in standard classical federated learning (FL) scenarios where data of participating clients is susceptible to leakage via gradient inversion attacks by the server. This paper presents innovative quantum protocols with quantum communication designed to address the FL problem, strengthen privacy measures, and optimize communication efficiency. In contrast to previous works that leverage expressive variational quantum circuits or differential privacy techniques, we consider gradient information concealment using quantum states and propose two distinct FL protocols, one based on private inner-product estimation and the other on incremental learning. These protocols offer substantial advancements in privacy preservation with low communication resources, forging a path toward efficient quantum communication-assisted FL protocols and contributing to the development of secure distributed quantum machine learning, thus addressing critical privacy concerns in the quantum computing era.

We present a novel clustering algorithm, visClust, that is based on lower dimensional data representations and visual interpretation. Thereto, we design a transformation that allows the data to be represented by a binary integer array enabling the use of image processing methods to select a partition. Qualitative and quantitative analyses measured in accuracy and an adjusted Rand-Index show that the algorithm performs well while requiring low runtime and RAM. We compare the results to 6 state-of-the-art algorithms with available code, confirming the quality of visClust by superior performance in most experiments. Moreover, the algorithm asks for just one obligatory input parameter while allowing optimization via optional parameters. The code is made available on GitHub and straightforward to use.

PDDSparse is a new hybrid parallelisation scheme for solving large-scale elliptic boundary value problems on supercomputers, which can be described as a Feynman-Kac formula for domain decomposition. At its core lies a stochastic linear, sparse system for the solutions on the interfaces, whose entries are generated via Monte Carlo simulations. Assuming small statistical errors, we show that the random system matrix ${\tilde G}(\omega)$ is near a nonsingular M-matrix $G$, i.e. ${\tilde G}(\omega)+E=G$ where $||E||/||G||$ is small. Using nonstandard arguments, we bound $||G^{-1}||$ and the condition number of $G$, showing that both of them grow moderately with the degrees of freedom of the discretisation. Moreover, the truncated Neumann series of $G^{-1}$ -- which is straightforward to calculate -- is the basis for an excellent preconditioner for ${\tilde G}(\omega)$. These findings are supported by numerical evidence.

We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.

北京阿比特科技有限公司