亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Multi-task learning (MTL) aims to improve estimation and prediction performance by sharing common information among related tasks. One natural assumption in MTL is that tasks are classified into clusters based on their characteristics. However, existing MTL methods based on this assumption often ignore outlier tasks that have large task-specific components or no relation to other tasks. To address this issue, we propose a novel MTL method called Multi-Task Learning via Robust Regularized Clustering (MTLRRC). MTLRRC incorporates robust regularization terms inspired by robust convex clustering, which is further extended to handle non-convex and group-sparse penalties. The extension allows MTLRRC to simultaneously perform robust task clustering and outlier task detection. The connection between the extended robust clustering and the multivariate M-estimator is also established. This provides an interpretation of the robustness of MTLRRC against outlier tasks. An efficient algorithm based on a modified alternating direction method of multipliers is developed for the estimation of the parameters. The effectiveness of MTLRRC is demonstrated through simulation studies and application to real data.

相關內容

This work aims to extend the well-known high-order WENO finite-difference methods for systems of conservation laws to nonconservative hyperbolic systems. The main difficulty of these systems both from the theoretical and the numerical points of view comes from the fact that the definition of weak solution is not unique: according to the theory developed by Dal Maso, LeFloch, and Murat in 1995, it depends on the choice of a family of paths. A general strategy is proposed here in which WENO operators are not only used to reconstruct fluxes but also the nonconservative products of the system. Moreover, if a Roe linearization is available, the nonconservative products can be computed through matrix-vector operations instead of path-integrals. The methods are extended to problems with source terms and two different strategies are introduced to obtain well-balanced schemes. These numerical schemes will be then applied to the two-layer shallow water equations in one- and two- dimensions to obtain high-order methods that preserve water-at-rest steady states.

Machine learning (ML)-based monitoring systems have been extensively developed to enhance the print quality of additive manufacturing (AM). In-situ and in-process data acquired using sensors can be used to train ML models that detect process anomalies, predict part quality, and adjust process parameters. However, the reproducibility of the proposed AM monitoring systems has not been investigated. There has not been a method to evaluate and improve reproducibility in the joint domain of AM and ML. Consequently, some crucial information for reproducing the research is usually missing from the publications; thus, systems reproduced based on the publications often cannot achieve the claimed performance. This paper establishes the definition of reproducibility in this domain, proposes a reproducibility investigation pipeline, and composes a reproducibility checklist. A research is reproducible if a performance comparable to the original research can be obtained when reproduced by a different team using a different experiment setup. The reproducibility investigation pipeline sequentially guides the readers through all the necessary reproduction steps, during which the reproducibility checklist will help extract the reproducibility information from the publication. A case study that reproduced a vision-based warping detection system demonstrated the usage and validated the efficacy of the proposed pipeline and checklist. It has been observed that the reproducibility checklist can help the authors verify that all the information critical to reproducibility is provided in the publications. The investigation pipeline can help identify the missing reproducibility information, which should be acquired from the original authors to achieve the claimed performance.

We study weighted basic parallel processes (WBPP), a nonlinear recursive generalisation of weighted finite automata inspired from process algebra and Petri net theory. Our main result is an algorithm of 2-EXPSPACE complexity for the WBPP equivalence problem. While (unweighted) BPP language equivalence is undecidable, we can use this algorithm to decide multiplicity equivalence of BPP and language equivalence of unambiguous BPP, with the same complexity. These are long-standing open problems for the related model of weighted context-free grammars. Our second contribution is a connection between WBPP, power series solutions of systems of polynomial differential equations, and combinatorial enumeration. To this end we consider constructible differentially finite power series (CDF), a class of multivariate differentially algebraic series introduced by Bergeron and Reutenauer in order to provide a combinatorial interpretation to differential equations. CDF series generalise rational, algebraic, and a large class of D-finite (holonomic) series, for which decidability of equivalence was an open problem. We show that CDF series correspond to commutative WBPP series. As a consequence of our result on WBPP and commutativity, we show that equivalence of CDF power series can be decided with 2-EXPTIME complexity. The complexity analysis is based on effective bounds from algebraic geometry, namely on the length of chains of polynomial ideals constructed by repeated application of finitely many, not necessarily commuting derivations of a multivariate polynomial ring. This is obtained by generalising a result of Novikov and Yakovenko in the case of a single derivation, which is noteworthy since generic bounds on ideal chains are non-primitive recursive in general. On the way, we develop the theory of \WBPP~series and \CDF~power series, exposing several of their appealing properties.

We introduce a fast algorithm for Gaussian process regression in low dimensions, applicable to a widely-used family of non-stationary kernels. The non-stationarity of these kernels is induced by arbitrary spatially-varying vertical and horizontal scales. In particular, any stationary kernel can be accommodated as a special case, and we focus especially on the generalization of the standard Mat\'ern kernel. Our subroutine for kernel matrix-vector multiplications scales almost optimally as $O(N\log N)$, where $N$ is the number of regression points. Like the recently developed equispaced Fourier Gaussian process (EFGP) methodology, which is applicable only to stationary kernels, our approach exploits non-uniform fast Fourier transforms (NUFFTs). We offer a complete analysis controlling the approximation error of our method, and we validate the method's practical performance with numerical experiments. In particular we demonstrate improved scalability compared to to state-of-the-art rank-structured approaches in spatial dimension $d>1$.

The manipulation of deformable linear objects (DLOs) via model-based control requires an accurate and computationally efficient dynamics model. Yet, data-driven DLO dynamics models require large training data sets while their predictions often do not generalize, whereas physics-based models rely on good approximations of physical phenomena and often lack accuracy. To address these challenges, we propose a physics-informed neural ODE capable of predicting agile movements with significantly less data and hyper-parameter tuning. In particular, we model DLOs as serial chains of rigid bodies interconnected by passive elastic joints in which interaction forces are predicted by neural networks. The proposed model accurately predicts the motion of an robotically-actuated aluminium rod and an elastic foam cylinder after being trained on only thirty seconds of data. The project code and data are available at: \url{//tinyurl.com/neuralprba}

Large language models (LLMs) bear promise as a fast and accurate material modeling paradigm for evaluation, analysis, and design. Their vast number of trainable parameters necessitates a wealth of data to achieve accuracy and mitigate overfitting. However, experimental measurements are often limited and costly to obtain in sufficient quantities for finetuning. To this end, we present a physics-based training pipeline that tackles the pathology of data scarcity. The core enabler is a physics-based modeling framework that generates a multitude of synthetic data to align the LLM to a physically consistent initial state before finetuning. Our framework features a two-phase training strategy: (1) utilizing the large-in-amount while less accurate synthetic data for supervised pretraining, and (2) finetuning the phase-1 model with limited experimental data. We empirically demonstrate that supervised pretraining is vital to obtaining accurate finetuned LLMs, via the lens of learning polymer flammability metrics where cone calorimeter data is sparse.

There is an ongoing need for scalable tools to aid researchers in both retrospective and prospective standardization of discrete entity types -- such as disease names, cell types or chemicals -- that are used in metadata associated with biomedical data. When metadata are not well-structured or precise, the associated data are harder to find and are often burdensome to reuse, analyze or integrate with other datasets due to the upfront curation effort required to make the data usable -- typically through retrospective standardization and cleaning of the (meta)data. With the goal of facilitating the task of standardizing metadata -- either in bulk or in a one-by-one fashion; for example, to support auto-completion of biomedical entities in forms -- we have developed an open-source tool called text2term that maps free-text descriptions of biomedical entities to controlled terms in ontologies. The tool is highly configurable and can be used in multiple ways that cater to different users and expertise levels -- it is available on PyPI and can be used programmatically as any Python package; it can also be used via a command-line interface; or via our hosted, graphical user interface-based Web application (//text2term.hms.harvard.edu); or by deploying a local instance of our interactive application using Docker.

This dissertation studies a fundamental open challenge in deep learning theory: why do deep networks generalize well even while being overparameterized, unregularized and fitting the training data to zero error? In the first part of the thesis, we will empirically study how training deep networks via stochastic gradient descent implicitly controls the networks' capacity. Subsequently, to show how this leads to better generalization, we will derive {\em data-dependent} {\em uniform-convergence-based} generalization bounds with improved dependencies on the parameter count. Uniform convergence has in fact been the most widely used tool in deep learning literature, thanks to its simplicity and generality. Given its popularity, in this thesis, we will also take a step back to identify the fundamental limits of uniform convergence as a tool to explain generalization. In particular, we will show that in some example overparameterized settings, {\em any} uniform convergence bound will provide only a vacuous generalization bound. With this realization in mind, in the last part of the thesis, we will change course and introduce an {\em empirical} technique to estimate generalization using unlabeled data. Our technique does not rely on any notion of uniform-convergece-based complexity and is remarkably precise. We will theoretically show why our technique enjoys such precision. We will conclude by discussing how future work could explore novel ways to incorporate distributional assumptions in generalization bounds (such as in the form of unlabeled data) and explore other tools to derive bounds, perhaps by modifying uniform convergence or by developing completely new tools altogether.

We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.

Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.

北京阿比特科技有限公司