高清国产三级在线播放,在线成人免费影片,亚洲中文乱码字幕不卡,国产精品无码久久久久十八禁

Single particle cryo-electron microscopy has become a critical tool in structural biology over the last decade, able to achieve atomic scale resolution in three dimensional models from hundreds of thousands of (noisy) two-dimensional projection views of particles frozen at unknown orientations. This is accomplished by using a suite of software tools to (i) identify particles in large micrographs, (ii) obtain low-resolution reconstructions, (iii) refine those low-resolution structures, and (iv) finally match the obtained electron scattering density to the constituent atoms that make up the macromolecule or macromolecular complex of interest. Here, we focus on the second stage of the reconstruction pipeline: obtaining a low resolution model from picked particle images. Our goal is to create an algorithm that is capable of ab initio reconstruction from small data sets (on the order of a few thousand selected particles). More precisely, we seek an algorithm that is robust, automatic, able to assess particle quality, and fast enough that it can potentially be used to assist in the assessment of the data being generated while the microscopy experiment is still underway.

相關內容

穩健性(xing)

關注 3

MoDELS · FAST · 任務對話系統 · Learning · state-of-the-art ·

2022 年 10 月 26 日

Fast Yet Effective Speech Emotion Recognition with Self-distillation

Zhao Ren,Thanh Tam Nguyen,Yi Chang,Bj?rn W. Schuller

from arxiv, Submitted to ICASSP 2023

Speech emotion recognition (SER) is the task of recognising human's emotional states from speech. SER is extremely prevalent in helping dialogue systems to truly understand our emotions and become a trustworthy human conversational partner. Due to the lengthy nature of speech, SER also suffers from the lack of abundant labelled data for powerful models like deep neural networks. Pre-trained complex models on large-scale speech datasets have been successfully applied to SER via transfer learning. However, fine-tuning complex models still requires large memory space and results in low inference efficiency. In this paper, we argue achieving a fast yet effective SER is possible with self-distillation, a method of simultaneously fine-tuning a pretrained model and training shallower versions of itself. The benefits of our self-distillation framework are threefold: (1) the adoption of self-distillation method upon the acoustic modality breaks through the limited ground-truth of speech data, and outperforms the existing models' performance on an SER dataset; (2) executing powerful models at different depth can achieve adaptive accuracy-efficiency trade-offs on resource-limited edge devices; (3) a new fine-tuning process rather than training from scratch for self-distillation leads to faster learning time and the state-of-the-art accuracy on data with small quantities of label information.

情景 · MoDELS · 泛函 · 模型評估 · 稀疏 ·

2022 年 10 月 25 日

Exploring the Whole Rashomon Set of Sparse Decision Trees

Rui Xin,Chudi Zhong,Zhi Chen,Takuya Takagi,Margo Seltzer,Cynthia Rudin

from arxiv, NeurIPS 2022 (Oral)

In any given machine learning problem, there may be many models that could explain the data almost equally well. However, most learning algorithms return only one of these models, leaving practitioners with no practical way to explore alternative models that might have desirable properties beyond what could be expressed within a loss function. The Rashomon set is the set of these all almost-optimal models. Rashomon sets can be extremely complicated, particularly for highly nonlinear function classes that allow complex interaction terms, such as decision trees. We provide the first technique for completely enumerating the Rashomon set for sparse decision trees; in fact, our work provides the first complete enumeration of any Rashomon set for a non-trivial problem with a highly nonlinear discrete function class. This allows the user an unprecedented level of control over model choice among all models that are approximately equally good. We represent the Rashomon set in a specialized data structure that supports efficient querying and sampling. We show three applications of the Rashomon set: 1) it can be used to study variable importance for the set of almost-optimal trees (as opposed to a single tree), 2) the Rashomon set for accuracy enables enumeration of the Rashomon sets for balanced accuracy and F1-score, and 3) the Rashomon set for a full dataset can be used to produce Rashomon sets constructed with only subsets of the data set. Thus, we are able to examine Rashomon sets across problems with a new lens, enabling users to choose models rather than be at the mercy of an algorithm that produces only a single model.

INFORMS · binary · MoDELS · Better · 輸出 ·

2022 年 10 月 25 日

Structure-Unified M-Tree Coding Solver for MathWord Problem

Bin Wang,Jiangzhou Ju,Yang Fan,Xinyu Dai,Shujian Huang,Jiajun Chen

from arxiv, Accepted by EMNLP2022

As one of the challenging NLP tasks, designing math word problem (MWP) solvers has attracted increasing research attention for the past few years. In previous work, models designed by taking into account the properties of the binary tree structure of mathematical expressions at the output side have achieved better performance. However, the expressions corresponding to a MWP are often diverse (e.g., $n_1+n_2 \times n_3-n_4$, $n_3\times n_2-n_4+n_1$, etc.), and so are the corresponding binary trees, which creates difficulties in model learning due to the non-deterministic output space. In this paper, we propose the Structure-Unified M-Tree Coding Solver (SUMC-Solver), which applies a tree with any M branches (M-tree) to unify the output structures. To learn the M-tree, we use a mapping to convert the M-tree into the M-tree codes, where codes store the information of the paths from tree root to leaf nodes and the information of leaf nodes themselves, and then devise a Sequence-to-Code (seq2code) model to generate the codes. Experimental results on the widely used MAWPS and Math23K datasets have demonstrated that SUMC-Solver not only outperforms several state-of-the-art models under similar experimental settings but also performs much better under low-resource conditions.

binary · MoDELS · 推斷 · 貝葉斯推斷 · 多元正態分布 ·

2022 年 10 月 24 日

Bayesian inference on high-dimensional multivariate binary responses

Antik Chakraborty,Rihui Ou,David B. Dunson

It has become increasingly common to collect high-dimensional binary response data; for example, with the emergence of new sampling techniques in ecology. In smaller dimensions, multivariate probit (MVP) models are routinely used for inferences. However, algorithms for fitting such models face issues in scaling up to high dimensions due to the intractability of the likelihood, involving an integral over a multivariate normal distribution having no analytic form. Although a variety of algorithms have been proposed to approximate this intractable integral, these approaches are difficult to implement and/or inaccurate in high dimensions. Our main focus is in accommodating high-dimensional binary response data with a small to moderate number of covariates. We propose a two-stage approach for inference on model parameters while taking care of uncertainty propagation between the stages. We use the special structure of latent Gaussian models to reduce the highly expensive computation involved in joint parameter estimation to focus inference on marginal distributions of model parameters. This essentially makes the method embarrassingly parallel for both stages. We illustrate performance in simulations and applications to joint species distribution modeling in ecology.

局部極小 · 極小值 · 最優化 · CASE · 優化器 ·

2022 年 10 月 24 日

Noisy Low-rank Matrix Optimization: Geometry of Local Minima and Convergence Rate

Ziye Ma,Somayeh Sojoudi

This paper is concerned with low-rank matrix optimization, which has found a wide range of applications in machine learning. This problem in the special case of matrix sensing has been studied extensively through the notion of Restricted Isometry Property (RIP), leading to a wealth of results on the geometric landscape of the problem and the convergence rate of common algorithms. However, the existing results can handle the problem in the case with a general objective function subject to noisy data only when the RIP constant is close to 0. In this paper, we develop a new mathematical framework to solve the above-mentioned problem with a far less restrictive RIP constant. We prove that as long as the RIP constant of the noiseless objective is less than $1/3$, any spurious local solution of the noisy optimization problem must be close to the ground truth solution. By working through the strict saddle property, we also show that an approximate solution can be found in polynomial time. We characterize the geometry of the spurious local minima of the problem in a local region around the ground truth in the case when the RIP constant is greater than $1/3$. Compared to the existing results in the literature, this paper offers the strongest RIP bound and provides a complete theoretical analysis on the global and local optimization landscapes of general low-rank optimization problems under random corruptions from any finite-variance family.

統計量 · Analysis · 自助法/自舉法 · DATE · 估計/估計量 ·

2022 年 10 月 22 日

Assessing the Most Vulnerable Subgroup to Type II Diabetes Associated with Statin Usage: Evidence from Electronic Health Record Data

Xinzhou Guo,Waverly Wei,Molei Liu,Tianxi Cai,Chong Wu,Jingshen Wang

from arxiv, 25 pages, 2 figures, 5 tables

There have been increased concerns that the use of statins, one of the most commonly prescribed drugs for treating coronary artery disease, is potentially associated with the increased risk of new-onset type II diabetes (T2D). Nevertheless, to date, there is no robust evidence supporting as to whether and what kind of populations are indeed vulnerable for developing T2D after taking statins. In this case study, leveraging the biobank and electronic health record data in the Partner Health System, we introduce a new data analysis pipeline and a novel statistical methodology that address existing limitations by (i) designing a rigorous causal framework that systematically examines the causal effects of statin usage on T2D risk in observational data, (ii) uncovering which patient subgroup is most vulnerable for developing T2D after taking statins, and (iii) assessing the replicability and statistical significance of the most vulnerable subgroup via a bootstrap calibration procedure. Our proposed approach delivers asymptotically sharp confidence intervals and debiased estimate for the treatment effect of the most vulnerable subgroup in the presence of high-dimensional covariates. With our proposed approach, we find that females with high T2D genetic risk are at the highest risk of developing T2D due to statin usage.

INFORMS · 特征選擇 · 互信息 · 層 · 可約的 ·

2022 年 10 月 21 日

A GA-like Dynamic Probability Method With Mutual Information for Feature Selection

Gaoshuai Wang,Fabrice Lauri,Amir Hajjam El Hassani

from arxiv, 18 pages; submitted to Applied Intelligence

Feature selection plays a vital role in promoting the classifier's performance. However, current methods ineffectively distinguish the complex interaction in the selected features. To further remove these hidden negative interactions, we propose a GA-like dynamic probability (GADP) method with mutual information which has a two-layer structure. The first layer applies the mutual information method to obtain a primary feature subset. The GA-like dynamic probability algorithm, as the second layer, mines more supportive features based on the former candidate features. Essentially, the GA-like method is one of the population-based algorithms so its work mechanism is similar to the GA. Different from the popular works which frequently focus on improving GA's operators for enhancing the search ability and lowering the converge time, we boldly abandon GA's operators and employ the dynamic probability that relies on the performance of each chromosome to determine feature selection in the new generation. The dynamic probability mechanism significantly reduces the parameter number in GA that making it easy to use. As each gene's probability is independent, the chromosome variety in GADP is more notable than in traditional GA, which ensures GADP has a wider search space and selects relevant features more effectively and accurately. To verify our method's superiority, we evaluate our method under multiple conditions on 15 datasets. The results demonstrate the outperformance of the proposed method. Generally, it has the best accuracy. Further, we also compare the proposed model to the popular heuristic methods like POS, FPA, and WOA. Our model still owns advantages over them.

漢明距離 · 通道 · 圖 · MoDELS · 極小點 ·

2022 年 10 月 21 日

The sequence reconstruction problem for permutations with the Hamming distance

Xiang Wang,Elena V. Konstantinova

V. Levenshtein first proposed the sequence reconstruction problem in 2001. This problem studies the model where the same sequence from some set is transmitted over multiple channels, and the decoder receives the different outputs. Assume that the transmitted sequence is at distance $d$ from some code and there are at most $r$ errors in every channel. Then the sequence reconstruction problem is to find the minimum number of channels required to recover exactly the transmitted sequence that has to be greater than the maximum intersection between two metric balls of radius $r$, where the distance between their centers is at least $d$. In this paper, we study the sequence reconstruction problem of permutations under the Hamming distance. In this model, we define a Cayley graph and find the exact value of the largest intersection of two metric balls in this graph under the Hamming distance for $r=4$ with $d\geqslant 5$, and for $d=2r$.

Conformer · 正則化項 · 估計/估計量 · 近似 · 優化器 ·

2022 年 10 月 21 日

On the necessity of the inf-sup condition for a mixed finite element formulation

Fleurianne Bertrand,Daniele Boffi

We study a non standard mixed formulation of the Poisson problem, sometimes known as dual mixed formulation. For reasons related to the equilibration of the flux, we use finite elements that are conforming in H(div) for the approximation of the gradients, even if the formulation would allow for discontinuous finite elements. The scheme is not uniformly inf-sup stable, but we can show existence and uniqueness of the solution, as well as optimal error estimates for the gradient variable when suitable regularity assumptions are made. Several additional remarks complete the paper, shedding some light on the sources of instability for mixed formulations.

優化器 · 圖 · 圖形處理器 · Neural Networks · 核化 ·

2021 年 1 月 28 日

Interpreting and Unifying Graph Neural Networks with An Optimization Framework

Meiqi Zhu,Xiao Wang,Chuan Shi,Houye Ji,Peng Cui

from arxiv, WWW2021, 12 pages

Graph Neural Networks (GNNs) have received considerable attention on graph-structured data learning for a wide variety of tasks. The well-designed propagation mechanism which has been demonstrated effective is the most fundamental part of GNNs. Although most of GNNs basically follow a message passing manner, litter effort has been made to discover and analyze their essential relations. In this paper, we establish a surprising connection between different propagation mechanisms with a unified optimization problem, showing that despite the proliferation of various GNNs, in fact, their proposed propagation mechanisms are the optimal solution optimizing a feature fitting function over a wide class of graph kernels with a graph regularization term. Our proposed unified optimization framework, summarizing the commonalities between several of the most representative GNNs, not only provides a macroscopic view on surveying the relations between different GNNs, but also further opens up new opportunities for flexibly designing new GNNs. With the proposed framework, we discover that existing works usually utilize naive graph convolutional kernels for feature fitting function, and we further develop two novel objective functions considering adjustable graph kernels showing low-pass or high-pass filtering capabilities respectively. Moreover, we provide the convergence proofs and expressive power comparisons for the proposed models. Extensive experiments on benchmark datasets clearly show that the proposed GNNs not only outperform the state-of-the-art methods but also have good ability to alleviate over-smoothing, and further verify the feasibility for designing GNNs with our unified optimization framework.