成年人日屄视频免费观看_18GAY国产小鲜肉可播放_国产一区二区三区在线视频观看_男女午夜好深好爽的视频_国产精品午夜在线观看体验区_欧美人妻一区二区三区66_精品奶水区一区二区三区在线观看

In this paper, we introduce a new methodology to solve the orthogonal nonnegative matrix factorization (ONMF) problem, where the objective is to approximate an input data matrix by a product of two nonnegative matrices, the features matrix and the mixing matrix, where one of them is orthogonal. We show how the ONMF can be interpreted as a specific facility-location problem (FLP), and adapt a maximum-entropy-principle based solution for FLP to the ONMF problem. The proposed approach guarantees orthogonality and sparsity of the features or the mixing matrix, while ensuring nonnegativity of both. Additionally, our methodology develops a quantitative characterization of ``true" number of underlying features - a hyperparameter required for the ONMF. An evaluation of the proposed method conducted on synthetic datasets, as well as a standard genetic microarray dataset indicates significantly better sparsity, orthogonality, and performance speed compared to similar methods in the literature, with comparable or improved reconstruction errors.

相關內容

正交

關注 0

FPT · 近似 · 穩健性 · 離散化 · 情景 ·

2023 年 5 月 12 日

Parameterized Approximation for Robust Clustering in Discrete Geometric Spaces

Fateme Abbasi,Sandip Banerjee,Jaros?aw Byrka,Parinya Chalermsook,Ameet Gadekar,Kamyar Khodamoradi,Dániel Marx,Roohani Sharma,Joachim Spoerhase

from arxiv, 21 pages, 3 figures

We consider the well-studied Robust $(k, z)$-Clustering problem, which generalizes the classic $k$-Median, $k$-Means, and $k$-Center problems. Given a constant $z\ge 1$, the input to Robust $(k, z)$-Clustering is a set $P$ of $n$ weighted points in a metric space $(M,\delta)$ and a positive integer $k$. Further, each point belongs to one (or more) of the $m$ many different groups $S_1,S_2,\ldots,S_m$. Our goal is to find a set $X$ of $k$ centers such that $\max_{i \in [m]} \sum_{p \in S_i} w(p) \delta(p,X)^z$ is minimized. This problem arises in the domains of robust optimization [Anthony, Goyal, Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness. For polynomial time computation, an approximation factor of $O(\log m/\log\log m)$ is known [Makarychev, Vakilian, COLT $2021$], which is tight under a plausible complexity assumption even in the line metrics. For FPT time, there is a $(3^z+\epsilon)$-approximation algorithm, which is tight under GAP-ETH [Goyal, Jaiswal, Inf. Proc. Letters, 2023]. Motivated by the tight lower bounds for general discrete metrics, we focus on \emph{geometric} spaces such as the (discrete) high-dimensional Euclidean setting and metrics of low doubling dimension, which play an important role in data analysis applications. First, for a universal constant $\eta_0 >0.0006$, we devise a $3^z(1-\eta_{0})$-factor FPT approximation algorithm for discrete high-dimensional Euclidean spaces thereby bypassing the lower bound for general metrics. We complement this result by showing that even the special case of $k$-Center in dimension $\Theta(\log n)$ is $(\sqrt{3/2}- o(1))$-hard to approximate for FPT algorithms. Finally, we complete the FPT approximation landscape by designing an FPT $(1+\epsilon)$-approximation scheme (EPAS) for the metric of sub-logarithmic doubling dimension.

分離的 · MoDELS · motivation · INFORMS · Performer ·

2023 年 5 月 12 日

Diffusion-based Signal Refiner for Speech Separation

Masato Hirano,Kazuki Shimada,Yuichiro Koyama,Shusuke Takahashi,Yuki Mitsufuji

from arxiv, Under review

We have developed a diffusion-based speech refiner that improves the reference-free perceptual quality of the audio predicted by preceding single-channel speech separation models. Although modern deep neural network-based speech separation models have show high performance in reference-based metrics, they often produce perceptually unnatural artifacts. The recent advancements made to diffusion models motivated us to tackle this problem by restoring the degraded parts of initial separations with a generative approach. Utilizing the denoising diffusion restoration model (DDRM) as a basis, we propose a shared DDRM-based refiner that generates samples conditioned on the global information of preceding outputs from arbitrary speech separation models. We experimentally show that our refiner can provide a clearer harmonic structure of speech and improves the reference-free metric of perceptual quality for arbitrary preceding model architectures. Furthermore, we tune the variance of the measurement noise based on preceding outputs, which results in higher scores in both reference-free and reference-based metrics. The separation quality can also be further improved by blending the discriminative and generative outputs.

正則化項 · 平滑 · Learning · CASES · 核化 ·

2023 年 5 月 12 日

Random Smoothing Regularization in Kernel Gradient Descent Learning

Liang Ding,Tianyang Hu,Jiahang Jiang,Donghao Li,Wenjia Wang,Yuan Yao

Random smoothing data augmentation is a unique form of regularization that can prevent overfitting by introducing noise to the input data, encouraging the model to learn more generalized features. Despite its success in various applications, there has been a lack of systematic study on the regularization ability of random smoothing. In this paper, we aim to bridge this gap by presenting a framework for random smoothing regularization that can adaptively and effectively learn a wide range of ground truth functions belonging to the classical Sobolev spaces. Specifically, we investigate two underlying function spaces: the Sobolev space of low intrinsic dimension, which includes the Sobolev space in $D$-dimensional Euclidean space or low-dimensional sub-manifolds as special cases, and the mixed smooth Sobolev space with a tensor structure. By using random smoothing regularization as novel convolution-based smoothing kernels, we can attain optimal convergence rates in these cases using a kernel gradient descent algorithm, either with early stopping or weight decay. It is noteworthy that our estimator can adapt to the structural assumptions of the underlying data and avoid the curse of dimensionality. This is achieved through various choices of injected noise distributions such as Gaussian, Laplace, or general polynomial noises, allowing for broad adaptation to the aforementioned structural assumptions of the underlying data. The convergence rate depends only on the effective dimension, which may be significantly smaller than the actual data dimension. We conduct numerical experiments on simulated data to validate our theoretical results.

塊 · 近似 · Subspace · INTERACT · SPIN ·

2023 年 5 月 11 日

On the complexity of implementing Trotter steps

Guang Hao Low,Yuan Su,Yu Tong,Minh C. Tran

from arxiv, 69 pages, 7 figures. Tightened gate complexity analysis with no amplitude amplification prefactors. Discussed ancilla space complexity. Added master theorem analysis of the recursive low-rank algorithm. Enhanced version of the article published in PRX Quantum at //journals.aps.org/prxquantum/abstract/10.1103/PRXQuantum.4.020323

Quantum dynamics can be simulated on a quantum computer by exponentiating elementary terms from the Hamiltonian in a sequential manner. However, such an implementation of Trotter steps has gate complexity depending on the total Hamiltonian term number, comparing unfavorably to algorithms using more advanced techniques. We develop methods to perform faster Trotter steps with complexity sublinear in the number of terms. We achieve this for a class of Hamiltonians whose interaction strength decays with distance according to power law. Our methods include one based on a recursive block encoding and one based on an average-cost simulation, overcoming the normalization-factor barrier of these advanced quantum simulation techniques. We also realize faster Trotter steps when certain blocks of Hamiltonian coefficients have low rank. Combining with a tighter error analysis, we show that it suffices to use $\left(\eta^{1/3}n^{1/3}+\frac{n^{2/3}}{\eta^{2/3}}\right)n^{1+o(1)}$ gates to simulate uniform electron gas with $n$ spin orbitals and $\eta$ electrons in second quantization in real space, asymptotically improving over the best previous work. We obtain an analogous result when the external potential of nuclei is introduced under the Born-Oppenheimer approximation. We prove a circuit lower bound when the Hamiltonian coefficients take a continuum range of values, showing that generic $n$-qubit $2$-local Hamiltonians with commuting terms require at least $\Omega(n^2)$ gates to evolve with accuracy $\epsilon=\Omega(1/poly(n))$ for time $t=\Omega(\epsilon)$. Our proof is based on a gate-efficient reduction from the approximate synthesis of diagonal unitaries within the Hamming weight-$2$ subspace, which may be of independent interest. Our result thus suggests the use of Hamiltonian structural properties as both necessary and sufficient to implement Trotter steps with lower gate complexity.

估計/估計量 · INFORMS · PCA · Networking · 微分熵 ·

2023 年 5 月 11 日

High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction

Kristjan Greenewald,Brian Kingsbury,Yuancheng Yu

from arxiv, To appear in ISIT 2023

We study the problem of overcoming exponential sample complexity in differential entropy estimation under Gaussian convolutions. Specifically, we consider the estimation of the differential entropy $h(X+Z)$ via $n$ independently and identically distributed samples of $X$, where $X$ and $Z$ are independent $D$-dimensional random variables with $X$ sub-Gaussian with bounded second moment and $Z\sim\mathcal{N}(0,\sigma^2I_D)$. Under the absolute-error loss, the above problem has a parametric estimation rate of $\frac{c^D}{\sqrt{n}}$, which is exponential in data dimension $D$ and often problematic for applications. We overcome this exponential sample complexity by projecting $X$ to a low-dimensional space via principal component analysis (PCA) before the entropy estimation, and show that the asymptotic error overhead vanishes as the unexplained variance of the PCA vanishes. This implies near-optimal performance for inherently low-dimensional structures embedded in high-dimensional spaces, including hidden-layer outputs of deep neural networks (DNN), which can be used to estimate mutual information (MI) in DNNs. We provide numerical results verifying the performance of our PCA approach on Gaussian and spiral data. We also apply our method to analysis of information flow through neural network layers (c.f. information bottleneck), with results measuring mutual information in a noisy fully connected network and a noisy convolutional neural network (CNN) for MNIST classification.

方陣 · MoDELS · Networking · 近似 · Attention ·

2023 年 5 月 11 日

Initial Steps Towards Tackling High-dimensional Surrogate Modeling for Neuroevolution Using Kriging Partial Least Squares

Fergal Stapleton,Edgar Galván

from arxiv, 2 pages, 1 table

Surrogate-assisted evolutionary algorithms (SAEAs) aim to use efficient computational models with the goal of approximating the fitness function in evolutionary computation systems. This area of research has been active for over two decades and has received significant attention from the specialised research community in different areas, for example, single and many objective optimisation or dynamic and stationary optimisation problems. An emergent and exciting area that has received little attention from the SAEAs community is in neuroevolution. This refers to the use of evolutionary algorithms in the automatic configuration of artificial neural network (ANN) architectures, hyper-parameters and/or the training of ANNs. However, ANNs suffer from two major issues: (a) the use of highly-intense computational power for their correct training, and (b) the highly specialised human expertise required to correctly configure ANNs necessary to get a well-performing network. This work aims to fill this important research gap in SAEAs in neuroevolution by addressing these two issues. We demonstrate how one can use a Kriging Partial Least Squares method that allows efficient computation of good approximate surrogate models compared to the well-known Kriging method, which normally cannot be used in neuroevolution due to the high dimensionality of the data.

圖 · Networking · 估計/估計量 · 圖形處理器 · 極大 ·

2023 年 5 月 11 日

Maximizing Influence with Graph Neural Networks

George Panagopoulos,Nikolaos Tziortziotis,Fragkiskos D. Malliaros,Michalis Vazirgiannis

from arxiv, 16

Finding the seed set that maximizes the influence spread over a network is a well-known NP-hard problem. Though a greedy algorithm can provide near-optimal solutions, the subproblem of influence estimation renders the solutions inefficient. In this work, we propose \textsc{Glie}, a graph neural network that learns how to estimate the influence spread of the independent cascade. GLIE relies on a theoretical upper bound that is tightened through supervised training.Experiments indicate that it provides accurate influence estimation for real graphs up to 10 times larger than the train set.Subsequently, we incorporate it into three influence maximization techniques.We first utilize Cost Effective Lazy Forward optimization substituting Monte Carlo simulations with GLIE, surpassing the benchmarks albeit with a computational overhead. To improve computational efficiency we first devise a Q-learning method that learns to choose seeds sequentially using GLIE's predictions. Finally, we arrive at the most efficient approach by developing a provably submodular influence spread based on GLIE's representations, to rank nodes while building the seed set adaptively. The proposed algorithms are inductive, meaning they are trained on graphs with less than 300 nodes and up to 5 seeds, and tested on graphs with millions of nodes and up to 200 seeds. The final method exhibits the most promising combination of time efficiency and influence quality, outperforming several baselines.

估計/估計量 · 協方差矩陣 · GM · 似然 · 稀疏 ·

2023 年 5 月 11 日

Two new algorithms for maximum likelihood estimation of sparse covariance matrices with applications to graphical modeling

Ghania Fatima,Prabhu Babu,Petre Stoica

In this paper, we propose two new algorithms for maximum-likelihood estimation (MLE) of high dimensional sparse covariance matrices. Unlike most of the state of-the-art methods, which either use regularization techniques or penalize the likelihood to impose sparsity, we solve the MLE problem based on an estimated covariance graph. More specifically, we propose a two-stage procedure: in the first stage, we determine the sparsity pattern of the target covariance matrix (in other words the marginal independence in the covariance graph under a Gaussian graphical model) using the multiple hypothesis testing method of false discovery rate (FDR), and in the second stage we use either a block coordinate descent approach to estimate the non-zero values or a proximal distance approach that penalizes the distance between the estimated covariance graph and the target covariance matrix. Doing so gives rise to two different methods, each with its own advantage: the coordinate descent approach does not require tuning of any hyper-parameters, whereas the proximal distance approach is computationally fast but requires a careful tuning of the penalty parameter. Both methods are effective even in cases where the number of observed samples is less than the dimension of the data. For performance evaluation, we test the proposed methods on both simulated and real-world data and show that they provide more accurate estimates of the sparse covariance matrix than two state-of-the-art methods.

主動學習 · Learning · 損失 · 標注 · 易處理的 ·

2023 年 5 月 11 日

Active Learning in the Predict-then-Optimize Framework: A Margin-Based Approach

Mo Liu,Paul Grigas,Heyuan Liu,Zuo-Jun Max Shen

We develop the first active learning method in the predict-then-optimize framework. Specifically, we develop a learning method that sequentially decides whether to request the "labels" of feature samples from an unlabeled data stream, where the labels correspond to the parameters of an optimization model for decision-making. Our active learning method is the first to be directly informed by the decision error induced by the predicted parameters, which is referred to as the Smart Predict-then-Optimize (SPO) loss. Motivated by the structure of the SPO loss, our algorithm adopts a margin-based criterion utilizing the concept of distance to degeneracy and minimizes a tractable surrogate of the SPO loss on the collected data. In particular, we develop an efficient active learning algorithm with both hard and soft rejection variants, each with theoretical excess risk (i.e., generalization) guarantees. We further derive bounds on the label complexity, which refers to the number of samples whose labels are acquired to achieve a desired small level of SPO risk. Under some natural low-noise conditions, we show that these bounds can be better than the naive supervised learning approach that labels all samples. Furthermore, when using the SPO+ loss function, a specialized surrogate of the SPO loss, we derive a significantly smaller label complexity under separability conditions. We also present numerical evidence showing the practical value of our proposed algorithms in the settings of personalized pricing and the shortest path problem.

簇 · 估計/估計量 · 推斷 · 試驗 · Pair ·

2023 年 5 月 10 日

Inference in Cluster Randomized Trials with Matched Pairs

Yuehao Bai,Jizhou Liu,Azeem M. Shaikh,Max Tabord-Meehan

This paper considers the problem of inference in cluster randomized trials where treatment status is determined according to a "matched pairs'' design. Here, by a cluster randomized experiment, we mean one in which treatment is assigned at the level of the cluster; by a "matched pairs'' design we mean that a sample of clusters is paired according to baseline, cluster-level covariates and, within each pair, one cluster is selected at random for treatment. We study the large-sample behavior of a weighted difference-in-means estimator and derive two distinct sets of results depending on if the matching procedure does or does not match on cluster size. We then propose a single variance estimator which is consistent in either regime. Combining these results establishes the asymptotic exactness of tests based on these estimators. Next, we consider the properties of two common testing procedures based on t-tests constructed from linear regressions, and argue that both are generally conservative in our framework. We additionally study the behavior of a randomization test which permutes the treatment status for clusters within pairs, and establish its finite-sample and asymptotic validity for testing specific null hypotheses. Finally, we propose a covariate-adjusted estimator which adjusts for additional baseline covariates not used for treatment assignment, and establish conditions under which such an estimator leads to improvements in precision. A simulation study confirms the practical relevance of our theoretical results.