亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We study the performance of empirical risk minimization on the $p$-norm linear regression problem for $p \in (1, \infty)$. We show that, in the realizable case, under no moment assumptions, and up to a distribution-dependent constant, $O(d)$ samples are enough to exactly recover the target. Otherwise, for $p \in [2, \infty)$, and under weak moment assumptions on the target and the covariates, we prove a high probability excess risk bound on the empirical risk minimizer whose leading term matches, up to a constant that depends only on $p$, the asymptotically exact rate. We extend this result to the case $p \in (1, 2)$ under mild assumptions that guarantee the existence of the Hessian of the risk at its minimizer.

相關內容

經驗風險是對訓練集中的所有樣本點損失函數的平均最小化。經驗風險越小說明模型f(X)對訓練集的擬合程度越好。

In this paper we present a general theory of $\Pi_{2}$-rules for systems of intuitionistic and modal logic. We introduce the notions of $\Pi_{2}$-rule system and of an Inductive Class, and provide model-theoretic and algebraic completeness theorems, which serve as our basic tools. As an illustration of the general theory, we analyse the structure of inductive classes of G\"{o}del algebras, from a structure theoretic and logical point of view. We show that unlike other well-studied settings (such as logics, or single-conclusion rule systems), there are continuum many $\Pi_{2}$-rule systems extending $\mathsf{LC}=\mathsf{IPC}+(p\rightarrow q)\vee (q\rightarrow p)$, and show how our methods allow easy proofs of the admissibility of the well-known Takeuti-Titani rule. Our final results concern general questions admissibility in $\mathsf{LC}$: (1) we present a full classification of those inductive classes which are inductively complete, i.e., where all $\Pi_{2}$-rules which are admissible are derivable, and (2) show that the problem of admissibility of $\Pi_{2}$-rules over $\mathsf{LC}$ is decidable.

We study the parameterized complexity of algorithmic problems whose input is an integer set $A$ in terms of the doubling constant $C := |A + A|/|A|$, a fundamental measure of additive structure. We present evidence that this new parameterization is algorithmically useful in the form of new results for two difficult, well-studied problems: Integer Programming and Subset Sum. First, we show that determining the feasibility of bounded Integer Programs is a tractable problem when parameterized in the doubling constant. Specifically, we prove that the feasibility of an integer program $I$ with $n$ polynomially-bounded variables and $m$ constraints can be determined in time $n^{O_C(1)} poly(|I|)$ when the column set of the constraint matrix has doubling constant $C$. Second, we show that the Subset Sum and Unbounded Subset Sum problems can be solved in time $n^{O_C(1)}$ and $n^{O_C(\log \log \log n)}$, respectively, where the $O_C$ notation hides functions that depend only on the doubling constant $C$. We also show the equivalence of achieving an FPT algorithm for Subset Sum with bounded doubling and achieving a milestone result for the parameterized complexity of Box ILP. Finally, we design near-linear time algorithms for $k$-SUM as well as tight lower bounds for 4-SUM and nearly tight lower bounds for $k$-SUM, under the $k$-SUM conjecture. Several of our results rely on a new proof that Freiman's Theorem, a central result in additive combinatorics, can be made efficiently constructive. This result may be of independent interest.

The $k$-center problem is a central optimization problem with numerous applications for machine learning, data mining, and communication networks. Despite extensive study in various scenarios, it surprisingly has not been thoroughly explored in the traditional distributed setting, where the communication graph of a network also defines the distance metric. We initiate the study of the $k$-center problem in a setting where the underlying metric is the graph's shortest path metric in three canonical distributed settings: the LOCAL, CONGEST, and CLIQUE models. Our results encompass constant-factor approximation algorithms and lower bounds in these models, as well as hardness results for the bi-criteria approximation setting.

Parameter estimation with the maximum $L_q$-likelihood estimator (ML$q$E) is an alternative to the maximum likelihood estimator (MLE) that considers the $q$-th power of the likelihood values for some $q<1$. In this method, extreme values are down-weighted because of their lower likelihood values, which yields robust estimates. In this work, we study the properties of the ML$q$E for spatial data with replicates. We investigate the asymptotic properties of the ML$q$E for Gaussian random fields with a Mat\'ern covariance function, and carry out simulation studies to investigate the numerical performance of the ML$q$E. We show that it can provide more robust and stable estimation results when some of the replicates in the spatial data contain outliers. In addition, we develop a mechanism to find the optimal choice of the hyper-parameter $q$ for the ML$q$E. The robustness of our approach is further verified on a United States precipitation dataset. Compared with other robust methods for spatial data, our proposal is more intuitive and easier to understand, yet it performs well when dealing with datasets containing outliers.

Sequential Monte Carlo Squared (SMC$^2$) is a Bayesian method which can infer the states and parameters of non-linear, non-Gaussian state-space models. The standard random-walk proposal in SMC$^2$ faces challenges, particularly with high-dimensional parameter spaces. This study outlines a novel approach by harnessing first-order gradients derived from a Common Random Numbers - Particle Filter (CRN-PF) using PyTorch. The resulting gradients can be leveraged within a Langevin proposal without accept/reject. Including Langevin dynamics within the proposal can result in a higher effective sample size and more accurate parameter estimates when compared with the random-walk. The resulting algorithm is parallelized on distributed memory using Message Passing Interface (MPI) and runs in $\mathcal{O}(\log_2N)$ time complexity. Utilizing 64 computational cores we obtain a 51x speed-up when compared to a single core. A GitHub link is given which provides access to the code.

A $k$-defective clique of an undirected graph $G$ is a subset of its vertices that induces a nearly complete graph with a maximum of $k$ missing edges. The maximum $k$-defective clique problem, which asks for the largest $k$-defective clique from the given graph, is important in many applications, such as social and biological network analysis. In the paper, we propose a new branching algorithm that takes advantage of the structural properties of the $k$-defective clique and uses the efficient maximum clique algorithm as a subroutine. As a result, the algorithm has a better asymptotic running time than the existing ones. We also investigate upper-bounding techniques and propose a new upper bound utilizing the \textit{conflict relationship} between vertex pairs. Because conflict relationship is common in many graph problems, we believe that this technique can be potentially generalized. Finally, experiments show that our algorithm outperforms state-of-the-art solvers on a wide range of open benchmarks.

Are score function estimators an underestimated approach to learning with $k$-subset sampling? Sampling $k$-subsets is a fundamental operation in many machine learning tasks that is not amenable to differentiable parametrization, impeding gradient-based optimization. Prior work has focused on relaxed sampling or pathwise gradient estimators. Inspired by the success of score function estimators in variational inference and reinforcement learning, we revisit them within the context of $k$-subset sampling. Specifically, we demonstrate how to efficiently compute the $k$-subset distribution's score function using a discrete Fourier transform, and reduce the estimator's variance with control variates. The resulting estimator provides both exact samples and unbiased gradient estimates while also applying to non-differentiable downstream models, unlike existing methods. Experiments in feature selection show results competitive with current methods, despite weaker assumptions.

The coalgebraic $\mu$-calculus provides a generic semantic framework for fixpoint logics over systems whose branching type goes beyond the standard relational setup, e.g. probabilistic, weighted, or game-based. Previous work on the coalgebraic $\mu$-calculus includes an exponential-time upper bound on satisfiability checking, which however relies on the availability of tableau rules for the next-step modalities that are sufficiently well-behaved in a formally defined sense; in particular, rule matches need to be representable by polynomial-sized codes, and the sequent duals of the rules need to absorb cut. While such rule sets have been identified for some important cases, they are not known to exist in all cases of interest, in particular ones involving either integer weights as in the graded $\mu$-calculus, or real-valued weights in combination with non-linear arithmetic. In the present work, we prove the same upper complexity bound under more general assumptions, specifically regarding the complexity of the (much simpler) satisfiability problem for the underlying one-step logic, roughly described as the nesting-free next-step fragment of the logic. The bound is realized by a generic global caching algorithm that supports on-the-fly satisfiability checking. Notably, our approach directly accommodates unguarded formulae, and thus avoids use of the guardedness transformation. Example applications include new exponential-time upper bounds for satisfiability checking in an extension of the graded $\mu$-calculus with polynomial inequalities (including positive Presburger arithmetic), as well as an extension of the (two-valued) probabilistic $\mu$-calculus with polynomial inequalities.

For the outlier problem in linear regression models, the Student-$t$ linear regression model is one of the common methods for robust modeling and is widely adopted in the literature. However, most of them applies it without careful theoretical consideration. This study provides the practically useful and quite simple conditions to ensure that the Student-$t$ linear regression model is robust against an outlier in the $y$-direction using regular variation theory.

We noisily observe solutions of an ordinary differential equation $\dot u = f(u)$ at given times, where $u$ lives in a $d$-dimensional state space. The model function $f$ is unknown and belongs to a H\"older-type smoothness class with parameter $\beta$. For the nonparametric problem of estimating $f$, we provide lower bounds on the error in two complementary model specifications: the snake model with few, long observed solutions and the stubble model with many short ones. The lower bounds are minimax optimal in some settings. They depend on various parameters, which in the optimal asymptotic regime leads to the same rate for the squared error in both models: it is characterized by the exponent $-2\beta/(2(\beta+1)+d)$ for the total number of observations $n$. To derive these results, we establish a master theorem for lower bounds in general nonparametric regression problems, which makes the proofs more comparable and seems to be a useful tool for future use.

北京阿比特科技有限公司