{mayi_des}
The persistent and switchable polarization of ferroelectric materials based on HfO$_2$-based ferroelectric compounds, compatible with large-scale integration, are attractive synaptic elements for neuromorphic computing. To achieve a record current density of 0.01 A/cm$^2$ (at a read voltage of 80 mV) as well as ideal memristive behavior (linear current-voltage relation and analog resistive switching), devices based on an ultra-thin (2.7 nm thick), polycrystalline HfZrO$_4$ ferroelectric layer are fabricated by Atomic Layer Deposition. The use of a semiconducting oxide interlayer (WO$_{x<3}$) at one of the interfaces, induces an asymmetric energy profile upon ferroelectric polarization reversal and thus the long-term potentiation / depression (conductance increase / decrease) of interest. Moreover, it favors the stable retention of both the low and the high resistive states. Thanks to the low operating voltage (<3.5 V), programming requires less than 10${^-12}$ J for 20 ns long pulses. Remarkably, the memristors show no wake-up or fatigue effect.
Out-of-distribution (OOD) detection discerns OOD data where the predictor cannot make valid predictions as in-distribution (ID) data, thereby increasing the reliability of open-world classification. However, it is typically hard to collect real out-of-distribution (OOD) data for training a predictor capable of discerning ID and OOD patterns. This obstacle gives rise to data generation-based learning methods, synthesizing OOD data via data generators for predictor training without requiring any real OOD data. Related methods typically pre-train a generator on ID data and adopt various selection procedures to find those data likely to be the OOD cases. However, generated data may still coincide with ID semantics, i.e., mistaken OOD generation remains, confusing the predictor between ID and OOD data. To this end, we suggest that generated data (with mistaken OOD generation) can be used to devise an auxiliary OOD detection task to facilitate real OOD detection. Specifically, we can ensure that learning from such an auxiliary task is beneficial if the ID and the OOD parts have disjoint supports, with the help of a well-designed training procedure for the predictor. Accordingly, we propose a powerful data generation-based learning method named Auxiliary Task-based OOD Learning (ATOL) that can relieve the mistaken OOD generation. We conduct extensive experiments under various OOD detection setups, demonstrating the effectiveness of our method against its advanced counterparts.
We design and implement two single-pass semi-streaming algorithms for the maximum weight $k$-disjoint matching ($k$-DM) problem. Given an integer $k$, the $k$-DM problem is to find $k$ pairwise edge-disjoint matchings such that the sum of the weights of the matchings is maximized. For $k \geq 2$, this problem is NP-hard. Our first algorithm is based on the primal-dual framework of a linear programming relaxation of the problem and is $\frac{1}{3+\varepsilon}$-approximate. We also develop an approximation preserving reduction from $k$-DM to the maximum weight $b$-matching problem. Leveraging this reduction and an existing semi-streaming $b$-matching algorithm, we design a $\frac{k}{(2+\varepsilon)(k+1)}$-approximate semi-streaming algorithm for $k$-DM. For any constant $\varepsilon > 0$, both of these algorithms require $O(nk \log_{1+\varepsilon}^2 n)$ bits of space. To the best of our knowledge, this is the first study of semi-streaming algorithms for the $k$-DM problem. We compare our two algorithms to state-of-the-art offline algorithms on 82 real-world and synthetic test problems. On the smaller instances, our streaming algorithms used significantly less memory (ranging from 6$\times$ to 114$\times$ less) and were faster in runtime than the offline algorithms. Our solutions were often within 5\% of the best weights from the offline algorithms. On a collection of six large graphs with a memory limit of 1 TB and with $k=8$, the offline algorithms terminated only on one graph (mycielskian20). The best offline algorithm on this instance required 640 GB of memory and 20 minutes to complete. In contrast, our slowest streaming algorithm for this instance took under four minutes and produced a matching that was 18\% better in weight, using only 1.4 GB of memory.
We consider the problem of sequential multiple hypothesis testing with nontrivial data collection costs. This problem appears, for example, when conducting biological experiments to identify differentially expressed genes of a disease process. This work builds on the generalized $\alpha$-investing framework which enables control of the false discovery rate in a sequential testing setting. We make a theoretical analysis of the long term asymptotic behavior of $\alpha$-wealth which motivates a consideration of sample size in the $\alpha$-investing decision rule. Posing the testing process as a game with nature, we construct a decision rule that optimizes the expected $\alpha$-wealth reward (ERO) and provides an optimal sample size for each test. Empirical results show that a cost-aware ERO decision rule correctly rejects more false null hypotheses than other methods for $n=1$ where $n$ is the sample size. When the sample size is not fixed cost-aware ERO uses a prior on the null hypothesis to adaptively allocate of the sample budget to each test. We extend cost-aware ERO investing to finite-horizon testing which enables the decision rule to allocate samples in a non-myopic manner. Finally, empirical tests on real data sets from biological experiments show that cost-aware ERO balances the allocation of samples to an individual test against the allocation of samples across multiple tests.
Standard probabilistic sparse coding assumes a Laplace prior, a linear mapping from latents to observables, and Gaussian observable distributions. We here derive a solely entropy-based learning objective for the parameters of standard sparse coding. The novel variational objective has the following features: (A) unlike MAP approximations, it uses non-trivial posterior approximations for probabilistic inference; (B) unlike for previous non-trivial approximations, the novel objective is fully analytical; and (C) the objective allows for a novel principled form of annealing. The objective is derived by first showing that the standard ELBO objective converges to a sum of entropies, which matches similar recent results for generative models with Gaussian priors. The conditions under which the ELBO becomes equal to entropies are then shown to have analytical solutions, which leads to the fully analytical objective. Numerical experiments are used to demonstrate the feasibility of learning with such entropy-based ELBOs. We investigate different posterior approximations including Gaussians with correlated latents and deep amortized approximations. Furthermore, we numerically investigate entropy-based annealing which results in improved learning. Our main contributions are theoretical, however, and they are twofold: (1) for non-trivial posterior approximations, we provide the (to the knowledge of the authors) first analytical ELBO objective for standard probabilistic sparse coding; and (2) we provide the first demonstration on how a recently shown convergence of the ELBO to entropy sums can be used for learning.
While current NL2SQL tasks constructed using Foundation Models have achieved commendable results, their direct application to Natural Language to Graph Query Language (NL2GQL) tasks poses challenges due to the significant differences between GQL and SQL expressions, as well as the numerous types of GQL. Our extensive experiments reveal that in NL2GQL tasks, larger Foundation Models demonstrate superior cross-schema generalization abilities, while smaller Foundation Models struggle to improve their GQL generation capabilities through fine-tuning. However, after fine-tuning, smaller models exhibit better intent comprehension and higher grammatical accuracy. Diverging from rule-based and slot-filling techniques, we introduce R3-NL2GQL, which employs both smaller and larger Foundation Models as reranker, rewriter and refiner. The approach harnesses the comprehension ability of smaller models for information reranker and rewriter, and the exceptional generalization and generation capabilities of larger models to transform input natural language queries and code structure schema into any form of GQLs. Recognizing the lack of established datasets in this nascent domain, we have created a bilingual dataset derived from graph database documentation and some open-source Knowledge Graphs (KGs). We tested our approach on this dataset and the experimental results showed that delivers promising performance and robustness.Our code and dataset is available at //github.com/zhiqix/NL2GQL
We consider the problem of supply chain data visibility in a blockchain-enabled supply chain network. Existing methods typically record transactions happening in a supply chain on a single blockchain and are limited in their ability to deal with different levels of data visibility. To address this limitation, we present FoodFresh -- a multi-chain consortium where organizations store immutable data on their blockchains. A decentralized hub coordinates the cross-chain exchange of digital assets among the heterogeneous blockchains. Mechanisms for enabling blockchain interoperability help to preserve the benefits of independent sovereign blockchains while allowing for data sharing across blockchain boundaries.
This spreading of prion proteins is at the basis of brain neurodegeneration. This paper deals with the numerical modelling of the misfolding process of $\alpha$-synuclein in Parkinson's disease. We introduce and analyze a discontinuous Galerkin method for the semi-discrete approximation of the Fisher-Kolmogorov (FK) equation that can be employed to model the process. We employ a discontinuous Galerkin method on polygonal and polyhedral grids (PolyDG) for space discretization, to accurately simulate the wavefronts typically observed in the prionic spreading and we prove stability and a priori error estimates. Next, we use a Crank-Nicolson scheme to advance in time. For the numerical verification of our numerical model, we first consider a manufactured solution, and then we consider a case with wavefront propagation in two-dimensional polygonal grids. Next, we carry out a simulation of $\alpha$-synuclein spreading in a two-dimensional brain slice in the sagittal plane with a polygonal agglomerated grid that takes full advantage of the flexibility of PolyDG approximation. Finally, we present a simulation in a three-dimensional geometry reconstructed from magnetic resonance images of a patient's brain.
We study the maximum $s,t$-flow oracle problem on planar directed graphs where the goal is to design a data structure answering max $s,t$-flow value (or equivalently, min $s,t$-cut value) queries for arbitrary source-target pairs $(s,t)$. For the case of polynomially bounded integer edge capacities, we describe an exact max $s,t$-flow oracle with truly subquadratic space and preprocessing, and sublinear query time. Moreover, if $(1-\epsilon)$-approximate answers are acceptable, we obtain a static oracle with near-linear preprocessing and $\tilde{O}(n^{3/4})$ query time and a dynamic oracle supporting edge capacity updates and queries in $\tilde{O}(n^{6/7})$ worst-case time. To the best of our knowledge, for directed planar graphs, no (approximate) max $s,t$-flow oracles have been described even in the unweighted case, and only trivial tradeoffs involving either no preprocessing or precomputing all the $n^2$ possible answers have been known. One key technical tool we develop on the way is a sublinear (in the number of edges) algorithm for finding a negative cycle in so-called dense distance graphs. By plugging it in earlier frameworks, we obtain improved bounds for other fundamental problems on planar digraphs. In particular, we show: (1) a deterministic $O(n\log(nC))$ time algorithm for negatively-weighted SSSP in planar digraphs with integer edge weights at least $-C$. This improves upon the previously known bounds in the important case of weights polynomial in $n$, and (2) an improved $O(n\log{n})$ bound on finding a perfect matching in a bipartite planar graph.
When labeled training data is scarce, a promising data augmentation approach is to generate visual features of unknown classes using their attributes. To learn the class conditional distribution of CNN features, these models rely on pairs of image features and class attributes. Hence, they can not make use of the abundance of unlabeled data samples. In this paper, we tackle any-shot learning problems i.e. zero-shot and few-shot, in a unified feature generating framework that operates in both inductive and transductive learning settings. We develop a conditional generative model that combines the strength of VAE and GANs and in addition, via an unconditional discriminator, learns the marginal feature distribution of unlabeled images. We empirically show that our model learns highly discriminative CNN features for five datasets, i.e. CUB, SUN, AWA and ImageNet, and establish a new state-of-the-art in any-shot learning, i.e. inductive and transductive (generalized) zero- and few-shot learning settings. We also demonstrate that our learned features are interpretable: we visualize them by inverting them back to the pixel space and we explain them by generating textual arguments of why they are associated with a certain label.
We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.