Voice-controlled smart speaker devices have gained a foothold in many modern households. Their prevalence combined with their intrusion into core private spheres of life has motivated research on security and privacy intrusions, especially those performed by third-party applications used on such devices. In this work, we take a closer look at such third-party applications from a less pessimistic angle: we consider their potential to provide personalized and secure capabilities and investigate measures to authenticate users (``PIN'', ``Voice authentication'', ``Notification'', and presence of ``Nearby devices''). To this end, we asked 100 participants to evaluate 15 application categories and 51 apps with a wide range of functions. The central questions we explored focused on: users' preferences for security and personalization for different categories of apps; the preferred security and personalization measures for different apps; and the preferred frequency of the respective measure. After an initial pilot study, we focused primarily on 7 categories of apps for which security and personalization are reported to be important; those include the three crucial categories finance, bills, and shopping. We found that ``Voice authentication'', while not currently employed by the apps we studied, is a highly popular measure to achieve security and personalization. Many participants were open to exploring combinations of security measures to increase the protection of highly relevant apps. Here, the combination of ``PIN'' and ``Voice authentication'' was clearly the most desired one. This finding indicates systems that seamlessly combine ``Voice authentication'' with other measures might be a good candidate for future work.
Non-Fungible Tokens (NFTs) represent deeds of ownership, based on blockchain technologies and smart contracts, of unique crypto assets on digital art forms (e.g., artworks or collectibles). In the spotlight after skyrocketing in 2021, NFTs have attracted the attention of crypto enthusiasts and investors intent on placing promising investments in this profitable market. However, the NFT financial performance prediction has not been widely explored to date. In this work, we address the above problem based on the hypothesis that NFT images and their textual descriptions are essential proxies to predict the NFT selling prices. To this purpose, we propose MERLIN, a novel multimodal deep learning framework designed to train Transformer-based language and visual models, along with graph neural network models, on collections of NFTs' images and texts. A key aspect in MERLIN is its independence on financial features, as it exploits only the primary data a user interested in NFT trading would like to deal with, i.e., NFT images and textual descriptions. By learning dense representations of such data, a price-category classification task is performed by MERLIN models, which can also be tuned according to user preferences in the inference phase to mimic different risk-return investment profiles. Experimental evaluation on a publicly available dataset has shown that MERLIN models achieve significant performances according to several financial assessment criteria, fostering profitable investments, and also beating baseline machine-learning classifiers based on financial features.
Considerable scientific work involves locating, analyzing, systematizing, and synthesizing other publications. Its results end up in a paper's "background" section or in standalone articles, which include meta-analyses and systematic literature reviews. The required research is aided through the use of online scientific publication databases and search engines, such as Web of Science, Scopus, and Google Scholar. However, use of online databases suffers from a lack of repeatability and transparency, as well as from technical restrictions. Thankfully, open data, powerful personal computers, and open source software now make it possible to run sophisticated publication studies on the desktop in a self-contained environment that peers can readily reproduce. Here we report a Python software package and an associated command-line tool that can populate embedded relational databases with slices from the complete set of Crossref publication metadata, ORCID author records, and other open data sets, for in-depth processing through performant queries. We demonstrate the software's utility by analyzing the underlying dataset's contents, by visualizing the evolution of publications in diverse scientific fields and relationships among them, by outlining scientometric facts associated with COVID-19 research, and by replicating commonly-used bibliometric measures of productivity, impact, and disruption.
Serverless computing is a popular cloud computing paradigm that frees developers from server management. Function-as-a-Service (FaaS) is the most popular implementation of serverless computing, representing applications as event-driven and stateless functions. However, existing studies report that functions of FaaS applications severely suffer from cold-start latency. In this paper, we propose an approach namely FaaSLight to accelerating the cold start for FaaS applications through application-level optimization. We first conduct a measurement study to investigate the possible root cause of the cold start problem of FaaS. The result shows that application code loading latency is a significant overhead. Therefore, loading only indispensable code from FaaS applications can be an adequate solution. Based on this insight, we identify code related to application functionalities by constructing the function-level call graph, and separate other code (i.e., optional code) from FaaS applications. The separated optional code can be loaded on demand to avoid the inaccurate identification of indispensable code causing application failure. In particular, a key principle guiding the design of FaaSLight is inherently general, i.e., platform- and language-agnostic. The evaluation results on real-world FaaS applications show that FaaSLight can significantly reduce the code loading latency (up to 78.95%, 28.78% on average), thereby reducing the cold-start latency. As a result, the total response latency of functions can be decreased by up to 42.05% (19.21% on average). Compared with the state-of-the-art, FaaSLight achieves a 21.25X improvement in reducing the average total response latency.
The report provides an intricate analysis of cyber security defined in contemporary operational digital environments. An extensive literature review is formed to determine how the construct is reviewed in modern scholarly contexts. The article seeks to offer a comprehensive definition of the term "cybersecurity" to accentuate its multidisciplinary perspectives. A meaningful concise, and inclusive dimension will be provided to assist in designing scholarly discourse on the subject. The report will offer a unified framework for examining activities that constitute the concept resulting in a new definition; "Cybersecurity is the collection and concerting of resources including personnel and infrastructure, structures, and processes to protect networks and cyber-enabled computer systems from events that compromise the integrity and interfere with property rights, resulting in some extent of the loss." The encapsulation of the interdisciplinary domains will be critical in improving understanding and response to emerging challenges in cyberspace.
We propose a new approach to portfolio optimization that utilizes a unique combination of synthetic data generation and a CVaR-constraint. We formulate the portfolio optimization problem as an asset allocation problem in which each asset class is accessed through a passive (index) fund. The asset-class weights are determined by solving an optimization problem which includes a CVaR-constraint. The optimization is carried out by means of a Modified CTGAN algorithm which incorporates features (contextual information) and is used to generate synthetic return scenarios, which, in turn, are fed into the optimization engine. For contextual information we rely on several points along the U.S. Treasury yield curve. The merits of this approach are demonstrated with an example based on ten asset classes (covering stocks, bonds, and commodities) over a fourteen-and-half year period (January 2008-June 2022). We also show that the synthetic generation process is able to capture well the key characteristics of the original data, and the optimization scheme results in portfolios that exhibit satisfactory out-of-sample performance. We also show that this approach outperforms the conventional equal-weights (1/N) asset allocation strategy and other optimization formulations based on historical data only.
Search engines decide what we see for a given search query. Since many people are exposed to information through search engines, it is fair to expect that search engines are neutral. However, search engine results do not necessarily cover all the viewpoints of a search query topic, and they can be biased towards a specific view since search engine results are returned based on relevance, which is calculated using many features and sophisticated algorithms where search neutrality is not necessarily the focal point. Therefore, it is important to evaluate the search engine results with respect to bias. In this work we propose novel web search bias evaluation measures which take into account the rank and relevance. We also propose a framework to evaluate web search bias using the proposed measures and test our framework on two popular search engines based on 57 controversial query topics such as abortion, medical marijuana, and gay marriage. We measure the stance bias (in support or against), as well as the ideological bias (conservative or liberal). We observe that the stance does not necessarily correlate with the ideological leaning, e.g. a positive stance on abortion indicates a liberal leaning but a positive stance on Cuba embargo indicates a conservative leaning. Our experiments show that neither of the search engines suffers from stance bias. However, both search engines suffer from ideological bias, both favouring one ideological leaning to the other, which is more significant from the perspective of polarisation in our society.
Novel AI-based code-writing Large Language Models (LLMs) such as OpenAI's Codex have demonstrated capabilities in many coding-adjacent domains. In this work we consider how LLMs maybe leveraged to automatically repair security relevant bugs present in hardware designs. We focus on bug repair in code written in the Hardware Description Language Verilog. For this study we build a corpus of domain-representative hardware security bugs. We then design and implement a framework to quantitatively evaluate the performance of any LLM tasked with fixing the specified bugs. The framework supports design space exploration of prompts (i.e., prompt engineering) and identifying the best parameters for the LLM. We show that an ensemble of LLMs can repair all ten of our benchmarks. This ensemble outperforms the state-of-the-art Cirfix hardware bug repair tool on its own suite of bugs. These results show that LLMs can repair hardware security bugs and the framework is an important step towards the ultimate goal of an automated end-to-end bug repair framework.
Autonomous vehicles are expected to operate safely in real-life road conditions in the next years. Nevertheless, unanticipated events such as the existence of unexpected objects in the range of the road, can put safety at risk. The advancement of sensing and communication technologies and Internet of Things may facilitate the recognition of hazardous situations and information exchange in a cooperative driving scheme, providing new opportunities for the increase of collaborative situational awareness. Safe and unobtrusive visualization of the obtained information may nowadays be enabled through the adoption of novel Augmented Reality (AR) interfaces in the form of windshields. Motivated by these technological opportunities, we propose in this work a saliency-based distributed, cooperative obstacle detection and rendering scheme for increasing the driver's situational awareness through (i) automated obstacle detection, (ii) AR visualization and (iii) information sharing (upcoming potential dangers) with other connected vehicles or road infrastructure. An extensive evaluation study using a variety of real datasets for pothole detection showed that the proposed method provides favorable results and features compared to other recent and relevant approaches.
Few-shot learning (FSL) has emerged as an effective learning method and shows great potential. Despite the recent creative works in tackling FSL tasks, learning valid information rapidly from just a few or even zero samples still remains a serious challenge. In this context, we extensively investigated 200+ latest papers on FSL published in the past three years, aiming to present a timely and comprehensive overview of the most recent advances in FSL along with impartial comparisons of the strengths and weaknesses of the existing works. For the sake of avoiding conceptual confusion, we first elaborate and compare a set of similar concepts including few-shot learning, transfer learning, and meta-learning. Furthermore, we propose a novel taxonomy to classify the existing work according to the level of abstraction of knowledge in accordance with the challenges of FSL. To enrich this survey, in each subsection we provide in-depth analysis and insightful discussion about recent advances on these topics. Moreover, taking computer vision as an example, we highlight the important application of FSL, covering various research hotspots. Finally, we conclude the survey with unique insights into the technology evolution trends together with potential future research opportunities in the hope of providing guidance to follow-up research.
The world population is anticipated to increase by close to 2 billion by 2050 causing a rapid escalation of food demand. A recent projection shows that the world is lagging behind accomplishing the "Zero Hunger" goal, in spite of some advancements. Socio-economic and well being fallout will affect the food security. Vulnerable groups of people will suffer malnutrition. To cater to the needs of the increasing population, the agricultural industry needs to be modernized, become smart, and automated. Traditional agriculture can be remade to efficient, sustainable, eco-friendly smart agriculture by adopting existing technologies. In this survey paper the authors present the applications, technological trends, available datasets, networking options, and challenges in smart agriculture. How Agro Cyber Physical Systems are built upon the Internet-of-Agro-Things is discussed through various application fields. Agriculture 4.0 is also discussed as a whole. We focus on the technologies, such as Artificial Intelligence (AI) and Machine Learning (ML) which support the automation, along with the Distributed Ledger Technology (DLT) which provides data integrity and security. After an in-depth study of different architectures, we also present a smart agriculture framework which relies on the location of data processing. We have divided open research problems of smart agriculture as future research work in two groups - from a technological perspective and from a networking perspective. AI, ML, the blockchain as a DLT, and Physical Unclonable Functions (PUF) based hardware security fall under the technology group, whereas any network related attacks, fake data injection and similar threats fall under the network research problem group.