publications | Nikos Kanakaris

For my publications, you can also check my profile on Google Scholar.

2022

Conference

Is it a bug or a feature? Identifying software bugs using graph attention networks

Nikos Kanakaris, Ilias Siachos, and Nikos Karacapilidis

International Conference on Tools with Artificial Intelligence (ICTAI) (to appear) 2022

Abs Code

This paper proposes a novel approach for identifying software bugs by building on a meaningful combination of word embeddings, graph-based text representations and graph attention networks. Existing approaches aim to advance each of the above components individually, without considering an integrative approach. As a result, they ignore information that is related to either the structure of a given text or an individual word of the text. Instead, our approach seamlessly incorporates both semantic and structural characteristics into a graph, which are then fed to a graph attention network in order to classify GitHub issues as bugs or features. Our experimental results demonstrate a significant improvement in terms of accuracy, precision and recall of the proposed approach compared to a list of classical and graph-based machine learning models. The dataset for the experiments reported in this paper has been retrieved from the kaggle.com platform and concerns GitHub issues with short-text attributes.
Conference

Predicting prices of Airbnb listings via Graph Neural Networks and Document Embeddings: The case of the island of Santorini

Nikos Kanakaris, and Nikos Karacapilidis

International Conference on ENTERprise Information Systems (CENTERIS) (to appear) 2022

Abs Code

We propose a new approach for predicting prices of Airbnb listings for touristic destinations such as the island of Santorini using graph neural networks and document embeddings. The already existing methods rely only on the features of each individual listing, ignoring any topological or neighborhood properties. Our approach represents the listings of a given area as a graph, where each node corresponds to a listing and each edge connects two similar neighboring listings. This enables us to not only exploit the features of each individual listing, but also to take into consideration information related to its neighborhood. Our preliminary experiments demonstrate that the proposed approach outperforms a list of classical regression models as far as the coefficient of determination (R2) is concerned and decreases the Mean Squared Error (MSE). The data of the experimentations reported in this paper have been retrieved from the insideairbnb.com platform and describe the Airbnb listings of the island of Santorini.
Book Chapter

A comparative survey of graph databases and software for social network analytics: The link prediction perspective

Nikos Kanakaris, Dimitrios Michail, and Iraklis Varlamis

Book chapter for Graph Databases and their use in social media and smart cities, Science Publishers and CRC Press, Taylor and Francis Group (to appear) 2022

Abs Code

In recent years, we have witnessed an excessive increase in the amounts of data available on the Web. These data originate mostly from social media applications or social networks and thus they are highly connected. Graph databases are capable of managing these data successfully since they are particularly designed for storing, retrieving, and searching data that is rich in relationships. This chapter aims to provide a detailed literature review of the existing graph databases and software libraries suitable for performing common social network analytic tasks. In addition, a classification of these graph technologies is also proposed, taking into consideration (i) the provided means of storing, importing, exporting, and querying data, (ii) the available algorithms, (iii) the ability to deal with big social graphs, and (iv) the CPU and memory usage of each one of the reported
Journal
Detection of fake news campaigns using graph convolutional networks

Dimitrios Michail, Nikos Kanakaris, and Iraklis Varlamis

International Journal of Information Management Data Insights, Elsevier 2022

Abs Bib HTML Code

The detection of organised disinformation campaigns that spread fake news, by first camouflaging them as real ones is crucial in the battle against misinformation and disinformation in social media. This article presents a method for classifying the diffusion graphs of news formed in social media, by taking into account the profiles of the users that participate in the graph, the profiles of their social relations and the way the news spread, ignoring the actual text content of the news or the messages that spread it. This increases the robustness of the method and widens its applicability in different contexts. The results of this study show that the proposed method outperforms methods that rely on textual information only and provide a model that can be employed for detecting similar disinformation campaigns on different context in the same social medium.
@article{MICHAIL2022100104, title = {Detection of fake news campaigns using graph convolutional networks}, journal = {International Journal of Information Management Data Insights, Elsevier}, volume = {2}, number = {2}, pages = {100104}, year = {2022}, issn = {2667-0968}, doi = {https://doi.org/10.1016/j.jjimei.2022.100104}, url = {https://www.sciencedirect.com/science/article/pii/S2667096822000477}, author = {Michail, Dimitrios and Kanakaris, Nikos and Varlamis, Iraklis}, keywords = {Fake news, Astroturfing, Graph convolutional networks, Disinformation, Graph attention networks}, }
Journal
Making personnel selection smarter through word embeddings: A graph-based approach

Nikos Kanakaris, Nikolaos Giarelis, Ilias Siachos, and Nikos Karacapilidis

Machine Learning with Applications, Elsevier 2022

Abs Bib HTML PDF Code

This paper employs techniques and algorithms from the fields of natural language processing, graph representation learning and word embeddings to assist project managers in the task of personnel selection. To do so, our approach initially represents multiple textual documents as a single graph. Then, it computes word embeddings through representation learning on graphs and performs feature selection. Finally, it builds a classification model that is able to estimate how qualified a candidate employee is to work on a given task, taking as input only the descriptions of the tasks and a list of word embeddings. Our approach differs from the existing ones in that it does not require the calculation of key performance indicators or any other form of structured data in order to operate properly. For our experiments, we retrieved data from the Jira issue tracking system of the Apache Software Foundation. The evaluation results show, in most cases, an increase of 0.43% in the accuracy of the proposed classification models when compared against a widely-adopted baseline method, while their validation loss is significantly decreased by 65.54%.
@article{KANAKARIS2022100214, title = {Making personnel selection smarter through word embeddings: A graph-based approach}, journal = {Machine Learning with Applications, Elsevier}, volume = {7}, pages = {100214}, year = {2022}, issn = {2666-8270}, doi = {https://doi.org/10.1016/j.mlwa.2021.100214}, url = {https://www.sciencedirect.com/science/article/pii/S2666827021001079}, author = {Kanakaris, Nikos and Giarelis, Nikolaos and Siachos, Ilias and Karacapilidis, Nikos}, keywords = {Natural language processing, Text categorization, Graph representation learning, Issue management, Personnel selection, Word embeddings}, first_author = {true}, }
Book Chapter
Medical Knowledge Graphs in the Discovery of Future Research Collaborations

Nikolaos Giarelis, Nikos Kanakaris, and Nikos Karacapilidis

2022

Abs Bib HTML PDF Code

This chapter introduces a framework that is based on a novel graph-based text representation method and combines graph-based feature selection, text categorization and link prediction to advance the discovery of future research collaborations. Our approach integrates into a single knowledge graph both structured and unstructured textual data through a novel representation of multiple scientific documents. The Neo4j graph database is used for the representation of the proposed scientific knowledge graph. For the implementation of our approach, we use the Python programming language and the scikit-learn machine learning library. We assess our approach against classical link prediction algorithms using accuracy, recall and precision as our performance metrics. Our experiments achieve state-of-the-art accuracy in the task of predicting future research collaborations. The experimentations reported in this chapter use the COVID-19 Open Research Dataset.
@inbook{Giarelis2022, author = {Giarelis, Nikolaos and Kanakaris, Nikos and Karacapilidis, Nikos}, editor = {Lim, Chee-Peng and Chen, Yen-Wei and Vaidya, Ashlesha and Mahorkar, Charu and Jain, Lakhmi C.}, title = {Medical Knowledge Graphs in the Discovery of Future Research Collaborations}, booktitle = {Handbook of Artificial Intelligence in Healthcare: Vol 2: Practicalities and Prospects}, year = {2022}, publisher = {Springer International Publishing}, address = {Cham}, pages = {371--391}, isbn = {978-3-030-83620-7}, doi = {10.1007/978-3-030-83620-7_16}, url = {https://doi.org/10.1007/978-3-030-83620-7_16}, }

2021

Journal
Converting Biomedical Text Annotated Resources into FAIR Research Objects with an Open Science Platform

Alexandros Kanterakis, Nikos Kanakaris, Manos Koutoulakis, Konstantina Pitianou, Nikos Karacapilidis, Lefteris Koumakis, and George Potamias

Applied Sciences 2021

Abs Bib HTML PDF Code

Today, there are excellent resources for the semantic annotation of biomedical text. These resources span from ontologies, tools for NLP, annotators, and web services. Most of these are available either in the form of open source components (i.e., MetaMap) or as web services that offer free access (i.e., Whatizit). In order to use these resources in automatic text annotation pipelines, researchers face significant technical challenges. For open-source tools, the challenges include the setting up of the computational environment, the resolution of dependencies, as well as the compilation and installation of the software. For web services, the challenge is implementing clients to undertake communication with the respective web APIs. Even resources that are available as Docker containers (i.e., NCBO annotator) require significant technical skills for installation and setup. This work deals with the task of creating ready-to-install and run Research Objects (ROs) for a large collection of components in biomedical text analysis. These components include (a) tools such as cTAKES, NOBLE Coder, MetaMap, NCBO annotator, BeCAS, and Neji; (b) ontologies from BioPortal, NCBI BioSystems, and Open Biomedical Ontologies; and (c) text corpora such as BC4GO, Mantra Gold Standard Corpus, and the COVID-19 Open Research Dataset. We make these resources available in OpenBio.eu, an open-science RO repository and workflow management system. All ROs can be searched, shared, edited, downloaded, commented on, and rated. We also demonstrate how one can easily connect these ROs to form a large variety of text annotation pipelines.
@article{app11209648, author = {Kanterakis, Alexandros and Kanakaris, Nikos and Koutoulakis, Manos and Pitianou, Konstantina and Karacapilidis, Nikos and Koumakis, Lefteris and Potamias, George}, title = {Converting Biomedical Text Annotated Resources into FAIR Research Objects with an Open Science Platform}, journal = {Applied Sciences}, volume = {11}, year = {2021}, number = {20}, article-number = {9648}, url = {https://www.mdpi.com/2076-3417/11/20/9648}, issn = {2076-3417}, doi = {10.3390/app11209648}, }
Journal
Shall I Work with Them? A Knowledge Graph-Based Approach for Predicting Future Research Collaborations

Nikos Kanakaris, Nikolaos Giarelis, Ilias Siachos, and Nikos Karacapilidis

Entropy 2021

Abs Bib HTML PDF Code

We consider the prediction of future research collaborations as a link prediction problem applied on a scientific knowledge graph. To the best of our knowledge, this is the first work on the prediction of future research collaborations that combines structural and textual information of a scientific knowledge graph through a purposeful integration of graph algorithms and natural language processing techniques. Our work: (i) investigates whether the integration of unstructured textual data into a single knowledge graph affects the performance of a link prediction model, (ii) studies the effect of previously proposed graph kernels based approaches on the performance of an ML model, as far as the link prediction problem is concerned, and (iii) proposes a three-phase pipeline that enables the exploitation of structural and textual information, as well as of pre-trained word embeddings. We benchmark the proposed approach against classical link prediction algorithms using accuracy, recall, and precision as our performance metrics. Finally, we empirically test our approach through various feature combinations with respect to the link prediction problem. Our experimentations with the new COVID-19 Open Research Dataset demonstrate a significant improvement of the abovementioned performance metrics in the prediction of future research collaborations.
@article{e23060664, author = {Kanakaris, Nikos and Giarelis, Nikolaos and Siachos, Ilias and Karacapilidis, Nikos}, title = {Shall I Work with Them? A Knowledge Graph-Based Approach for Predicting Future Research Collaborations}, journal = {Entropy}, volume = {23}, year = {2021}, number = {6}, article-number = {664}, url = {https://www.mdpi.com/1099-4300/23/6/664}, pubmedid = {34070422}, issn = {1099-4300}, doi = {10.3390/e23060664}, first_author = {true}, }
Conference
A Comparative Assessment of State-Of-The-Art Methods for Multilingual Unsupervised Keyphrase Extraction

Nikolaos Giarelis, Nikos Kanakaris, and Nikos Karacapilidis

In Artificial Intelligence Applications and Innovations 2021

Abs Bib HTML PDF Code

Keyphrase extraction is a fundamental task in information management, which is often used as a preliminary step in various information retrieval and natural language processing tasks. The main contribution of this paper lies in providing a comparative assessment of prominent multilingual unsupervised keyphrase extraction methods that build on statistical (RAKE, YAKE), graph-based (TextRank, SingleRank) and deep learning (KeyBERT) methods. For the experimentations reported in this paper, we employ well-known datasets designed for keyphrase extraction from five different natural languages (English, French, Spanish, Portuguese and Polish). We use the F1 score and a partial match evaluation framework, aiming to investigate whether the number of terms of the documents and the language of each dataset affect the accuracy of the selected methods. Our experimental results reveal a set of insights about the suitability of the selected methods in texts of different sizes, as well as the performance of these methods in datasets of different languages.
@inproceedings{10.1007/978-3-030-79150-6_50, author = {Giarelis, Nikolaos and Kanakaris, Nikos and Karacapilidis, Nikos}, editor = {Maglogiannis, Ilias and Macintyre, John and Iliadis, Lazaros}, title = {A Comparative Assessment of State-Of-The-Art Methods for Multilingual Unsupervised Keyphrase Extraction}, booktitle = {Artificial Intelligence Applications and Innovations}, year = {2021}, publisher = {Springer International Publishing}, address = {Cham}, pages = {635--645}, isbn = {978-3-030-79150-6}, }

2020

Conference
On the Utilization of Structural and Textual Information of a Scientific Knowledge Graph to Discover Future Research Collaborations: A Link Prediction Perspective

Nikolaos Giarelis, Nikos Kanakaris, and Nikos Karacapilidis

In Discovery Science 2020

Abs Bib HTML PDF Code

We consider the discovery of future research collaborations as a link prediction problem applied on scientific knowledge graphs. Our approach integrates into a single knowledge graph both structured and unstructured textual data through a novel representation of multiple scientific documents. The Neo4j graph database is used for the representation of the proposed scientific knowledge graph. For the implementation of our approach, we use the Python programming language and the scikit-learn ML library. We benchmark our approach against classical link prediction algorithms using accuracy, recall, and precision as our performance metrics. Our initial experimentations demonstrate a significant improvement of the accuracy of the future collaboration prediction task. The experimentations reported in this paper use the new COVID-19 Open Research Dataset.
@inproceedings{10.1007/978-3-030-61527-7_29, author = {Giarelis, Nikolaos and Kanakaris, Nikos and Karacapilidis, Nikos}, editor = {Appice, Annalisa and Tsoumakas, Grigorios and Manolopoulos, Yannis and Matwin, Stan}, title = {On the Utilization of Structural and Textual Information of a Scientific Knowledge Graph to Discover Future Research Collaborations: A Link Prediction Perspective}, booktitle = {Discovery Science}, year = {2020}, publisher = {Springer International Publishing}, address = {Cham}, pages = {437--450}, isbn = {978-3-030-61527-7}, first_author = {true}, }
Conference
An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents

Nikolaos Giarelis, Nikos Kanakaris, and Nikos Karacapilidis

In Artificial Intelligence Applications and Innovations 2020

Abs Bib HTML PDF Code

This paper introduces a novel graph-based approach to select features from multiple textual documents. The proposed solution enables the investigation of the importance of a term into a whole corpus of documents by utilizing contemporary graph theory methods, such as community detection algorithms and node centrality measures. Compared to well-tried existing solutions, evaluation results show that the proposed approach increases the accuracy of most text classifiers employed and decreases the number of features required to achieve ‘state-of-the-art’ accuracy. Well-known datasets used for the experimentations reported in this paper include 20Newsgroups, LingSpam, Amazon Reviews and Reuters.
@inproceedings{10.1007/978-3-030-49161-1_9, author = {Giarelis, Nikolaos and Kanakaris, Nikos and Karacapilidis, Nikos}, editor = {Maglogiannis, Ilias and Iliadis, Lazaros and Pimenidis, Elias}, title = {An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents}, booktitle = {Artificial Intelligence Applications and Innovations}, year = {2020}, publisher = {Springer International Publishing}, address = {Cham}, pages = {96--106}, isbn = {978-3-030-49161-1}, first_author = {true}, }
Conference
On a Novel Representation of Multiple Textual Documents in a Single Graph

Nikolaos Giarelis, Nikos Kanakaris, and Nikos Karacapilidis

In Intelligent Decision Technologies 2020

Abs Bib HTML PDF Code

This paper introduces a novel approach to represent multiple documents as a single graph, namely, the graph-of-docs model, together with an associated novel algorithm for text categorization. The proposed approach enables the investigation of the importance of a term into a whole corpus of documents and supports the inclusion of relationship edges between documents, thus enabling the calculation of important metrics as far as documents are concerned. Compared to well-tried existing solutions, our initial experimentations demonstrate a significant improvement of the accuracy of the text categorization process. For the experimentations reported in this paper, we used a well-known dataset containing about 19,000 documents organized in various subjects.
@inproceedings{10.1007/978-981-15-5925-9_9, author = {Giarelis, Nikolaos and Kanakaris, Nikos and Karacapilidis, Nikos}, editor = {Czarnowski, Ireneusz and Howlett, Robert J. and Jain, Lakhmi C.}, title = {On a Novel Representation of Multiple Textual Documents in a Single Graph}, booktitle = {Intelligent Decision Technologies}, year = {2020}, publisher = {Springer Singapore}, address = {Singapore}, pages = {105--115}, isbn = {978-981-15-5925-9}, first_author = {true}, }

Conference

On the Exploitation of Textual Descriptions for a Better-informed Task Assignment Process

Nikos Kanakaris, Nikos Karacapilidis, and Georgios Kournetas

2020

Bib HTML PDF Code

@conference{icores20,
  author = {Kanakaris, Nikos and Karacapilidis, Nikos and Kournetas, Georgios},
  title = {On the Exploitation of Textual Descriptions for a Better-informed Task Assignment Process},
  booktitle = {Proceedings of the 9th International Conference on Operations Research and Enterprise Systems - ICORES,},
  year = {2020},
  pages = {304-310},
  publisher = {SciTePress},
  organization = {INSTICC},
  doi = {10.5220/0009151603040310},
  isbn = {978-989-758-396-4},
  issn = {2184-4372},
  first_author = {true},
}

Book Chapter
Combining Machine Learning and Operations Research Methods to Advance the Project Management Practice

Nikos Kanakaris, Nikos Karacapilidis, Georgios Kournetas, and Alexis Lazanas

In Operations Research and Enterprise Systems 2020

Abs Bib HTML PDF Code

Project Management is a complex practice that is associated with a series of challenges such as handling of conflicts and dependencies in resource allocation, fine tuning of projects to avoid fragmented planning, handling of potential opportunities or threats during the execution of a project, and alignment between projects and business objectives. Traditionally, methods and tools to address these issues are based on analytical approaches developed in the realm of the Operations Research discipline. Aiming to facilitate and augment the quality of the Project Management practice, this paper proposes a hybrid approach that builds on the synergy between contemporary Machine Learning and Operations Research techniques. Based on past data, Machine Learning techniques can predict undesired situations, provide timely warnings and recommend preventive actions regarding problematic resource loads or deviations from business priority lists. The applicability of our approach is demonstrated through two real examples elaborating two different datasets. In these examples, we comment on the proper orchestration of the associated Operations Research and Machine Learning algorithms, paying equal attention to both optimization and big data manipulation issues.
@inproceedings{10.1007/978-3-030-37584-3_7, author = {Kanakaris, Nikos and Karacapilidis, Nikos and Kournetas, Georgios and Lazanas, Alexis}, editor = {Parlier, Greg H. and Liberatore, Federico and Demange, Marc}, title = {Combining Machine Learning and Operations Research Methods to Advance the Project Management Practice}, booktitle = {Operations Research and Enterprise Systems}, year = {2020}, publisher = {Springer International Publishing}, address = {Cham}, pages = {135--155}, isbn = {978-3-030-37584-3}, first_author = {true}, }

2019

Conference

Towards Reproducible Bioinformatics: The OpenBio-C Scientific Workflow Environment

A. Kanterakis, G. Iatraki, K. Pityanou, L. Koumakis, N. Kanakaris, N. Karacapilidis, and G. Potamias

In 2019 IEEE 19th International Conference on Bioinformatics and Bioengineering (BIBE) Oct 2019

Bib HTML PDF Code

@inproceedings{8941893,
  author = {Kanterakis, A. and Iatraki, G. and Pityanou, K. and Koumakis, L. and Kanakaris, N. and Karacapilidis, N. and Potamias, G.},
  booktitle = {2019 IEEE 19th International Conference on Bioinformatics and Bioengineering (BIBE)},
  title = {Towards Reproducible Bioinformatics: The OpenBio-C Scientific Workflow Environment},
  year = {2019},
  volume = {},
  issn = {},
  pages = {221-226},
  keywords = {},
  doi = {10.1109/BIBE.2019.00047},
  url = {https://doi.ieeecomputersociety.org/10.1109/BIBE.2019.00047},
  publisher = {IEEE Computer Society},
  address = {Los Alamitos, CA, USA},
  month = oct,
}

Conference

On the Advancement of Project Management through a Flexible Integration of Machine Learning and Operations Research Tools

Nikos Kanakaris, Nikos Karacapilidis, and Alexis Lazanas

Oct 2019

Bib HTML PDF

@conference{icores19,
  author = {Kanakaris, Nikos and Karacapilidis, Nikos and Lazanas, Alexis},
  title = {On the Advancement of Project Management through a Flexible Integration of Machine Learning and Operations Research Tools},
  booktitle = {Proceedings of the 8th International Conference on Operations Research and Enterprise Systems - ICORES,},
  year = {2019},
  pages = {362-369},
  publisher = {SciTePress},
  organization = {INSTICC},
  doi = {10.5220/0007387103620369},
  isbn = {978-989-758-352-0},
  issn = {2184-4372},
  first_author = {true}
}

2016

Thesis
Parallelization of “burn scar mapping” algorithms - Παραλληλοποίηση αλγορίθμων χαρτογράφησης καμένων εκτάσεων σε δορυφορικά δεδομένα

Νικόλαος Κανακάρης

Oct 2016

Abs Bib HTML PDF

Nowadays, the plant life all over the world is decreasing rapidly. On the contrary, the breakouts of fires, in forests, are increasing. As a result, many programs and missions have been created, as far as remote sensing is concerned. Their goal is to collect data, in order to map out burnt areas. NASA’s Landsat Program provides information that can be used in terms of ’burn-scar’ mapping. ESA (European Space Agency) also offers its services via the program ’Copernicus’ which is responsible for satellite missions called ’Sentinels’. The National Observatory of Athens has played a significant role in mapping out burnt areas throughout Greek territory, by developing different systems, which apply algorithms and filters on digital image processing. The data used is mostly from the Landsat Program, due to the fact its’ satellite images are high quality and large in size. Therefore, the need for processing a large number of images that are also big in size, makes sequential implementation of the system not sufficient enough, as far as time is concerned. This thesis has worked on the algorithm and filter parallelization of the ’burn-scar’ mapping system that was implemented by the National Observatory of Athens. For the actualization of the thesis, the programming language Python and Message Passing Interface (MPI) are used. The parallelization achieves a decrease of the total execution time from 14,7 minutes to less than 1 minute, using up to 33 quad-core computers at the lab of the Department of Informatics and Telematics.
@misc{17398, title = {Parallelization of “burn scar mapping” algorithms - Παραλληλοποίηση αλγορίθμων χαρτογράφησης καμένων εκτάσεων σε δορυφορικά δεδομένα}, author = {Κανακάρης, Νικόλαος}, year = {2016}, school = {Τμήμα Πληροφορικής και Τηλεματικής, Χαροκόπειο Πανεπιστήμιο}, }