Publications – Smart Data Analytics

You can find a list of completed Phd theses here.

538 entries « ‹ 1 of 11 › »

2022

Chaudhuri, Debanjan

Enriching Text-Based Human-Machine Interactions with Additional World Knowledge PhD Thesis

Rheinische Friedrich-Wilhelms-Universität Bonn, 2022.

Links | BibTeX

Bader, Sebastian Richard

Semantic Digital Twins in the Industrial Internet of Things PhD Thesis

Rheinische Friedrich-Wilhelms-Universität Bonn, 2022.

Links | BibTeX

Lange, Christoph; Langkau, Jörg; Bader, Sebastian R.

The IDS Information Model: A Semantic Vocabulary for Sovereign Data Exchange Book Section

In: Designing Data Spaces: The Ecosystem Approach to Competitive Advantage, pp. 111–127, Springer, 2022.

Links | BibTeX

Rony, Md. Rashad Al Hasan; Kovriguina, Liubov; Chaudhuri, Debanjan; Usbeck, Ricardo; Lehmann, Jens

RoMe: A Robust Metric for Evaluating Natural Language Generation Proceedings Article

In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pp. 5645–5657, Association for Computational Linguistics, 2022.

Links | BibTeX

Draschner, Carsten Felix; Jabeen, Hajira; Lehmann, Jens

SimE4KG: Distributed and Explainable Multi-Modal Semantic Similarity Estimation for Knowledge Graphs Proceedings Article

In: 5th IEEE International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2022, Laguna Hills, CA, USA, September 19-21, 2022, pp. 1–8, IEEE, 2022.

Links | BibTeX

Draschner, Carsten Felix; Jabeen, Hajira; Lehmann, Jens

Ethical and Sustainability Considerations for Knowledge Graph based Machine Learning Proceedings Article

In: 5th IEEE International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2022, Laguna Hills, CA, USA, September 19-21, 2022, pp. 53–60, IEEE, 2022.

Links | BibTeX

Kacupaj, Endri; Singh, Kuldeep; Maleshkova, Maria; Lehmann, Jens

Contrastive Representation Learning for Conversational Question Answering over Knowledge Graphs Proceedings Article

In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, October 17-21, 2022, pp. 925–934, ACM, 2022.

Links | BibTeX

Sun, Wenya; Grubenmann, Tobias; Cheng, Reynold; Kao, Ben; Ching, Wai-Ki

Modeling Long-Range Travelling Times with Big Railway Data Proceedings Article

In: Database Systems for Advanced Applications - 27th International Conference, DASFAA 2022, Virtual Event, April 11-14, 2022, Proceedings, Part III, pp. 443–454, Springer, 2022.

Links | BibTeX

Vlad, Adriano; Vahdati, Sahar; Nayyeri, Mojtaba; Bellomarini, Luigi; Sallinger, Emanuel

Towards Hybrid Logic-based and Embedding-based Reasoning on Financial Knowledge Graphs Proceedings Article

In: Proceedings of the Workshops of the EDBT/ICDT 2022 Joint Conference, Edinburgh, UK, March 29, 2022, CEUR-WS.org, 2022.

Links | BibTeX

Nayyeri, Mojtaba; Vahdati, Sahar; Khan, Md Tansen; Alam, Mirza Mohtashim; Wenige, Lisa; Behrend, Andreas; Lehmann, Jens

Dihedron Algebraic Embeddings for Spatio-Temporal Knowledge Graph Completion Proceedings Article

In: The Semantic Web - 19th International Conference, ESWC 2022, Hersonissos, Crete, Greece, May 29 - June 2, 2022, Proceedings, pp. 253–269, Springer, 2022.

Links | BibTeX

Kraft, Angelie; Zorn, Hans-Peter; Fecht, Pascal; Simon, Judith; Biemann, Chris; Usbeck, Ricardo

Measuring Gender Bias in German Language Generation Proceedings Article

In: 52. Jahrestagung der Gesellschaft für Informatik, INFORMATIK 2022, Informatik in den Naturwissenschaften, 26. - 30. September 2022, Hamburg, pp. 1257–1274, Gesellschaft für Informatik, Bonn, 2022.

Links | BibTeX

Knoblach, Judith; Acharya, Nikhil; Koranemkattil, Bhavya; Both, Andreas; Collarana, Diego

Combining Knowledge Graphs and Language Models to Answer Questions over Tables Proceedings Article

In: Proceedings of Poster and Demo Track and Workshop Track of the 18th International Conference on Semantic Systems co-located with 18th International Conference on Semantic Systems (SEMANTiCS 2022), Vienna, Austria, September 13th to 15th, 2022, CEUR-WS.org, 2022.

Links | BibTeX

McTear, Michael F.; Jokinen, Kristiina; Dubey, Mohnish; Chollet, Gérard; Boudy, Jér^ome; Lohr, Christophe; Roelen, Sonja-Dana; Mössing, Wanja; Wieching, Rainer

Empowering Well-Being Through Conversational Coaching for Active and Healthy Ageing Proceedings Article

In: Participative Urban Health and Healthy Aging in the Age of AI - 19th International Conference, ICOST 2022, Paris, France, June 27-30, 2022, Proceedings, pp. 257–265, Springer, 2022.

Links | BibTeX

Reimann, Lars; Kniesel-Wünsche, Günter

Improving the Learnability of Machine Learning APIs by Semi-Automated API Wrapping Proceedings Article

In: 44th IEEE/ACM International Conference on Software Engineering: New Ideas and Emerging Results ICSE (NIER) 2022, Pittsburgh, PA, USA, May 22-24, 2022, pp. 46–50, IEEE/ACM, 2022.

Links | BibTeX

Ali, Mehdi; Berrendorf, Max; Galkin, Mikhail; Thost, Veronika; Ma, Tengfei; Tresp, Volker; Lehmann, Jens

Improving Inductive Link Prediction Using Hyper-Relational Facts (Extended Abstract) Proceedings Article

In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23-29 July 2022, pp. 5259–5263, ijcai.org, 2022.

Links | BibTeX

Rony, Md. Rashad Al Hasan; Zuo, Ying; Kovriguina, Liubov; Teucher, Roman; Lehmann, Jens

Climate Bot: A Machine Reading Comprehension System for Climate Change Question Answering Proceedings Article

In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23-29 July 2022, pp. 5249–5252, ijcai.org, 2022.

Links | BibTeX

Kraft, Angelie; Usbeck, Ricardo

The Lifecycle of "Facts": A Survey of Social Bias in Knowledge Graphs Proceedings Article

In: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, AACL/IJCNLP 2022 - Volume 1: Long Papers, Online Only, November 20-23, 2022, pp. 639–652, Association for Computational Linguistics, 2022.

Links | BibTeX

Shoushtari, Hossein; Kassawat, Firas; Harder, Dorian; Venzke, Korvin; Müller-Lietzkow, Jörg; Sternberg, Harald

L5IN+: From an Analytical Platform to Optimization of Deep Inertial Odometry Proceedings Article

In: WiP Proceedings of the Twelfth International Conference on Indoor Positioning and Indoor Navigation - Work-in-Progress Papers (IPIN-WiP 2022) co-located with 12th International Conference on Indoor Positioning and Indoor Navigation (IPIN 2022), Beijing, China, 5 September - 7 September, 2022, CEUR-WS.org, 2022.

Links | BibTeX

Andreadis, Stelios; Mavropoulos, Thanassis; Pantelidis, Nick; Vrochidis, Stefanos; Elias, Mirette; Papadopoulos, Charis; Gialampoukidis, Ilias; Kompatsiaris, Ioannis

SPARQL querying for validating the usage of automatically georeferenced social media data as human sensors for air quality Proceedings Article

In: 14th IEEE Image, Video, and Multidimensional Signal Processing Workshop, IVMSP 2022, Nafplio, Greece, June 26-29, 2022, pp. 1–5, IEEE, 2022.

Links | BibTeX

Xiong, Bo; Zhu, Shichao; Nayyeri, Mojtaba; Xu, Chengjin; Pan, Shirui; Zhou, Chuan; Staab, Steffen

Ultrahyperbolic Knowledge Graph Embeddings Proceedings Article

In: KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14 - 18, 2022, pp. 2130–2139, ACM, 2022.

Links | BibTeX

Perevalov, Aleksandr; Yan, Xi; Kovriguina, Liubov; Jiang, Longquan; Both, Andreas; Usbeck, Ricardo

Knowledge Graph Question Answering Leaderboard: A Community Resource to Prevent a Replication Crisis Proceedings Article

In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, LREC 2022, Marseille, France, 20-25 June 2022, pp. 2998–3007, European Language Resources Association, 2022.

Links | BibTeX

Rony, Md. Rashad Al Hasan; Usbeck, Ricardo; Lehmann, Jens

DialoKG: Knowledge-Structure Aware Task-Oriented Dialogue Generation Proceedings Article

In: Findings of the Association for Computational Linguistics: NAACL 2022, Seattle, WA, United States, July 10-15, 2022, pp. 2557–2571, Association for Computational Linguistics, 2022.

Links | BibTeX

Rula, Anisa; Calegari, Gloria Re; Azzini, Antonia; Bucci, Davide; Baroni, Ilaria; Celino, Irene

Eliciting and Curating Procedural Knowledge in Industry: Challenges and Opportunities Proceedings Article

In: Proceedings of the Third Conference on Digital Curation Technologies (Qurator 2022), Berlin, Germany, Sept. 19th-23rd, 2022, CEUR-WS.org, 2022.

Links | BibTeX

Han, Xiaolin; Cheng, Reynold; Grubenmann, Tobias; Maniu, Silviu; Ma, Chenhao; Li, Xiaodong

Leveraging Contextual Graphs for Stochastic Weight Completion in Sparse Road Networks Proceedings Article

In: Proceedings of the 2022 SIAM International Conference on Data Mining, SDM 2022, Alexandria, VA, USA, April 28-30, 2022, pp. 64–72, SIAM, 2022.

Links | BibTeX

Bagozi, Ada; Bianchini, Devis; Rula, Anisa

A Multi-Perspective Data Model for Cyber Physical Production Networks Proceedings Article

In: Proceedings of the 30th Italian Symposium on Advanced Database Systems, SEBD 2022, Tirrenia (PI), Italy, June 19-22, 2022, pp. 44–51, CEUR-WS.org, 2022.

Links | BibTeX

Moghaddam, Farshad Bakhshandegan; Lehmann, Jens; Jabeen, Hajira

DistAD: A Distributed Generic Anomaly Detection Framework over Large KGs Proceedings Article

In: 16th IEEE International Conference on Semantic Computing, ICSC 2022, Laguna Hills, CA, USA, January 26-28, 2022, pp. 243–250, IEEE, 2022.

Links | BibTeX

Perevalov, Aleksandr; Diefenbach, Dennis; Usbeck, Ricardo; Both, Andreas

QALD-9-plus: A Multilingual Dataset for Question Answering over DBpedia and Wikidata Translated by Native Speakers Proceedings Article

In: 16th IEEE International Conference on Semantic Computing, ICSC 2022, Laguna Hills, CA, USA, January 26-28, 2022, pp. 229–234, IEEE, 2022.

Links | BibTeX

Xiong, Bo; Potyka, Nico; Tran, Trung-Kien; Nayyeri, Mojtaba; Staab, Steffen

Faithful Embeddings for textbackslashemphEtextbackslash(mathscrLtextbackslash)textbackslash(^textbackslashmbox++textbackslash) Knowledge Bases Proceedings Article

In: The Semantic Web - ISWC 2022 - 21st International Semantic Web Conference, Virtual Event, October 23-27, 2022, Proceedings, pp. 22–38, Springer, 2022.

Links | BibTeX

Banerjee, Debayan; Nair, Pranav Ajit; Kaur, Jivat Neet; Usbeck, Ricardo; Biemann, Chris

Modern Baselines for SPARQL Semantic Parsing Proceedings Article

In: SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11 - 15, 2022, pp. 2260–2265, ACM, 2022.

Links | BibTeX

Jiang, Longquan; Usbeck, Ricardo

Knowledge Graph Question Answering Datasets and Their Generalizability: Are They Enough for Future Research? Proceedings Article

In: SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11 - 15, 2022, pp. 3209–3218, ACM, 2022.

Links | BibTeX

Bagozi, Ada; Bianchini, Devis; Rula, Anisa

A Distributed Registry of Multi-perspective Data Services in Cyber Physical Production Networks Proceedings Article

In: Proceedings of the 18th International Conference on Web Information Systems and Technologies, WEBIST 2022, Valletta, Malta, October 25-27, 2022, pp. 174–181, SCITEPRESS, 2022.

Links | BibTeX

Xu, Chengjin; Su, Fenglong; Xiong, Bo; Lehmann, Jens

Time-aware Entity Alignment using Temporal Relational Attention Proceedings Article

In: WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25 - 29, 2022, pp. 788–797, ACM, 2022.

Links | BibTeX

Alam, Mirza Mohtashim; Rony, Md. Rashad Al Hasan; Nayyeri, Mojtaba; Mohiuddin, Karishma; Akter, M. S. T. Mahfuja; Vahdati, Sahar; Lehmann, Jens

Language Model Guided Knowledge Graph Embeddings Journal Article

In: IEEE Access, vol. 10, pp. 76008–76020, 2022.

Links | BibTeX

Rony, Md. Rashad Al Hasan; Chaudhuri, Debanjan; Usbeck, Ricardo; Lehmann, Jens

Tree-KGQA: An Unsupervised Approach for Question Answering Over Knowledge Graphs Journal Article

In: IEEE Access, vol. 10, pp. 50467–50478, 2022.

Links | BibTeX

Rony, Md. Rashad Al Hasan; Kumar, Uttam; Teucher, Roman; Kovriguina, Liubov; Lehmann, Jens

SGPT: A Generative Approach for SPARQL Query Generation From Natural Language Questions Journal Article

In: IEEE Access, vol. 10, pp. 70712–70723, 2022.

Links | BibTeX

Bagozi, Ada; Bianchini, Devis; Rula, Anisa

Multi-perspective Data Modelling in Cyber Physical Production Networks: Data, Services and Actors Journal Article

In: Data Sci. Eng., vol. 7, no. 3, pp. 193–212, 2022.

Links | BibTeX

Westphal, Patrick; Grubenmann, Tobias; Collarana, Diego; Bin, Simon; Bühmann, Lorenz; Lehmann, Jens

Spatial concept learning and inference on geospatial polygon data Journal Article

In: Knowl. Based Syst., vol. 241, pp. 108233, 2022.

Links | BibTeX

Ali, Mehdi; Berrendorf, Max; Hoyt, Charles Tapley; Vermue, Laurent; Galkin, Mikhail; Sharifzadeh, Sahand; Fischer, Asja; Tresp, Volker; Lehmann, Jens

Bringing Light Into the Dark: A Large-Scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework Journal Article

In: IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 12, pp. 8825–8845, 2022.

Links | BibTeX

Han, Xiaolin; Cheng, Reynold; Ma, Chenhao; Grubenmann, Tobias

DeepTEA: Effective and Efficient Online Time-dependent Trajectory Outlier Detection Journal Article

In: Proc. VLDB Endow., vol. 15, no. 7, pp. 1493–1505, 2022.

Links | BibTeX

Möller, Cedric; Lehmann, Jens; Usbeck, Ricardo

Survey on English Entity Linking on Wikidata: Datasets and approaches Journal Article

In: Semantic Web, vol. 13, no. 6, pp. 925–966, 2022.

Links | BibTeX

Han, Xiaolin; Dell'Aglio, Daniele; Grubenmann, Tobias; Cheng, Reynold; Bernstein, Abraham

A framework for differentially-private knowledge graph embeddings Journal Article

In: J. Web Semant., vol. 72, pp. 100696, 2022.

Links | BibTeX

Lukovnikov, Denis

Deep Learning Methods for Semantic Parsing and Question Answering over Knowledge Graphs PhD Thesis

University of Bonn, 2022.

Links | BibTeX

2021

Mousavinezhad, Najmehsadat

Knowledge Extraction Methods for the Analysis of Contractual Agreements PhD Thesis

Rheinische Friedrich-Wilhelms-Universität Bonn, 2021.

Links | BibTeX

Musyaffa, Fathoni Arief

Comparative Analysis of Open Linked Fiscal Data PhD Thesis

Rheinische Friedrich-Wilhelms-Universität Bonn, 2021.

Abstract | Links | BibTeX

@phdthesis{Musyaffa2021,

title = {Comparative Analysis of Open Linked Fiscal Data},

author = {Fathoni Arief Musyaffa},

url = {https://bonndoc.ulb.uni-bonn.de/xmlui/handle/20.500.11811/9114},

year  = {2021},

date = {2021-06-01},

school = {Rheinische Friedrich-Wilhelms-Universität Bonn},

abstract = {The open data movement within public administrations has provided data regarding governance publicly. As public administrators and governments produce data and release the data as open data, the volume of the data is highly increasing. One of these datasets is budget and spending data, which has been gaining interest to the extent that several working groups and CSO/NGOs started working on this particular open data domain. The majority of these datasets are part of the open budget and spending datasets, which laid out data regarding how public administrations plan, revise, allocate, and expense their governance funding. The disclosure of public administration budget and spending data is expected to improve governance transparency, accountability, law enforcement, and political participation. 

 

Unfortunately, the analysis of budget and spending datasets is not a trivial task to do for several reasons. First, the quality of open fiscal data varies. Standards and recommendations for publishing open data are available, however, these standards are often not met and no framework specifically addresses fiscal data quality measurements. Second, the datasets are heterogeneous, since it is produced by different public administrations with different business process, accounting practice, requirements, and language. This lead to a challenging task in data integration across public budget and spending data. The structural and linguistic heterogeneity of open budget and spending data makes comparative analysis across datasets difficult to perform. Third, datasets within the budget and spending domain are complicated. To be able to comprehend such data, expertise is needed both from the public accounting/budgeting domain, as well as the technical domain to digest the datasets properly. Fourth, a platform to transform, store, analyze, and visualize datasets is necessary, especially those that make the utilization of semantic analysis is possible. Fifth, there is no conceptual association between datasets, which can be used as a comparison point to analyze fiscal records between compared public administrations. Lastly, there is a lack of methodology to consume and compare linked open fiscal data records across different public administrations. 

 

Our focus in this thesis is hence to perform research to help the community gain a better understanding of open fiscal data, provide analysis of their quality, suggest a way to publish open fiscal data in an improved manner, analyze the open fiscal data heterogeneity while also laying out lessons learned regarding their current state and supporting data formats that are capable for open fiscal data integration. Consequently, a platform to digest, analyze and visualize these datasets is devised, continued with performing experiments on multilingual fiscal data concept mapping and wrapped up with a proof-of-concept description of comparative analysis over linked open fiscal data.},

keywords = {},

pubstate = {published},

tppubtype = {phdthesis}

}

The open data movement within public administrations has provided data regarding governance publicly. As public administrators and governments produce data and release the data as open data, the volume of the data is highly increasing. One of these datasets is budget and spending data, which has been gaining interest to the extent that several working groups and CSO/NGOs started working on this particular open data domain. The majority of these datasets are part of the open budget and spending datasets, which laid out data regarding how public administrations plan, revise, allocate, and expense their governance funding. The disclosure of public administration budget and spending data is expected to improve governance transparency, accountability, law enforcement, and political participation.

Unfortunately, the analysis of budget and spending datasets is not a trivial task to do for several reasons. First, the quality of open fiscal data varies. Standards and recommendations for publishing open data are available, however, these standards are often not met and no framework specifically addresses fiscal data quality measurements. Second, the datasets are heterogeneous, since it is produced by different public administrations with different business process, accounting practice, requirements, and language. This lead to a challenging task in data integration across public budget and spending data. The structural and linguistic heterogeneity of open budget and spending data makes comparative analysis across datasets difficult to perform. Third, datasets within the budget and spending domain are complicated. To be able to comprehend such data, expertise is needed both from the public accounting/budgeting domain, as well as the technical domain to digest the datasets properly. Fourth, a platform to transform, store, analyze, and visualize datasets is necessary, especially those that make the utilization of semantic analysis is possible. Fifth, there is no conceptual association between datasets, which can be used as a comparison point to analyze fiscal records between compared public administrations. Lastly, there is a lack of methodology to consume and compare linked open fiscal data records across different public administrations.

Our focus in this thesis is hence to perform research to help the community gain a better understanding of open fiscal data, provide analysis of their quality, suggest a way to publish open fiscal data in an improved manner, analyze the open fiscal data heterogeneity while also laying out lessons learned regarding their current state and supporting data formats that are capable for open fiscal data integration. Consequently, a platform to digest, analyze and visualize these datasets is devised, continued with performing experiments on multilingual fiscal data concept mapping and wrapped up with a proof-of-concept description of comparative analysis over linked open fiscal data.

Fathalla, Said

Towards Facilitating Scholarly Communication using Semantic Technologies PhD Thesis

Rheinische Friedrich-Wilhelms-Universität Bonn, 2021.

Abstract | Links | BibTeX

@phdthesis{said_thesis,

title = {Towards Facilitating Scholarly Communication using Semantic Technologies},

author = {Said Fathalla},

url = {https://hdl.handle.net/20.500.11811/9089},

year  = {2021},

date = {2021-05-20},

school = {Rheinische Friedrich-Wilhelms-Universität Bonn},

abstract = {Web technologies have substantially stimulated the submission of manuscripts, publishing scientific articles, as well as the organization of scholarly events, especially virtual events, when a global crisis occurs, which consequently restricts travels across the globe. Publication in scholarly events, such as conferences, workshops, and symposiums, is essential and pervasive in computer science, engineering, and natural sciences. The past years have witnessed significant growth in scholarly data published on the Web, mostly in unstructured formats, which immolate the embedded semantics and relationships between various entities. These formats restrict the reusability of the data, i.e., data analysis, retrieval, and mining. Therefore, managing, retrieving, and analyzing such data have become quite challenging. Consequently, there is a pressing need to represent this data in a semantic format, i.e., Linked Data, which significantly improves scholarly communication by supporting researchers concerning analyzing, retrieving, and exploring scholarly data. Notwithstanding the considerable advances in technology, publishing and exchanging scholarly data have not substantially changed (i.e., still follows the document-based scheme), thus restricting both developments of research applications in various industries as well as data preservation and exploration. This thesis tackles the problem of facilitating scholarly communication using semantic technologies. The ultimate aim is improving scholarly communication by facilitating the transformation from a document-based to knowledge-based scholarly communication, which helps researchers to examine science itself with a new perspective. Key steps towards the goal have been taken by proposing methodologies as well as a metrics suite for publishing and assessing the quality of scholarly events concerning several criteria, in particular, Computer Science as well as Physics, Mathematics, and Engineering. Within the framework of these criteria, steps towards assessing the quality of scholarly events and recommendations to various stakeholders have been taken. Furthermore, we engineered the Scientific Events Ontology in order to enable the enriched semantic representation of scholarly event metadata. Currently, this ontology is in use on thousands of OpenResearch.org events wiki pages. These steps will have far-reaching implications for the various stakeholders involved in the scholarly communication domain, including authors, sponsors, reviewers, publishers, and libraries. Most of the scholarly data publishers, such as Springer Nature, have taken serious steps towards publishing research data in a semantic form by publishing collated information from across the research landscape, such as research articles, scholarly events, persons, and grants, as knowledge graphs. Interlinking this data will significantly enable the provision of better and more intelligent services for the discovery of scientific work, which opens new opportunities for both scholarly data exploration and analysis. In the direction to this goal, we proposed the Science Knowledge Graph Ontologies suite, which comprises four OWL ontologies for representing the scientific knowledge in various fields of science, including Computer Science, Physics, and Pharmaceutical science. Besides, we developed an upper ontology on top of them for modeling modern science branches and related concepts, such as scientific discovery, instruments, and phenomena.},

keywords = {},

pubstate = {published},

tppubtype = {phdthesis}

}

Web technologies have substantially stimulated the submission of manuscripts, publishing scientific articles, as well as the organization of scholarly events, especially virtual events, when a global crisis occurs, which consequently restricts travels across the globe. Publication in scholarly events, such as conferences, workshops, and symposiums, is essential and pervasive in computer science, engineering, and natural sciences. The past years have witnessed significant growth in scholarly data published on the Web, mostly in unstructured formats, which immolate the embedded semantics and relationships between various entities. These formats restrict the reusability of the data, i.e., data analysis, retrieval, and mining. Therefore, managing, retrieving, and analyzing such data have become quite challenging. Consequently, there is a pressing need to represent this data in a semantic format, i.e., Linked Data, which significantly improves scholarly communication by supporting researchers concerning analyzing, retrieving, and exploring scholarly data. Notwithstanding the considerable advances in technology, publishing and exchanging scholarly data have not substantially changed (i.e., still follows the document-based scheme), thus restricting both developments of research applications in various industries as well as data preservation and exploration. This thesis tackles the problem of facilitating scholarly communication using semantic technologies. The ultimate aim is improving scholarly communication by facilitating the transformation from a document-based to knowledge-based scholarly communication, which helps researchers to examine science itself with a new perspective. Key steps towards the goal have been taken by proposing methodologies as well as a metrics suite for publishing and assessing the quality of scholarly events concerning several criteria, in particular, Computer Science as well as Physics, Mathematics, and Engineering. Within the framework of these criteria, steps towards assessing the quality of scholarly events and recommendations to various stakeholders have been taken. Furthermore, we engineered the Scientific Events Ontology in order to enable the enriched semantic representation of scholarly event metadata. Currently, this ontology is in use on thousands of OpenResearch.org events wiki pages. These steps will have far-reaching implications for the various stakeholders involved in the scholarly communication domain, including authors, sponsors, reviewers, publishers, and libraries. Most of the scholarly data publishers, such as Springer Nature, have taken serious steps towards publishing research data in a semantic form by publishing collated information from across the research landscape, such as research articles, scholarly events, persons, and grants, as knowledge graphs. Interlinking this data will significantly enable the provision of better and more intelligent services for the discovery of scientific work, which opens new opportunities for both scholarly data exploration and analysis. In the direction to this goal, we proposed the Science Knowledge Graph Ontologies suite, which comprises four OWL ontologies for representing the scientific knowledge in various fields of science, including Computer Science, Physics, and Pharmaceutical science. Besides, we developed an upper ontology on top of them for modeling modern science branches and related concepts, such as scientific discovery, instruments, and phenomena.

Dubey, Mohnish

Towards Complex Question Answering over Knowledge Graphs PhD Thesis

University of Bonn, Germany, 2021.

Abstract | Links | BibTeX

@phdthesis{DBLP:phd/dnb/Dubey21,

title = {Towards Complex Question Answering over Knowledge Graphs},

author = {Mohnish Dubey},

url = {https://hdl.handle.net/20.500.11811/9122},

year = {2021},

date = {2021-01-19},

school = {University of Bonn, Germany},

abstract = {Over the past decade, Knowledge Graphs (KG) have emerged as a prominent repository for storing facts about the world in a linked data architecture. Providing machines with the capability of exploring such Knowledge Graphs and answering natural language questions over them, has been an active area of research. The purpose of this work, is to delve further into the research of retrieving information stored in KGs, based on the natural language questions posed by the user. Knowledge Graph Question Answering (KGQA) aims to produce a concise answer to a user question, such that the user is exempt from using KG vocabulary and overheads of learning a formal query language. Existing KGQA systems have achieved excellent results over Simple Questions, where the information required is limited to a single triple and a single formal query pattern. Our motivation is to improve the performance of KGQA over Complex Questions, where formal query patterns significantly vary, and a single triple is not confining for all the required information. Complex KGQA provides several challenges such as understanding semantics and syntactic structure of questions, Entity Linking, Relation Linking and Answer Representation. Lack of suitable datasets for complex question answering further adds to research gaps. Hence, in this thesis, we focus the research objective of laying the foundations for the advancement of the state-of-the-art for Complex Question Answering over Knowledge Graphs, by providing techniques to solve various challenges and provide resources to fill the research gaps.

First, we propose Normalized Query Structure (NQS), which is a linguistic analyzer module that helps the QA system to detect inputs and intents and the relation between them in the users’ question. NQS acts like an intermediate language between natural language questions and formal expressions to ease the process of query formulation for complex questions. We then developed a framework named LC-QuAD to generate large scale question answering dataset by reversing the process of question answering, thereby translating natural language questions from the formal query using intermediate templates. Our goal is to use this framework for high variations in the query patterns and create a large size dataset with minimum human effort. The first version of the dataset consists of 5,000 complex questions. By extending the LC-QuAD framework to support Reified KGs and crowd-sourcing, we published the second version of the dataset as LC-QuAD 2.0, consisting of 30,000 questions with their paraphrases and has higher complexity and new variations in the questions. To overcome the problem of Entity Linking and Relation Linking in KGQA, we develop EARL, a module performing these two tasks as a single joint task for complex question answering. We develop approaches for this module, first by formalizing the task as an instance of the Generalized Traveling Salesman Problem (GTSP) and the second approach uses machine learning to exploit the connection density between nodes in the Knowledge Graph. Lastly, we create another large scale dataset to answer verbalization and provide results for multiple baseline systems on it. The Verbalization dataset is introduced to make the system’s response more human-like. The NQS based KGQA system was next to the best system in terms of accuracy on the QALD-5 dataset. We empirically prove that NQS is robust to tackle paraphrases of the questions. EARL achieves the state of the art results in Entity Linking and Relation Linking for question answering on several KGQA datasets. The dataset curated in this thesis has helped the research community to move forward in the direction of improving the accuracy of complex question answering as a task as other researchers too developed several KGQA systems and modules around these published datasets. With the large-scale datasets, we have encouraged the use of large scale machine learning, deep learning and emergence of new techniques to advance the state-of-the-art in complex question answering over knowledge graphs. We further developed core components for the KGQA pipeline to overcome the challenges of Question Understanding, Entity-Relation Linking and Answer Verbalization and thus achieve our research objective. All the proposed approaches mentioned in this thesis and the published resources are available at https://github.com/AskNowQA and are released under the umbrella project AskNow.},

keywords = {},

pubstate = {published},

tppubtype = {phdthesis}

}

Over the past decade, Knowledge Graphs (KG) have emerged as a prominent repository for storing facts about the world in a linked data architecture. Providing machines with the capability of exploring such Knowledge Graphs and answering natural language questions over them, has been an active area of research. The purpose of this work, is to delve further into the research of retrieving information stored in KGs, based on the natural language questions posed by the user. Knowledge Graph Question Answering (KGQA) aims to produce a concise answer to a user question, such that the user is exempt from using KG vocabulary and overheads of learning a formal query language. Existing KGQA systems have achieved excellent results over Simple Questions, where the information required is limited to a single triple and a single formal query pattern. Our motivation is to improve the performance of KGQA over Complex Questions, where formal query patterns significantly vary, and a single triple is not confining for all the required information. Complex KGQA provides several challenges such as understanding semantics and syntactic structure of questions, Entity Linking, Relation Linking and Answer Representation. Lack of suitable datasets for complex question answering further adds to research gaps. Hence, in this thesis, we focus the research objective of laying the foundations for the advancement of the state-of-the-art for Complex Question Answering over Knowledge Graphs, by providing techniques to solve various challenges and provide resources to fill the research gaps.
First, we propose Normalized Query Structure (NQS), which is a linguistic analyzer module that helps the QA system to detect inputs and intents and the relation between them in the users’ question. NQS acts like an intermediate language between natural language questions and formal expressions to ease the process of query formulation for complex questions. We then developed a framework named LC-QuAD to generate large scale question answering dataset by reversing the process of question answering, thereby translating natural language questions from the formal query using intermediate templates. Our goal is to use this framework for high variations in the query patterns and create a large size dataset with minimum human effort. The first version of the dataset consists of 5,000 complex questions. By extending the LC-QuAD framework to support Reified KGs and crowd-sourcing, we published the second version of the dataset as LC-QuAD 2.0, consisting of 30,000 questions with their paraphrases and has higher complexity and new variations in the questions. To overcome the problem of Entity Linking and Relation Linking in KGQA, we develop EARL, a module performing these two tasks as a single joint task for complex question answering. We develop approaches for this module, first by formalizing the task as an instance of the Generalized Traveling Salesman Problem (GTSP) and the second approach uses machine learning to exploit the connection density between nodes in the Knowledge Graph. Lastly, we create another large scale dataset to answer verbalization and provide results for multiple baseline systems on it. The Verbalization dataset is introduced to make the system’s response more human-like. The NQS based KGQA system was next to the best system in terms of accuracy on the QALD-5 dataset. We empirically prove that NQS is robust to tackle paraphrases of the questions. EARL achieves the state of the art results in Entity Linking and Relation Linking for question answering on several KGQA datasets. The dataset curated in this thesis has helped the research community to move forward in the direction of improving the accuracy of complex question answering as a task as other researchers too developed several KGQA systems and modules around these published datasets. With the large-scale datasets, we have encouraged the use of large scale machine learning, deep learning and emergence of new techniques to advance the state-of-the-art in complex question answering over knowledge graphs. We further developed core components for the KGQA pipeline to overcome the challenges of Question Understanding, Entity-Relation Linking and Answer Verbalization and thus achieve our research objective. All the proposed approaches mentioned in this thesis and the published resources are available at https://github.com/AskNowQA and are released under the umbrella project AskNow.

Nayyeri, Mojtaba; Vahdati, Sahar; Aykul, Can; Lehmann, Jens

5* Knowledge Graph Embeddings with Projective Transformations Proceedings Article

In: Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021, pp. 9064–9072, AAAI Press, 2021.

Links | BibTeX

Nayyeri, Mojtaba; Cil, Gökce Müge; Vahdati, Sahar; Osborne, Francesco; Kravchenko, Andrey; Angioni, Simone; Salatino, Angelo A.; Recupero, Diego Reforgiato; Motta, Enrico; Lehmann, Jens

Link Prediction of Weighted Triples for Knowledge Graph Completion Within the Scholarly Domain Journal Article

In: IEEE Access, vol. 9, pp. 116002–116014, 2021.

Links | BibTeX

Hogan, Aidan; Blomqvist, Eva; Cochez, Michael; d'Amato, Claudia; de Melo, Gerard; Gutiérrez, Claudio; Kirrane, Sabrina; Gayo, José Emilio Labra; Navigli, Roberto; Neumaier, Sebastian; Ngomo, Axel-Cyrille Ngonga; Polleres, Axel; Rashid, Sabbir M.; Rula, Anisa; Schmelzeisen, Lukas; Sequeda, Juan F.; Staab, Steffen; Zimmermann, Antoine

Knowledge Graphs Journal Article

In: ACM Comput. Surv., vol. 54, no. 4, pp. 71:1–71:37, 2021.

Links | BibTeX

Nayyeri, Mojtaba; Vahdati, Sahar; Aykul, Can; Lehmann, Jens

5* Knowledge Graph Embeddings with Projective Transformations Proceedings Article

Links | BibTeX

538 entries « ‹ 1 of 11 › »