Funded Projects

SDA is currently funded with the following regional, national and European research projects (industry funded projects are not included here).


The CLEOPATRA ITN, a Marie Skłodowska-Curie Innovative Training Network aims to make sense of the massive digital coverage generated by the intense disruption in Europe over the past decade – including appalling terrorist incidents and the dramatic movement of refugees and economic migrants. Read more about Cleopatra.


The LAMBDA project shall implement a thoroughly prepared plan of actions for knowledge exchange and transfer between the linked institutions and carry out a wide range of dissemination and outreach activities in the West Balkan countries and across Europe. Read more about LAMBDA.


The MLwin consortium bundles the competences and the preparatory work of the participating partners from the different scientific and technical fields in order to substantially further develop the topic of machine learning with knowledge graphs and place it on a new foundation. A central topic is the further development and consolidation of KG learning algorithms. Furthermore, data mining algorithms are being developed for the analysis of the underlying structures in KGs, and KG-supported information extraction from texts, images and videos will be investigated. Finally, the relation to Artificial Intelligence (AI) and cognition research is examined.

Opertus Mundi

Opertus Mundi will deliver a trusted, secure, and highly scalable pan-European industrial geospatial data market, which will act as a single-point for the streamlined and trusted discovery, sharing, trading, remuneration, and use of proprietary geospatial data assets, guaranteeing low-cost and flexibility to accommodate current and emerging needs of Data Economy stakeholders regardless of size, domain, and expertise. Read more about Opertus Mundi.


The digitalization of the energy sector enables higher levels of operational excellence with the adoption of disrupting technologies. The Energy Big Data framework of the modern smart energy networks provides an ideal ecosystem for knowledge exploitation from data. PLATOON pretends to deploy distributed/edge processing and data analytics technologies for optimized real-time energy system management in a simple way for the energy domain expert. The data governance among the different stakeholders for multi-party data exchange, coordination and cooperation in the energy value chain will be guaranteed through IDS based connectors. Read more about PLATOON.


A textual & graphical domain-specific language for semantic data analytics workflows. Read more about Simple-ML.

SDA was formerly funded with the following regional, national and European research projects:

  • BigDataEurope – Big Data Europe will undertake the foundational work for enabling European companies to build innovative multilingual products and services based on semantically interoperable, large-scale, multi-lingual data assets and knowledge, available under a variety of licenses and business models.
  • BigDataOcean – The main objective of the BigDataOcean project is to enable maritime big data scenarios for EU-based companies, organizations and scientists, through a multi-segment platform that will combine data of different velocity, variety and volume under an inter-linked, trusted, multilingual engine to produce a big-data repository of value and veracity back to the participants and local communities.
  • Be-IoT – The business engine for IoT pilots: Turning the Internet of things in Europe into an economically successful and socially accepted vibrant ecosystem.
  • Boost4.0 – The biggest European initiative in Big Data for Industry 4.0. With a 20M€ budget and leveraging 100M€ of private investment, Boost 4.0 will lead the construction of the European Industrial Data Space to improve the competitiveness of Industry 4.0 and will guide the European manufacturing industry in the introduction of Big Data in the factory, providing the industrial sector with the necessary tools to obtain the maximum benefit of Big Data.
  • DIACHRON – Preserving the Evolving Data Web: Making Open / Linked Data Diachronic.
  • EDSA – European Data Science Academy.
  • GEISER – GEISER is a European project that develops a holistic open-source platform for benchmarking sensor data to Internet-based Geo-Services.
  • GraDAna – Innovation and Entrepreneurship for Iranian HE Graduates through Data Analytics.
  • HOBBIT – HOBBIT is a European project that develops a holistic open-source platform and industry-grade benchmarks for benchmarking big linked data.
  • LiDaKrA – Linked-data-based crime analysis.
  • LinDA – Enabling Linked Data and Analytics for SMEs by renovating public sector information.
  • LUCID – Linked Value Chain Data.
  • LOD2 – Creating Knowledge out of Interlinked Data.
  • ODINE – Open Data Incubator for Europe.
  • – Financial Transparency Platform for the Public Sector.
  • OSCOSS – A shared platform for Opening Scholarly Communication in the Social Sciences.
  • QROWD – A European project that will deliver innovative solutions to improve transport and mobility in European cities combining the power of the Qrowd and RDF.
  • SeReCo – Semantics, Coordination and Reasoning.
  • SLIPO – A European project that develops a Scalable Linking and Integration of Big POI data.
  • WDAqua – Answering Questions using Web Data.

Open Source Projects

SDA is working on a wide range of open source projects. You can find many of those at our Github organisation page. Please note that we split off several projects there into individual organisations to group them according to particular lines of research – please check the SDA Github README for those. Below is an incomplete selection of high-impact R&D OpenSource projects:


AskNow is an Umbrella project for our activities over Question Answering and Conversational AI.
 Read more about AskNow.


DBpedia is a crowd-sourced community effort to extract structured content from the information created in various Wikimedia projects. This structured information resembles an open knowledge graph (OKG) which is available for everyone on the Web. A knowledge graph is a special kind of database which stores knowledge in a machine-readable form and provides a means for information to be collected, organised, shared, searched and utilised. Google uses a similar approach to create those knowledge cards during search. We hope that this work will make it easier for the huge amount of information in Wikimedia projects to be used in some new interesting ways. Read more about DBpedia.


DeFacto (Deep Fact Validation) is an algorithm for validating statements by finding confirming sources for it on the web. It takes a statement (such as “Jamaica Inn was directed by Alfred Hitchcock”) as input and then tries to find evidence for the truth of that statement by searching for information on the web. Read more about DeFacto.


DL-Learner is a tool for learning concepts in Description Logics (DLs) from user-provided examples. Equivalently, it can be used to learn classes in OWL ontologies from selected objects. The goal of DL-Learner is to support knowledge engineers in constructing knowledge and learning about the data they created.  Read more about DL-Learner.

MEX Vocabulary

MEX Vocabulary: A Light-Weight Interchange Format for Machine Learning Experiments.  Read more about MEX Vocabulary.


A Python library for learning and evaluating knowledge graph embeddings. Read more about pyKEEN.


An open-source platform for distributed batch data processing for RDF large-scale datasets. Read more about SANSA Stack.


A SPARQL-SQL rewriter. Read more about Sparqlify.


The ultimate goal of SML-Bench is to foster research in machine learning from structured data as well as increase the reproducibility and comparability of algorithms in that area. This is important, since a) the preparation of machine learning tasks in that area involves a significant amount of work and b) there are hardly any cross-comparisons across languages as this requires data conversion processes. Read more about SML-Bench.