Christmas Time at SDA – Time to look back at 2017

christmas-xmas-christmas-tree-decoration(4658x3105)We are looking back at a busy and successful year 2017 full of new members, inspirational discussions, exciting conferences, a lot of accepted papers and awards as well as new software releases.

Below is a short summary of the main cornerstones for 2017:


The growth of the group in 2017

SDA is a new group, but not new in the field :). As a group, it was founded by Prof. Dr. Jens Lehmann at the beginning of 2016. The group has members at the University of Bonn with associated researchers at the Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS) and the Institute for Applied Computer Science Leipzig. Within 2017, the group has grown from 20 members to around 55 members (1 Professor, 1 Akademischer Rat / Assistant Professor, 11 PostDocs, 31 PhD Students,11 master students) as you can see on our team page.

An interesting future for AI and knowledge graphs

Artificial intelligence / machine learning and semantic technologies / knowledge graphs are central topics for SDA. Throughout the year, we have been able to achieve a range of interesting research achievements. This ranges from internationally leading results in question answering over knowledge graphs, to scalable distributed querying, inference and analysis of large RDF datasets as well as new perspectives on industrial data spaces and data integration. Among the race for ever improving achievements in AI, which go far beyond what many could have imagined 10 years ago, our researchers were able to deliver important contributions and continue to shape different sub areas of the growing AI research landscape.

Papers accepted

We had 46 papers accepted at well-known conferences (i.e The Web Conference 2018, WWW 2017, AAAI 2017, ISWC 2017, ESWC 2017, DEXA 2017, SEMANTiCS 2017, K-CAP 2017, WI 2017, KESW 2017, IEEE BigData 2017, NIPS 2017, TPDL 2017, ICSC 2018, ICEGOV 2018 and more. We estimate our articles to be cited around 3000+ times per year (based on Google Scholar profiles).


We received several awards in 2017 – click on the posts to find out more:


Software releases

SANSA – An open source data flow processing engine for performing distributed computation over large-scale RDF datasets had 2 successfully released during 2017 (SANSA 0.3 and SANSA 0.2).

From the funded projects we were happy to launch the final release of the Big Data Europe platform – an open source Big Data Processing Platform allowing users to install numerous big data processing tools and frameworks and create working data flow applications.

There were several other releases:

  • SML-Bench – A Structured Machine Learning benchmark framework 0.2 has been released.
  • WebVOWL – A Web-based Visualization of Ontologies had several releases in 2017.
  • – A Crowd-Sourcing platform for collaborative management of scholarly metadata reached coverage of more than 5K computer science conferences in 2017.

Furthermore, SDA deeply values team bonding activities. :-) Often we try to introduce fun activities that involve teamwork and teambuilding. At our X-mas party, we enjoyed a very international and lovely dinner together, we played a `Secret Santa` and Pantomime game.


Long-term team building through deeper discussions, genuine connections and healthy communication helps us to connect within the group!

Many thanks to all of you who have accompanied and supported us on this way!

Jens Lehmann on behalf of The SDA Research Team

SANSA 0.3 (Semantic Analytics Stack) Released

We are happy to announce SANSA 0.3 – the third release of the Scalable Semantic Analytics Stack. SANSA employs distributed computing via Apache Spark and Flink in order to allow scalable machine learning, inference and querying capabilities for large knowledge graphs.

You can find the FAQ and usage examples at

The following features are currently supported by SANSA:

  • Reading and writing RDF files in N-Triples, Turtle, RDF/XML, N-Quad format
  • Reading OWL files in various standard formats
  • Support for multiple data partitioning techniques
  • SPARQL querying via Sparqlify (with some known limitations until the next Spark 2.3.* release)
  • SPARQL querying via conversion to Gremlin path traversals (experimental)
  • RDFS, RDFS Simple, OWL-Horst (all in beta status), EL (experimental) forward chaining inference
  • Automatic inference plan creation (experimental)
  • RDF graph clustering with different algorithms
  • Rule mining from RDF graphs based AMIE+
  • Terminological decision trees (experimental)
  • Anomaly detection (beta)
  • Distributed knowledge graph embedding approaches: TransE (beta), DistMult (beta), several further algorithms planned

Deployment and getting started:

  • There are template projects for SBT and Maven for Apache Spark as well as for Apache Flink available to get started.
  • The SANSA jar files are in Maven Central i.e. in most IDEs you can just search for “sansa” to include the dependencies in Maven projects.
  • There is example code for various tasks available.
  • We provide interactive notebooks for running and testing code via Docker.

We want to thank everyone who helped to create this release, in particular the projects Big Data Europe, HOBBIT, SAKE, Big Data Ocean, SLIPO, QROWD and BETTER.

Greetings from the SANSA Development Team



SANSA 0.1 (Semantic Analytics Stack) Released

Dear all,

The Smart Data Analytics group is very happy to announce SANSA 0.1 – the initial release of the Scalable Semantic Analytics Stack. SANSA combines distributed computing and semantic technologies in order to allow powerful machine learning, inference and querying capabilities for large knowledge graphs.


You can find the FAQ and usage examples at

The following features are currently supported by SANSA:

  • Support for reading and writing RDF files in N-Triples format
  • Support for reading OWL files in various standard formats
  • Querying and partitioning based on Sparqlify
  • Support for RDFS/RDFS Simple/OWL-Horst forward chaining inference
  • Initial RDF graph clustering support
  • Initial support for rule mining from RDF graphs

We want to thank everyone who helped to create this release, in particular, the projects Big Data Europe, HOBBIT and SAKE.

Kind regards,

The SANSA Development Team


DL-Learner 1.3 (Supervised Structured Machine Learning Framework) Released

Dear all,

the Smart Data Analytics group is happy to announce DL-Learner 1.3.

DL-Learner is a framework containing algorithms for supervised machine learning in RDF and OWL. DL-Learner can use various RDF and OWL serialization formats as well as SPARQL endpoints as input, can connect to most popular OWL reasoners and is easily and flexibly configurable. It extends concepts of Inductive Logic Programming and Relational Learning to the Semantic Web in order to allow powerful data analysis.

GitHub page:

DL-Learner is used for data analysis tasks within other tools such as ORE and RDFUnit. Technically, it uses refinement operator based, pattern-based and evolutionary techniques for learning on structured data. For a practical example, see It also offers a plugin for Protégé, which can give suggestions for axioms to add.

In the current release, we added a large number of new algorithms and features. For instance, DL-Learner supports terminological decision tree learning, it integrates the LEAP and EDGE systems as well as the BUNDLE probabilistic OWL reasoner. We migrated the system to Java 8, Jena 3, OWL API 4.2 and Spring 4.3. We want to point to some related efforts here:

We want to thank everyone who helped to create this release, in particular we want to thank Giuseppe Cota who visited the core developer team and significantly improved DL-Learner. We also acknowledge support by the recently SAKE project, in which DL-Learner will be applied to event analysis in manufacturing use cases, as well as Big Data Europe and HOBBIT projects.

Kind regards,

Lorenz Bühmann, Jens Lehmann, Patrick Westphal and Simon Bin