We are happy to announce SANSA 0.7.1 – the seventh release of the Scalable Semantic Analytics Stack. SANSA employs distributed computing via Apache Spark and Flink in order to allow scalable machine learning, inference and querying capabilities for large knowledge graphs.
- Website: http://sansa-stack.net
- GitHub: https://github.com/SANSA-Stack
- Download: http://sansa-stack.net/downloads-usage/
- ChangeLog: https://github.com/SANSA-Stack/SANSA-Stack/releases
You can find usage guidelines and examples at http://sansa-stack.net/user-guide.
The following features are currently supported by SANSA:
- Reading and writing RDF files in N-Triples, Turtle, RDF/XML, N-Quad, TRIX format
- Reading OWL files in various standard formats
- Query heterogeneous sources (Data Lake) using SPARQL – CSV, Parquet, MongoDB, Cassandra, JDBC (MySQL, SQL Server, etc.) are supported
- Support for multiple data partitioning techniques
- SPARQL querying via Sparqlify and Ontop and Tensors
- Graph-parallel querying of RDF using SPARQL (1.0) via GraphX traversals (experimental)
- RDFS, RDFS Simple and OWL-Horst forward chaining inference
- RDF graph clustering with different algorithms
- Terminological decision trees (experimental)
- Knowledge graph embedding approaches: TransE (beta), DistMult (beta)
Noteworthy changes or updates since the previous release are:
- TRIX support
- A new query engine over compressed RDF data
- OWL/XML Support
Deployment and getting started:
- There are template projects for SBT and Maven for Apache Spark as well as for Apache Flink available to get started.
- The SANSA jar files are in Maven Central i.e. in most IDEs you can just search for “sansa” to include the dependencies in Maven projects.
- Example code is available for various tasks.
- We provide interactive notebooks for running and testing code via Docker.
We want to thank everyone who helped to create this release, in particular the projects Big Data Ocean, SLIPO, QROWD, BETTER, BOOST, MLwin, PLATOON and Simple-ML. Also check out our recent articles in which we describe how to use SANSA for tensor based querying, scalable RDB2RDF query execution, quality assessment and semantic partitioning.
Greetings from the SANSA Development Team