The Smart Data Analytics group (http://sda.tech) is happy to announce SANSA 0.8.0 RC – the eighth release (candidate) of the Scalable Semantic Analytics Stack. SANSA employs distributed computing using Apache Spark in order to allow scalable machine learning, inference and querying capabilities for large knowledge graphs.
We look forward to your comments on the new features to make them permanent in our upcoming release.
Kindly note that the candidate is not in the Maven Central, please follow the readme.
In this release candidate, we have included:
- Integrated Ontop as a new SPARQL engine
- Improved SPARQL query API
- Distributed Trig/Turtle record reader
- Support to write out RDDs of OWL axioms in a variety of formats.
- Distributed Data Summaries with ABstraction and STATistics (ABSTAT)
- Configurable mapping of RDD of triples dataframes
- Initial support for RDD of Graphs and Datasets, executing queries on each entry and aggregating over the results
- Sparql Transformer for ML-Pipelines
- Autosparql Generation for Feature Extraction
- Distributed Feature based Semantic Similarity Estimations
- Added a common R2RML abstraction layer for Ontop, Sparqlify and possible future query engines
- Consolidated SANSA layers into a single GIT repository
- Retired the support for Apache Flink
View this announcement on Twitter and SANSA blog: