AskNow 0.1 Released🗓 2018-09-12 ✍ Prof. Dr. Jens Lehmann
Dear all,the Smart Data Analytics group is happy to announce AskNow 0.1 – the initial release of Question Answering Components and Tools over RDF Knowledge Graphs.
The following components with corresponding features are currently supported by AskNow:
- EARL 0.1 EARL performs entity linking and relation linking as a joint task. It uses machine learning in order to exploit the Connection Density between nodes in the knowledge graph. It relies on three base features and re-ranking steps in order to predict entities and relations.
ISWC 2018: https://arxiv.org/pdf/1801.03825.pdf
- SQG 0.1: This is a SPARQL Query Generator with modular architecture. SQG enables easy integration with other components for the construction of a fully functional QA pipeline. Currently entity relation, compound, count, and boolean questions are supported.
ESWC 2018: http://jens-lehmann.org/files/2018/eswc_qa_query_generation.pdf
- AskNow UI 0.1: The UI interface works as a platform for users to pose their questions to the AskNow QA system. The UI displays the answers based on whether the answer is an entity or a list of entities, boolean or literal. For entities it shows the abstracts from DBpedia.
- SemanticParsingQA 0.1: The Semantic Parsing-based Question Answering system is built on the integration of EARL, SQG and AskNowUI.
View this announcement on Twitter: https://twitter.com/AskNowQA/status/1040205350853599233
The AskNow Development Team
A BETTER project for exploiting Big Data in Earth Observation🗓 2018-07-05 ✍ Gezim Sejdiu
The SANSA Stack is one of the earmarked big data analytics components to be employed in the BETTER data pipelines.
Big-data Earth observation Technology and Tools Enhancing Research and development is an EU-H2020 research and innovation project started in November 2017 to the end of October 2020.
The project’s main objective is to implement Big Data solutions (denominated as Data Pipelines) based on the usage of large volumes and heterogeneous Earth Observation datasets. This should help addressing key Societal Challenges, so the users can focus on the analysis of the extraction of the potential knowledge within the data and not on the processing of the data itself.
To achieve that, BETTER is improving the way Big Data service developers interact with end-users. After defining the challenges, the promoters validate the pipelines requirements and co-design the solution with a dedicated development team in a workshop. During the implementation, promoters can continuously test and validate the pipelines. Later, the implemented pipelines will be used by the public in the scope of Hackathons, enabling the use of specific solutions in other areas and the collection of additional user feedback. www.ec-better.eu
SUBSCRIBE HERE for major project updates.
SANSA 0.4 (Semantic Analytics Stack) Released🗓 2018-06-26 ✍ Prof. Dr. Jens Lehmann
We are happy to announce SANSA 0.4 - the fourth release of the Scalable Semantic Analytics Stack. SANSA employs distributed computing via Apache Spark and Flink in order to allow scalable machine learning, inference and querying capabilities for large knowledge graphs.
- Website: http://sansa-stack.net
- GitHub: https://github.com/SANSA-Stack
- Download: http://sansa-stack.net/downloads-usage/
- ChangeLog: https://github.com/SANSA-Stack/SANSA-Stack/releases
- Reading and writing RDF files in N-Triples, Turtle, RDF/XML, N-Quad format
- Reading OWL files in various standard formats
- Support for multiple data partitioning techniques
- SPARQL querying via Sparqlify
- Graph-parallel querying of RDF using SPARQL (1.0) via GraphX traversals (experimental)
- RDFS, RDFS Simple, OWL-Horst, EL (experimental) forward chaining inference
- Automatic inference plan creation (experimental)
- RDF graph clustering with different algorithms
- Terminological decision trees (experimental)
- Anomaly detection (beta)
- Knowledge graph embedding approaches: TransE (beta), DistMult (beta)
- Parser performance has been improved significantly e.g. DBpedia 2016-10 can be loaded in <100 seconds on a 7 node cluster
- Support for a wider range of data partitioning strategies
- A better unified API across data representations (RDD, DataFrame, DataSet, Graph) for triple operations
- Improved unit test coverage
- Improved distributed statistics calculation (see ISWC paper)
- Initial scalability tests on 6 billion triple Ethereum blockchain data on a 100 node cluster
- New SPARQL-to-GraphX rewriter aiming at providing better performance for queries exploiting graph locality
- Numeric outlier detection tested on DBpedia (en)
- Improved clustering tested on 20 GB RDF data sets
- There are template projects for SBT and Maven for Apache Spark as well as for Apache Flink available to get started.
- The SANSA jar files are in Maven Central i.e. in most IDEs you can just search for “sansa” to include the dependencies in Maven projects.
- Example code is available for various tasks.
- We provide interactive notebooks for running and testing code via Docker.