GezimSejdiu – Page 8 – Smart Data Analytics

Invited talk by Anisa Rula

2017-03-292017-03-30GezimSejdiu

On Tuesday, 28^thDr. Anisa Rula, a postdoctoral researcher at The University of Milano-Bicocca visited SDA and gave a talk entitled “Enriching Knowledge Bases through Quality Assessment”.

Anisa presented a talk in the context of quality dimensions and their evolution and how the anatomy of data representation and the quality assessment in Knowledge Bases (KBs) could lead to the improvement of existing KBs, i.e., by providing an enrichment of KBs. The trade-off between the enrichment and quality of KGs were risen up and discussed in details. Some of the use cases were mentioned as well, with the main focus on Link Discovery. In particular, enriching KBs will help in better interlinking by eliminating noise and search space.
During the talk, she also introduced ABSTAT, an ontology-driven linked data summarization framework that generates summaries of Linked Data datasets that comprises a set of Abstract Knowledge patterns, statistics, and a subtype graph.

Prof. Dr. Jens Lehmann invited the speaker to the bi-weekly “SDA colloquium presentations”, so there was good representation from various students and researchers from our group.
The Slides of the talk of our invited speaker Anisa Rula were inspired by “Data Quality Issues in Linked Open Data”, a chapter of the book “Data and Information Quality by Carlo Batini and Monica Scannapieco.

With this visit, we expect to strengthen our research collaboration networks with the Department of Computer Science, Systems and Communication, University of Milan-Bicocca, mainly on combining quality assessment metrics and distributed frameworks applied on SANSA

Paper accepted at WWW 2017

2017-03-172017-03-17GezimSejdiu

We are very pleased to announce that our group got a paper accepted for presentation at the 26th International World Wide Web Conference (WWW 2017), which will be held on the sunny shores of Perth, Western Australia /3-7 April, 2017. The WWW is an important international forum for the evolution of the web, technical standards, the impact of the web on society, and its future.

“Neural Network-based Question Answering over Knowledge Graphs on Word and Character Level” by Denis Lukovnikov, Asja Fischer, Soeren Auer, and Jens Lehmann.

Abstract: Question Answering (QA) systems over Knowledge Graphs (KG) automatically answer natural language questions using facts contained in a knowledge graph. Simple questions, which can be answered by the extraction of a single fact, constitute a large part of questions asked on the web but still pose challenges to QA systems, especially when asked against a large knowledge resource. Existing QA systems usually rely on various components each specialised in solving different sub-tasks of the problem (such as segmentation, entity recognition, disambiguation, and relation classification etc.). In this work, we follow a quite different approach: We train a neural network for answering simple questions in an end-to-end manner, leaving all decisions to the model. It learns to rank subject-predicate pairs to enable the retrieval of relevant facts given a question. The network contains a nested word/character-level question encoder which allows to handle out-of-vocabulary and rare word problems while still being able to exploit word-level semantics. Our approach achieves results competitive with state-of-the-art end-to-end approaches that rely on an attention mechanism.

Acknowledgments
This work is supported in part by the European Union under the Horizon 2020 Framework Program for the project WDAqua (GA 642795).

Looking forward to seeing you at WWW.

Paper accepted at ICWE 2017

2017-03-082017-03-08GezimSejdiu

We are very pleased to announce that our group got a paper accepted for presentation at the 17th International Conference on Web Engineering (ICWE 2017 ), which will be held on 5 – 8 June 2017 / Rome Italy. The ICWE is an important international forum for the Web Engineering Community.

“The BigDataEurope Platform – Supporting the Variety Dimension of Big Data“ Sören Auer, Simon Scerri, Aad Versteden, Erika Pauwels, Angelos Charalambidis, Stasinos Konstantopoulos, Jens Lehmann, Hajira Jabeen, Ivan Ermilov, Gezim Sejdiu, Andreas Ikonomopoulos, Spyros Andronopoulos, Mandy Vlachogiannis, Charalambos Pappas, Athanasios Davettas, Iraklis A. Klampanos, Efstathios Grigoropoulos, Vangelis Karkaletsis, Victor de Boer, Ronald Siebes, Mohamed Nadjib Mami, Sergio Albani, Michele Lazzarini, Paulo Nunes, Emanuele Angiuli, Nikiforos Pittaras, George Giannakopoulos, Giorgos Argyriou, George Stamoulis, George Papadakis, Manolis Koubarakis, Pythagoras Karampiperis, Axel-Cyrille Ngonga Ngomo, Maria-Esther Vidal.

Abstract: The management and analysis of large-scale datasets – described with the term Big Data – involves the three classic dimensions volume, velocity and variety. While the former two are well supported by a plethora of software components, the variety dimension is still rather neglected. We present the BDE platform – an easy-to-deploy, easy-to-use and adaptable (cluster-based and standalone) platform for the execution of big data components and tools like Hadoop, Spark, Flink, Flume and Cassandra. The BDE platform was designed based upon the requirements gathered from seven of the societal challenges put forward by the European Commission in the Horizon 2020 programme and targeted by the BigDataEurope pilots. As a result, the BDE platform allows to perform a variety of Big Data flow tasks like message passing, storage, analysis or publishing. To facilitate the processing of heterogeneous data, a particular innovation of the platform is the Semantic Layer, which allows to directly process RDF data and to map and transform arbitrary data into RDF. The advantages of the BDE platform are demonstrated through seven pilots, each focusing on a major societal challenge.

Acknowledgments
This work is supported by the European Union’s Horizon 2020 research and innovation program under grant agreement no.644564 – BigDataEurope.

“AskNow: A Framework for Natural Language Query Formalization in SPARQL” elected as Paper of the month

2017-03-062017-03-06GezimSejdiu

We are very pleased to announce that our paper “AskNow: A Framework for Natural Language Query Formalization in SPARQL” by Mohnish Dubey, Sourish Dasgupta, Ankit Sharma, Konrad Höffner, Jens Lehmann has been elected as the Paper of the month at Fraunhofer IAIS. This award is given to publications that have a high innovation impact in the research field after a committee evaluation.

This research paper has been accepted on ESWC 2016 main conference and its core work of Natural Language Query Formalization in SPARQL is based on AskNow Project.

Abstract: Natural Language Query Formalization involves semantically parsing queries in natural language and translating them into their corresponding formal representations. It is a key component for developing question-answering (QA) systems on RDF data. The chosen formal representation language in this case is often SPARQL. In this paper, we propose a framework, called AskNow, where users can pose queries in English to a target RDF knowledge base (e.g. DBpedia), which are first normalized into an intermediary canonical syntactic form, called Normalized Query Structure (NQS), and then translated into SPARQL queries. NQS facilitates the identification of the desire (or expected output information) and the user-provided input information, and establishing their mutual semantic relationship. At the same time, it is sufficiently adaptive to query paraphrasing. We have empirically evaluated the framework with respect to the syntactic robustness of NQS and semantic accuracy of the SPARQL translator on standard benchmark datasets.

The paper and authors were honored for this publication in a special event at Fraunhofer Schloss Birlinghoven, Sankt Augustin, Germany.

Invited talk by Paul Groth

2017-02-242017-02-28GezimSejdiu

On Tuesday, 7th February, Paul Groth from Elsevier Labs visited SDA and gave a talk entitled “Applying Knowledge Graphs”.

Paul presented a talk in the context of building large knowledge graphs at Elsevier. He gave a great talk on how to motivate the need for Knowledge Graph observatories in order to provide empirical evidence for how to deal with changing over Knowledge Bases.

Enjoyed visiting @SDA_Research and @JLehmann82 today. Combining #semantics distcompute #ml see https://t.co/LyVDbygBgC @ApacheSpark https://t.co/TBovCsKWpv

— Paul Groth (@pgroth) February 7, 2017

The talk was invited from Prof. Dr. Jens Lehmann on “Knowledge Graph Analysis” lectures so there was good representation from various students and researchers from SDA and EIS group.

The Slides of the talk of our invited speaker Paul Groth can be found here:

With this visit, we expect to strengthen our research collaboration networks with Elsevier Labs, mainly on combining semantics and distributed machine learning applied on SANSA.

Paper accepted at ESWC 2017

2017-02-232017-02-23GezimSejdiu

We are very pleased to announce that our group got one paper accepted for presentation at the 14th Extended Semantic Web Conference (ESWC 2017) research track, held in Portoroz, Slovenia from 28th of May to the 1st of June. The ESWC is an important international forum for the Semantic Web / Linked Data Community.

“WOMBAT – A Generalization Approach for Automatic Link Discovery” Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo, Jens Lehmann

Abstract. A significant portion of the evolution of Linked Data datasets lies in updating the links to other datasets. An important challenge when aiming to update these links automatically under the open-world assumption is the fact that usually only positive examples for the links exist. We address this challenge by presenting and evaluating WOMBAT , a novel approach for the discovery of links between knowledge bases that relies exclusively on positive examples. WOMBAT is based on generalisation via an upward refinement operator to traverse the space of link specification. We study the theoretical characteristics of WOMBAT and evaluate it on 8 different benchmark datasets. Our evaluation suggests that WOMBAT outperforms state-of-the-art supervised approaches while relying on less information. Moreover, our evaluation suggests that WOMBAT’s pruning algorithm allows it to scale well even on large datasets.

Acknowledgments
This work is supported by the European Union’s H2020 research and innovation action HOBBIT (GA no. 688227), the European Union’s H2020 research and innovation action SLIPO (GA no. 731581) and the BMWI Project GEISER (project no. 01MD16014).

Asja Fischer has won Best Paper Award

2017-02-162017-02-16GezimSejdiu

We are very pleased to announce that our paper “Training restricted Boltzmann machines: An introduction by Asja Fischer and Christian Igel. Pattern Recognition. Volume 47, Issue 1, Jan. 2014, Pages 25-39″ was awarded the Pattern Recognition Journal Best Paper Award 2014. The biennial award is given to the best paper published in the journal Pattern Recognition, the official journal of the Pattern Recognition Society.

The idea behind the paper was to provide implementations of Restricted Boltzmann machines (RBMs), which are probabilistic graphical models that can be interpreted as stochastic neural networks. They have attracted much attention as building blocks of deep learning systems called deep belief networks, and variants and extensions of RBMs have found application in a wide range of pattern recognition tasks. The article introduces RBMs from the viewpoint of Markov random fields (undirected graphical models).

Abstract
Restricted Boltzmann machines (RBMs) are probabilistic graphical models that can be interpreted as stochastic neural networks. They have attracted much attention as building blocks for the multi-layer learning systems called deep belief networks, and variants and extensions of RBMs have found application in a wide range of pattern recognition tasks. This tutorial introduces RBMs from the viewpoint of Markov random fields, starting with the required concepts of undirected graphical models. Different learning algorithms for RBMs, including contrastive divergence learning and parallel tempering, are discussed. As sampling from RBMs, and therefore also most of their learning algorithms, are based on Markov chain Monte Carlo (MCMC) methods, an introduction to Markov chains and MCMC techniques is provided. Experiments demonstrate relevant aspects of RBM training.

Stay tuned for more news 🙂

SLIPO project kick-off & HOBBIT project meeting

2017-02-082017-02-09GezimSejdiu

SLIPO, a new project within the EU’s “Horizon 2020” framework program, kicked-off in Athens, Greece on 18th and 20th of January 2017.

The main goal of SLIPO is to transfer the research output generated in our previous project GeoKnow to the specific challenges of POI data. In SLIPO we introduce validated and cost-effective innovations across the POI value chain. Beyond that, we are aiming to improve the scalability of our key research frameworks, such as LinkedGeoData, DL-Learner or LIMES.

Our partners in this project are:

Find out more at http://sda.tech/projects/slipo/.

This project has received funding from the European Union’s H2020 research and innovation action program under grant agreement number 688227.

Afterward, on 1st and 2nd February on Athens, Grece the HOBBIT project successfully held its 3rd plenary meeting at NCSR_Demokritos premises.

Every one of the project’s work packages presented their recent progress and important discussion was held on the upcoming release of the first version of the platform. Furthermore, the project is quickly approaching the realization of its accompanying challenges at the ESWC and DEBS conference, for which technical and organizational agreements were made.

More information on the upcoming challenges can be found under https://project-hobbit.eu/challenges .

Paper accepted at AAAI 2017

2016-12-222017-11-13GezimSejdiu

We are very pleased to announce that one paper from our group got accepted for presentation at the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), which will be held on February 4–9 at the Hilton San Francisco, San Francisco, California, USA.

Radon– Rapid Discovery of Topological Relations Mohamed Ahmed Sherif, Kevin Dreßler, Panayiotis Smeros, and Axel-Cyrille Ngonga Ngomo

Abstract. Datasets containing geo-spatial resources are increasingly being represented according to the Linked Data principles. Several time-efficient approaches for discovering links between RDF resources have been developed over the last years. However, the time-efficient discovery of topological relations between geospatial resources has been paid little attention to. We address this research gap by presenting Radon, a novel approach for the rapid computation of topological relations between geo-spatial resources. Our approach uses a sparse tiling index in combination with minimum bounding boxes to reduce the computation time of topological relations. Our evaluation of Radon’s runtime on 45 datasets and in more than 800 experiments shows that it outperforms the state of the art by up to 3 orders of magnitude while maintaining an F-measure of 100%. Moreover, our experiments suggest that Radon scales up well when implemented in parallel.

Acknowledgments
This work is implemented in the link discovery framework LIMES and has been supported by the European Union’s H2020 research and innovation action HOBBIT (GA no. 688227) as well as the BMWI Project GEISER (project no. 01MD16014E).

SDA@ESWC16

2016-06-072016-06-08GezimSejdiu

ESWC Extended Semantic Web Conference 2016 is one of the major venue for discussing the latest scientific results and technologies around semantic technologies. Our members have actively participated in 13th ESWC 2016, which took place in Crete, Greece from May 29th to June 2nd.

We are very pleased to report that:

Two papers from our group were accepted for presentation as full research papers @ESWC16

AskNow: A Framework for Natural Language Query Formalization in SPARQL by Mohnish Dubey, Sourish Dasgupta, Ankit Sharma, Konrad Höffner, Jens Lehmann.
Mohnish Dubey presented his work on Natural Language Query Formalization in SPARQL based on AskNow Project in the main conference. The audience showed high interest in his presentation and appreciated the natural language understanding provided by the AskNow system. Following discussion included further challenges in QA system and constructive suggestions for possible improvement.

@MohnishDubey presented #AskNow“A Framework for Natural Language Query Formalization in SPARQL”at #eswc2016 #QuestionAnswering #SemanticWeb

— SDA Research (@SDA_Research) June 2, 2016

Semantically Enhanced Quality Assurance in the JURION Business Use Case by Dimitris Kontokostas, Christian Mader, Christian Dirschl, Katja Eck, Michael Leuthold, Jens Lehmann, Sebastian Hellmann

A Workshop paper

DBtrends : Publishing and Benchmarking RDF Ranking Functions by Edgard Marx, Amrapali J. Zaveri, Mofeed Mohammed, Sandro Rautenberg, Jens Lehmann, Axel-Cyrille Ngonga Ngomo and Gong Cheng, SumPre2016 Workshop at ESWC 2016

The workshop Know@LOD was held by Prof. Heiko Paulheim and Prof. Dr. Jens Lehmann. It featured lively discussions on combinations of the Semantic Web and machine learning.

The 5th edition of the Know@LOD workshop starting at @eswc_conf today at 9am with nine presentations! #machinelearning #semanticweb

— Jens Lehmann (@JLehmann82) May 30, 2016

Prof. Jens Lehmann took part in the two day HOBBIT project plenary which started the last day of the ESWC conference. HOBBIT deals with Big Linked Data benchmarks and at the meeting 8 different datasets were discussed along with the HOBBIT benchmarking platform and HOBBIT association. SDA will specifically focus on question answering and faceted browsing benchmarks inside of the project. Already during ESWC, there was a dedicated HOBBIT event in which requirements for benchmarks and the platform were discussed.

Prof @JLehmann82 gave his insight on benchmarking – “Visualization & Services” for @hobbit_project #eswc2016

— SDA Research (@SDA_Research) June 1, 2016

ESWC16 was a great venue to meet the community, create new connections, talk about current research challenges, share ideas and settle new collaborations. We look forward to the next ESWC conference.

Until then, meet us at sda.tech !

Tweet