Archives de catégorie : Articles

A study of the discovery and redundancy of link keys between two RDF datasets based on partition pattern structures

A link key between two RDF datasets D1 and D2 is a set of pairs of properties allowing to identify pairs of individuals x1 and x2 through an identity link such as x1 owl∶sameAs x2 . In this paper, relying on and extending previous work, we introduce an original formalization of link key discovery based on the framework of Partition Pattern Structures (pps). Our objective is to study and evaluate the redundancy of link keys based on the fact that owl:sameAs is an equivalence relation. In the pps concept lattice, every concept has an extent representing a link key candidate and an intent representing a partition of instances into sets of equivalent instances. Experiments show three main results. Firstly redundancy of link keys is not so significant in real-world datasets. Nevertheless, the link key discovery approach based on pps returns a reduced number of non redundant link key candidates when compared to a standard approach. Moreover, the pps-based approach is efficient and returns link keys of high quality.

Towards Analyzing Variability in Space and Time of Products from a Product Line using Triadic Concept Analysis

In this paper, we report an ongoing work on exploring the ability of Triadic Concept Analysis to provide a framework for analyzing products evolution in time and space, and highlight possible usages in the lifecycle of a product line.

A Triadic Generalisation of the Boolean Concept Lattice

Boolean concept lattices are fundamental structures in formal concept analysis, both from a theoretical and an applied point of view. There are multiple ways to generalise them in the triadic concept analysis framework and one of them, the so-called powerset trilattice, has already been proposed by Biedermann in 1998. However, it lacks some interesting properties such as extremality in the number of triconcepts for tricontexts of a given size. In this paper, we discuss another generalisation of Boolean concept lattices that exhibit such properties. We argue that those structures form equivalence classes and should be studied as such, and investigate the minimum number of objects required to produce them.

Non-Redundant Link Keys in RDF Data: Preliminary Steps

A link key between two RDF datasets D1 and D2 is a set ofpairs of properties allowing to identify pairs of individuals, say x1 in D1 and x2 in D2, which can be materialized as ax1owl:sameAs x2 identity link. There exist several ways to mine such link keys but no one takes into account the fact that owl:sameAs is an equivalence relation, which leads to the discovery of non-redundant link keys. Accordingly, in this paper, we present the link key discovery based on Pattern Structures(PS). PS output a pattern concept lattice where every concept has an extent representing a set of pairs of individuals and an intent representing the related link key candidate. Then, we discuss the equivalence relation in duced by a link key and we introduce the notion of non-redundant link key candidate.

A Hybrid Approach to Identifying the Most Predictive and Discriminant Features in Supervised Classification Problems

In this paper, we are interested in the predictive and discriminant nature of features in supervised classification problems. We discuss the notions of prediction and discrimination and propose a hybrid approach combining supervised classifiers, model explanation, multicriteria decision making and pattern mining for identifying the most predictive and discriminant features in a dataset. The explanation of models learned by supervised classifiers produces rankings of features according to various performance measures. Based on that, multicriteria decision making and pattern mining methods are used to, respectively, select the most important features and interpret their role in terms of prediction and discrimination. Finally, we present and discuss two experiments on public datasets illustrating the potential of the approach.

Sandwich: An Algorithm for Discovering Relevant Link Keys in an LKPS Concept Lattice

The discovery of link keys between two RDF datasets allows the identification of individuals which share common key characteristics. Actually link keys correspond to closed sets of a specific Galois connection and can be discovered thanks to an FCA-based algorithm. In this paper, given a pattern concept lattice where each concept intent is a link key candidate, we aim at identifying the most relevant candidates w.r.t adapted quality measures. To achieve this task, we introduce the « Sandwich » algorithm which is based on a combination of two dual bottom-up and top-down strategies for traversing the pattern concept lattice. The output of the Sandwich algorithm is a poset of the most relevant link key candidates. We provide details about the quality measures applicable to the selection of link keys, the Sandwich algorithm, and as well a discussion on the benefit of our approach.

Steps Towards Causal Formal Concept Analysis

Efficiently discovering causal relations from data and representing them in a way that facilitates their use is an important problem in science that has received much attention. In this paper, we propose an adaptation of the Formal Concept Analysis formalism to the problem of discovering and representing causal relations. We show that Formal Concept Analysis structures and algorithms are well-suited to this problem.

A Novel Framework for Unification of Association Rule Mining, Online Analytical Processing and Statistical Reasoning

Statistical reasoning was one of the earliest methods to draw insights from data. However, over the last three decades, association rule mining and online analytical processing have gained massive ground in practice and theory. Logically, both association rule mining and online analytical processing have some common objectives, but they have been introduced with their own set of mathematical formalizations and have developed their specific terminologies. Therefore, it is difficult to reuse results from one domain in another. Furthermore, it is not easy to unlock the potential of statistical results in their application scenarios. The target of this paper is to bridge the artificial gaps between association rule mining, online analytical processing and statistical reasoning. We first provide an elaboration of the semantic correspondences between their foundations, i.e., itemset apparatus, relational algebra and probability theory. Subsequently, we propose a novel framework for the unification of association rule mining, online analytical processing and statistical reasoning. Additionally, an instance of the proposed framework is developed by implementing a sample decision support tool. The tool is compared with a state-of-the-art decision support tool and evaluated by a series of experiments using two real data sets and one synthetic data set. The results of the tool validate the framework for the unified usage of association rule mining, online analytical processing, and statistical reasoning. The tool clarifies in how far the operations of association rule mining and online analytical processing can complement each other in understanding data, data visualization and decision making.

Some Notes on Polyadic Concept Analysis

Despite the popularity of Formal Concept Analysis (FCA) as a mathematical framework for data analysis, some of its extensions are still considered arcane. Polyadic Concept Analysis (PCA) is one of the most promising yet understudied of these extensions. This formalism offers many interesting open questions but is hindered in its dissemination by complex notations and a lack of agreed-upon basic definitions. In this paper, we discuss in a mostly informal way the fundamental differences between FCA and PCA in the relation between contexts, conceptual structures, and rules. We identify open questions, present partial results on the maximal size of concept n-lattices and suggest new research directions.

Explaining multicriteria decision making with formal concept analysis

Multicriteria decision making aims at helping a decision maker choose the best solutions among alternatives compared against multiple conflicting criteria. The reasons why an alternative is considered among the best are not always clearly explained. In this paper, we propose an approach that uses formal concept analysis and background knowledge on the criteria to explain the presence of alternatives on the Pareto front of a multicriteria decision problem.