Archives de l’auteur : Alexandre Bazin

Distances Between Formal Concept Analysis Structures

In this paper, we study the notion of distance between the most important structures of formal concept analysis: formal contexts, concept lattices, and implication bases. We first define three families of Minkowski-like distances between these three structures. We then present experiments showing that the correlations of these measures are low and depend on the distance between formal contexts.

Extraction de connaissances basée sur l’analyse formelle de concepts en vue de l’assistance aux débats en ligne

Nous présentons un processus automatisé d’accompagnement de débats visant à extraire des associations entre termes à partir des listes de termesclés des arguments, listes co-construites par les utilisateurs et notre système d’indexation. L’indexation encourage les utilisateurs à compléter ou corriger la liste des termes-clés, agissant comme un outil incitatif à l’élaboration de points de vue plus structurés. L’algorithme est basé sur l’analyse formelle de concepts et s’appuie sur la base de connaissances JeuxDeMots (JDM). La procédure fait intervenir plusieurs modules menant à une étape d’extraction de connaissances sous forme d’implications destinées à être intégrées dans JDM. Cette approche coopérative permet à la base de connaissances de s’enrichir à mesure que les débats sont analysés, améliorant les termes-clés suggérés par la plate-forme.

Discovery of link keys in resource description framework datasets based on pattern structures

In this paper, we present a detailed and complete study on data interlinking and the discovery of identity links between two RDF-Resource Description Framework-datasets over the web of data. Data interlinking is the task of discovering identity links between individuals across datasets. Link keys are constructions based on pairs of properties and classes that can be considered as rules allowing to infer identity links between subjects in two RDF datasets. Here we investigate how FCA-Formal Concept Analysis-and its extensions are well adapted to investigate and to support the discovery of link keys. Indeed plain FCA allows to discover the so-called link key candidates, while a specific pattern structure allows to associate a pair of classes with every candidate. Different link key candidates can generate sets of identity links between individuals that can be considered as equal when they are regarded as partitions of the identity relation and thus involving a kind of redundancy. In this paper, such a redundancy is deeply studied thanks to partition pattern structures. In particular, experiments are proposed where it is shown that redundancy of link key candidates while not significant when based on identity of partitions appears to be much more significant when based on similarity.

Polyadic Relational Concept Analysis

Formal concept analysis is a mathematical framework based on lattice theory that aims at representing the information contained in binary object-attribute datasets (called formal contexts) in the form of a lattice of so-called formal concepts. Since its introduction, it has been extended to more complex types of data. In this paper, we are interested in two of those extensions: relational concept analysis and polyadic concept analysis that allow to process, respectively, relational data and $n$-ary relations. We present a framework for polyadic relational concept analysis that extends relational concept analysis to relational datasets that are made of $n$-ary relations. We show its basic properties and that it is a valid extension of relational concept analysis.

A study of the discovery and redundancy of link keys between two RDF datasets based on partition pattern structures

A link key between two RDF datasets D1 and D2 is a set of pairs of properties allowing to identify pairs of individuals x1 and x2 through an identity link such as x1 owl∶sameAs x2 . In this paper, relying on and extending previous work, we introduce an original formalization of link key discovery based on the framework of Partition Pattern Structures (pps). Our objective is to study and evaluate the redundancy of link keys based on the fact that owl:sameAs is an equivalence relation. In the pps concept lattice, every concept has an extent representing a link key candidate and an intent representing a partition of instances into sets of equivalent instances. Experiments show three main results. Firstly redundancy of link keys is not so significant in real-world datasets. Nevertheless, the link key discovery approach based on pps returns a reduced number of non redundant link key candidates when compared to a standard approach. Moreover, the pps-based approach is efficient and returns link keys of high quality.

A Triadic Generalisation of the Boolean Concept Lattice

Boolean concept lattices are fundamental structures in formal concept analysis, both from a theoretical and an applied point of view. There are multiple ways to generalise them in the triadic concept analysis framework and one of them, the so-called powerset trilattice, has already been proposed by Biedermann in 1998. However, it lacks some interesting properties such as extremality in the number of triconcepts for tricontexts of a given size. In this paper, we discuss another generalisation of Boolean concept lattices that exhibit such properties. We argue that those structures form equivalence classes and should be studied as such, and investigate the minimum number of objects required to produce them.

Non-Redundant Link Keys in RDF Data: Preliminary Steps

A link key between two RDF datasets D1 and D2 is a set ofpairs of properties allowing to identify pairs of individuals, say x1 in D1 and x2 in D2, which can be materialized as ax1owl:sameAs x2 identity link. There exist several ways to mine such link keys but no one takes into account the fact that owl:sameAs is an equivalence relation, which leads to the discovery of non-redundant link keys. Accordingly, in this paper, we present the link key discovery based on Pattern Structures(PS). PS output a pattern concept lattice where every concept has an extent representing a set of pairs of individuals and an intent representing the related link key candidate. Then, we discuss the equivalence relation in duced by a link key and we introduce the notion of non-redundant link key candidate.

A Hybrid Approach to Identifying the Most Predictive and Discriminant Features in Supervised Classification Problems

In this paper, we are interested in the predictive and discriminant nature of features in supervised classification problems. We discuss the notions of prediction and discrimination and propose a hybrid approach combining supervised classifiers, model explanation, multicriteria decision making and pattern mining for identifying the most predictive and discriminant features in a dataset. The explanation of models learned by supervised classifiers produces rankings of features according to various performance measures. Based on that, multicriteria decision making and pattern mining methods are used to, respectively, select the most important features and interpret their role in terms of prediction and discrimination. Finally, we present and discuss two experiments on public datasets illustrating the potential of the approach.

Sandwich: An Algorithm for Discovering Relevant Link Keys in an LKPS Concept Lattice

The discovery of link keys between two RDF datasets allows the identification of individuals which share common key characteristics. Actually link keys correspond to closed sets of a specific Galois connection and can be discovered thanks to an FCA-based algorithm. In this paper, given a pattern concept lattice where each concept intent is a link key candidate, we aim at identifying the most relevant candidates w.r.t adapted quality measures. To achieve this task, we introduce the « Sandwich » algorithm which is based on a combination of two dual bottom-up and top-down strategies for traversing the pattern concept lattice. The output of the Sandwich algorithm is a poset of the most relevant link key candidates. We provide details about the quality measures applicable to the selection of link keys, the Sandwich algorithm, and as well a discussion on the benefit of our approach.