2020
Activity report
Project-Team
GRAPHIK
RNSR: 201019618K
Research center
In partnership with:
CNRS, INRAE, Université de Montpellier
Team name:
GRAPHs for Inferences and Knowledge representation
In collaboration with:
Laboratoire d'informatique, de robotique et de microélectronique de Montpellier (LIRMM)
Domain
Perception, Cognition and Interaction
Theme
Data and Knowledge Representation and Processing
Creation of the Project-Team: 2010 January 01

# Keywords

• A3.1.1. Modeling, representation
• A3.2.1. Knowledge bases
• A3.2.3. Inference
• A3.2.5. Ontologies
• A7.2. Logic in Computer Science
• A9.1. Knowledge
• A9.6. Decision support
• A9.7. AI algorithmics
• A9.8. Reasoning
• B3.1. Sustainable development
• B9.5.6. Data science
• B9.7.2. Open data

# 1 Team members, visitors, external collaborators

## Research Scientists

• Jean-François Baget [Inria, Researcher]
• Pierre Bisquert [Institut national de recherche pour l'agriculture, l'alimentation et l'environnement, Researcher]

## Faculty Members

• Marie-Laure Mugnier [Team leader, Univ de Montpellier, Professor, HDR]
• Michel Chein [Univ de Montpellier, Emeritus, HDR]
• Madalina Croitoru [Univ de Montpellier, Associate Professor, HDR]
• Jérôme Fortin [Univ de Montpellier, Associate Professor]
• Michel Leclère [Univ de Montpellier, Associate Professor]
• Federico Ulliana [Univ de Montpellier, Associate Professor]

## PhD Students

• Martin Jedwabny [Univ de Montpellier]
• Elie Najm [Inria]
• Guillaume Perution Kihli [Inria, from Sep 2020]
• Olivier Rodriguez [Inria]

## Technical Staff

• Florent Tornil [Inria, Engineer, from Sep 2020]

## Interns and Apprentices

• Guillaume Perution Kihli [Inria, from Feb 2020 until Jul 2020]
• Noel Rodriguez [Inria, from Jul 2020 until Aug 2020]

• Annie Aliaga [Inria]

## External Collaborators

• Meghyn Bienvenu [CNRS, HDR]
• Patrice Buche [Institut national de recherche pour l'agriculture, l'alimentation et l'environnement, HDR]
• Alain Gutierrez [CNRS]
• Rallou Thomopoulos [Institut national de recherche pour l'agriculture, l'alimentation et l'environnement, HDR]

# 2 Overall objectives

## 2.1 Logic and Graph-based KR

The main research domain of GraphIK is Knowledge Representation and Reasoning (KR), which studies paradigms and formalisms for representing knowledge and reasoning on these representations. A large part of our work is strongly related to data management and database theory.

We develop logical languages, which mainly correspond to fragments of first-order logic. However, we also use graphs and hypergraphs (in the graph-theoretic sense) as basic objects. Indeed, we view labelled graphs as an abstract representation of knowledge that can be expressed in many KR languages: different kinds of conceptual graphs —historically our main focus—, the Semantic Web language RDFS, expressive rules equivalent to so-called tuple-generating-dependencies in databases, some description logics dedicated to query answering, etc. For these languages, reasoning can be based on the structure of objects (thus on graph-theoretic notions) while being sound and complete with respect to entailment in the associated logical fragments. An important issue is to study trade-offs between the expressivity and the computational tractability of (sound and complete) reasoning in these languages.

## 2.2 From Theory to Applications, and Vice-versa

We study logic- and graph-based KR formalisms from three perspectives:

• theoretical (structural properties, expressiveness, translations between languages, problem complexity, algorithm design),
• software (developing tools to implement theoretical results),
• applications (formalizing practical issues and solving them with our techniques, which also feeds back into theoretical work).

## 2.3 Main Challenges

GraphIK focuses on some of the main challenges in KR:

• ontological query answering: querying large, complex or heterogeneous datasets, provided with an ontological layer;
• reasoning with rule-based languages;
• reasoning in presence of inconsistencies and
• decision making.

## 2.4 Scientific Directions

Our research work is currently organized into two research lines, both with theoretical and applied sides:

1. Ontology-mediated query answering (OMQA). Modern information systems are often structured around an ontology, which provides a high-level vocabulary, as well as knowledge relevant to the target domain, and enables a uniform access to possibly heterogeneous data sources. As many complex tasks can be recast in terms of query answering, the question of querying data while taking into account inferences enabled by ontological knowledge has become a fundamental issue. This gives rise to the notion of a knowledge base, composed of an ontology and a factbase, both described using a KR language. The factbase can be seen as an abstraction of several data sources, and may actually remain virtual. The topical ontology-mediated query answering (OMQA) problem asks for all answers to queries that are logically entailed by the given knowledge base.
2. Reasoning with imperfect knowledge and decision support. To solve real-world problems we often need to consider features that cannot be expressed purely (or naturally) in classical logic. Indeed, information is often “imperfect”: it can be partially contradictory, vague or uncertain, etc. These last years, we mostly considered reasoning in presence of conflicts, where contradictory information may come from the data or from the ontology. This requires to define appropriate semantics, able to provide meaningful answers to queries while taming the computational complexity increase. Reasoning becomes more complex from a conceptual viewpoint as well, hence how to explain results to an end-user is also an important issue. Such questions are natural extensions to those studied in the first axis. On the other hand, the work of this axis is also motivated by applications provided by our INRAE partners, where the knowledge to be represented intrinsically features several viewpoints and involves different stakeholders with divergent priorities, while a decision has to be made. Beyond the representation of conflictual knowledge itself, this raises arbitration issues. The aim here is to support decision making by tools that help eliciting and representing relevant knowledge, including the stakeholders' preferences and motivations, compute syntheses of compatible options, and propose justified decisions.

# 3 Research program

## 3.1 Logic-based Knowledge Representation and Reasoning

We follow the mainstream logic-based approach to knowledge representation (KR). First-order logic (FOL) is the reference logic in KR and most formalisms in this area can be translated into fragments (i.e., particular subsets) of FOL. This is in particular the case for description logics and existential rules, two well-known KR formalisms studied in the team.

A large part of research in this domain can be seen as studying trade-offs between the expressivity of languages and the complexity of (sound and complete) reasoning in these languages. The fundamental problem in KR languages is entailment checking: is a given piece of knowledge entailed by other pieces of knowledge, for instance from a knowledge base (KB)? Another important problem is consistency checking: is a set of knowledge pieces (for instance the knowledge base itself) consistent, i.e., is it sure that nothing absurd can be entailed from it? The ontology-mediated query answering problem is a topical problem (see Section 3.3). It asks for the set of answers to a query in the KB. In the case of Boolean queries (i.e., queries with a yes/no answer), it can be recast as entailment checking.

## 3.2 Graph-based Knowledge Representation and Reasoning

Besides logical foundations, we are interested in KR formalisms that comply, or aim at complying with the following requirements: to have good computational properties and to allow users of knowledge-based systems to have a maximal understanding and control over each step of the knowledge base building process and use.

These two requirements are the core motivations for our graph-based approach to KR. We view labelled graphs as an abstract representation of knowledge that can be expressed in many KR languages (different kinds of conceptual graphs —historically our main focus— the Semantic Web language RDF (Resource Description Framework), its extension RDFS (RDF Schema), expressive rules equivalent to the so-called tuple-generating-dependencies in databases, some description logics dedicated to query answering, etc.). For these languages, reasoning can be based on the structure of objects, thus based on graph-theoretic notions, while staying logically founded.

More precisely, our basic objects are labelled graphs (or hypergraphs) representing entities and relationships between these entities. These graphs have a natural translation in first-order logic. Our basic reasoning tool is graph homomorphism. The fundamental property is that graph homomorphism is sound and complete with respect to logical entailment i.e., given two (labelled) graphs $G$ and $H$, there is a homomorphism from $G$ to $H$if and only if the formula assigned to $G$ is entailed by the formula assigned to $H$. In other words, logical reasoning on these graphs can be performed by graph mechanisms. These knowledge constructs and the associated reasoning mechanisms can be extended (to represent rules for instance) while keeping this fundamental correspondence between graphs and logics.

Querying knowledge bases has become a central problem in knowledge representation and in databases. A knowledge base is classically composed of a terminological part (metadata, ontology) and an assertional part (facts, data). Queries are supposed to be at least as expressive as the basic queries in databases, i.e., conjunctive queries, which can be seen as existentially closed conjunctions of atoms or as labelled graphs. The challenge is to define good trade-offs between the expressivity of the ontological language and the complexity of querying data in presence of ontological knowledge. Description logics have been so far the prominent family of formalisms for representing and reasoning with ontological knowledge. However, classical description logics were not designed for efficient data querying. On the other hand, database languages are able to process complex queries on huge databases, but without taking the ontology into account. There is thus a need for new languages and mechanisms, able to cope with the ever growing size of knowledge bases in the Semantic Web or in scientific domains.

This problem is related to two other problems identified as fundamental in KR:

• Query answering with incomplete information. Incomplete information means that it might be unknown whether a given assertion is true or false. Databases classically make the so-called closed-world assumption: every fact that cannot be retrieved or inferred from the base is assumed to be false. Knowledge bases classically make the open-world assumption: if something cannot be inferred from the base, and neither can its negation, then its truth status is unknown. The need of coping with incomplete information is a distinctive feature of querying knowledge bases with respect to querying classical databases (however, as explained above, this distinction tends to disappear). The presence of incomplete information makes the query answering task much more difficult.
• Reasoning with rules. Researching types of rules and adequate manners to process them is a mainstream topic in the Semantic Web, and, more generally a crucial issue for knowledge-based systems. For several years, we have been studying rules, both in their logical and their graph form, which are syntactically very simple but also very expressive. These rules, known as existential rules or Datalog$+$, can be seen as an abstraction of ontological knowledge expressed in the main languages used in the context of KB querying.

## 3.4 Inconsistency and Decision Making

While classical FOL is the kernel of many KR languages, to solve real-world problems we often need to consider features that cannot be expressed purely (or not naturally) in classical logic. The logic and graph-based formalisms used for previous points have thus to be extended with such features. The following requirements have been identified from scenarios in decision making, privileging the agronomy domain:

• to cope with inconsistency;
• to cope with defeasible knowledge;
• to take into account different and potentially conflicting viewpoints;
• to integrate decision notions (priorities, gravity, risk, benefit).

Although the solutions we develop require to be validated on the applications that motivated them, we also want them to be sufficiently generic to be applied in other contexts. One angle of attack (but not the only possible one) consists in increasing the expressivity of our core languages, while trying to preserve their essential combinatorial properties, so that algorithmic optimizations can be transferred to these extensions.

# 4 Application domains

## 4.1 Agronomy

Agronomy is a strong expertise domain in the area of Montpellier. Some members of GraphIK are INRAE researchers (computer scientists). We closely collaborate with the Montpellier research laboratory IATE, a join unit of INRAE and other organisms. A major issue for INRAE and more specifically IATE applications is modeling agrifood chains (i.e., the chain of all processes leading from the plants to the final products, including waste treatment). This modeling has several objectives. It provides better understanding of the processes from begin to end, which aids in decision making, with the aim of improving the quality of the products and decreasing the environmental impact. It also facilitates knowledge sharing between researchers, as well as the capitalization of expert knowledge and “know how”. This last point is particularly important in areas strongly related to local know how (like in cheese or wine making), where knowledge is transmitted by experience, with the risk of non-sustainability of the specific skills. An agrifood chain analysis is a highly complex procedure since it relies on numerous criteria of various types: environmental, economical, functional, sanitary, etc. Quality objectives involve different stakeholders, technicians, managers, professional organizations, end-users, public organizations, etc. Since the goals of the implied stakeholders may be divergent dedicated knowledge and representation techniques are to be employed.

## 4.2 Data Journalism

One of today’s major issues in data science is to design techniques and algorithms that allow analysts to efficiently infer useful information and knowledge by inspecting heterogeneous information sources, from structured data to unstructured content. We take data journalism as an emblematic use-case, which stands at the crossroad of multiple research fields: content analysis, data management, knowledge representation and reasoning, visualization and human-machine interaction. We are particularly interested in issues raised by the design of data and knowledge management systems that will support data journalism. These systems include an ontology (which typically expresses domain knowledge), heterogeneous data sources (provided with their own vocabulary and querying capabilities), and mappings that relate these data sources to the ontological vocabulary. Ontologies play a central role as they act both as a mediation layer that glue together pieces of knowledge extracted from data sources, and as an inference layer that allow to draw new knowledge.

Besides pure knowledge representation and reasoning issues, querying such systems raise issues at the crossroad of data and knowledge management. In particular, although mappings have been widely investigated in databases, they need to be revisited in the light of the reasoning capabilities enabled by the ontology. More generally, the consistency and the efficiency of the system cannot be ensured by considering the components of the system in isolation (i.e., the ontology, data sources and mappings), but require to study the interactions between these components and to consider the system as a whole.

# 5 Social and environmental responsibility

Since January 2020, Pierre Bisquert is a member of the national INRAE DigigrAL thinking group. This group aims at providing reflections under the form of reports about the technological, societal and ethical impacts of digital technologies in agriculture. Some questions of interest are, among others: In what way digitalization might redefine power relation between citizens, consumers and industries? Where lies the responsability when using a decision support tool? How to sustain massive data production? This group meets monthly and is composed of 13 researchers, each representing a department of the INRAE institute.

# 6 Highlights of the year

## 6.1 Awards

Maxime Buron, jointly supervised by François Goasdoue (IRISA/CEDAR), Ioana Manolescu (CEDAR) and Marie-Laure Mugnier (GraphIK) obtained the BDA PhD price 2020 for his PhD thesis entitled “Efficient Reasoning on Large and Heterogeneous Graphs”. BDA is the conference of the French research community on data management. https://bda.lip6.fr/

# 7 New software and platforms

## 7.1 New software

### 7.1.1 GRAAL

• Keywords: Knowledge database, Ontologies, Querying, Data management
• Scientific Description: Graal is a Java toolkit dedicated to querying knowledge bases within the framework of existential rules, aka Datalog+/-.
• Functional Description: Graal has been designed in a modular way, in order to facilitate software reuse and extension. It should make it easy to test new scenarios and techniques, in particular by combining algorithms. The main features of Graal are currently the following: (1) a data layer that provides generic interfaces to store various kinds of data and query them with (union of) conjunctive queries, currently: MySQL, PostgreSQL, Sqlite, in memory graph and linked list structures, (2) an ontological layer, where an ontology is a set of existential rules, (3) a knowledge base layer, where a knowledge base is composed of a fact base (abstraction of the data via generic interfaces) and an ontology, (4) algorithms to process ontology-mediated queries, based on query rewriting and/or forward chaining (or chase), (5) a rule analyzer, which performs a syntactic and structural analysis of an existential rule set, (6) several IO formats, including imports from OWL 2.
• Release Contributions:

Beta version (2020) provides improved chase algorithms. Available for internal use on gite.lirmm.fr

Previous versions: version 1.3.1 (2018), 1.3.0 (2017).

• News of the Year: 2020: beta version with improved chase algorithms. 2018: Version 1.3.1, with small bug fixes and minor improvements. Several new functionalities were developed by internships in 2018 but the code is not integrated to Graal yet. 2017: New stable version (1.3.0) realised. Moreover, Graal website has been deeply restructured and enriched with new tools, available online or for download, and documentation including tutorials, examples of use, and technical documentation about all Graal modules.
• URL:
• Publications:
• Authors: Clément Sipieter, Marie-Laure Mugnier, Jean-François Baget, Mélanie König, Michel Leclère, Swan Rocher, Guillaume Perution Kihli
• Contacts: Marie-Laure Mugnier, Federico Ulliana
• Participants: Marie-Laure Mugnier, Jean-François Baget, Michel Leclère, Federico Ulliana, Guillaume Perution Kihli, Olivier Rodriguez, Florent Tornil

### 7.1.2 Obi-Wan

• Name: Obi-Wan
• Keywords: RDF, Databases
• Scientific Description: Obi-Wan provides query answering on heterogeneous data sources integrated through mappings into a (possibly virtual) RDFS factbase, provided with an RDFS ontology and RDFS entailment rules
• Functional Description: Integration system of heteregeneous DB with RDFS ontology
• URL:
• Publications:
• Contact: Maxime Buron
• Participants: Maxime Buron, François Goasdoué, Ioana Manolescu, Marie-Laure Mugnier

# 8 New results

Participants: Jean-François Baget, Meghyn Bienvenu, Michel Leclère, Marie-Laure Mugnier, Elie Najm, Guillaume Pérution-Kihli, Olivier Rodriguez, Federico Ulliana, Pierre Bourhis, Maxime Buron, François Goasdoué, Ioana Manolescu, Sophie Tison.

Ontolology-mediated query answering (OMQA) is the issue of querying data while taking into account inferences enabled by ontological knowledge. From an abstract viewpoint, this gives rise to knowledge bases, composed of an ontology and a factbase (in database terms: a database instance under incomplete data assumption). Answers to queries are logically entailed from the knowledge base.

This year, we worked in two directions:

• deepening foundations of OMQA with existential rules, the main KR language developed by the team;
• moving from OMQA to a more general framework with explicit management of data sources and mappings from data to knowledge.

### 8.1.1 Fundamental Issues on OMQA with Existential Rules

Existential rules (a.k.a. datalog+, as this framework generalizes the deductive database language datalog) have emerged as a new expressive ontological language, well-suited to OMQA. The basic techniques for query answering under existential rules rely on the two classical ways of processing rules, namely forward chaining and backward chaining. In forward chaining, known as the chase in databases, the rules are applied to enrich the factbase and query answering can then be solved by evaluating the query against the saturated factbase (as in a classical database system, i.e., with forgetting the ontological knowledge). The backward chaining process is divided into two steps: first, the query is rewritten using the rules into a first-order query (typically a union of conjunctive queries, but it can be a more compact form); then the rewritten query is evaluated against the factbase (again, as in a classical database system). Depending on the considered class of existential rules, the chase and/or query rewriting may terminate or not.

In 2018 and 2019, we carried out the first studies on the boundedness problem for existential rules. This problem asks whether a given set of existential rules is bounded, i.e., whether there is a predefined bound on the “depth” of the chase independently from any factbase. It has been deeply studied in the context of datalog, where it is key to query optimization, but barely considered for existential rules yet.

The boundedness problem is already undecidable in the specific case of datalog rules. However, even for decidable subclasses, knowing that a set of rules is bounded does not help much in practice if the bound is unknown. Hence, as part of Stathis Delivorias's PhD thesis (defended in October 2019), we investigated the decidability and complexity of the k-boundedness problem, which asks whether a given set of rules is bounded by an integer $k$; we proved that k-boundedness is decidable for some main chase variants 32. We extended and deepened these results, which gave rise to a long paper version published in Theory and Practice of Logic Programming 11.

For datalog rules, boundedness is equivalent to a desirable property, namely first-order rewritability: a set of rules is called first-order rewritable if any conjunctive query can be rewritten into a first-order query, whose evaluation on any factbase yields the expected answers (i.e., the relevant part of the ontology can be compiled into the rewritten query, which allows one to reduce query answering to a simple query evaluation task). This equivalence does not hold for existential rules. Beside potential practical use, the notion of boundedness is closely related to an interesting theoretical question on existential rules: what are the relationships between chase termination and first-order query rewritability? With respect to this question, we obtained the following salient result for two main chase variants (oblivious and skolem): a set of existential rules is bounded if and only if it ensures both chase termination for any factbase and first-order rewritability for any conjunctive query. This gave rise to a paper at IJCAI 2019. This year, we wrote an extended version with all proof details 29. In collaboration with Pierre Bourhis and Sophie Tison.

Still on OMQA, we wrote an invited paper on the relationships between two prominent families of KR languages, namely existential rules and description logics, under the angle of data access. Generally speaking, existential rules and description logics are incomparable in terms of expressivity. However, existential rules generalize so-called Horn description logics, which are precisely those description logic dialects used in OMQA. In this paper, we compare salient Horn description logics and a decidable family of existential rules from a semantic and complexity viewpoints 12 (KI - Künstliche Intelligenz).

Finally, the collective book “A Guided Tour of Artificial Intelligence Research”, to which we contributed with a chapter on “Reasoning with Ontologies” finally appeared 27. In collaboration with Meghyn Bienvenu and Marie-Christine Rousset.

### 8.1.2 Ontology-Based Data Access with RDFS

As part of Maxime Buron's PhD thesis (defended in October 2020 31), co-supervised with Inria CEDAR team (Ioana Manolescu and François Goasdoué) within the iCODA IPL (Inria Project Lab), we considered the so-called Ontology-Based Data Access framework, which is composed of three components: the data level made of several independent data sources, the ontological level made of a knowledge base, and mappings that relate queries on the data sources to facts described in the vocabulary of the ontology. Roughly, the OMQA problem mentioned previously (Section 8.1.1) can be seen as a special case of query answering in the OBDA setting, where all mappings have been triggered to produce a set of facts, which allows one to do query answering on the knowledge base and ignore the data sources that gave rise to the facts. Our work is in the context of the Semantic Web, where knowledge is described in the RDFS language.

Specifically, our framework features heterogeneous data sources integrated through mappings into a (possibly virtual) RDFS factbase, provided with an RDFS ontology and RDFS entailment rules. The innovative aspects with respect to the state of the art are (i) SPARQL queries that extend classical conjunctive queries by the ability of querying data and ontological triples together, namely Basic Graph Pattern Queries and (ii) Global-Local-As-View (GLAV) mappings, which can be seen as source-to-target existential rules. GLAV mappings make it possible to create unknown entities (blank nodes), which increases the amount of information accessible through the integration system, e.g., to state the existence of some data whose values are not known in the sources.

We devised and experimentally compared several query answering techniques in this setting. These techniques can be seen as different ways of distributing the reasoning effort among preprocessing and query times 16 (EDBT 2020). Moreover, the performance of query answering in an RDF database strongly depends on the data layout, that is, the way data is split in persistent data structures. We proposed a new layout (TCP), which combines two well-known layouts (T and CP). In exchange to occupying more storage space, e.g. on an inexpensive disk, TCP avoids the bad or even catastrophic performance that T and/or CP sometimes exhibit for queries. We also introduced summary-based pruning, a novel technique based on existing RDF quotient summaries, which improves query answering performance on the T, CP and the more robust TCP layouts 14 (SSWS 2020).

The whole framework and associated algorithms have been implemented in a prototype called Obi-Wan, developed on top of CEDAR and GraphIK software (Tatooine, OntoSQL and Graal), which was demonstrated at VLDB 2020 15.

## 8.2 Reasoning with Conflicts and Decision Support

Participants: Pierre Bisquert, Patrice Buche, Madalina Croitoru, Jérôme Fortin, Martin Jedwabny, Rallou Thomopoulos.

In real-world applications, data is likely to generate inconsistencies in the presence of ontological knowledge, specially when it comes from several independent sources. In particular, data coming from different stakeholders, such as preferences and opinions, is generally conflicting. In order to use this data, for instance in a decision support setting, it is thus necessary to be able to reason in the presence of inconsistencies. In such a context, classical reasoning fails because any statement can be derived from a contradiction. Argumentation is one approach to this problem, where inference steps are represented as possibly conflicting arguments. To a set of arguments is naturally associated a graph in which arguments are nodes and conflicts are edges.

One interest of the argumentation framework is that it allows to define a variety of semantics for reasoning in the presence of inconsistencies, some of them having been shown to be semantically equivalent to repair-based approaches. Second, this framework naturally benefits from the explanatory potential of graphs, which is particularly interesting to help the users better understand the results of the reasoning.

This year, we investigated the following questions:

• How to be expressive enough while controlling the computational cost of reasoning?
• How to represent stakeholders' conflicting opinions and preferences to practically support a decision?

### 8.2.1 Argumentation

Argumentation is an appealing reasoning tool in presence of inconsistencies, however a main concern lies in its real-world applicability. We basically face two challenges:

1. taming the combinatorial explosion of arguments associated with a knowledge base,
2. meeting the expressivity needs in applications.

#### Combinatorial Aspects

Regarding the combinatorial aspects it is known that the number of arguments generated by argumentation-based methods can be prohibitively large, since they require the construction of one argument per inference step (i.e., per rule application). We started to investigate alternative methods still based on argumentation, but avoiding the combinatorial explosion at the graph construction phase. To that end, we focused on two main techniques: the use of argumentation hypergraphs and the deployment of backward chaining.

Argumentation hypergraphs extend argumentation graphs by considering hyperedges (as opposed to classically considered binary edges). They can encode in a much more compact form the inconsistencies arising from n-ary conflicts. Please note that this is especially important for the work of GraphIK as our main foundations rule language, existential rules, can easily encode n-ary conflicts (called n-ary constraints) as opposed to other ontological languages such as Description Logics (e.g., DL-Lite), which can only directly capture binary constraints. In 25 (COMMA 2020), we provided an argumentation framework that considers sets of attacking arguments (n-ary attacks) and possesses arguments that are recursively built upon other arguments and n-ary attacks. We proved that this new framework retains desirable properties with fewer arguments and attacks compared to the existing frameworks. Based on this foundational work, we developed in 24 (AAAI 2020) the first ranking-based semantics applicable to n-ary attack relations. We generalised existing postulates for ranking-based semantics to fit this framework, proved that it converges for every argumentation framework and studied the postulates it satisfies.

In 23 (COMMA 2020), we addressed the problem of efficient generation of structured argumentation systems and considered a simplified variant of an ASPIC argumentation system. We provided a backward chaining mechanism for the generation of argumentation graphs and we empirically compared the efficiency of this new approach with existing approaches (which are based on forward chaining).

Finally, we studied the practical issue of computing repairs for existential rule inconsistent KBs, which is needed for certain tasks that require an enumeration of the repairs (such as inconsistency-based repair ranking frameworks or argumentation-based decision-making). Indeed, the problem of all repair computation is very costly in practice. In 22 (ICCS 2020), we proposed and evaluated an incremental approach providing an efficient computation of all repairs when the conflicts have a cardinality of at most three.

#### Expressivity

Regarding the expressivity needs of argumentation for real-world scenarios, bipolar argumentation graphs extend argumentation graphs by considering the additional binary relation of the support (translating the fact that an argument supports another). Hence, a bipolar argumentation graph has bi-colored edges: attack and support. The notion of support is largely debated in the literature and in our work we considered two main semantics: support in defeasible logics and logical necessities.

Defeasible Logics are a family of approaches to handle conflicts in situations were two types of rules are considered: strict rules expressing undeniable implications (if A is true then B is definitely true), and defeasible rules expressing weaker implications (if A is true then B is generally true). The use of Defeasible Logics allows for more expressivity since contradictions may stem from either relying on incorrect facts or from having exceptions to the defeasible implications. Our work in defeasible logics was initiated throughout the PhD thesis of Abdelraouf Hecham who graduated in 2018. Then, the underlying bipolar structure of defeasible logics and their expressivity have continued to be investigated throughout his postdoc and beyond, as part of the PhD work of Martin Jedwabny: 19, 20, 18 (AAAI 2020, ECAI 2020, ICCS 2020).

Argumentation framework with necessities replace the classical deductive support relation between arguments (if argument A supports argument B, accepting A implies accepting B) with logical necessity support (if A supports B, accepting B implies accepting A), which allows to express requirements between arguments. The role of necessities as a support relation in ranking semantics has been investigated in 17 (IJCAI 2020). To this end, we (1) introduced a set of postulates specifically designed for necessities and (2) proposed the first ranking-based semantics in the literature to be shown to respect these postulates and converge for certain argumentation graphs.

### 8.2.2 Decision Support

This part of our work is concerned with the practical application of knowledge representation and argumentation for supporting decision.

In particular, we applied our work in the context of Life Cycle analysis (LCA). LCA is a family of multi-criteria analyses specific to the environmental impact of a product in its different stages (such as manufacturing or transporting), where the criteria relate to different dimensions of the environment (global warming, water ecotoxicity, etc.). LCA is however susceptible to collective disagreement or practitioner bias on the weighting of the different criteria. In 13 (Sustainability journal), we proposed a methodology using our software DAMN in order to represent arguments justifying preferences on impact criteria. Those preferences are then aggregated in order to produce a weighting profile that is used in the LCA analysis. We applied this approach in the context of the European project NoAW in which polyphenol extraction technologies were compared.

Complementary work on decision support was carried out by our associate collaborators at INRAE, see 21, 28, 26.

# 9 Partnerships and cooperations

## 9.1 International research visitors

Due to the sanitary crisis, all visits were cancelled. In particular, Marie-Laure Mugnier was invited for 3 months at Stanford University as part of a 6 month-sabbatical.

## 9.2 European initiatives

### 9.2.1 FP7 & H2020 Projects

#### NoAW (H2020, Oct. 2016-Sept. 2020)

Participants: Patrice Buche, Pierre Bisquert, Madalina Croitoru, Rallou Thomopoulos.

NoAW (No Agricultural Waste) is led by INRAE (IATE laboratory). Driven by a “near zero-waste” society requirement, the goal of NoAW project is to generate innovative efficient approaches to convert growing agricultural waste issues into eco-efficient bio-based products opportunities with direct benefits for both environment, economy and EU consumer. To achieve this goal, the NoAW concept relies on developing holistic life cycle thinking able to support environmentally responsible R&D innovations on agro-waste conversion at different TRLs, in the light of regional and seasonal specificities, not forgetting risks emerging from circular management of agro-wastes (e.g. contaminants accumulation). GraphIK contributes on two aspects. On the one hand we participate in the annotation effort of knowledge bases (using the @Web tool). On the other hand we further investigate the interplay of argumentation with logically instantiated frameworks and its relation with social choice in the context of decision making.

#### GLOPACK (H2020, June. 2018- July. 2022)

Participants: Patrice Buche, Pierre Bisquert, Madalina Croitoru.

GLOPACK is led by the University of Montpellier (IATE laboratory). It proposes a cutting-edge strategy addressing the technical and societal barriers to spread in our social system, innovative eco-efficient packaging able to reduce food environmental footprint. Focusing on accelerating the transition to a circular economy concept, GLOPACK aims to support users and consumers’ access to innovative packaging solutions enabling the reduction and circular management of agro-food, including packaging, wastes. Validation of the solutions including compliance with legal requirements, economic feasibility and environmental impact will push forward the technologies tested and the related decision-making tool to TRL 7 for a rapid and easy market uptake contributing therefore to strengthen European companies’ competitiveness in an always more globalised and connected world.

### 9.2.2 Collaborations in European programs, except FP7 and H2020

#### FoodMC (European COST action, 2016-2020)

Participants: Patrice Buche, Madalina Croitoru, Rallou Thomopoulos.

COST actions aim to develop European cooperation in science and technology. FoodMC (CA 15118) is a cost action on Mathematical and Computer Science Methods for Food Science and Industry. Rallou Thomopoulos is co-leader of this action for France, and member of the action Management Committee, and other members of GraphIK (Patrice Buche, Madalina Croitoru) are participants. The action is organised in four working groups, dealing respectively with the modelling of food products and food processes, modelling for eco-design of food processes, software tools for the food industry, and dissemination and knowledge transfer.

## 9.3 National initiatives

#### CQFD (ANR PRC, Jan. 2019-Dec. 2024)

Participants: Jean-François Baget, Michel Leclère, Marie-Laure Mugnier, Guillaume Pérution-Kihli, Olivier Rodriguez, Florent Tornil, Federico Ulliana.

CQFD (Complex ontological Queries over Federated heterogeneous Data), coordinated by Federico Ulliana (GraphIK), involves participants from Inria Saclay (CEDAR team), Inria Paris (VALDA team), Inria Nord Europe (SPIRALS team), IRISA, LIG, LTCI, and LaBRI. The aim of this project is tackle two crucial challenges in OMQA (Ontology Mediated Query Answering), namely, heterogeneity, that is, the possibility to deal with multiple types of data-sources and database management systems, and federation, that is, the possibility of cross-querying a collection of heterogeneous datasources. By featuring 8 different partners in France, this project aims at consolidating a national community of researchers around the OMQA issue.

#### ICODA (Inria Project Lab, 2017-2021)

Participants: Jean-François Baget, Michel Chein, Alain Gutierrez, Marie-Laure Mugnier.

The iCODA project (Knowledge-mediated Content and Data Interactive Analytics—The case of data journalism), coordinated by Guillaume Gravier and Laurent Amsaleg (LINKMEDIA), takes together four Inria teams: LINKMEDIA, CEDAR, ILDA and GraphIK, as well as three press partners: Ouest France, Le Monde (les décodeurs) and AFP.

Taking data journalism as an emblematic use-case, the goal of the project is to develop the scientific and technological foundations for knowledge-mediated user-in-the-loop big data analytics jointly exploiting data and content, and to demonstrate the effectiveness of the approach in realistic, high-visibility use-cases.

#### Docamex (CASDAR project, 2017-2020)

Participants: Patrice Buche, Madalina Croitoru, Jérôme Fortin.

DOCaMEx (Développement de prOgiciels de Capitalisation et de Mobilisation du savoir-faire et de l'Expérience fromagers en filière valorisant leur terroir), let by CFTC (centre technique des fromages de Franche-Comté) involves 7 research units (including IATE and LIRMM), 8 technical centers and 3 dairy product schools. It represents five cheese-making chains (Comté, Reblochon, Emmental de Savoie, Salers, Cantal).

Traditional cheese making requires a lot of knowledge, expertise, and experience, which are usually acquired over a long time. This know-how is today mainly transmitted by apprenticeship and a concrete risk of knowledge forgetting is raised by the evolution of practices in the sector. The main goal of the project is to develop a new approach for expert knowledge elicitation and capitalization, and a dedicated software for decision making. The novel part of the decision making tool consists in the representation power and reasoning efficiency in the context of the logic used to describe the domain knowledge.

## 9.4 Regional initiatives

#### Convergence Institute #DigitAg (2017-2023)

Participants: Jean-François Baget, Patrice Buche, Madalina Croitoru, Marie-Laure Mugnier, Elie Najm, Rallou Thomopoulos, Federico Ulliana.

Located in Montpellier, #DigitAg (for Digital Agriculture) gathers 17 founding members: research institutes, including Inria, the University of Montpellier and higher-education institutes in agronomy, transfer structures and companies. Its objective is to support the development of digital agriculture. GraphIK is involved in this project on the issues of designing data and knowledge management systems adapted to agricultural information systems, and of developing methods for integrating different types of information and knowledge (generated from data, experts, models). A PhD thesis started in 2019 (Elie Najm) is investigating knowledge representation and reasoning for agro-ecological systems, in collaboration with the research laboratory UMR SYSTEM (Tropical and mediterranean cropping system functioning and management).

## 9.5 Informal Partners

We continue to work informally with the following partners:

• Pierre Bourhis (SPIRALS Inria team) and Sophie Tison (LINKS Inria team) on Ontology-Mediated Query Answering 29.
• Michael Thomazo (VALDA Inria team) on Ontology-Mediated Query Answering .
• Maxime Buron (CEDAR Inria team), François Goasdoué (IRISA/CEDAR) and Ioana Manolescu (CEDAR) on Ontology-Based Data Access 15, 16, 1431.
• Srdjan Vesic (CRIL) and Bruno Yun (University of Aberdeen) on Argumentation Systems 24, 22, 17, 25, 23 .

# 10 Dissemination

## 10.1 Promoting scientific activities

### 10.1.1 Scientific events: organisation

#### Member of the conference program committees

We regularly participate to the program committees of the top conferences in AI (IJCAI and ECAI for 2020) as senior PC members or PC members. We also regularly participate to the program committees of more focused international conferences and workshops as well as national events.

### 10.1.3 Leadership within the scientific community

• Madalina Croitoru was a member of the steering committee for ICCS 2020 (26th International Conference on Conceptual Structures), https://iccs-conference.org/
• Rallou Thomopoulos has been co-leader of the trans-unit program InCom (Knowledge and Model Integration) of the TRANSFORM Division of INRAE from 2016.

• From September 2019 onwards, Madalina Croitoru has been deputy member of the CNU section 27 (Computer Science).
• Rallou Thomopoulos is an elected member of the Scientific Committee of the INRAE-CEPIA research division (2016-2020).

## 10.2 Teaching - Supervision - Juries

### 10.2.1 Teaching

The five faculty members do an average of 200 teaching hours per year at the Computer Science department of the Science Faculty. They are in charge of courses in Logics (Licence), Databases (Master), Artificial Intelligence (M), Knowledge Representation and Reasoning (M), Theory of Data and Knowledge Bases (M), Social and Semantic Web (M) and Multi-Agent Systems (M). Concerning full-time researchers in 2020, Jean-François Baget gave about 40 hours in master.

Moreover, some faculty members have specific teaching responsibilities:

• Madalina Croitoru has been in charge of international relations for the Computer Science department of the Science Faculty as well as of the management of industrial master internships (about 100 students each year) of the Master of Computer Science, since 2019.
• Federico Ulliana has been the head of the curriculum “Data, Knowledge and Natural Language Processing” (DECOL, about 30 students), part of the Master of Computer Science, since 2017.

### 10.2.2 Involvement in University Structures

• Marie-Laure Mugnier has been a member of the Council of the Scientific Department MIPS (Mathematics Informatics Physics and Systems) of the University of Montpellier, since 2016.

### 10.2.3 Supervision

• PhD Defended: Maxime Buron (CEDAR Inria team), “Efficient reasoning on large heterogeneous graphs”. Supervisors: François Goasdoué (IRISA/CEDAR), Ioana Manolescu (CEDAR) and Marie-Laure Mugnier. Institut Polytechnique de Paris, October 2020 31.
• PhD in progress: Olivier Rodriguez, “Querying key-value store under semantic constraints”. Supervisors: Federico Ulliana and Marie-Laure Mugnier. Started February 2019.
• PhD in progress: Elie Najm, “Knowledge Representation and Reasoning for innovating agroecological systems”. Supervisors: Marie-Laure Mugnier, Christian Gary (INRAE, UMR ABSys), Jean-François Baget and Raphaël Metral (Supagro, UMR ABSys). Started October 2019.
• PhD in progress: Martin Jedwabny, “Argumentation and ethical decision making”. Supervisors: Madalina Croitoru and Pierre Bisquert. Started October 2019.
• PhD in progress: Guillaume Pérution-Kihli, “Des données aux connaissances : un cadre unifié pour l’intégration sémantique de données hétérogènes et l’amélioration de leur qualité”. Supervisors: Michel Leclère and Marie-Laure Mugnier. Started September 2020.

### 10.2.4 Juries

• Jury reviewer for the PhD defense of Mélanie Munch (November 2020, U. Paris Saclay) - Madalina Croitoru
• Jury member for the PhD defense of Jose Luis Lozano (February 2020, U. Lille) - Federico Ulliana
• Jury reviewer for the PhD defense of Adrian Robert (November 2020, U. Angers) - Patrice Buche

Madalina Croitoru was vice-presidente of a recruitement jury for an assistant professor (MCF) position at the University of Montpellier.

# 11 Scientific production

## 11.1 Major publications

• 1 inproceedings Jean-FrançoisJ.-F. Baget, MeghynM. Bienvenu, Marie-LaureM.-L. Mugnier and MichaëlM. Thomazo. 'Answering Conjunctive Regular Path Queries over Guarded Existential Rules'. IJCAI: International Joint Conference on Artificial Intelligence Melbourne, Australia August 2017
• 2 articleJean-FrançoisJ.-F. Baget, MichelM. Leclère, Marie-LaureM.-L. Mugnier and EricE. Salvat. 'On Rules with Existential Variables: Walking the Decidability Line'.Artificial Intelligence1759-10March 2011, 1620-1654
• 3 inproceedings MeghynM. Bienvenu, PierreP. Bourhis, Marie-LaureM.-L. Mugnier, SophieS. Tison and FedericoF. Ulliana. 'Ontology-Mediated Query Answering for Key-Value Stores'. IJCAI: International Joint Conference on Artificial Intelligence Melbourne, Australia August 2017
• 4 articleMeghynM. Bienvenu, StanislavS. Kikot, RomanR. Kontchakov, Vladimir VV. Podolskii and MichaelM. Zakharyaschev. 'Ontology-Mediated Queries: Combined Complexity and Succinctness of Rewritings via Circuit Complexity'.Journal of the ACM (JACM)655September 2018, 1-51
• 5 inproceedings PierreP. Bourhis, MichelM. Leclère, Marie-LaureM.-L. Mugnier, SophieS. Tison, FedericoF. Ulliana and LilyL. Gallois. 'Oblivious and Semi-Oblivious Boundedness for Existential Rules'. IJCAI 2019 - International Joint Conference on Artificial Intelligence Macao, China August 2019
• 6 inproceedings MaximeM. Buron, FrançoisF. Goasdoué, IoanaI. Manolescu and Marie-LaureM.-L. Mugnier. 'Ontology-Based RDF Integration of Heterogeneous Data'. EDBT/ICDT 2020 - 23rd International Conference on Extending Database Technology Copenhagen, Denmark March 2020
• 7 inproceedingsAbdelraoufA. Hecham, PierreP. Bisquert and MadalinaM. Croitoru. 'On a Flexible Representation for Defeasible Reasoning Variants'.AAMAS: Autonomous Agents and MultiAgent SystemsStockholm, SwedenJuly 2018, 1123-1131
• 8 articleMélanieM. König, MichelM. Leclère, Marie-LaureM.-L. Mugnier and MichaëlM. Thomazo. 'Sound, Complete and Minimal UCQ-Rewriting for Existential Rules'.Semantic Web journal652015, 451-475
• 9 articleBrunoB. Yun, PierreP. Bisquert, PatriceP. Buche, MadalinaM. Croitoru, ValérieV. Guillard and RallouR. Thomopoulos. 'Choice of environment-friendly food packagings through argumentation systems and preferences'.Ecological Informatics48November 2018, 24-36
• 10 inproceedingsBrunoB. Yun, SrdjanS. Vesic, MadalinaM. Croitoru and PierreP. Bisquert. 'Inconsistency Measures for Repair Semantics in OBDA'.IJCAI: International Joint Conference on Artificial IntelligenceStockholm, SwedenJuly 2018, 1977-1983

## 11.2 Publications of the year

### International journals

• 11 articleStathisS. Delivorias, MichelM. Leclère, Marie-LaureM.-L. Mugnier and FedericoF. Ulliana. 'Characterizing Boundedness in Chase Variants'.Theory and Practice of Logic Programming211August 2020, 51-79
• 12 articleMarie-LaureM.-L. Mugnier. 'Data Access With Horn Ontologies: Where Description Logics Meet Existential Rules'.KI - Künstliche Intelligenz3442020, 475-489
• 13 articleJoshuaJ. Sohn, PierreP. Bisquert, PatriceP. Buche, AbdelraoufA. Hecham, Pradip PP. Kalbar, BenB. Goldstein, MortenM. Birkved and Stig IrvingS. Olsen. 'Argumentation Corrected Context Weighting-Life Cycle Assessment: A Practical Method of Including Stakeholder Perspectives in Multi-Criteria Decision Support for LCA'.Sustainability126March 2020, 2170

### International peer-reviewed conferences

• 14 inproceedings MaximeM. Buron, FrançoisF. Goasdoué, IoanaI. Manolescu, TayebT. Merabti and Marie-LaureM.-L. Mugnier. 'Revisiting RDF storage layouts for efficient query answering'. SSWS 2020 - 13th International Workshop on Scalable Semantic Web Knowledge Base Systems Athène, Greece inria saclay August 2020
• 15 inproceedings MaximeM. Buron, FrançoisF. Goasdoué, IoanaI. Manolescu and Marie-LaureM.-L. Mugnier. 'Obi-Wan: Ontology-Based RDF Integration of Heterogeneous Data'. VLDB 2020 - 46th International Conference on Very Large Data Bases Tokyo, Japan August 2020
• 16 inproceedings MaximeM. Buron, FrançoisF. Goasdoué, IoanaI. Manolescu and Marie-LaureM.-L. Mugnier. 'Ontology-Based RDF Integration of Heterogeneous Data'. EDBT/ICDT 2020 - 23rd International Conference on Extending Database Technology Copenhagen, Denmark March 2020
• 17 inproceedingsDraganD. Doder, SrdjanS. Vesic and MadalinaM. Croitoru. 'Ranking Semantics for Argumentation Systems With Necessities'.29th International Joint Conference on Artificial Intelligence (IJCAI)Yokohama, JapanJanuary 2021, 1912-1918
• 18 inproceedingsAbdelraoufA. Hecham, PierreP. Bisquert and MadalinaM. Croitoru. 'A formalism unifying Defeasible Logics and Repair Semantics for existential rules'.ICCS 2020 - 25th International Conference on Conceptual Structures12277Lecture Notes in Computer ScienceBolzano / Virtual, Italyhttps://iccs-conference.org/?page_id=17September 2020, 3-17
• 19 inproceedings AbdelraoufA. Hecham, MadalinaM. Croitoru and PierreP. Bisquert. 'DAMN: Defeasible Reasoning Tool for Multi-Agent Reasoning'. AAAI 2020 - 34th AAAI Conference on Artificial Intelligence New York, United States https://aaai.org/Conferences/AAAI-20/ 2020
• 20 inproceedingsMartinM. Jedwabny, MadalinaM. Croitoru and PierreP. Bisquert. 'Gradual Semantics for Logic-Based Bipolar Graphs Using T-(Co)norms'.ECAI 2020 - 24th European Conference on Artificial Intelligence325Frontiers in Artificial Intelligence and ApplicationsSantiago de Compostela (virtual), Spainhttp://ecai2020.eu/2020, 777-783
• 21 inproceedings RallouR. Thomopoulos, JulienJ. Cufi and MaximeM. Le Breton. 'A Generic Software to Support Collective Decision in Food Chains and in Multi-Stakeholder Situations'. FoodSim 2020 - 11th Biennial FOODSIM Conference Proceedings of FoodSim 2020 Ghent, Belgium September 2020
• 22 inproceedings BrunoB. Yun and MadalinaM. Croitoru. 'An Incremental Algorithm for Computing All Repairs in Inconsistent Knowledge Bases'. ICCS 2020 - 25th International Conference on Conceptual Structures Bolzano / Virtual, Italy https://iccs-conference.org/?page_id=17 2020
• 23 inproceedingsBrunoB. Yun, NirN. Oren and MadalinaM. Croitoru. 'Efficient Construction of Structured Argumentation Systems'.COMMA 2020 - 8th International Conference on Computational Models of Argument326Frontiers in Artificial Intelligence and ApplicationsPerugia, Italy2020, 411-418
• 24 inproceedingsBrunoB. Yun, SrdjanS. Vesic and MadalinaM. Croitoru. 'Ranking-Based Semantics for Sets of Attacking Arguments'.AAAI 20 - 34th AAAI Conference on Artificial Intelligence343New York, United StatesApril 2020, 3033-3040
• 25 inproceedingsBrunoB. Yun, SrdjanS. Vesic and MadalinaM. Croitoru. 'Sets of Attacking Arguments for Inconsistent Datalog Knowledge Bases'.COMMA 2020 - 8th International Conference on Computational Models of Argument326Frontiers in Artificial Intelligence and ApplicationsPerugia / Virtual), ItalySeptember 2020, 419-430

### Conferences without proceedings

• 26 inproceedingsPatriceP. Buche, JulienJ. Cufi, StéphaneS. Dervaux, JulietteJ. Dibie, Liliana L.L. Ibanescu, AlrickA. Oudot and MagalieM. Weber. 'A new alignment method based on FoodOn as pivot ontology to integrate nutritional legacy data sources'.ICBO 2020 - IFOW Integrated Food Ontology WorkshopBolzano / Virtual, Italyhttps://foodon.org/icbo-2020-food-workshop/September 2020, 1-2

### Scientific book chapters

• 27 inbookMeghynM. Bienvenu, MichelM. Leclère, Marie-LaureM.-L. Mugnier and Marie-ChristineM.-C. Rousset. 'Reasoning with Ontologies'.A Guided Tour of Artificial Intelligence ResearchVolume I: Knowledge Representation, Reasoning and LearningMay 2020, 185-215
• 28 inbook RallouR. Thomopoulos, NicolasN. Salliou, PatrickP. Taillandier and AlbertoA. Tonda. 'Consumers' Motivations towards Environment-Friendly Dietary Changes: An Assessment of Trends Related to the Consumption of Animal Products'. Handbook of Climate Change Across the Food Supply Chain 2020

### Reports & preprints

• 29 report PierreP. Bourhis, MichelM. Leclère, Marie-LaureM.-L. Mugnier, SophieS. Tison, FedericoF. Ulliana and LilyL. Galois. 'Oblivious and Semi-Oblivious Boundedness for Existential Rules'. LIRMM (UM, CNRS) June 2020

## 11.3 Other

### Patents

• 30 patent R. Thomopoulos, J. Cufi, M. Le Breton and B. Thomas. 'MyChoice software'. 2020

## 11.4 Cited publications

• 31 phdthesisMaximeM. Buron. 'Efficient reasoning on large and heterogeneous graphs. (Raisonnement efficace sur des grandsgraphes hétérogènes)'.École Polytechnique, Palaiseau, France2020,
• 32 inproceedingsStathisS. Delivorias, MichelM. Leclère, Marie-LaureM.-L. Mugnier and FedericoF. Ulliana. 'On the k-Boundedness for Existential Rules'.Rules and Reasoning - Second International Joint Conference, RuleML+RR 2018, Luxembourg, September 18-21, 2018, Proceedings11092Lecture Notes in Computer ScienceSpringer2018, 48--64