Abstract

Translate

Deep neural learning has shown remarkable performance at learning representations for visual object categorization. However, deep neural networks such as CNNs do not explicitly encode objects and relations among them. This limits their success on tasks that require a deep logical understanding of visual scenes, such as Kandinsky patterns and Bongard problems. To overcome these limitations, we introduce $α ILP$ , a novel differentiable inductive logic programming framework that learns to represent scenes as logic programs—intuitively, logical atoms correspond to objects, attributes, and relations, and clauses encode high-level scene information. $α$ ILP has an end-to-end reasoning architecture from visual inputs. Using it, $α$ ILP performs differentiable inductive logic programming on complex visual scenes, i.e., the logical rules are learned by gradient descent. Our extensive experiments on Kandinsky patterns and CLEVR-Hans benchmarks demonstrate the accuracy and efficiency of $α ILP$ in learning complex visual-logical concepts.

Details

Title

αILP: thinking visual scenes as differentiable logic programs

Author

Shindo, Hikaru¹; Pfanschilling, Viktor¹; Dhami, Devendra Singh²; Kersting, Kristian³

¹ TU Darmstadt, Darmstadt, Germany (GRID:grid.6546.1) (ISNI:0000 0001 0940 1669)
² TU Darmstadt, Darmstadt, Germany (GRID:grid.6546.1) (ISNI:0000 0001 0940 1669); Hessian Center for AI (hessian.AI), Darmstadt, Germany (GRID:grid.6546.1)
³ TU Darmstadt, Darmstadt, Germany (GRID:grid.6546.1) (ISNI:0000 0001 0940 1669); Hessian Center for AI (hessian.AI), Darmstadt, Germany (GRID:grid.6546.1); TU Darmstadt, Centre for Cognitive Science, Darmstadt, Germany (GRID:grid.6546.1) (ISNI:0000 0001 0940 1669)

Pages

1465-1497

Publication year

2023

Publication date

May 2023

Publisher

Springer Nature B.V.

ISSN

08856125

e-ISSN

15730565

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1007/s10994-023-06320-1

ProQuest document ID

2809964661

© The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

αILP: thinking visual scenes as differentiable logic programs

Jump to:

Abstract

Details

Suggested sources