Abstract

Deep neural learning has shown remarkable performance at learning representations for visual object categorization. However, deep neural networks such as CNNs do not explicitly encode objects and relations among them. This limits their success on tasks that require a deep logical understanding of visual scenes, such as Kandinsky patterns and Bongard problems. To overcome these limitations, we introduce αILP, a novel differentiable inductive logic programming framework that learns to represent scenes as logic programs—intuitively, logical atoms correspond to objects, attributes, and relations, and clauses encode high-level scene information. αILP has an end-to-end reasoning architecture from visual inputs. Using it, αILP performs differentiable inductive logic programming on complex visual scenes, i.e., the logical rules are learned by gradient descent. Our extensive experiments on Kandinsky patterns and CLEVR-Hans benchmarks demonstrate the accuracy and efficiency of αILP in learning complex visual-logical concepts.

Details

Title
αILP: thinking visual scenes as differentiable logic programs
Author
Shindo, Hikaru 1 ; Pfanschilling, Viktor 1 ; Dhami, Devendra Singh 2 ; Kersting, Kristian 3 

 TU Darmstadt, Darmstadt, Germany (GRID:grid.6546.1) (ISNI:0000 0001 0940 1669) 
 TU Darmstadt, Darmstadt, Germany (GRID:grid.6546.1) (ISNI:0000 0001 0940 1669); Hessian Center for AI (hessian.AI), Darmstadt, Germany (GRID:grid.6546.1) 
 TU Darmstadt, Darmstadt, Germany (GRID:grid.6546.1) (ISNI:0000 0001 0940 1669); Hessian Center for AI (hessian.AI), Darmstadt, Germany (GRID:grid.6546.1); TU Darmstadt, Centre for Cognitive Science, Darmstadt, Germany (GRID:grid.6546.1) (ISNI:0000 0001 0940 1669) 
Pages
1465-1497
Publication year
2023
Publication date
May 2023
Publisher
Springer Nature B.V.
ISSN
08856125
e-ISSN
15730565
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2809964661
Copyright
© The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.