Deep Learning for Computer Vision: A comparision between Convolutional Neural Networks and Hierarchical Temporal Memories on object recognition tasks - Slides
This document describes a study comparing Convolutional Neural Networks (CNNs) and Hierarchical Temporal Memories (HTMs) on object recognition tasks. The study implements a CNN using Theano, creates a new benchmark of image sequences from the NORB dataset, and evaluates the performance of CNNs and HTMs on the original NORB dataset and new image sequences. The results show that while CNNs achieve higher accuracy on the original NORB data, HTMs are more competitive on the image sequences and can achieve comparable performance using less training data. The study proves that bio-inspired approaches like HTM can advance deep learning research.
Similar to Deep Learning for Computer Vision: A comparision between Convolutional Neural Networks and Hierarchical Temporal Memories on object recognition tasks - Slides
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...CSCJournals
Similar to Deep Learning for Computer Vision: A comparision between Convolutional Neural Networks and Hierarchical Temporal Memories on object recognition tasks - Slides (20)
Deep Learning for Computer Vision: A comparision between Convolutional Neural Networks and Hierarchical Temporal Memories on object recognition tasks - Slides
1. Alma Mater Studiorum - University of Bologna
School of Science
Department of Computer Science and Engineering DISI
Deep Learning for Computer Vision
Candidate
dott. Vincenzo Lomonaco
Supervisor
prof. Davide Maltoni
Co-examiner
prof. Mauro Gaspari
A comparison between Convolutional Neural
Networks and Hierarchical Temporal Memories on
object recognition tasks
2. 08.09.15 Vincenzo Lomonaco 2
ContentsBackground & Motivations
Objectives
Introduction
CNN and HTM
Key features
Implementations
NORB-sequences
Original NORB dataset
New benchmark design
Experiments and Results
Experiments design
Results
Conclusions
Contents
3. 08.09.15 Vincenzo Lomonaco 3
ContentsBackground & Motivations
Objectives
Introduction
CNN and HTM
Key features
Implementations
NORB-sequences
Original NORB dataset
New benchmark design
Experiments and Results
Experiments design
Results
Conclusions
Contents
4. 08.09.15 Vincenzo Lomonaco 4
Deep Learning
In the last decade, Deep Learning techniques have shown to
perform incredibly well on a large variety of problems both in
Computer Vision and Natural Language Processing, resulting in
the state of the art in many tasks.
5. 08.09.15 Vincenzo Lomonaco 5
Deep Learning advantages
Deep Learning is a branch of machine learning based on a set of
algorithms that attempt to model high-level abstractions in data by
using model architectures composed of multiple non-linear
transformations.
6. 08.09.15 Vincenzo Lomonaco 6
Deep Learning disadvantages
● Poorly understood surrounding theory
● Non-optimal method
● Very difficult to train
● Huge quantity of data needed
● High Performance Computing environment needed
Possible limitations:
7. 08.09.15 Vincenzo Lomonaco 7
Objectives
Proving that taking inspiration from biological learning
systems can help again in advancing the field of DL.
Proving that, with less data, it is however possible to reach
good levels of accuracy.
8. 08.09.15 Vincenzo Lomonaco 8
How
We would like to show that, with a lower quantity of available
data, HTM can outperfom CNN on these tasks remaining
comparable in terms of training times.
Comparing two very different deep learning algorithms on
object recognition tasks:
– CNN: classical approach, state-of-the-art for object
recognition
– HTM: new biologically inspired approach
9. 08.09.15 Vincenzo Lomonaco 9
NORB-sequences
Experiments and Results
Conclusions
ContentsBackground & Motivations
Objectives
Introduction
CNN and HTM
Key features
Implementations
Original NORB dataset
New benchmark design
Experiments design
Results
Contents
10. 08.09.15 Vincenzo Lomonaco 10
CNN
CNNs are MLP variants where individual neurons are tiled in
such a way that they respond to overlapping regions in the
visual field. They are architectural inspired by Hubel and
Wiesel’s early work on the cat’s visual cortex.
● Python
● Using Theano
● 11 source files, 2550+ lns
● Pure supervised method
● Sparse Connectivity
● Shared Weights
Key features: Implementation:
11. 08.09.15 Vincenzo Lomonaco 11
HTM
HTM is known as a new emerging paradigm that is more
biologically inspired. It tries to incorporate concepts like time,
context and attention during the learning process that are
typical of the human brain.
● C#, OPENMP version
● Provided by Biometric
System Lab (DISI)
● Mainly unsupervised method
● Top down and bottom-up
information flow
● Bayesian probabilistic
formulation
Key features: Implementations:
12. 08.09.15 Vincenzo Lomonaco 12
Experiments and Results
Conclusions
Original NORB dataset
New benchmark design
ContentsBackground & Motivations
Objectives
Introduction
CNN and HTM
Key features
Implementations
Experiments design
Results
Contents
NORB-sequences
13. 08.09.15 Vincenzo Lomonaco 13
NORB-Sequences
Since the computer vision community is starting to investigate
object recognition algorithms on videos, we would like to move
our comparison to that direction.
To this purpose, a new benchmark of a large collection of image
sequences starting from the well-know small NORB DATASET
has been created.
THE original NORB DATASET:
● Stores 48,600 96x96 image (5 categories, 10 instances, 6 lightings,
9 elevations, and 18 azimuths).
● Is well-know and accepted by the research community in the
context of object-recognition
15. 08.09.15 Vincenzo Lomonaco 15
Java sequencer
NORB-sequences is made possible thanks to a Java software
that takes in input the small NORB DATASET, and given a
number of different tuning parameters, return a number of
training and a test image sequences.
time
● The sequences are created ad hoc to simulate a camera moving
around a specific object including changes in the surround lighting.
● Integrated KNN baseline, GUI, 10 source files, 2600+ lns
Key features:
17. 08.09.15 Vincenzo Lomonaco 17
NORB-sequences
Conclusions
ContentsBackground & Motivations
Objectives
Introduction
CNN and HTM
Key features
Implementations
Original NORB dataset
New benchmark design
Experiments design
Results
Contents
Experiments and Results
18. 08.09.15 Vincenzo Lomonaco 18
Experiments design
1) Validate the CNN implementation on the NORB dataset
2) Evaluate the performance of both algorithms on the plain
NORB dataset
3) Evaluate the performance of both algorithms on the NORB
sequences
19. 08.09.15 Vincenzo Lomonaco 19
CNN validation
In order to validate the new implementation,the goal was to
reproduce Y. LeCun original results on the plain NORB
DATASET.
20. 08.09.15 Vincenzo Lomonaco 20
Plain NORB results
Accuracy results comparison between CNN and HTM on the
plain NORB dataset.
21. 08.09.15 Vincenzo Lomonaco 21
Training times
Training times comparison between CNN and HTM on the
NORB sequences.
Training size CNN times HTM times
100 + 800jit 10.94 m 21.19 m
250 + 2000jit 31.15 m 23.13 m
500 + 4000jit 38.24 m 22.14 m
1000 + 4000jit 91.26 m 26.04 m
2500 + 4000jit 94.90 m 61.08 m
5000 + 4000jit 124.7 m 89.58 m
10000 + 4000jit 187.7 m 143.5 m
24300 + 4000jit 51.31 m 596.2 m
● CNN: GPU Tesla C2075 Fermi
(GPU speedup x3.2)
● HTM: CPU Xeon W3550, 4
cores.
Architectures:
23. 08.09.15 Vincenzo Lomonaco 23
NORB-sequences
Experiments and Results
Conclusions
ContentsBackground & Motivations
Objectives
Introduction
CNN and HTM
Key features
Implementations
Original NORB dataset
New benchmark design
Experiments design
Results
Contents
24. 08.09.15 Vincenzo Lomonaco 24
Conclusions
In this dissertation three different milestones have been
achieved:
1) A LeNet-7 with Theano has been successfully implemented.
2) A new benchmark for object recognition in image
sequences has been created.
3) HTM and CNN have been compared on different object
recognition tasks.
It has been proven that the HTM bio-inspired approach can
be highly competitive and could be instrumental for
advancing the field of Deep Learning
25. 08.09.15 Vincenzo Lomonaco 25
The End
http://vincenzolomonaco.com
vincenzo.lomonaco@studio.unibo.it
“If we want machines to think, we need to teach them to see”
Fei-Fei Li, Stanford Computer Vision Lab
Thank you for your attention
Vincenzo Lomonaco