ݺߣ

ݺߣShare a Scribd company logo
Deep Convolutional
Neural Networks
Lukáš Vrábel 
lukas.vrabel@firma.seznam.cz
www.seznam.cz
Neural Networks
Crashcourse
www.seznam.cz
Neural Net Crashcourse
●
We want to classify images
●
Input: image
●
Output: which categories the image
belongs to
www.seznam.cz
Seznam.cz applications
●
Car advert images
Front Left Side Steering
Wheel
www.seznam.cz
Seznam.cz applications
●
Adult content detection
Porn Not porn
www.seznam.cz
Seznam.cz applications
●
Detect photo rotation
●
Email topic classification
●
Basis of other, more complex
projects
www.seznam.cz
Other applications
●
Handwritten character recognition
www.seznam.cz
Other applications
●
Self-driving cars
www.seznam.cz
Other applications
●
Self-driving mini cars
●
link
www.seznam.cz
Other applications
●
Automatic drone navigation (link)
www.seznam.cz
Other applications
●
Automatic drone navigation (link)
www.seznam.cz
Other applications
●
Automatic drone navigation (link)
www.seznam.cz
Other applications
●
Accessibility for blind people (link)
www.seznam.cz
Part of bigger systems
●
Image segmentation (link)
www.seznam.cz
Part of bigger systems
●
Generate image description (link)
"man in black shirt
is playing guitar."
www.seznam.cz
Part of bigger systems
●
Generate art (link)
www.seznam.cz
Part of bigger systems
●
Playing games (link)
www.seznam.cz
Summary
●
Image → class / category
●
Direct application
●
Part of a bigger system
www.seznam.cz
Neural Net Crashcourse
●
Tutorial Example: handwritten digits
www.seznam.cz
Neural Net Crashcourse
●
Tutorial Example: handwritten digits
1
Neural
Network
0
2
4
6
8
9
7
3
5
1
-1
-1
-1
-1
-1
-1
-1
-1
-1
www.seznam.cz
Neural Net Crashcourse
●
Tutorial Example: handwritten digits
-1
Neural
Network
0
2
4
6
8
9
7
3
5
1
1
-1
-1
-1
-1
-1
-1
-1
-1
www.seznam.cz
Neural Net Crashcourse
●
Neural network interactive example
www.seznam.cz
Final Convolution Filters
●
Visualization of complex networks
●
deepvis from Jason Yosinsky (video)
www.seznam.cz
Networks in browser
●
networks in browser
www.seznam.cz
Summary
●
Image classification applications
●
Fully-connected layers
●
Convolutional layers
●
Normalization
●
Backpropagation of errors
www.seznam.cz
Homemade Image
Classifier
www.seznam.cz
Task
●
We want to analyze our own images
●
We want to classify them to our own
classes
●
Example:
– Guess product category from image
– Guess room type for house photos
– Flower recognition, dog breed recognition
– ...
www.seznam.cz
Training
●
To train neural network from scratch:
– lot of (GPU) processing power
– lot of annotated images (millions)
– lot of time (weeks)
www.seznam.cz
Solution: Finetuning
●
Modify pre-trained general network
●
Retrain it on custom images
●
Fraction of images, days instead of weeks
●
Lot of our image analysis projects at
Seznam.cz starts with finetuning
www.seznam.cz
Caffe Framework
●
http://caffe.berkeleyvision.org/
●
“Easy”, fast deep neural networks
●
Provides pre-trained “general” networks
●
Command line utilities for training
●
C++/Python/Matlab wrappers
www.seznam.cz
Caffe pre-trained models
●
free for commercial usage
●
ImageNet Large Scale Visual Recognition
Competition
●
1000 general categories
www.seznam.cz
ILSVRC Categories
www.seznam.cz
Finetune Dataset
● artistic style tutorial
● 20 categories
● Tutorial is very brief
● Does not explain
what changes to do
and why to do it
● Does not address
many pitfalls
www.seznam.cz
Finetune Tips
●
We will explain the workflow
●
We will address some of the pitfalls
www.seznam.cz
Finetune Workflow
●
Annotated Image dataset
●
At least 50k images
●
Shuffle them!
data/flickr_style/images/12123529133.jpg 16
data/flickr_style/images/11603781264.jpg 12
data/flickr_style/images/12852147194.jpg 9
data/flickr_style/images/8516303191.jpg 10
data/flickr_style/images/12223105575.jpg 2
...
www.seznam.cz
Finetune workflow
conv 1 conv 2 conv 3 conv 4 conv 5
fc 6 fc 7 fc 8
Car
Beer
Dog
1000
●
bvlc_reference_caffenet model
●
1000 general categories
www.seznam.cz
Finetune workflow
conv 1 conv 2 conv 3 conv 4 conv 5
fc 6 fc 7 fc 8
Car
Beer
Dog
www.seznam.cz
Finetune workflow
conv 1 conv 2 conv 3 conv 4 conv 5
fc 6 fc 7
www.seznam.cz
Finetune workflow
Macro
Noir
Baroque
conv 1 conv 2 conv 3 conv 4 conv 5
fc 6 fc 7 fc 8 style
20
www.seznam.cz
Finetune workflow
Macro
Noir
Baroque
conv 1 conv 2 conv 3 conv 4 conv 5
fc 6 fc 7 fc 8 style
Slow Learning Fast Learning
20
www.seznam.cz
Summary
●
Prepare dataset
●
Get general network
●
Replace last layer
●
Retrain
www.seznam.cz
Tips how to avoid
common pitfalls
www.seznam.cz
Tips: CPU vs GPU
●
Nvidia GPU
●
Vit will talk about it more
●
300k images:
Intel CPU @ 1.70GHz ~3 days (!!)
GeForce GTX TITAN Black ~3 min
1000-1500x speedup
www.seznam.cz
Tips: Training Process
●
Training is running in iterations
●
Each iteration is one batch (50 images)
●
Weights are updated after each iteration
●
After specified number of iterations:
– Test set is evaluated
– Snapshot is taken
www.seznam.cz
Tips: Training Process
TRAIN
TRAIN
TRAIN
TRAIN
TEST
TRAIN
TRAIN
TRAIN
TRAIN
TEST
SNAPSHOT
TRAIN
TRAIN
TRAIN
TRAIN
TEST
TRAIN
TRAIN
TRAIN
TRAIN
TEST
SNAPSHOT
...
TEST TEST TEST TEST
www.seznam.cz
Tips: Batch size
●
Better learning
●
Faster computation (copy bottleneck)
●
More GPU memory
●
Affects Iteration count config
www.seznam.cz
Tips: Training Process
●
Number of iterations for test phase
should be set to evaluate whole test set
– 16000 images / 50 batch size = 320 iterations
●
It could be good idea to test after whole
training set was evaluated
– 64000 images / 50 batch size = 1280 iterations
●
Snapshots – 200-300 MB
●
Can be resumed from snapshot
www.seznam.cz
Summary of Tips
●
Shuffle train/test sets
●
Use Nvidia GPU
●
Setup batch size
●
Update testing and snapshotting based
on test/train set sizes
www.seznam.cz
Big Summary
●
Image → category / class
●
FC/Convolution layers
●
Backpropagation of errors
●
Caffe for Finetuning of general models
●
Use GPU, shuffle images, setup training
process
www.seznam.cz
Lukáš Vrábel (lukas.vrabel@firma.seznam.cz)
www.seznam.cz
Appendix: Changes to
Configuration Files
www.seznam.cz
Quick n'Dirty Finetuning
●
Prepare dataset
– images, train.txt, test.txt
●
Configure network
– train_val.prototxt, deploy.prototxt
●
Configure training
– solver.prototxt
www.seznam.cz
Prepare Dataset
●
Images
●
train.txt, test.txt
●
Format: <filename> <class>
data/flickr_style/images/12123529133.jpg 16
data/flickr_style/images/11603781264.jpg 12
data/flickr_style/images/12852147194.jpg 9
data/flickr_style/images/8516303191.jpg 10
data/flickr_style/images/12223105575.jpg 2
...
www.seznam.cz
Prepare Dataset
●
Update train_val.prototxt data layers
layer {
  name: "data"
  type: "ImageData"
  ...
  include {
    phase: TRAIN
  }
  ...
  image_data_param {
    source: "data/flickr_style/train.txt"
    batch_size: 50
    ...
  }
}
Path to the train.txt
Do the same for the TEST
How much images to process
in one iteration
www.seznam.cz
Configure Network
FC 8
ReLU 7
FC 7
FC 6
ReLU 6
Max pool 5
ReLU 5
Conv 5
ReLU 4
Conv 4
ReLU 3
Conv 1
ReLU 1
Max pool 1
LRN 1
Conv 2
ReLU 2
Max pool 2
LRN 2
conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8
Conv 3
●
train_val.prototxt
www.seznam.cz
Configure Network
●
Replace “fc8” layer
layer {
  name: "fc8_flickr"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8_flickr"
  ...
  inner_product_param {
    num_output: 20
    ...
  }
}
www.seznam.cz
Configure Network
●
Update “accuracy” layer for testing
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc8_flickr"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
www.seznam.cz
Configure Network
●
Update “loss” layer for training
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8_flickr"
  bottom: "label"
  top: "loss"
}
www.seznam.cz
net: "<netconf>"
test_iter: 320
test_interval: 2560
...
snapshot: 5120
snapshot_prefix: "<snapshots>"
...
Configure Training
●
solver.prototxt:
– update paths
Should cover whole test set
16000 / 50 = 320
Test after two whole train cycles
64000 / 50 = 1280
1280 * 2 = 2560
Save after four train cycles
1280 * 4 = 5120
www.seznam.cz
Training
●
Run training
./build/tools/caffe train 
 ­solver <path to solver.prototxt>
 ­weights <path to bvlc.caffemodel>
 ­gpu 0

More Related Content

Lukáš Vrábel - Deep Convolutional Neural Networks