�ݺ�ߣ

Leveraging Social Media with
Computer Vision
TJ Torres
Data Scientist, Stitch Fix
Big Data Applications in Fashion MeetUp
10/2016
Informing Recommendations in Fashion and Retail

Data LabsStyling Algorithms Research

MOTIVATION
Inventory Scaling:
Why Recommendations?
Infeasible from an eﬃciency perspective to look
through all inventory as it scales.

MOTIVATION
Inventory Scaling:
Human Ability:
Stylists can’t keep all products in their memories while
trying to locate the best items for each client.

MOTIVATION
Inventory Scaling:
Human Ability:
Stylists can’t keep all products in their memories while
trying to locate the best items for each client.
Business Success:
Aid stylists in making the best decisions to better please
our clients.

MOTIVATION
Our goal at Stitch Fix
Total Inventory
Recommendation Algo
Stylists
Filtered Items
1 2 3 4 5
Final Items Sent

COMPUTER VISION
New Clients
New Clothing
Cold Start Problem
No or sparse purchasing information,
so how can we supplement this?

COMPUTER VISION
New Clients
New Clothing
Cold Start Problem
No or sparse purchasing information,
so how can we supplement this?
Perception
Fashion can be diﬃcult to describe via text/categorization.
Many times it’s easier to show what you like.

TURN TO IMAGES
• Style/fashion is primarily visual.
• We wish to use images for modeling purposes.
• Heuristics for how we process image data
unknown or quite complex.
• We don’t want to have to develop image
features.
• Turn to deep learning to learn the feature
extraction.

OUTLINE
1. Brief Introduction to NNs
2. Deep Learning for Fashion Imagery
3. Recommendations and Social Media
4. Results
5. Conclusions

NEURAL NETWORKS
http://www.wired.com/2013/02/three-awesome-tools-scientists-may-use-to-map-your-brain-in-the-future/

http://googleresearch.blogspot.com/2015/06/inceptionism-going-deeper-into-neural.html

Whoa
Dude!
http://googleresearch.blogspot.com/2015/06/inceptionism-going-deeper-into-neural.html

Gatys, et. al. : https://arxiv.org/abs/1508.06576

Begin with input:
INTRO TO NEURAL NETS
1 2 3 4 5 6

Begin with input: 1 2 3 4 layer 1
(Input)
5 6
layer 2
f
(l)
i (x) = tanh
0
@
X
j
W
(l)
ij x
(l 1)
j + b(l)
1
A

Begin with input: 1 2 3 4 layer 1
(Input)
5 6
layer 2
f
(l)
i (x) = tanh
0
@
X
j
W
(l)
ij x
(l 1)
j + b(l)
1
A
layer 3
(output)
Transform data repeatedly
with non-linear function.
f(1)
· · · f(n)
(x)

1 2 3 4 layer 1
(Input)
5 6
layer 2
layer 3
(output)
Calculate loss function
and update weights
f(1)
· · · f(n)
(x)
L(xout, y) =
MSE
z }| {
1
m
mX
k=1
(xk yk)2
Begin with input:
f
(l)
i (x) = tanh
0
@
X
j
W
(l)
ij x
(l 1)
j + b(l)
1
A

1 2 3 4 layer 1
(Input)
5 6
layer 2
layer 3
(output)
L(xout, y) =
MSE
z }| {
1
m
mX
k=1
(xk yk)2
W
(l)⇤
ij = W
(l)
ij
✓
1 ↵
@L
@Wij
◆
and update weights
f(1)
· · · f(n)
(x)
Begin with input:
f
(l)
i (x) = tanh
0
@
X
j
W
(l)
ij x
(l 1)
j + b(l)
1
A

1 2 3 4 layer 1
(Input)
5 6
layer 2
layer 3
(output)
L(xout, y) =
MSE
z }| {
1
m
mX
k=1
(xk yk)2
W
(l)⇤
ij = W
(l)
ij
✓
1 ↵
@L
@Wij
◆
@L
@W
(l)
ij
=
✓
@L
@xout
◆ ✓
@xout
@f(n 1)
◆
· · ·
@f(l)
@W
(l)
ij
!
and update weights
f(1)
· · · f(n)
(x)
Begin with input:
f
(l)
i (x) = tanh
0
@
X
j
W
(l)
ij x
(l 1)
j + b(l)
1
A

RECS AND SOCIAL MEDIA
Clients give Pinterest board to visually indicate fashion tastes.
Match pinned images to our own styles.

Strategies

Strategies
Attribute extraction and matching.

Strategies
Attribute extraction and matching. Visual feature similarity.

Strategies
Metric learning.

Strategies
Metric learning. …or some combination.

VISUAL FEATURES
Use pre-trained extracted features.
Compare image features with metric of your choice
Cosine Euclidean etc,

EXAMPLES
Query Image
Top 5 Results

CHALLENGES
Query Image
Top 5 Results
Sometimes things don’t work out so well…
Need system to compare images across separate domains

METRIC LEARNING
New Metric
as Objective
AnchorPositiveNegative
Triplet or Contrastive Loss
https://arxiv.org/abs/1404.4661
Ltriplet(a, p, n) =
1
N
NX
i=1
max {d(f(ai), f(pi)) d(f(ai), f(ni)) + m, 0}
!

METRIC LEARNING
m m
Positive
Negative
BeforeTraining AfterTraining BeforeTraining AfterTraining

METRIC LEARNING
m m
Positive
Negative
BeforeTraining AfterTraining BeforeTraining AfterTraining
Learn an embedding that obeys the similarity constraints.
similarity score = d query, inventory

CONCLUSIONS
1. Social media images can help make better recommendations.
a) Alleviate cold start.
b) Provide new features/data for recommendations.
2. Cross-domain image matching can be diﬃcult, but is made easier
with deep learning.
3. There’s enormous potential moving forward with this type of work.
a) Attribute labeling and trend tracking.
b) Predictive models for purchasing probability.

THANK YOU!
@teejosaur
/in/tjtorres
@tjtorres

�ݺ�ߣ

Leveraging Social Media with Computer Vision

More Related Content

Leveraging Social Media with Computer Vision