Professional Resume Template for Software Developers
Deep learning image classification aplicado al mundo de la moda
1. Robert Figiel
Co-Founder & CTO
Deep Learning
Image Classification
Deep Learning
Image Classification
Javier Abadía
Lead Developer
2. WHAT DO WE DO AT STYLESAGE?
Web-Crawling of
100M+ e-commerce
products daily.
Analysis of text,
machine learning,
image-recognition
Visualize insights for
fashion brands &
retailers
Collect Data Analyze Products Visualize Insights
12. CODE USING python/scikit-learn
""" based on
http://scikit-learn.org/stable/auto_examples/linear_model/plot_iris_logistic.html """
import numpy as np
from sklearn import linear_model, metrics
N = 50000
X = np.array([x.flatten() for x in data['train_dataset'][:N]])
Y = data['train_labels'][:N]
solver = 'sag'
C = 0.001
# train
logreg = linear_model.LogisticRegression(C=C, solver=solver)
logreg.fit(X, Y)
# test
VX = np.array([x.flatten() for x in data['test_dataset']])
predicted_labels = logreg.predict(VX)
print "%.3f" % (metrics.accuracy_score(predicted_labels, data['test_labels']),)
13. CODE WITH tensorflow
import tensorflow as tf
graph = tf.Graph()
with graph.as_default():
# Input data placeholder
tf_train_dataset = tf.placeholder(tf.float32,
shape=(batch_size, image_size * image_size))
tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
# Variables.
weights = tf.Variable(
tf.truncated_normal([image_size * image_size, num_labels]))
biases = tf.Variable(tf.zeros([num_labels]))
# Training computation.
logits = tf.matmul(tf_train_dataset, weights) + biases
loss = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels))
# Optimizer.
optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
14. LINEAR METHODS ARE LIMITED
TO LINEAR RELATIONSHIPS
PX ✕W1 +b1 Y s(Y)
PX ✕W1 +b1 Y s(Y)✕W2 +b2
activation
function
(RELU)
19. CHOOSING A MODEL – OPEN SOURCE OPTIONS
• AlexNet (2012)
• 8 layers, 16.4% error rate on ImageNet
• GoogLeNet (2014)
• 22 layers, 6.66% error rate on ImageNet
• Google Inception v3 (2015)
• 48 layers, 3.46% error rate
• Microsoft ResNet (2015)
• 152 layers, 3.57% error rate
Yearly competition on ImageNet dataset with 1M images
across 1000 object classes – models available open source
Many models open source.
No need to re-invent the
wheel.
20. FRAMEWORK – OPEN SOURCE OPTIONS
• Caffe
• Developed by UC Berkley, Very efficient algorithms
• Implemented GoogLeNet, ResNet
• Large community
• Tensorflow
• Released 2015 by Google
• Ready-to-use Implementions of GoogLeNet, Inception v3
• Tensorboard for visualizing training progress
• Torch, Theano, Keras, ...
Many Python frameworks available, all with many examples,
good documentation and pre-implemented models
Chose a Python Frame-
work that fits your needs
21. IMPLEMENTING A CNN
MODEL – TRAIN - PREDICT
Select / Develop
MODEL
TRAIN/TEST model
with known images PREDICT
on new Images
Feedback loop
Additional Training Data
22. INFRASTRUCTURE – GPUS
Underlying CNN computations are mainly matrix multiplications
GPUs (Graphical Processing Unit) 30-50X faster than CPUs
1 CPU: 2 sec
1 GPU: 50ms
30-50X faster
vs.
Use GPU based
servers for faster
training and
predictions
23.
24.
25.
26.
27. THANK YOU – WE ARE RECRUITING!
Team Slide - recruiting
www.stylesage.co/careers javier@stylesage.co