CAFFE (Convolution Architecture For Feature Extraction) is a deep learning framework made for speed. It was made on top of C/C++/CUDA supports CPU and GPU.

Originally Caffe could process over 60M images per day with a single NVIDIA K40 GPU (very old GPU now, Release Date October 8th, 2013). That is 1 ms/image for inference and 4 ms/image for learning.

The library was initially designed for machine vision tasks, but more recent versions support sequences, speech and text modules. Caffe sit on these four concepts:

  • blobs
  • layers
  • net
  • solver

Blob stores data and derivatives. Caffe moves data from CPU to GPU via blobs. Layer transforms bottom blobs to top blobs.

Net is simple multiple layers. It computes gradients via Forward and Backward methods.

Solver uses gradients to update weights.

Caffe supports Matlab and Python so C++ is not the only way to work with Caffe.


Caffe2 and Caffe are two completely different frameworks. Caffe is much older and almost completely replaced by Caffe2 (released in late 2017).

Early in 2018 when PyTorch integrated with Caffe2.

NOTE: You can think of Caffe2 now as part of PyTorch. Caffe2 and PyTorch join forces to create a Research + Production platform

So you could train your model using PyTorch and deploy it using Caffe2. ( PyTorch 1.1.0+ and ONNX 1.5.0+ with Python3.7+ seams to be the good match)

To export your model to ONNX you typically use torch.onnx:

import torch.onnx
torch.onnx.export(model, input, "model.onnx")

From there deploy model.onnx on a device and do inference in Caffe2.