For this workshop we invited the community to compete in three challenges intended to advance the field of representation learning:

1) The multi-modal learning challenge: Competitors design a system to map both images and text into the same representation space. They are scored based on their ability to do information retrieval on held out data. This challenge is designed to advance the ability of representation learning systems to discover semantic spaces that underlie multiple kinds of sensory input.

2) The black-box learning challenge: Competitors train a classifier on a dataset that is not human readable, without knowledge of what the data consists of. They are scored based on classification accuracy on a private test set. This challenge is designed to reduce the usefulness of having a human researcher working in the loop with the training algorithm.

3) The facial expression recognition challenge: One motivation for representation learning is that learning algorithms can design features better and faster than humans can. To this end, we will hold one challenge that does not explicitly require that entries use representation learning. Rather, we will introduce an entirely new dataset and invite competitors from all related communities to solve it. The dataset for this challenge will be a facial expression classification dataset that we have assembled from the internet and has not yet been distributed publicly.

The first place winner of each contest will receive $300 and the second place winner will receive $150. The prize money was generously provided by Google, Inc.