Table Of Contents

Next topic

Library Documentation

This Page

Welcome

Pylearn2 is still undergoing rapid development. Don’t expect a clean road without bumps! If you find a bug please write to pylearn-dev@googlegroups.com. If you’re a Pylearn2 developer and you find a bug, please write a unit test for it so the bug doesn’t come back!

Pylearn2 is a machine learning library. Most of its functionality is built on top of Theano. This means you can write Pylearn2 plugins (new models, algorithms, etc) using mathematical expressions, and theano will optimize and stabilize those expressions for you, and compile them to a backend of your choice (CPU or GPU).

Pylearn2 Vision

  • Researchers add features as they need them. We avoid getting bogged down by too much top-down planning in advance.
  • A machine learning toolbox for easy scientific experimentation.
  • All models/algorithms published by the LISA lab should have reference implementations in Pylearn2.
  • Pylearn2 may wrap other libraries such as scikit-learn when this is practical
  • Pylearn2 differs from scikit-learn in that Pylearn2 aims to provide great flexibility and make it possible for a researcher to do almost anything, while scikit-learn aims to work as a “black box” that can produce good results even if the user does not understand the implementation
  • Dataset interface for vector, images, video, ...
  • Small framework for all what is needed for one normal MLP/RBM/SDA/Convolution experiments.
  • Easy reuse of sub-component of Pylearn2.
  • Using one sub-component of the library does not force you to use / learn to use all of the other sub-components if you choose not to.
  • Support cross-platform serialization of learned models.
  • Remain approachable enough to be used in the classroom (IFT6266 at the University of Montreal).

Download and installation

No PyPI download yet. You must checkout the version in github for bleeding-edge/development version, available via:

git clone git://github.com/lisa-lab/pylearn2.git

Once done, you have to append the installation path to the PYTHONPATH variable to make the library accessible from Python. On linux, you can add a line to the .bashrc file:

export PYTHONPATH=<new location to add>:$PYTHONPATH

You also need to set your PYLEARN2_DATA_PATH variable. On linux, the best way to do this is to add a line to your .bashrc file:

export PYLEARN2_DATA_PATH=/data/lisa/data

Note that this is only an example, and if you are not in the LISA lab, you will need to choose a directory path that is valid on your filesystem. Simply choose a path where it will be convenient for you to store datasets for use with Pylearn2.

Other methods

Vagrant (any OS)

Pylearn2 in a box uses Vagrant to easily create a new VM installed with Pylearn2 and the necessary packages.

  1. Download and install Vagrant http://www.vagrantup.com/
  2. Download and install Virtual Box and VirtualBox Extension Pack https://www.virtualbox.org/wiki/Downloads
  3. Download the Vagrant scripts from Pylearn2 in a box
  4. Start the VM by running vagrant up in the directory from step 3

Dependencies

  • Theano and its dependencies are required to use Pylearn2. Since pylearn2 is under rapid development some of its features might depend on the bleeding-edge version of Theano.

  • PyYAML is required for most functionality.

  • PIL is required for some image-related functionality.

  • Some dependencies are optional:
    • Pylearn2 includes code for accessing several standard datasets, such as MNIST and CIFAR-10. However, if you wish to use one of these datasets, you must download the dataset itself manually.
    • The original Pylearn project is required for loading some datasets, such as the Unsupervised and Transfer Learning Challenge datasets
    • Some features (SVMs) depend on scikit-learn.

License and Citations

Pylearn2 is released under the 3-claused BSD license, so it may be used for commercial purposes. The license does not require anyone to cite Pylearn2, but if you use Pylearn2 in published research work we encourage you to cite this article:

Pylearn2 is primarily developed by academics, and so citations matter a lot to us. As an added benefit, you increase Pylearn2’s exposure and potential user (and developer) base, which is to the benefit of all users of Pylearn2. Thanks in advance!

Documentation

Roughly in order of what you’ll want to check out:

  • Quick-start example – Learn the basics via an example.
  • YAML for Pylearn2 – A tutorial on YAML tags employed by Pylearn2.
  • IPython Notebook Tutorials – At this point, you might want to work through the ipython notebooks in the “scripts/tutorials” directory.
  • Features – A list of features available in the library.
  • Overview – A detailed but high-level overview of how Pylearn2 works. This is the place to start if you want to really learn the library.
  • Library Documentation – Documentation of the library modules.
  • Working with computer clusters – The tools we use at LISA for running Pylearn2 jobs on HPC clusters.
  • Pylearn2 Vision – Some more detailed elaboration of some points of the Pylearn2 vision.

Community

  • Register and post to pylearn-users for general inquiries and support questions or if you want to talk to other users.
  • Register and post to pylearn-dev if you want to talk to the developers. We don’t bite.
  • Register to pylearn2-github if you want to receive an email for all changes to the GitHub repository.
  • Register to theano-buildbot if you want to receive our daily buildbot email. This is the buildbot for Pylearn2, Theano, Pylearn and the Deep Learning Tutorial.
  • Ask/view questions/answers about machine learning in general at metaoptimize/qa/tags/theano (it’s like stack overflow for machine learning)
  • We use the github issues to stay organized.
  • Come visit us in Montreal! Most of the developers are students in the LISA group at the University of Montreal.

Developer