Pylearn2 is still undergoing rapid development. Don’t expect a clean road without bumps! If you find a bug please write to firstname.lastname@example.org. If you’re a Pylearn2 developer and you find a bug, please write a unit test for it so the bug doesn’t come back!
Pylearn2 is a machine learning library. Most of its functionality is built on top of Theano. This means you can write Pylearn2 plugins (new models, algorithms, etc) using mathematical expressions, and Theano will optimize and stabilize those expressions for you, and compile them to a backend of your choice (CPU or GPU).
- Researchers add features as they need them. We avoid getting bogged down by too much top-down planning in advance.
- A machine learning toolbox for easy scientific experimentation.
- All models/algorithms published by the LISA lab should have reference implementations in Pylearn2.
- Pylearn2 may wrap other libraries such as scikit-learn when this is practical
- Pylearn2 differs from scikit-learn in that Pylearn2 aims to provide great flexibility and make it possible for a researcher to do almost anything, while scikit-learn aims to work as a “black box” that can produce good results even if the user does not understand the implementation
- Dataset interface for vector, images, video, ...
- Small framework for all what is needed for one normal MLP/RBM/SDA/Convolution experiments.
- Easy reuse of sub-component of Pylearn2.
- Using one sub-component of the library does not force you to use / learn to use all of the other sub-components if you choose not to.
- Support cross-platform serialization of learned models.
- Remain approachable enough to be used in the classroom (IFT6266 at the University of Montreal).
Download and installation¶
There is no PyPI download yet, so Pylearn2 cannot be installed using e.g.
You must check out the bleeding-edge/development version from GitHub
git clone git://github.com/lisa-lab/pylearn2.git
To make Pylearn2 available in your Python installation, run the following command in the
pylearn2 directory (which should have been created by the previous command):
python setup.py develop
You may need to use
sudo to invoke this command with administrator privileges.
If you do not have such access (or would rather not add Pylearn2 to the global
site-packages for whatever reason), you can install the relevant links inside the
user site-packages directory
by issuing the command:
python setup.py develop --user
This command will also compile the Cython extensions required for e.g.
Alternatively, you can make Pylearn2 available by adding the installation directory to your
PYTHONPATH environment variable, but note that changing your
alone won’t compile the Cython extensions (you will need to make sure the extension
files are built and accessible within the source tree, e.g. with
python setup.py build_ext --inplace).
For some tutorials and tests you will also need to set your
On Linux, the best way to do this is to add a line to your
Note that this is only an example, and if you are not in the LISA lab, you will need to choose a directory path that is valid on your filesystem. Simply choose a path where it will be convenient for you to store datasets for use with Pylearn2.
Vagrant (any OS)¶
Pylearn2 in a box uses Vagrant to easily create a new VM installed with Pylearn2 and the necessary packages.
PyYAML is required for most functionality.
PIL is required for some image-related functionality.
matplotlib is required for some plotting functionality.
- Some dependencies are optional:
- Pylearn2 includes code for accessing several standard datasets, such as MNIST and CIFAR-10. However, if you wish to use one of these datasets, you must download the dataset itself manually.
- The original Pylearn project is required for loading some datasets, such as the Unsupervised and Transfer Learning Challenge datasets
- Some features (SVMs) depend on scikit-learn.
- k-means depends on milk.
- Reading SVHN dataset depends on pytables.
License and Citations¶
Pylearn2 is released under the 3-claused BSD license, so it may be used for commercial purposes. The license does not require anyone to cite Pylearn2, but if you use Pylearn2 in published research work we encourage you to cite this article:
- Ian J. Goodfellow, David Warde-Farley, Pascal Lamblin, Vincent Dumoulin, Mehdi Mirza, Razvan Pascanu, James Bergstra, Frédéric Bastien, and Yoshua Bengio. “Pylearn2: a machine learning research library”. arXiv preprint arXiv:1308.4214 (BibTeX)
Pylearn2 is primarily developed by academics, and so citations matter a lot to us. As an added benefit, you increase Pylearn2’s exposure and potential user (and developer) base, which is to the benefit of all users of Pylearn2. Thanks in advance!
Roughly in order of what you’ll want to check out:
- Quick-start example – Learn the basics via an example.
- YAML for Pylearn2 – A tutorial on YAML tags employed by Pylearn2.
- IPython Notebook Tutorials – At this point, you might want to work through the ipython notebooks in the “scripts/tutorials” directory.
- A First Experiment with Pylearn2 – A brief introduction to running experiments.
- Monitoring Experiments in Pylearn2 – An overview of monitoring experiments.
- Your models in Pylearn2 – A tutorial on porting Theano code to Pylearn2
- Features – A list of features available in the library.
- Overview – A detailed but high-level overview of how Pylearn2 works. This is the place to start if you want to really learn the library.
- Library Documentation – Documentation of the library modules.
- Working with computer clusters – The tools we use at LISA for running Pylearn2 jobs on HPC clusters.
- Working with large datasets in Pylearn2 – A guide on how to deal with large datasets.
- Pylearn2 Vision – Some more detailed elaboration of some points of the Pylearn2 vision.
- F.A.Q. – Please read the FAQ section before posting to mailing-lists.
- Register and post to pylearn-users for general inquiries and support questions or if you want to talk to other users.
- Register and post to pylearn-dev if you want to talk to the developers. We don’t bite.
- Register to pylearn2-github if you want to receive an email for all changes to the GitHub repository.
- Register to theano-buildbot if you want to receive our daily buildbot email. This is the buildbot for Pylearn2, Theano, Pylearn and the Deep Learning Tutorial.
- Ask/view questions/answers about machine learning in general at metaoptimize/qa/tags/theano (it’s like stack overflow for machine learning)
- We use the github issues to stay organized.
- Come visit us in Montreal! Most of the developers are students in the LISA group at the University of Montreal.
- Register to everything listed in the Community section above
- Follow the LISA lab coding style guidelines: http://deeplearning.net/software/pylearn/v2_planning/API_coding_style.html
- Pylearn2 API Change Best Practices Guide – the best practices guide you should follow when changing any API in the library
- Developer Start Guide – how to contribute code to Pylearn2
- Pylearn2 Pull Request Checklist – Things you should check your pull request for before review.
- Data specifications, spaces, and sources – the interface different elements use to request and provide data