Welcome
Pylearn2 is still undergoing rapid development. Don’t expect a clean road without bumps!
If you find a bug please write to pylearn-dev@googlegroups.com.
If you’re a Pylearn2 developer and you find a bug, please write a unit test for it so
the bug doesn’t come back!
Pylearn2 is a machine learning library. Most of its functionality is built on top of
Theano.
This means you can write Pylearn2 plugins (new models, algorithms, etc) using
mathematical expressions, and theano will optimize and stabilize those expressions for
you, and compile them to a backend of your choice (CPU or GPU).
Pylearn2 Vision
- Researchers add features as they need them. We avoid getting bogged down by too
much top-down planning in advance.
- A machine learning toolbox for easy scientific experimentation.
- All models/algorithms published by the LISA lab should have reference implementations
in Pylearn2.
- Pylearn2 may wrap other libraries such as scikits.learn when this is practical
- Pylearn2 differs from scikits.learn in that Pylearn2 aims to provide great flexibility
and make it possible for a researcher to do almost anything, while scikits.learn
aims to work as a “black box” that can produce good results even if the user does not
understand the implementation
- Dataset interface for vector, images, video, ...
- Small framework for all what is needed for one normal MLP/RBM/SDA/Convolution experiments.
- Easy reuse of sub-component of Pylearn2.
- Using one sub-component of the library does not force you to use / learn to use all of
the other sub-components if you choose not to.
- Support cross-platform serialization of learned models.
- Remain approachable enough to be used in the classroom (IFT6266 at the University of Montreal).
Download and installation
No PyPI download yet. You must checkout the version in github for bleeding-edge/development
version, available via:
git clone git://github.com/lisa-lab/pylearn2.git
You also need to set your PYLEARN2_DATA_PATH variable. On linux, the best way to do this
is to add a line to your .bashrc file:
export PYLEARN2_DATA_PATH=/data/lisa/data
Note that this is only an example, and if you are not in the LISA lab, you will need to
choose a directory path that is valid on your filesystem. Simply choose a path where
it will be convenient for you to store datasets for use with Pylearn2.
Dependencies
Theano and its dependencies are required to use
Pylearn2.
PyYAML is required for most functionality.
PIL is required for some image-related functionality.
- Some dependencies are optional:
- Pylearn2 includes code for accessing several standard datasets, such as MNIST and CIFAR-10.
However, if you wish to use one of these datasets, you must download the dataset itself
manually.
- The original Pylearn project is required
for loading some datasets, such as the Unsupervised and Transfer Learning Challenge datasets
- Some features (SVMs) depend on scikits-learn.
Documentation
Roughly in order of what you’ll want to check out:
- Quick-start example – Learn the basics via an example.
- At this point, you might want to work through the ipython notebooks in the “scripts/tutorials”
directory.
- Features – A list of features available in the library.
- Overview – A detailed but high-level overview of how Pylearn2 works. This is the
place to start if you want to really learn the library.
- Library Documentation – Documentation of the libray modules.
- Working with computer clusters – The tools we use at LISA for running Pylearn2 jobs on HPC clusters.
- Pylearn2 Vision – Some more detailed elaboration of some points of the Pylearn2 vision.