Open Discussion of ICLR 2016 is now open

Open discussion of #ICLR2016 submissions is now open:

Access requires a CMT account. If you don’t have one already, go here:

Note that the assigned reviewers and area chair of each paper will be encouraged to consider the public comments in their evaluation of submissions.
Your comments will thus be very useful and appreciated!


Hugo Larochelle’s Google+ Post:


Software Developer Position at MILA

At MILA (Montreal Institute for Learning Algorithms), we are looking for a software developer to help us improving our software libraries (mostly Theano) and other related tasks.

This is a one year contract, full time position.
The duration of the contact could be extended depending on available funding.
If you are interested, please send your CV to Frédéric Bastien at “frederic.bastien AT” with “Software Developer Position at MILA” as the email subject.
Candidates need to be authorized to work in Canada.

Deep Learning Summer School 2015 Videos

The videos of the recently organized “Deep Learning Summer School 2015″ in Montreal are made available online on website:



Long Short-Term Memory dramatically improves Google Voice etc – now available to a billion users

A type of “Recurrent neural networks” architecture initially developed in Juergen Schmidhuber’s research groups at the Swiss AI Lab IDSIA and TU Munich greatly improved Google Voice (by 49%) and are now available to a billion users. You can find the recent Google Research Blog on this, by Haşim Sak, Andrew Senior, Kanishka Rao, Françoise Beaufays and Johan Schalkwyk:

This is a speech recognition application of “Long Short-Term Memory (LSTM)” Recurrent Neural Networks (since 1997) [1] with “forget gates” (since 1999) [2] and “Connectionist Temporal Classification (CTC)” (since 2006) [3].

Google is using the LSTMs also for numerous other applications such as state-of-the-art machine translation [4], image caption generation [5], natural language processing, etc.


[1] S. Hochreiter and J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735-1780, 1997.

[2] F. Gers, N. Schraudolph, J. Schmidhuber. Learning precise timing with LSTM recurrent networks. Journal of Machine Learning Research 3:115-143, 2002.

[3] A. Graves, S. Fernandez, F. Gomez, J. Schmidhuber. Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks. Proceedings of the International Conference on Machine Learning (ICML-06, Pittsburgh), 2006.

[4] I. Sutskever, O. Vinyals, O., Q. V. Le,  (2014). Sequence to sequence learning with neural networks. Technical Report [cs.CL], Google. NIPS’2014.

[5] O. Vinyals, A. Toshev, S. Bengio, D. Erhan. Show and Tell: A Neural Image Caption Generator.

A Brief Summary of the Panel Discussion at DL Workshop @ICML 2015 by Kyunghyun Cho

You can also access original post from Kyunghyun Cho’s blog, DeepRNN .


The finale of the Deep Learning Workshop at ICML 2015 was the panel discussion on the future of deep learning. After a couple of weeks of extensive discussion and exchange of emails among the workshop organizers, we invited six panelists; Yoshua Bengio (University of Montreal), Neil Lawrence (University of Sheffield), Juergen Schmidhuber (IDSIA), Demis Hassabis (Google DeepMind), Yann LeCun (Facebook, NYU) and Kevin Murphy (Google). As recent deep learning revolution has come from both academia and industry, we tried our best to balance the panelists so that audience can hear from the experts in both industry and academia. Before I say anything more, I would like to thank the panelists for having accepted the invitation!

Max Welling (University of Amsterdam) moderated the discussion, and personally, I found his moderation to be perfect. A very tight schedule of one hour, with six amazing panelists, on the grand topic of the future of deep learning; I cannot think anyone could’ve done a better job than Max. On behalf of all the other organizers (note that Max Welling is also one of the workshop organizers), I thank him a lot!

Now that the panel discussion is over, I’d like to leave a brief note of what I heard from the six panelists here. Unfortunately, only as the panel discussion began, I realized that I didn’t have a notepad with me.. I furiously went through my backpack and found a paper I need to review. In other words, due to the lack of space, my record here is likely not precise nor extensive.

I’m writing this on the plane, and forgive me for any error below (or above.) I wanted to write it down before the heat from the discussion cools down. Also, almost everything inside quotation marks is not an exact quote but a paraphrased one.

On the present and future of deep learning

Bengio began by noting that natural language processing (NLP) has not been revolutionized by deep learning, though there has been huge progress during the last one year. He believe NLP has a potential to become a next big thing for deep learning. Also, he wants more effort invested in unsupervised learning, which was resonated by LeCun, Hassabis and Schmidhuber.

Interestingly, four out of the six panelists, LeCun, Hassabis, Lawrence and Murphy, all found medicine/healthcare as a next big thing for deep/machine learning. Some of the areas they expressed their interest in were medical image analysis (LeCun) and drug discovery (Hassabis). Regarding this, I believe Lawrence is already pushing into this direction (DeepHealth from his earlier talk on the same day,) and it’ll be interesting to contrast his approach and those from Google DeepMind and Facebook later.

LeCun and Hassabis both picked Q&A and natural language dialogue systems as next big things. Especially, I liked how LeCun puts these in the context of incorporating reasoning based on knowledge, its acquisition and planning into neural networks (or as a matter of fact, any machine learning model.) This was echoed by both Hassabis and Schmidhuber.

Schmidhuber and Hassabis found sequential decision making as a next important research topic. Schmidhuber’s example of Capuchin monkeys was both inspiring and fun (not only because he mistakenly pronounced it as a cappuccino monkey.) In order to pick a fruit at the top of a tree, Capuchin monkey plans a sequence of sub-goals (e.g., walk to the tree, climb the tree, grab the fruit, …) effortlessly. Schmidhuber believes that we will have machines with animal-level intelligence (like a Capuchin smartphone?) in 10 years.

Slightly different from the other panelists, Lawrence and Murphy are more interested in transferring the recent success of deep learning to tasks/datasets that humans cannot solve well (let me just call these kinds of tasks ‘non-cognitive’ tasks for now.) Lawrence noted that the success of deep learning so far has largely been constrained to the tasks humans can do effortlessly, but the future may be with non-cognitive tasks. When it comes to these non-cognitive tasks, interpretability of trained models will become more valuable, noted by Murphy.

Hierarchical planning, knowledge acquisition and the ability to perform non-cognitive tasks naturally lead to the idea of automated laboratory, explained Murphy and Schmidhuber. In this automated laboratory, a machine will actively plan its goals to expand its knowledge of the world (by observation and experiments) and provide insights into the world (interpretability.)

On the Industry vs. Academia

One surprising remark from LeCun was that he believes the gap between the infrastructures at the industry labs and academic labs will shrink over time, not widen. This will be great, but I am more pessimistic than he is.

LeCun continued on explaining the open research effort at Facebook AI Research (FAIR). According to him, there are three reasons why industry (not only specific to FAIR) should push open science: (1) this is how research advances in general, (2) this makes a company more attractive to prospective employee/researcher and (3) there’s competition among different companies in research, and this is the way to stay ahead of others.

To my surprise, according to Hassabis, Google DeepMind (DeepMind from here on) and FAIR have agreed to share research software framework based on Torch. I vaguely remember hearing something about this under discussion some weeks or months ago, but apparently it has happened. I believe this will further speed up research from both FAIR and DeepMind. Though, it’s to be seen whether it’ll be beneficial to other research facilities (like universities) for two places with the highest concentration of deep learning researchers in the world to share and use the same code base.


Hassabis, Lawrence, Murphy and Bengio all believe that the enormous resource available in industry labs is not necessarily an issue for academic labs. Lawrence pointed out that other than those data-driven companies (think of Google and Facebook) most of companies in the world are suffering from the abundance of data rather than enjoying it, which opens a large opportunity to researchers in academic labs. Murphy compared research in academia these days to Russians during the space race between US and Russia. The lack of resource may prove useful, or even necessary for algorithmic breakthroughs which Bengio and Hassabis found important still. Furthermore, Hassabis suggested finding tasks or problems where one can readily generate artificial data such as games.


Schmidhuber’s answer was the most unique one here. He believes that the code for truly working AI agents will be so simple and short that eventually high school students will play around with it. In other words, there won’t be any worry of industries monopolizing AI and its research. Nothing to worry at all!

On the Hype and the Potential Second NN Winter

As he’s been asked this question of overhyping everytime he was interviewed by a journalist, LeCun started on this topic. Overhyping is dangerous, said LeCun, and there are four factors;  (1) self-deluded academics who need funding, (2) startup founders who need funding, (3) program managers of funding agencies who manage funding and (4) failed journalism (who probably also needs funding/salary.) Recently in the field of deep learning, the forth factor has played a major role, and surprisingly, not all news articles have been the result of the PR machines at Google and Facebook. Rather, LeCun prefers if journalists would call researchers before writing potentially nonsense.

LeCun and Bengio believe that a potential solution both to avoid overhyping and to speed up the progress in research is the open review system, where (real) scientists/researchers put up their work online and publicly comment on them so as to let people see both the upside and downside of the paper (and why this paper alone won’t cause singularity.) Pushing it further, Murphy pointed out the importance of open sourcing research software, using which other people can more easily understand weakness or limitations of newly proposed methods in papers. Though, he pointed out it’s important for authors themselves to clearly state the limitation of their approaches whenever they write a paper. Of course, this requires what Leon Bottou said in his plenary talk (reviewers should encourage the discussion of limitations not kill the paper because of them.)

Similarly, Lawrence proposed that we, researchers and scientists, should slowly but surely approach the public more. If we can’t trust journalists, then we may need to do it ourselves. A good example he pointed to is “Talking Machines” podcast by Ryan Adams and Katherine Gorman.

Hassabis agrees that overhyping is dangerous, but also believes that there will be no third AI/NN winter. For, we now know better what caused the previous AI/NN winters, and we are better at not promising too much. If I may add here my own opinion, I agree with Hassabis, and especially because neural networks are now widely deployed in commercial applications (think of Google Voice), it’ll be even more difficult to have another NN winter (I mean, it works!)

Schmidhuber also agree with all the other panelists that there won’t be any more AI/NN winter, but because of yet another reason; the advances in hardware technology toward “more RNN-like (hence brain-like) architectures”. He believed this is time to move on to hardware technologies that are more suitable to neural networks, more specifically recurrent neural nets, where “a small 3D volume with lots of processors connected by many short and few long wires.”

One comment from Murphy was my favourite; ‘it is simply human nature.’

On AI Fear and Singularity

Apparently Hassabis of DeepMind has been at the core of recent AI fear from prominent figures such as Elon Musk, Stephen Hawking and Bill Gates. Hassabis introduced AI to Musk, which may have alarmed him. However, in recent months, Hassabis has convinced Musk, and also had a three-hour-long chat with Hawking about this. According to him, Hawking is less worried now. However, he emphasized that we must be ready, not fear, for the future.

Murphy found this kind of AI fear and discussion of singularity to be a huge distraction. There are so many other major problems in the world that require much immediate attention, such as climate changes and spreading inequalities. This kind of AI fear is a simply oversold speculation and needs to stop, to which both Bengio and LeCun agree. Similarly, Lawrence does not find the fear of AI the right problem to be worried about. Rather, he is more concerned with the issue of digital oligarchy and data inequality.

One interesting remark from LeCun was that we must be careful at distinguishing intelligence and quality. Most of the problematic human behaviours, because of which many fear human-like AI, are caused by human quality not intelligence. Any intelligent machine need not inherit human quality.

Schmidhuber had a very unique view on this matter. He believes that we will see a community of AI agents consisting of both smart ones and dumb ones. They will be more interested in each other (as ten-year-old girls are more interested in and hang out with other ten-year-old girls, and Capuchin monkeys are interested in hanging out with other Capuchin monkeys,) and may not be interested in humans too much. Furthermore, he believes AI agents will be significantly smarter than humans (or rather himself) without those human qualities that he does not like of himself, which is in lines with LeCun’s remark.

Questions from the Audience

Unfortunately, I was carrying around the microphone during this time and subsequently couldn’t make any note. There were excellent questions (for instance, from Tijmen Tieleman) and the responses from the panelists. Hopefully, if anyone reads this and remembers those questions and answers, please share this in the comment section.

One question I remember came from Tieleman. He asked the panelists about their opinions on active learning/exploration as an option for efficient unsupervised learning. Schmidhuber and Murphy responded, and before I reveal their response, I really liked it. In short (or as much as I’m certain about my memory,) active exploration will happen naturally as the consequence of rewarding better explanation of the world. Knowledge of the surrounding world and its accumulation should be rewarded, and to maximize this reward, an agent or an algorithm will active explore the surrounding area (even without supervision.) According to Murphy, this may reflect how babies learn so quickly without much supervising signal or even without much unsupervised signal (their way of active exploration compensates the lack of unsupervised examples by allowing a baby to collect high quality unsupervised examples.)

I had an honor to ask the last question directed mainly at Hassabis, LeCun and Murphy on what companies would do if they (accidentally or intentionally) built a truly working AI agent (in whatever sense.) Would they conceal it thinking that the world is not ready for this? Would they keep it secret because of potential opportunities for commercialization? Let me put a brief summary of their responses (as I remember, but again, I couldn’t write it down then.)

All of them expressed that it won’t happen like that (one accident resulting in a thinking machine.) And, because of this, LeCun does not find it concerning, as this will happen gradually as a result of joint efforts of many scientists both in industry and academia. Hassabis believes similarly to how LeCun does, and also couldn’t imagine that this kind of discovery, had it happened, will be able to be contained (probably the best leak of human history.) However, he argued for getting ready for the future where we, humans, will have access to truly thinking machines, of which sentiment I share. Murphy agreed with both LeCun and Hassabis. He together with LeCun made a remark about a recently released movie, Ex-Machina (which is by the way my favourite this year so far): It’s a beautifully filmed movie, but nothing like that will happen.

I agree with all the points they made. Though, there was another reason behind my question, which was unfortunately not discussed by them (undoubtably due to time constraint.) That is, once we have algorithms or machineries that are “thinking” and say the most important few pieces were developed in a couple of commercial companies (like the ones Hassabis, LeCun and Murphy are,) who will have right to those crucial components, will those crucial components belong to those companies or individuals, will they have to be made public (something like universal right to artificial intelligence?), and most importantly who will decide any of these?


Obviously, there is no conclusion. It is an ongoing effort, and I, or we the organizers, hope that this panel discussion has been successful at shedding at least a bit of light in the path toward the future of deep learning as well as general artificial intelligence (though, Lawrence pointed out the absurdity of this term by quoting Zoubin Ghahramani ‘if a bird flying is flying, then is a plane flying artificial flying?‘)

But, let me point a few things that I’ve personally found very interesting and inspiring:

  1.  Unsupervised learning as reinforcement learning and automated laboratory: instead of taking into account every single unlabeled example as it is, we should let a model selectively consider a subset of unlabeled examples to maximize a reward defined by the amount of accumulated knowledge.
  2.  Overhyping can be avoided largely by the active participation of researchers in distributing latest results and ideas, rather than by letting non-experts explain them to non-experts. Podcasting, open reviewing and blogging may help, but there’s probably no one right answer here.
  3.  I don’t think there was any one agreement on industry vs. academia. However, I felt that all three academic panelists as well as the other industry panelists all agree that each has its own role (sometimes overlapping) toward a single grand goal.
  4.  Deep learning has been successful at what humans are good at (e.g., vision and speech), and in the future we as researchers should also explore tasks/datasets where humans are not particularly good at (or only become good at after years and years of special training.) In this sense, medicine/health care seems to be one area where most of the panelists were interested in and probably are investing in.

When it comes to the format of the panel discussion, I liked it in general, but of course, as usual with anything, there were a few unsatisfactory things. The most unsatisfactory thing was the time constraint (1 hour) we set ourselves. We have gathered six amazing panelists who have so much to share with the audience and world, but on average, only 10 minutes per panelist was allowed. In fact, as one of the organizers, this is partly due to my fault in planning. It would’ve been even better if the panel discussion was scheduled to last the whole day with more panelists, more topics and more audience involvement (at least, I would’ve loved it!) But, of course, a three-day-long workshop has been way out of our league.

Another thing I think can be improved is the one-time nature of the discussion. It may be possible to make this kind of panel discussion some kind of yearly event. It may be co-located with a workshop, or can even be done online. This can help, as many of the panelists pointed out, us (and others) avoid overhyping our research result or the future of the whole field of machine learning, and will be a great way to approach a much larger audience including both senior and junior researchers as well as other informed/interested public. Maybe, I, or you who’s reading this, should email the hosts of “Talking Machines” and suggest it.


A comment from Juergen Schmidhuber


Schmidhuber read this post and emailed me with his comment to clarify a few things. With his permission I am putting here his comment as it is:
Thanks for your summary! I think it would be good to publish the precise transcript. Let me offer a few clarifications for now:
1. Why no additional NN winter? The laws of physics force our hardware to become more and more 3D-RNN-like (and brain-like): densely packed processors connected by many short and few long wires, e.g., Nature seems to dictate such 3D architectures, and that’s why both fast computers and brains are the way they are.  That is, even without any biological motivation, RNN algorithms will become even more important – no new NN winter in sight.
2. On AI fear: I didn’t say “Nothing to worry at all!” I just said we may hope for some sort of protection from supersmart AIs of the far future through their widespread lack of interest in us, like in this comment:
And in the near future there will be intense commercial pressure to make very friendly, not so smart AIs that keep their users happy. Unfortunately, however, a child-like AI could also be trained by sick humans to become a child soldier, which sounds horrible. So I’d never say “Nothing to worry at all!” Nevertheless, silly goal conflicts between robots and humans in famous SF movie plots (Matrix, Terminator) don’t make any sense.


Recent Reddit AMA’s about Deep Learning

Recently Geoffrey Hinton, Yann Lecun and Yoshua Bengio had reddit AMA’s where subscribers of r/MachineLearning asked questions to them. Each AMA contains interesting anectodes about deep learning by the most prominent scientists of the field.

Yoshua Bengio’s reddit AMA:

Yann Lecun’s reddit AMA:

Geoffrey Hinton’s AMA:

In addition to those, Michael I Jordan, an influential scientist in Machine Learning field, had a reddit AMA as well, where upon the questions, he had some remarks about deep learning.

Michael I Jordan’s AMA:

Google DeepMind Teams Up with Oxford University

DeepMind acquired startup by Google for 500M$ established a new collaboration with University of Oxford. The news is announced by Demis Hassabis, co-founder of DeepMind and VP of engineering at Google from a blog-post [1]. Deep learning researchers Prof Nando de Freitas, Prof Phil Blunsom, Dr Edward Grefenstette and Dr Karl Moritz Hermann, from University of Oxford, who teamed up earlier this year to co-found Dark Blue Labs, are hired by DeepMind. Also Dr Karen Simonyan, Max Jaderberg and Prof Andrew Zisserman, one of the world’s foremost experts on computer vision systems, and they recently have a start-up called Vision Factory will join DeepMind from University of Oxford[1,2].

The three professors hired by DeepMind are holding joint appointments at Oxford University where they will continue to spend part of their time. 

[1]Teaming up with Oxford University on Artificial Intelligence,, Last retrieved on: 24-10-2014.

[2]Google’s DeepMind Acqui-Hires Two AI Teams In The UK, Partners With Oxford,, Last Retrieved on: 24-10-2014.

Call For Papers: ICLR 2015

3rd International Conference on Learning Representations (ICLR2015)

Website: Submission deadline: December 19, 2014 Location: Hilton San Diego Resort & Spa, May 7-9, 2015


It is well understood that the performance of machine learning methods is heavily dependent on the choice of data representation (or features) on which they are applied. The rapidly developing field of representation learning is concerned with questions surrounding how we can best learn meaningful and useful representations of data. We take a broad view of the field, and include in it topics such as deep learning and feature learning, metric learning, kernel learning, compositional models, non-linear structured prediction, and issues regarding non-convex optimization.

Despite the importance of representation learning to machine learning and to application areas such as vision, speech, audio and NLP, there was no venue for researchers who share a common interest in this topic. The goal of ICLR has been to help fill this void.

A non-exhaustive list of relevant topics: – unsupervised, semisupervised, and supervised representation learning – metric learning and kernel learning – dimensionality expansion – sparse modeling – hierarchical models – optimization for representation learning – learning representations of outputs or states – implementation issues, parallelization, software platforms, hardware – applications in vision, audio, speech, natural language processing, robotics, neuroscience, or any other field

The program will include keynote presentations from invited speakers, oral presentations, and posters. This year, the program will also include a joint session with AISTATS.

ICLR’s Two Tracks

ICLR has two publication tracks.

Conference Track: These papers are reviewed as standard conference papers. Papers should be between 6-9 pages in length. Accepted papers will be presented at the main conference as either an oral or poster presentation and will be included in the official proceedings. A subset of accepted conference track papers will be selected to participate in a JMLR special topics issue on the subject of Representation Learning. Authors of the selected papers will be given an opportunity to extend their original submissions with supplementary material.

Workshop Track: Papers submitted to this track are ideally 2-3 pages long and describe late-breaking developments. This track is meant to carry on the tradition of the former Snowbird Learning Workshop. These papers are non-archival workshop papers, and therefore may be published elsewhere.

Note that submitted conference track papers that are not accepted to the conference proceedings are automatically considered for the workshop track.

ICLR Submission Instructions

1. Authors should post their submissions (both conference and workshop tracks) on arXiv:

2. Once the arXiv paper is publicly visible (there can be an approx. 30 hour delay), authors should go to the openreview ICLR2015 website to submit to either the conference track or the workshop track.

To register on the openreview ICLR2015 website, the submitting author must have a Google account.

For more information on paper preparation, including style files and the URL for the openreview ICLR2015 website, please see

Submission deadline: December 19, 2014

Notes: i. Regarding the conference submission’s 6-9 page limits, these are really meant as guidelines and will not be strictly enforced. For example, figures should not be shrunk to illegible size to fit within the page limit. However, in order to ensure a reasonable workload for our reviewers, papers that go beyond the 9 pages should be formatted to include a 9 page submission and a separate supplementary material submission that will be optionally reviewed. If the paper is selected for the JMLR special topic issue, this supplementary material can be incorporated into the final journal version. ii. Workshop track submissions should be formatted as a short paper, with introduction, problem statement, brief explanation of solution, figure(s) and references. They should not merely be abstracts. iii. Paper revisions will be permitted, and in fact are encouraged, in response to comments from and discussions with the reviewers (see “An Open Reviewing Paradigm” below). iv. Authors are encouraged to post their papers to arXiv early enough that the paper has an arXiv number and URL by the submission deadline of 19 Dec. 2014. However, if these are not yet available, authors have up to one week after the submission deadline to provide the arXiv number and URL. At submission time, simply provide the title, authors, abstract, and temporary arXiv number indicating that the paper has been submitted to arXiv.

An Open Reviewing Paradigm

1. Submissions to ICLR are posted on arXiv prior to being submitted to the conference.

2. Authors submit their paper to either the ICLR conference track or workshop track via the the openreview ICLR2015 website.

3. After the authors have submitted their papers via, the ICLR program committee designates anonymous reviewers as usual.

4. The submitted reviews are published without the name of the reviewer, but with an indication that they are the designated reviews.

5. Anyone can openly (non-anonymously) write and publish comments on the paper. Anyone can ask the program chairs for permission to become an anonymous designated reviewer (open bidding). The program chairs have ultimate control over the publication of each anonymous review. Open commenters will have to use their real names, linked with their Google Scholar profiles.

6. Authors can post comments in response to reviews and comments. They can revise the paper as many times as they want, possibly citing some of the reviews. Reviewers are expected to revise their reviews in light of paper revisions.

7. The review calendar includes a generous amount of time for discussion between the authors, anonymous reviewers, and open commentators. The goal is to improve the quality of the final submissions.

8. The ICLR program committee will consider all submitted papers, comments, and reviews and will decide which papers are to be presented in the conference track, which are to be presented in the workshop track, and which will not appear at ICLR.

9. Papers that are presented in the workshop track or are not accepted will be considered non-archival, and may be submitted elsewhere (modified or not), although the ICLR site will maintain the reviews, the comments, and the links to the arXiv versions.

General Chairs

Yoshua Bengio, Université de Montreal Yann LeCun, New York University and Facebook

Program Chairs

Brian Kingsbury, IBM Research Samy Bengio, Google Nando de Freitas, University of Oxford Hugo Larochelle, Université de Sherbrooke


The organizers can be contacted at

Google’s Entry to ImageNet 2014 Challenge

Imagenet 2014 competition is one of the largest and the most challenging computer vision challenge. This challenge is held annually and each year it attracts top machine learning and computer vision researchers. Neural networks, specifically convolutional neural networks again made a big impact on the result of this year’s challenge [1]. Google’s approach won the classification and object recognition challenges. Google used a new variant of convolutional neural network called “Inception” for classification, and for detection the R-CNN [5] was used. The results and the approach that Google’s team took are summarized here [2, 3]. Google’s team was able to train a much smaller neural network and obtained much better results  compared to results obtained with convolutional neural networks in the previous year’s challenges.  Andrej Karpathy, one of the organizer of the competition, summarized his experience and the challenge itself in his blog post [4].

[1] Imagenet 2014 LSVRC results,, Last retrieved on: 19-09-2014.

[2] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, Going Deeper with Convolutions, Arxiv Link:

[3] GoogLeNet presentation,, Last retrieved on: 19-09.2014..

[4] What I learned from competing against a convnet on imagenet,, Last retrieved on: 19-09-2014.

[5] Girshick, Ross, et al. “Rich feature hierarchies for accurate object detection and semantic segmentation.” arXiv preprint arXiv:1311.2524 (2013).

Andrew Ng is hired by Baidu

Andrew Ng who is one of the co-founder of Coursera, an ex-employee of Google, professor at University of Stanford and an important contributor for machine learning has just been hired by Baidu[1,2,3]. Andrew Ng is going to take on the role of Chief Scientist at Baidu in Silicon Valley. Adam Coates, previously a PhD and Postdoc student of Andrew Ng,  is going to join Baidu as well and his research is going to be mainly focused on unsupervised learning algorithms.