Google’s new Deep Learning Algorithm Transcribes House Numbers

During his summer internship, Ian Goodfellow (currently a PhD student at UdeM Lisa Lab) and his collaborators from Google, Yaroslav Bulatov, Julian Ibarz, Sacha Arnoud, Vinay Shet, submitted a paper to ICLR 2014 that proposes a deep learning method which successfully transcribes the house numbers from Google Streetview images. This work took wide coverage in the internet media[1, 2, 3].

[1] http://www.wired.co.uk/news/archive/2014-01/07/google-street-view-house-numbers

[2] http://motherboard.vice.com/blog/how-google-knows-your-house-number

[3] http://www.technologyreview.com/view/523326/how-google-cracked-house-number-identification-in-street-view/

3 comments to Google’s new Deep Learning Algorithm Transcribes House Numbers

  • [...] The idea behind deep learning is that instead of explicitly teaching the algorithm “cats vs. lizards,” you allow computers to learn those simpler components and then build on them, the way a child would learn first sounds, then words, then complete sentences. It’s an approach that’s proven remarkably effective, and has the potential to transform many of the algorithms that power our day-to-day experiences on the net, from a search engine that can understand the web pages it crawls to a photo-sharing site that can recognize the faces of your friends in the photos you upload, to a street-view service that can read the numbers on people’s front doors. [...]

  • [...] The idea behind deep learning is that instead of explicitly teaching the algorithm "cats vs. lizards," you allow computers to learn those simpler components and then build on them, the way a child would learn first sounds, then words, then complete sentences. It’s an approach that’s proven remarkably effective, and it has the potential to transform many of the algorithms that power our day-to-day experiences on the net, from a search engine that can understand the web pages it crawls to a photo-sharing site that can recognize the faces of your friends in the photos you upload, to a street-view service that can read the numbers on people’s front doors. [...]

  • [...] L’idea che sta dietro l’apprendimento approfondito è che invece di insegnare esplicitamente l’algoritmo “gatti contro lucertole”, fornisci al computer la capacità di imparare le singole componenti per poi metterle insieme, come un bambino che impara prima a emettere suoni, poi a formulare parole e infine frasi complete. Si tratta di un approccio collaudato, straordinariamente efficiente, che ha il potenziale di trasformare molti degli algoritmi che accrescono di giorno dopo giorno la nostra esperienza sulla rete, da un motore di ricerca che riesce a riconoscere le pagine web su cui naviga a un sito di condivisione di foto che individua le facce dei tuoi amici, a unservizio di street-view che riesce a leggere i numeri civici. [...]