Google


Thursday, October 04, 2012

scrape random movies from IMDB

I have created a scraper for IMDB. It creates a graph as it downloads all referred actors/directors, keywords, languages etc - basically features which could be put into a recommender or similar system.

See http://pastebin.com/ezkW0Ru1

Output looks like


http://www.imdb.com/title/tt1192995:genre/Animation genre/Family name/nm0784124/ name/nm1293791/ name/nm0265620/ name/nm0754781/ keyword/bear
http://www.imdb.com/title/tt1030901:genre/News country/jp language/ja
http://www.imdb.com/title/tt1016481:genre/Drama genre/Romance genre/Mystery name/nm0130215/ name/nm0280541/ name/nm0302384/ name/nm0309129/ name/nm0130191/ name/nm0560478/ name/nm0001607/ name/nm0908001/ name/nm0912604/ name/nm0133597/ name/nm0489010/ name/nm0005166/ name/nm0593411/ name/nm0929869/ keyword/soap keyword/tragedy keyword/betrayal keyword/shipper country/us language/en
http://www.imdb.com/title/tt1294723:genre/Short genre/Drama name/nm0430267/ name/nm3136900/ name/nm0018495/ name/nm0068168/ name/nm0231191/ name/nm0263099/ name/nm0341647/ name/nm0367731/ name/nm0792129/ name/nm0909848/ country/gb language/en company/co0248652/
http://www.imdb.com/title/tt1335935:genre/Documentary
http://www.imdb.com/title/tt1209362:

No comments: