Deep Learning - Digital Imaging

Googleを使った画像スクレイピング

 今回はDeepLearningをする際に学習データ用の画像を収集することがあると思いますが、今回は、Google画像検索を使って簡単におこなう小ネタです。


 Google画像検索で取得したい画像がリスト表示されたところで、下記のjavascriptをブラウザ(Google Chrome推奨)の開発者ツールのコンソールで実行すると、検索結果として表示された画像のソースURLがCSVファイルとしてダウンロードできます。

// 検索結果のHTMLソースから各検索結果のアイテムごとにソースURLを取得し配列化
urls = Array.from(document.querySelectorAll('.rg_di .rg_meta')).map(el=>JSON.parse(el.textContent).ou);

// 改行コードを各行末尾に付けたCSVとして吐き出す
window.open('data:text/csv;charset=utf-8,' + escape(urls.join('\n')));

 上記の実行によって、表示されたパンダの画像のソース画像のURLリストがdownload.csvという名前のファイルとしてダウンロードされます。

https://upload.wikimedia.org/wikipedia/commons/0/0f/Grosser_Panda.JPG
https://ichef.bbci.co.uk/news/660/cpsprodpb/4FA0/production/_108848302_a0d15811-30d8-4a51-8dd3-ab45f3dbc387.jpg
https://cdn.livekindly.co/wp-content/uploads/2018/09/panda-1.jpg
https://c402277.ssl.cf1.rackcdn.com/photos/11551/images/hero_small/Bernard_de_wetter_wwf_canon_113974.jpg?1462218465
https://ichef.bbci.co.uk/news/660/cpsprodpb/169F6/production/_91026629_gettyimages-519508400.jpg
https://www.motherjones.com/wp-content/uploads/2018/06/panda-research-6-27-18-2.jpg?w=990
https://cbsnews1.cbsistatic.com/hub/i/2016/08/26/cdf56aa8-1f2a-4d44-8cac-ab5993ee7d18/gettyimages-594359398.jpg
https://scx1.b-cdn.net/csz/news/800/2019/giantpandame.jpg
https://ca-times.brightspotcdn.com/dims4/default/adc09a0/2147483647/strip/true/crop/2048x1152+0+0/resize/840x473!/quality/90/?url=https%3A%2F%2Fca-times.brightspotcdn.com%2F2a%2Fb3%2F0952d2af911f85896ec2d937de29%2Fsd-1553375117-bc1xhf1twl-snap-image
https://static.scientificamerican.com/sciam/cache/file/ACF0A7DC-14E3-4263-93F438F6DA8CE98A_source.jpg?w=590&h=800&896FA922-DF63-4289-86E2E0A5A8D76BE1
https://c402277.ssl.cf1.rackcdn.com/photos/13100/images/magazine_large/BIC_128.png?1485963152
https://www.nationalgeographic.com/content/dam/animals/2018/08/giant-pandas-vitale/panda-cubs-group.adapt.1900.1.jpg
https://static.independent.co.uk/s3fs-public/thumbnails/image/2019/01/31/14/panda-bamboo.jpg
https://cdn1.i-scmp.com/sites/default/files/styles/1200x800/public/images/methode/2019/01/25/f4e875a8-1ee6-11e9-9b66-f8d7b487d426_image_hires_010852.jpg?itok=jpygj9or&v=1548349734
https://www.sciencemag.org/sites/default/files/styles/inline__450w__no_aspect/public/panda_16x9.jpg?itok=8a0-7WSj
https://crosstalk.cell.com/hs-fs/hubfs/Images/Jennifer%20Levine/New%20Insights%20into%20pandas,/6990634-panda-hug.jpg?width=2560&height=1600&name=6990634-panda-hug.jpg
https://www.worldatlas.com/r/w728-h425-c728x425/upload/59/17/19/shutterstock-688280269.jpg
https://cdn.newsapi.com.au/image/v1/fae23ecd7187c894038be4907c079025
https://img.etimg.com/thumb/height-450,width-800,imgsize-126408,msid-49662050/scientists-decode-giant-panda-language-in-china.jpg
https://www.dw.com/image/42524546_303.jpg
https://media.tacdn.com/media/attractions-splice-spp-674x446/07/26/3c/87.jpg
https://i0.wp.com/www.redpandanetwork.org/wp-content/uploads/2019/07/Beautiful-SM-size-.png?resize=800%2C800&ssl=1
https://i.guim.co.uk/img/media/40c2aed00bdb51f8261a48e1c9dfab64ca5c4b15/263_460_3014_1809/master/3014.jpg?width=300&quality=85&auto=format&fit=max&s=9a094474706a8342ef92e473de350d12
https://cdn.mos.cms.futurecdn.net/3n8tRry6fYg7sNyhFDPQwR-320-80.jpg
https://ichef.bbci.co.uk/news/624/cpsprodpb/F58E/production/_109326826_gettyimages-1180602673-594x594.jpg
https://www.straitstimes.com/sites/default/files/styles/article_pictrure_780x520_/public/articles/2019/09/19/yq-pandathai-19092022.jpg?itok=RH3gsT_1×tamp=1568879977
https://thumbs-prod.si-cdn.com/OfnYT3mpQGkbP9UFQjSbz3w_xYs=/800x600/filters:no_upscale()/https://public-media.si-cdn.com/filer/c5/dd/c5dd326f-7983-4b4e-a555-bf4f8ac40769/panda_cub_from_wolong_sichuan_china.jpg
http://img2.chinadaily.com.cn/images/201905/20/5ce1e456a3104842e4ae5bae.jpeg
http://blogs.discovermagazine.com/d-brief/files/2014/07/panda.jpg
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments