[Dev log] Python image crawling

Dev Log/Python

[Dev log] Python image crawling

Godwony 2020. 4. 22. 17:35

네이버에서 이미지를 크롤링 해봅시다.

아래 코드 참고하시구요

어느 사이트든 html 구조만 잘 분석하셔도 크롤링은 충분히 가능합니다.

크롤링을 할때는 항상 주의해주시구요

Python Naver image crawling

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time, random, os
from urllib.request import urlretrieve
from tqdm import tqdm

def get_images(keyword) : 
    # 1~3 초 사이의 랜덤난수 
    randomsl = random.uniform(1,3) 

    print('Loading')
    # 스크롤 다운을 해야 되서 chromedriver 사용
    driver = webdriver.Chrome('./chromedriver')
    driver.implicitly_wait(30)    
    url = 'https://search.naver.com/search.naver?where=image&sm=tab_jum&query={}'.format(keyword)
    driver.get(url)

    # 스크롤 다운을 하기 위한 select
    body = driver.find_element_by_css_selector('body')
    for i in range(3) : 
        body.send_keys(Keys.PAGE_DOWN)
        time.sleep(randomsl)    # 1~3초 사이 난수 적용 

    # 이미지 tag select
    imgs = driver.find_elements_by_css_selector('img._img')
    result=[]
    for img in tqdm(imgs) : 
        if 'http' in img.get_attribute('src') :
            result.append(img.get_attribute('src'))

    driver.close()
    print('Search End')

    # images 폴더 아래에 크롤링 하는 사람의 폴더를 생성 
    path = 'images/' 
    if not os.path.isdir(path + './{}'.format(keyword))  : 
        os.mkdir(path + './{}'.format(keyword) + '/')
        print('{}'.format(keyword) + 'make folders')
    else : 
        print('{}'.format(keyword) + 'be folders')

    # 이미지 다운로드 
    for index, link in tqdm(enumerate(result)) : 
        start = link.rfind('.')
        end = link.rfind('&')
        filetype = link[start : end]
        urlretrieve(link, path + './{}/{}{}'.format(keyword, index, filetype))
        time.sleep(randomsl)    # 1~3초 사이 난수 적용

    print('DownLoad End')

if __name__ == "__main__":
    keyword = input('who Search: ')
    get_images(keyword)

출처 https://www.youtube.com/watch?v=HJN28B7OSzw

'Dev Log > Python' 카테고리의 다른 글

[Dev log] selenium page down, scroll down, 스크롤 내리기 (1)	2021.01.14
[Dev log] Python Web crawling selenium for Naver Login (0)	2021.01.12
[Dev log] Python 개행 문자(\n) 삭제 - map, lambda, strip (0)	2021.01.11
[Dev log] Python jupyter notebook에서 kenerl이 안보일때 (0)	2020.06.05
[Dev log] Python Crawling (0)	2020.04.23

현재글[Dev log] Python image crawling

250x250

국산루프박스, Python, 연남동맛집, 연남동가성비점심, 딥러닝, 게르카라붐, 퍼시픽오션ex, 월패드네트워크불량, 체리새우사료, 연남동점심, 루프박스추천, 카라붐, 연남동점심추천, 대형텐트, 노스피크, 게르, shn-8070 네트워크불량, 실제경험, 시놀로지 나스 도커, gpu컴퓨팅,

Today :
Yesterday :

Hiwony blog

[Dev log] Python image crawling

Python Naver image crawling

'Dev Log > Python' 카테고리의 다른 글

'Dev Log/Python'의 다른글

티스토리툴바

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

[Dev log] Python image crawling

Python Naver image crawling

'Dev Log > Python' 카테고리의 다른 글

'Dev Log/Python'의 다른글

관련글

티스토리툴바