OCR_이미지를 텍스트로 변환

코딩이것저것

황TL 2017. 8. 2. 20:28

사진 이미지에서의 영어를 text로 출력

"""

tesseract-4.0.0-alpha 다운로드

tessdata 폴더 만든 후

https://github.com/tesseract-ocr/tessdata 사이트에서 언어 다운로드 -> 폴더 저장

"""

from PIL import Image

from pytesseract import *

def OCR(imgfile, lang='eng'): #'eng' = 영어로 번역

im = Image.open('C:/Users/32goqudeo/text.jpg') #이미지파일 불러오기

text = image_to_string(im, lang=lang)

print(text) #텍스트로 출력

OCR('C:/Users/32goqudeo/text.jpg') #이미지파일 불러오기

한국어는 받침이 있어서 인식이 안됌.