We are going to use pytesseract and pillow library to work on this project!
Before you start coding, you need to complete three tasks:
1. Click on the link below and install tesseract-OCR
https://github.com/UB-Mannheim/tesseract/wiki
After you install the setup successfully, take a note of where you are saving the file because we need that path in our code.
2. Install pytesseract by using command : pip install pytesseract
3. Install pillow by using command : pip install pillow
Source Code:
import pytesseract from PIL import Image pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' value=Image.open('logo.png') text=pytesseract.image_to_string(value) print("Extracted Data is: \n ", text)
Notice that if you want to add full path, you need to add double slash (\\) instead of single slash (\) in you path
For Example: If your path looks like this: D:\Python_Program\my_Image.jpg
Then replace \ with \\
something like : D:\\Python_Program\\my_Image.jpg
Or
Add alphabet 'r' before your string
example: r'C:\Program Files\Tesseract-OCR\tesseract.exe'
and then you are good to go!
No comments:
Post a Comment