By defauly, tesseract does both detection and recognition.
Is it possible to have an API for recognize() which would just perform recognition and return the output text with confidence?
Or atleast simulate it?
The pytesseract.image_to_string() call only gives the recognized text.
For image_recognize(), we could do something like this for output_type dict:
def recognize(img):
data = pytesseract.image_to_data(img, lang=self.lang_str, output_type='dict')
texts = []
avg_confidence = 0
total_bboxes = 0
# assert len(data['text']) == 1 # Should contain only 1 bbox
for i in range(len(data['text'])):
text, conf = data['text'][i].strip(), float(data['conf'][i]) / 100.0
if conf < 0 or not text:
continue
total_bboxes += 1
avg_confidence += conf
texts.append(text)
if not total_bboxes:
return {}
return {
'text': ' '.join(texts),
'confidence': avg_confidence/total_bboxes
}
Can you please take this as a feature request?
This would be helpful if someone is using their own detector and want to just perform recognition using tesseract.