I try to recognize multiple documents containing tables with black, red and yellow text.
While red and black text is recognized, yellow text is not recognized in almost all cases.
I tried adding TextBlocks at the cells position. However, the text is still only recognized ~25% of the time.
I had greater success when I extract the yellow text as a Bitmap into a new ImageDocument. This, however, costs an additional page for each ImageDocument. This is not feasible.
I am guessing that the yellow text is lost during the binarization for the OCR process.
Any Ideas how to overcome this problem?