In a character recognition system, a method and apparatus for segmenting a document image into areas containing text and non-text. Document segmentation in the present invention is comprised generally of the steps of: providing a bit-mapped representation of the document image, extracting run lengths...http://www.google.com.hk/patents/US5335290?utm_source=gb-gplus-share專利 US5335290 - Segmentation of text, picture and lines of a document image