Two PhD Positions in Computer Science / Document Image Analysis at ESIEE Paris, France
Further to the acceptance of a project funded by the French Ministry of Industry, ESIEE Paris seeks applicants for two PhDs in the field of Computer Science and Document Image Analysis (DIA). For an overview of the activities of the Department, please visit:
http://www.esiee.fr/en/research/a2si.php
The overall goal of the project is to improve an existing document analysis system, which is able to convert various type of documents, most of them originating from the French heritage and assuming various physical forms (books, microfilms, postcards, civil deeds, etc.). The novelty of the PhD comes from the proposed methodology, which suggests processing images directly in grey levels rather than in their binarized version.
Strong knowledges in image analysis and processing, as well as the technical mastery of a programming language (such as C++ or Java) are a must. Additional knowledges in mathematical morphology for the first position, and experience with an environment such as Matlab or R, or further knowledges on statistical classification/recognition methods (SVM, neural networks, bayesian networks, …) for the second position would be additional assets.
Interested applicants should be either French-speaking or fluent in english. Resumes, possibly including a publication or a manuscript, and motivation letters should be sent by e-mail only to:
l.najman@esiee.fr for position 1
x.hilaire@esiee.fr for position 2
Duration : 36 months
Salary : 39 k€ per annum.
Expected starting date : November 2008.
Position 1 : Document image enhancement using topological and morphological filters in simplicial complexes
The aim of the PhD is to improve the quality of the document images by rectifying as much as possible the various consequences of noise, which appear either at pixel level (white noise, locally low contrast, blotting effect of the paper), or at character level (ink default, paper defect, broken characters). Page level defects other than white noise (e.g., rotation, uneven feeding of paper when using roll scanners), however, need not be addressed.
We propose to design a morphological and topological filtering method that operates within the framework of greylevel simplicial complexes. Our idea is to process images at subpixel level only, and to refrain from binarizing it. When processing printed text, the method aims to provide an image with characters as close as possible to the ground truth (without it be possible, however, to formally state its performance, since the commercial OCR FineReader will be used as a black box in the real processing chain). When coping with handwritten text, the method shall compute, word by word, a filtered skeleton of these words that will be used as input to feed an HCR.
In both cases, formal performances of the filtering method will be established based on the specifications of a public OCR and HCR, and proped up with experimental validations using FineReader and DocumentReader in the framework of the project.
Position 2 : Contextual segmentation of document images using greylevel texture analysis
Document page segmentation is to automatically recognize and extract its various components (text and text blocks, mathematical formulas, halftones, captions, …).
Numerous segmentation methods are available in the literature. The usual taxonomy grossly fits them into three families : top-down methods (one starts from an entire page, then recursively split it until a criterion is satisfied on each region), bottom-up methods (the opposite approach), and hybrid methods. The latter family obviously gathers methods that take advantage of both top-down and bottom-up strategies, but also those which rely on texture analysis (Gabor analysis, co-occurrence, HTD, edge histograms, etc.).
One critic that may be addressed to almost all of the existing methods is their inability to process any image but binary ones, as most of them generally need to separate the background from the foreground of the document very early. The aim of the PhD is to design a texture-based method, which would improve over the existing ones in three different manners :
1. By using greylevel texture descriptors : the additional information conveyed by greylevels should result in a significant accuracy of the descriptors, and thereafter in that of the segmentation itself. It would even be desirable to define or use color whenever color is available.
2. By contextualizing the segmentation : although heterogeneous, the document corpus remains rather well identified. Our idea is then to introduce document models that could not only permit to modify the probability laws that a pixel belongs to a class given the document and the corpus, but also to give an a posteriori explanation of these laws taken jointly. Bayesian networks, in particular, could constitute an appealing framework to solve this problem.
.
3. By explicitly modelizing and explaining noise : it is highly desirable that the segmentation method modelize noise as a class of its own, and be able to explain it. Such an approach has already been proposed in the literature, for instance for distinguishing between handwritten and printed text on binary images by Zheng et al., and exhibited interesting results. Significant improvements are to be expected by a similar approach extended in grey levels.