Patent application number | Description | Published |
20080253657 | Geometric parsing of mathematical expressions - A processing device may parse a group of strokes representing a mathematical expression. The group of strokes may be examined to determine whether the group of strokes satisfies any of a finite set of rules. When the group of strokes, included in a region, satisfies any of the finite set of rules, the region may be partitioned according to a satisfied one of the finite set of rules. The group of strokes included in the region may be further examined to determine whether the group of strokes may be further partitioned according to any of the finite set of rules. After all regions have been examined and no further partitioning of regions may be performed, all mathematical symbols of the mathematical expression may be isolated in at least some of the regions and may be recognized. | 10-16-2008 |
20080260240 | User interface for inputting two-dimensional structure for recognition - In embodiments consistent with the subject matter of this disclosure, a user may input one or more strokes as digital ink to a processing device. The processing device may produce and present a recognition result, which may include a misrecognized portion. A user may indicate a desire to correct the misrecognized portion and may further select one or more strokes of the misrecognized portion. The processing device may then present the one or more recognition alternates corresponding to the selected one or more strokes of the misrecognized portion. In some embodiments, the processing device may permit a user to rewrite the selected one or more strokes of the misrecognized portion with newly entered digital ink. Features, such as, rewriting and correction of the input digital ink may be discoverable in some embodiments. | 10-23-2008 |
20080260251 | Recognition of mathematical expressions - In embodiments consistent with the subject matter of this disclosure, a user may input strokes as digital ink to a processing device. The processing device may partition the input strokes into multiple regions of strokes. A first recognizer and a second recognizer may score grammar objects included in regions and represented by chart entries. The scores may be converted to a converted score, which may have at least a near standard normal distribution. The processing device may present a recognition result based on highest converted scores according to a recurrence formula. The processing device may receive a correction hint with respect to misrecognized strokes and may add a penalty score with respect to chart entries representing grammar objects breaking the correction hint. Incremental recognition may be performed when a pause is detected during inputting of strokes. | 10-23-2008 |
20090304282 | RECOGNITION OF TABULAR STRUCTURES - A number of regions and partitions may be created based on input handwritten atoms and a grammar parsing framework. Productions for tabular structures may be added to the grammar parsing framework to produce an extended grammar parsing framework. Each of the regions may be searched for a tabular structure. Upon finding a tabular structure, a type of tabular structure may be determined. Configuration partitions may be created, based on the added productions, and added to the created partitions. A set of configuration regions may be created based on the configuration partitions and added to the created regions. The productions for tabular structures and productions of the grammar parsing framework may be applied, as rewriting rules, to the atoms to produce possible recognition results. A best recognition result may be determined and displayed. A mechanism for correcting misrecognition errors, which may occur while recognizing tabular structures, may be provided. | 12-10-2009 |
20090304283 | CORRECTIONS FOR RECOGNIZERS - A processing device may recognize a number of input handwritten strokes, which may represent a mathematical expression, a chemical formula, or other two-dimensional structure. Rewriting rules of a grammar may be applied to the strokes to produce a number of possible recognition results. Each of the possible recognition results has a respective score based on a sum of rewriting rules applied to the strokes to produce respective ones of the possible recognition results. Input may be provided to identify misrecognized strokes and a correct terminal production, or symbol corresponding to the misrecognized strokes. Strokes may be misrecognized for many reasons, including parsing errors, over-grouping or under-grouping of matrices, and improper placement of a recognized terminal production, or symbol, with respect to a root structure. Correction hints may be leveraged for correcting types of errors mentioned above. | 12-10-2009 |
Patent application number | Description | Published |
20090144605 | PAGE CLASSIFIER ENGINE - Embodiments of the present invention relate to classifying pages of an electronic document, such as a scanned book page. OCR software is applied to the contents of the electronic document, revealing semantic information about the content of the electronic document. Software-based features are applied to the semantic information to determine the type of page the electronic document is. Page types may include table of contents (TOC), table of figures (TOF), bibliography, index, or other types of pages commonly found in a book, magazine, or other publication. Once determined, the determined page type is stored and used by other software engines. | 06-04-2009 |
20090144614 | DOCUMENT LAYOUT EXTRACTION - Computer-readable media, systems, and methods for document layout extraction are described. In embodiments, textual data in an electronic format is received and the textual data is converted from the electronic format to an independent interface format, the independent interface format including coordinates to one or more structural elements of the textual data. Further, in embodiments, a structure and layout analysis of the textual data is performed to generate a set of structure and layout information. Still further, in embodiments, the textual data and the set of structure and layout information is stored in an enriched interface format, the enriched interface format providing for search and navigation of the textual data. | 06-04-2009 |
20110222768 | TEXT ENHANCEMENT OF A TEXTUAL IMAGE UNDERGOING OPTICAL CHARACTER RECOGNITION - A method for enhancing a textual image for undergoing optical character recognition begins by receiving an image that includes native lines of text. A background line profile is determined which represents an average background intensity along the native lines in the image. Likewise, a foreground line profile is determined which represents an average foreground background intensity along the native lines in the image. The pixels in the image are assigned to either a background or foreground portion of the image based at least in part on the background line profile and the foreground line profile. The intensity of the pixels designated to the background portion of the image is adjusted to a maximum brightness so as to represent a portion of the image that does not include text. | 09-15-2011 |
20110222772 | RESOLUTION ADJUSTMENT OF AN IMAGE THAT INCLUDES TEXT UNDERGOING AN OCR PROCESS - An optical character recognition process characterizes text lines in a textual image by their base-line, mean-line and x-height. The base-line for at least one text line in the image is determined by finding a parametric curve that maximizes a first fitness function that depends on the values of pixels through which the parametric curve passes and pixels below the parametric curve. The base-line corresponds to the parametric curve for which the first fitness function is maximized. The first fitness function is designed so that it increases with increasing lightless or brightness of pixels immediately below the parametric curve while also increasing with decreasing lightness of pixels through which the parametric curve passes. The mean-line is determined by incrementally shifting the base-line upward by predetermined amounts (e.g., a single pixel) until a second fitness function for the shifted base-line is maximized. The second fitness function is essentially the inverse of the first fitness function. Specifically, the second fitness function increases with increasing lightless of pixels immediately above the shifted base-line while also increasing with decreasing lightness of pixels through which the shifted base-line passes. The x-height is equal to the sum of the predetermined amounts by which the base-line is shifted upward in order to maximize the second fitness function. In some cases different groups of text-lines in the textual image may be characterized differently from one another. For example, each group may be characterized by a most probable x-height for that group. | 09-15-2011 |
20110243445 | DETECTING POSITION OF WORD BREAKS IN A TEXTUAL LINE IMAGE - Line segmentation in an OCR process is performed to detect the positions of words within an input textual line image by extracting features from the input to locate breaks and then classifying the breaks into one of two break classes which include inter-word breaks and inter-character breaks. An output including the bounding boxes of the detected words and a probability that a given break belongs to the identified class can then be provided to downstream OCR or other components for post-processing. Advantageously, by reducing line segmentation to the extraction of features, including the position of each break and the number of break features, and break classification, the task of line segmentation is made less complex but with no loss of generality. | 10-06-2011 |
20110280481 | USER CORRECTION OF ERRORS ARISING IN A TEXTUAL DOCUMENT UNDERGOING OPTICAL CHARACTER RECOGNITION (OCR) PROCESS - An electronic model of the image document is created by undergoing an OCR process. The electronic model includes elements (e.g., words, text lines, paragraphs, images) of the image document that have been determined by each of a plurality of sequentially executed stages in the OCR process. The electronic model serves as input information which is supplied to each of the stages by a previous stage that processed the image document. A graphical user interface is presented to the user so that the user can provide user input data correcting a mischaracterized item appearing in the document. Based on the user input data, the processing stage which produced the initial error that gave rise to the mischaracterized item corrects the initial error. Stages of the OCR process subsequent to this stage then correct any consequential errors arising in their respective stages as a result of the initial error. | 11-17-2011 |