Patent application number | Description | Published |
20090015676 | Recognition and Tracking Using Invisible Junctions - The present invention uses invisible junctions which are a set of local features unique to every page of the electronic document to match the captured image to a part of an electronic document. The present invention includes: an image capture device, a feature extraction and recognition system and database. When an electronic document is printed, the feature extraction and recognition system captures an image of the document page. The features in the captured image are then extracted, indexed and stored in the database. Given a query image, usually a small patch of some document page captured by a low resolution image capture device, the features in the query image are extracted and compared against those stored in the database to identify the query image. The present invention also includes methods for recognizing and tracking the viewing region and look at point corresponding to the input query image. This information is combined with a rendering of the original input document to generate a new graphical user interface to the user. This user interface can be displayed on a conventional browser or even on the display of an image capture device. | 01-15-2009 |
20090016564 | Information Retrieval Using Invisible Junctions and Geometric Constraints - The present invention uses invisible junctions which are a set of local features unique to every page of the electronic document to match the captured image to a part of an electronic document. The present invention includes: an image capture device, a feature extraction and recognition system and database. When an electronic document is printed, the feature extraction and recognition system captures an image of the document page. The features in the captured image are then extracted, indexed and stored in the database. Given a query image, usually a small patch of some document page captured by a low resolution image capture device, the features in the query image are extracted and compared against those stored in the database to identify the query image. The present invention advantageously uses geometric estimation to reduce the query results to a single one or a few candidate matches. In one embodiment, the two separate geometric estimations are used to rank and verify matching candidates. | 01-15-2009 |
20090016604 | Invisible Junction Features for Patch Recognition - The present invention uses invisible junctions which are a set of local features unique to every page of the electronic document to match the captured image to a part of an electronic document. The present invention includes: an image capture device, a feature extraction and recognition system and database. When an electronic document is printed, the feature extraction and recognition system captures an image of the document page. The features in the captured image are then extracted, indexed and stored in the database. Given a query image, usually a small patch of some document page captured by a low resolution image capture device, the features in the query image are extracted and compared against those stored in the database to identify the query image. The present invention also includes methods for feature extraction, feature indexing, feature retrieval and geometric estimation. | 01-15-2009 |
20090016615 | Invisible Junction Feature Recognition For Document Security or Annotation - The present invention uses invisible junctions which are a set of local features unique to every page of the electronic document to match the captured image to a part of an electronic document. The present invention includes: an image capture device, a feature extraction and recognition system and database. When an electronic document is printed, the feature extraction and recognition system captures an image of the document page. The features in the captured image are then extracted, indexed and stored in the database. Given a query image, the features in the query image are extracted and compared against those stored in the database to identify the query image. The feature extraction and recognition system of the present invention is integrated into a multifunction peripheral. This allows the feature extraction and recognition system to be used in conjunction with other modules to provide security and annotation applications. | 01-15-2009 |
20090019402 | User interface for three-dimensional navigation - The present invention uses invisible junctions which are a set of local features unique to every page of the electronic document to match the captured image to a part of an electronic document. The present invention includes: an image capture device, a feature extraction and recognition system and database. When an electronic document is printed, the feature extraction and recognition system captures an image of the document page. The features in the captured image are then extracted, indexed and stored in the database. Given a query image, usually a small patch of some document page captured by a low resolution image capture device, the features in the query image are extracted and compared against those stored in the database to identify the query image. The present invention also includes methods for recognizing and tracking the viewing region and look at point corresponding to the input query image. This information is combined with a rendering of the original input document to generate a new graphical user interface to the user. This user interface can be displayed on a conventional browser or even on the display of an image capture device. | 01-15-2009 |
20100095374 | GRAPH BASED BOT-USER DETECTION - Computer implemented methods are disclosed for detecting bot-user groups that send spam email over a web-based email service. Embodiments of the present system employ a two-prong approach to detecting bot-user groups. The first prong employs a historical-based approach for detecting anomalous changes in user account information, such as aggressive bot-user signups. The second prong of the present system entails constructing a large user-user relationship graph, which identifies bot-user sub-graphs through finding tightly connected subgraph components. | 04-15-2010 |
20100195914 | SCALABLE NEAR DUPLICATE IMAGE SEARCH WITH GEOMETRIC CONSTRAINTS - Methods are disclosed for finding images from a large corpus of images that at least partially match a query image. The present method makes use of feature detectors to bundle features into local groups or bundles. These bundled features are repeatable and much more discriminative than an individual SIFT feature. Equally importantly, the bundled features provide a flexible representation that allows simple and robust geometric constraints to be efficiently enforced when querying the index. | 08-05-2010 |
20100312777 | PARTIAL-MATCHING FOR WEB SEARCHES - An efficient manner of performing an M-out-of-N partial matching search of indexed documents (e.g., web pages) is provided herein. More particularly, indexed words are arranged into a global location space (GLS), providing for respective occurrences of words in indexed documents being searched to have continuous locations on a one-dimensional GLS. Documents within the GLS are separated by end of document word marking boundaries between consecutive documents. The query words are then separated into an active set, comprising the left-most query words, and a non-active set. A partial matching operator transverses the GLS, applying active geometric constraints, in a sequential manner, to words in the active set. This causes shifting of the active set along the GLS to comprise M left-most query words. If a document satisfies constraints associated with M words in an active set, the document comprises at least M-out-of-N words. | 12-09-2010 |
20110103699 | IMAGE METADATA PROPAGATION - Methods and computer-readable media for propagating content category information to images stored in a database are described. A seed image that is associated with a known content category is received. A content-based image retrieval is conducted using the seed image as a search query image. A number of search result images are identified. The content category is propagated to the search result images. Metadata associated with the search result images is aggregated and analyzed to identify domains that should also be associated with the content category. Additional images that are associated with the domain are identified and the content category propagated thereto. The process is iterated using the additional images as search query images for the content-based image retrieval. | 05-05-2011 |
20110106782 | CONTENT-BASED IMAGE SEARCH - Image descriptor identifiers are used for content-based search. A plurality of descriptors is determined for an image. The descriptors represent the content of the image at respective interest points identified in the image. The descriptors are mapped to respective descriptor identifiers. The image can thus be represented as a set of descriptor identifiers. A search is performed on an index using the descriptor identifiers as search elements. A method for efficiently searching the inverted index is also provided. Candidate images that include at least a predetermined number of descriptor identifiers that match those of the image are identified. The candidate images are ranked and at least a portion thereof are presented as content-based search results. | 05-05-2011 |
20110106798 | Search Result Enhancement Through Image Duplicate Detection - Systems, methods, and computer media for enhancing user search query results are provided. Upon receiving a user search query, relevant images are identified. Duplicate image information for the relevant images is accessed in an index. The index includes information extracted from individual images or duplicates and information aggregated according to groups comprised of images and duplicates of the images. The images identified as relevant to the user query are ranked based at least in part on the information accessed in the index. | 05-05-2011 |
20110208714 | LARGE SCALE SEARCH BOT DETECTION - A framework may be used for identifying low-rate search bot traffic within query logs by capturing groups of distributed, coordinated search bots. Search log data may be input to a history-based anomaly detection engine to determine if query-click pairs associated with a query are suspicious in view of historical query-click pairs for the query. Users associated with suspicious query-click pairs may be input to a matrix-based bot detection engine to determine correlations between queries submitted by the users. Those users indicating strong correlations may be categorized as bots, whereas those who do not may be categorized as part of flash crowd traffic. | 08-25-2011 |
20110235908 | PARTITION MIN-HASH FOR PARTIAL-DUPLICATE IMAGE DETERMINATION - Images in a database or collection of images are each divided into multiple partitions with each partition corresponding to an area of an image. The partitions in an image may overlap with each other. Min-hash sketches are generated for each of the partitions and stored with the images. A user may submit an image and request that an image that is a partial match for the submitted image be located in the image collection. The submitted image is similarly divided into partitions and min-hash sketches are generated from the partitions. The min-hash sketches are compared with the stored min-hash sketches for matches, and images having partitions whose sketches are matches are returned as partial matching images. | 09-29-2011 |
20110299743 | SCALABLE FACE IMAGE RETRIEVAL - A system for identifying individuals in digital images and for providing matching digital images is provided. A set of images that include faces of known individuals is received. Faces are detected in the images and facial components are identified in each face. Visual words corresponding to the facial components are generated, stored, and associated with identifiers of the individuals. At a later time, a user may provide an image that includes the face of one of the known individuals. Visual words are determined from the face of the individual in the provided image and matched against the stored visual words. Images associated with matching visual words are ranked and presented to the user. | 12-08-2011 |
20110310110 | SYNTHETIC IMAGE AND VIDEO GENERATION FROM GROUND TRUTH DATA - A system and a method are disclosed for generating video. Object information is received. A path of motion of the object relative to a reference point is generated. A series of images and ground for a reference frame are generated from the ground truth and the generated path. A system and a method are disclosed for generating an image. Object information is received. Image data and ground truth may be generated using position, the image description, the camera characteristics, and image distortion parameters. A positional relationship between the document and a reference point is determined. An image of the document and ground truth are generated from the object information and the positional relationship and in response to user specified environment of the document. | 12-22-2011 |
20120078936 | VISUAL-CUE REFINEMENT OF USER QUERY RESULTS - Methods and computer-storage media having computer-executable instructions embodied thereon that facilitate refining query results using visual cues are provided. Query results are determined in response to an indication of a user query. One or more groups of query results are generated from the query results based on categories of query results that share similar features. Visual cues are associated with each of the query result groups. Visual cues, in association with query result groups, are presented to a user. Query results associated with a selected visual cue may be presented to a user. A refined user query may be generated based on a selected visual cue. | 03-29-2012 |
20120117051 | MULTI-MODAL APPROACH TO SEARCH QUERY INPUT - Search queries containing multiple modes of query input are used to identify responsive results. The search queries can be composed of combinations of keyword or text input, image input, video input, audio input, or other modes of input. The multiple modes of query input can be present in an initial search request, or an initial request containing a single type of query input can be supplemented with a second type of input. In addition to providing responsive results, in some embodiments additional query refinements or suggestions can be made based on the content of the query or the initially responsive results. | 05-10-2012 |
20120155717 | IMAGE SEARCH INCLUDING FACIAL IMAGE - A method and apparatus is provided for performing image matching. The method includes comparing a face in a first image to a face in each of a set of stored images to identify one or more face-matching images that include similar facial features to the face in the first image. Next, the first image is compared to each of the face-matching images to identify one or more resulting images that are spatially similar to the first image. Accordingly, the resulting image or images have similar facial features and similar overall or background features to those in the first image. For example, if the query image is of a playground with a child swinging on a swing, the image matching technique can find other images of the same child in a setting that appears similar. | 06-21-2012 |
20120163707 | MATCHING TEXT TO IMAGES - Text in web pages or other text documents may be classified based on the images or other objects within the webpage. A system for identifying and classifying text related to an object may identify one or more web pages containing the image or similar images, determine topics from the text of the document, and develop a set of training phrases for a classifier. The classifier may be trained and then used to analyze the text in the documents. The training set may include both positive examples and negative examples of text taken from the set of documents. A positive example may include captions or other elements directly associated with the object, while negative examples may include text taken from the documents, but from a large distance from the object. In some cases, the system may iterate on the classification process to refine the results. | 06-28-2012 |
20120177294 | IMAGE RETRIEVAL USING DISCRIMINATIVE VISUAL FEATURES - Image search results are obtained by providing weights to visual features to emphasize features corresponding to objects of interest while simultaneously deemphasizing irrelevant or inconsistent features that lead to poor search results. In order to minimize the impact of visual features that are unreliable or irrelevant with respect to the objects of interest in the image, context-dependent weights are provided to detect visual features such that those visual features pertaining to the objects of interest are more heavily weighted than those visual features that pertain to irrelevant or unreliable portions of the image. Visual features may be weighted for images in a searchable database. Training data may be obtained and used in weighting visual features in a query image and, alternatively, in searchable database images. | 07-12-2012 |
20120246158 | CO-RANGE PARTITION FOR QUERY PLAN OPTIMIZATION AND DATA-PARALLEL PROGRAMMING MODEL - A co-range partitioning scheme that divides multiple static or dynamically generated datasets into balanced partitions using a common set of automatically computed range keys. A co-range partition manager minimizes the number of data partitioning operations for a multi-source operator (e.g., join) by applying a co-range partition on a pair of its predecessor nodes as early as possible in the execution plan graph. Thus, the amount of data being transferred is reduced. By using automatic range and co-range partition for data partitioning tasks, a programming API is enabled that abstracts explicit data partitioning from users to provide a sequential programming model for data-parallel programming in a computer cluster. | 09-27-2012 |
20120314941 | ACCURATE TEXT CLASSIFICATION THROUGH SELECTIVE USE OF IMAGE DATA - Product images are used in conjunction with textual descriptions to improve classifications of product offerings. By combining cues from both text and image descriptions associated with products, implementations enhance both the precision and recall of product description classifications within the context of web-based commerce search. Several implementations are directed to improving those areas where text-only approaches are most unreliable. For example, several implementations use image signals to complement text classifiers and improve overall product classification in situations where brief textual product descriptions use vocabulary that overlaps with multiple diverse categories. Other implementations are directed to using text and images “training sets” to improve automated classifiers including text-only classifiers. Certain implementations are also directed to learning a number of three-way image classifiers focused only on “confusing categories” of the text signals to improve upon those specific areas where text-only classification is weakest. | 12-13-2012 |
20120314961 | SCALABLE NEAR DUPLICATE IMAGE SEARCH WITH GEOMETRIC CONSTRAINTS - Methods are disclosed for finding images from a large corpus of images that at least partially match a query image. The present method makes use of feature detectors to bundle features into local groups or bundles. These bundled features are repeatable and much more discriminative than an individual SIFT feature. Equally importantly, the bundled features provide a flexible representation that allows simple and robust geometric constraints to be efficiently enforced when querying the index. | 12-13-2012 |
20130152057 | OPTIMIZING DATA PARTITIONING FOR DATA-PARALLEL COMPUTING - A data partitioning plan is automatically generated that—given a data-parallel program and a large input dataset, and without having to first run the program on the input dataset—substantially optimizes performance of the distributed execution system that explicitly measures and infers various properties of both data and computation to perform cost estimation and optimization. Estimation may comprise inferring the cost of a candidate data partitioning plan, and optimization may comprise generating an optimal partitioning plan based on the estimated costs of computation and input/output. | 06-13-2013 |
20130185791 | VOUCHING FOR USER ACCOUNT USING SOCIAL NETWORKING RELATIONSHIP - Trusted user accounts of an application provider are determined. Graphs, such as trees, are created with each node corresponding to a trusted account. Each of the nodes is associated with a vouching quota, or the nodes may share a vouching quota. Untrusted user accounts are determined. For each of these untrusted accounts, a trusted user account that has a social networking relationship is determined. If the node corresponding to the trusted user account has enough vouching quota to vouch for the untrusted user account, then the quota is debited, a node is added for the untrusted user account to the graph, and the untrusted user account is vouched for. If not, available vouching quota may be borrowed from other nodes in the graph. | 07-18-2013 |
20130204608 | IMAGE ANNOTATIONS ON WEB PAGES - An image in a web page may be annotated after deriving information about an image when the image may be displayed on multiple web pages. The web pages that show the image may be analyzed in light of each other to determine metadata about the image, then various additional content may be added to the image. The additional content may be hyperlinks to other webpages. The additional content may be displayed as annotations on top of the images and in other manners. Many embodiments may perform searching, analysis, and classification of images prior to the web page being served. | 08-08-2013 |
20130315480 | MATCHING TEXT TO IMAGES - Text in web pages or other text documents may be classified based on the images or other objects within the webpage. A system for identifying and classifying text related to an object may identify one or more web pages containing the image or similar images, determine topics from the text of the document, and develop a set of training phrases for a classifier. The classifier may be trained and then used to analyze the text in the documents. The training set may include both positive examples and negative examples of text taken from the set of documents. A positive example may include captions or other elements directly associated with the object, while negative examples may include text taken from the documents, but from a large distance from the object. In some cases, the system may iterate on the classification process to refine the results. | 11-28-2013 |
20140258295 | Approximate K-Means via Cluster Closures - A set of data points is divided into a plurality of subsets of data points. A set of cluster closures is generated based at least in part on the subset of data points. Each cluster closure envelopes a corresponding cluster of a set of clusters and is comprised of data points of the enveloped cluster and data points neighboring the enveloped cluster. A k-Means approximator iteratively assigns data points to a cluster of the set of clusters and updates a set of cluster centroids corresponding to the set of clusters. The k-Means approximator assigns data points based at least in part on the set of cluster closures. | 09-11-2014 |
20140270497 | ACCURATE TEXT CLASSIFICATION THROUGH SELECTIVE USE OF IMAGE DATA - Product images are used in conjunction with textual descriptions to improve classifications of product offerings. By combining cues from both text and image descriptions associated with products, implementations enhance both the precision and recall of product description classifications within the context of web-based commerce search. Several implementations are directed to improving those areas where text-only approaches are most unreliable. For example, several implementations use image signals to complement text classifiers and improve overall product classification in situations where brief textual product descriptions use vocabulary that overlaps with multiple diverse categories. Other implementations are directed to using text and images “training sets” to improve automated classifiers including text-only classifiers. Certain implementations are also directed to learning a number of three-way image classifiers focused only on “confusing categories” of the text signals to improve upon those specific areas where text-only classification is weakest. | 09-18-2014 |