Xiaoou Tang, Beijing CN

Patent application number	Description	Published
20080297621	Strategies for extracting foreground information using flash and no-flash image pairs - A flash-based strategy is used to separate foreground information from background information within image information. In this strategy, a first image is taken without the use of flash. A second image is taken of the same subject matter with the use of flash. The foreground information in the flash image is illuminated by the flash to a much greater extent than the background information. Based on this property, the strategy applies processing to extract the foreground information from the background information. The strategy supplements the flash information by also taking into consideration motion information and color information.	12-04-2008
20080298766	Interactive Photo Annotation Based on Face Clustering - An interactive photo annotation method uses clustering based on facial similarities to improve annotation experience. The method uses a face recognition algorithm to extract facial features of a photo album and cluster the photos into multiple face groups based on facial similarity. The method annotates a face group collectively using annotations, such as name identifiers, in one operation. The method further allows merging and splitting of face groups. Special graphical user interfaces, such as displays in a group view area and a thumbnail area and drag-and-drop features, are used to further improve the annotation experience.	12-04-2008
20080304735	Learning object cutout from a single example - Systems and methods are described for learning visual object cutout from a single example. In one implementation, an exemplary system determines the color context near each block in a model image to create an appearance model. The system also learns color sequences that occur across visual edges in the model image to create an edge profile model. The exemplary system then infers segmentation boundaries in unknown images based on the appearance model and edge profile model. In one implementation, the exemplary system minimizes the energy in a graph-cut model where the appearance model is used for data energy and the edge profile is used to modulate edges. The system is not limited to images with nearly identical foregrounds or backgrounds. Some variations in scale, rotation, and viewpoint are allowed.	12-11-2008
20080304740	Salient Object Detection - Methods for detecting a salient object in an input image are described. For this, the salient object in an image may be defined using a set of local, regional, and global features including multi-scale contrast, center-surround histogram, and color spatial distribution. These features are optimally combined through conditional random field learning. The learned conditional random field is then used to locate the salient object in the image. The methods can also use image segmentation, where the salient object is separated from the image background.	12-11-2008
20080304743	ACTIVE SEGMENTATION FOR GROUPS OF IMAGES - Systems and methods of segmenting images are disclosed herein. The similarity of images in a set of images is compared. A group of images is selected from the set of images. The images in the group of images are selected based on compared similarities among the images. An informative image is selected from the group of images. User-defined semantic information of the informative image is received. The group of images as a graph is modeled as a graph. Each image in the group of images denotes a node in the graph. Edges of the graph denote a foreground relationship between images or a background relationship between images. One or more images in the group of images are automatically segmented by propagating the semantic information of the informative image to images in the group of images having a corresponding graph node that is related to a graph node corresponding to the informative image. Segmentation results can be refined according to user provided image semantics.	12-11-2008
20080304755	Face Annotation Framework With Partial Clustering And Interactive Labeling - Systems and methods are described for a face annotation framework with partial clustering and interactive labeling. In one implementation, an exemplary system automatically groups some images of a collection of images into clusters, each cluster mainly including images that contain a person's face associated with that cluster. After an initial user-labeling of each cluster with the person's name or other label, in which the user may also delete/label images that do not belong in the cluster, the system iteratively proposes subsequent clusters for the user to label, proposing clusters of images that when labeled, produce a maximum information gain at each iteration and minimize the total number of user interactions for labeling the entire collection of images.	12-11-2008
20090080774	Hybrid Graph Model For Unsupervised Object Segmentation - This disclosure describes an integrated framework for class-unsupervised object segmentation. The class-unsupervised object segmentation occurs by integrating top-down constraints and bottom-up constraints on object shapes using an algorithm in an integrated manner. The algorithm describes a relationship among object parts and superpixels. This process forms object shapes with object parts and oversegments pixel images into the superpixels, with the algorithm in conjunction with the constraints. This disclosure describes computing a mask map from a hybrid graph, segmenting the image into a foreground object and a background, and displaying the foreground object from the background.	03-26-2009
20090097772	Laplacian Principal Components Analysis (LPCA) - Systems and methods perform Laplacian Principal Components Analysis (LPCA). In one implementation, an exemplary system receives multidimensional data and reduces dimensionality of the data by locally optimizing a scatter of each local sample of the data. The optimization includes summing weighted distances between low dimensional representations of the data and a mean. The weights of the distances can be determined by a coding length of each local data sample. The system can globally align the locally optimized weighted scatters of the local samples and provide a global projection matrix. The LPCA improves performance of such applications as face recognition and manifold learning.	04-16-2009
20090099990	OBJECT DETECTION AND RECOGNITION WITH BAYESIAN BOOSTING - An efficient, effective and at times superior object detection and/or recognition (ODR) function may be built from a set of Bayesian stumps. Bayesian stumps may be constructed for each feature and object class, and the ODR function may be constructed from the subset of Bayesian stumps that minimize Bayesian error for a particular object class. That is, Bayesian error may be utilized as a feature selection measure for the ODR function. Furthermore, Bayesian stumps may be efficiently implemented as lookup tables with entries corresponding to unequal intervals of feature histograms. Interval widths and entry values may be determined so as to minimize Bayesian error, yielding Bayesian stumps that are optimal in this respect.	04-16-2009
20090132213	METHOD FOR MODELING DATA STRUCTURES USING LOCAL CONTEXTS - A method for modeling data affinities and data structures. In one implementation, a contextual distance may be calculated between a selected data point in a data sample and a data point in a contextual set of the selected data point. The contextual set may include the selected data point and one or more data points in the neighborhood of the selected data point. The contextual distance may be the difference between the selected data point's contribution to the integrity of the geometric structure of the contextual set and the data point's contribution to the integrity of the geometric structure of the contextual set. The process may be repeated for each data point in the contextual set of the selected data point. The process may be repeated for each selected data point in the data sample. A digraph may be created using a plurality of contextual distances generated by the process.	05-21-2009
20090254539	User Intention Modeling For Interactive Image Retrieval - A system performs user intention modeling for interactive image retrieval. In one implementation, the system uses a three stage iterative technique to retrieve images from a database without using any image tags or text descriptors. First, the user submits a query image and the system models the user's search intention and configures a customized search to retrieve relevant images. Then, the system extends a user interface for the user to designate visual features across the retrieved images. The designated visual features refine the intention model and reconfigure the search to retrieve images that match the remodeled intention. Third, the system extends another user interface through which the user can give natural feedback about the retrieved images. The three stages can be iterated to quickly assemble a set of images that accurately fulfills the user's search intention. They system can be used for image searching without text tags, can be used for initial text tag generation, or can be used to complement a conventional tagged-image platform.	10-08-2009
20090297046	Linear Laplacian Discrimination for Feature Extraction - An exemplary method for extracting discriminant feature of samples includes providing data for samples in a multidimensional space; based on the data, computing local similarities for the samples; mapping the local similarities to weights; based on the mapping, formulating an inter-class scatter matrix and an intra-class scatter matrix; and based on the matrices, maximizing the ratio of inter-class scatter to intra-class scatter for the samples to provide discriminate features of the samples. Such a method may be used for classifying samples, recognizing patterns, or other tasks. Various other methods, devices, system, etc., are also disclosed.	12-03-2009
20090313239	Adaptive Visual Similarity for Text-Based Image Search Results Re-ranking - Described is a technology in which images initially ranked by some relevance estimate (e.g., according to text-based similarities) are re-ranked according to visual similarity with a user-selected image. A user-selected image is received and classified into an intention class, such as a scenery class, portrait class, and so forth. The intention class is used to determine how visual features of other images compare with visual features of the user-selected image. For example, the comparing operation may use different feature weighting depending on which intention class was determined for the user-selected image. The other images are re-ranked based upon their computed similarity to the user-selected image, and returned as query results. Retuning of the feature weights using actual user-provided relevance feedback is also described.	12-17-2009
20100067799	GLOBALLY INVARIANT RADON FEATURE TRANSFORMS FOR TEXTURE CLASSIFICATION - A “globally invariant Radon feature transform,” or “GIRFT,” generates feature descriptors that are both globally affine invariant and illumination invariant. These feature descriptors effectively handle intra-class variations resulting from geometric transformations and illumination changes to provide robust texture classification. In general, GIRFT considers images globally to extract global features that are less sensitive to large variations of material in local regions. Geometric affine transformation invariance and illumination invariance is achieved by converting original pixel represented images into Radon-pixel images by using a Radon Transform. Canonical projection of the Radon-pixel image into a quotient space is then performed using Radon-pixel pairs to produce affine invariant feature descriptors. Illumination invariance of the resulting feature descriptors is then achieved by defining an illumination invariant distance metric on the feature space of each feature descriptor.	03-18-2010
20100076723	TENSOR LINEAR LAPLACIAN DISCRIMINATION FOR FEATURE EXTRACTION - Tensor linear Laplacian discrimination for feature extraction is disclosed. One embodiment comprises generating a contextual distance based sample weight and class weight, calculating a within-class scatter using the at least one sample weight and a between-class scatter for multiple classes of data samples in a sample set using the class weight, performing a mode-k matrix unfolding on scatters and generating at least one orthogonal projection matrix.	03-25-2010
20100080450	CLASSIFICATION VIA SEMI-RIEMANNIAN SPACES - Described is using semi-Riemannian geometry in supervised learning to learn a discriminant subspace for classification, e.g., labeled samples are used to learn the geometry of a semi-Riemannian submanifold. For a given sample, the K nearest classes of that sample are determined, along with the nearest samples that are in other classes, and the nearest samples in that sample's same class. The distances between these samples are computed, and used in computing a metric matrix. The metric matrix is used to compute a projection matrix that corresponds to the discriminant subspace. In online classification, as a new sample is received, it is projected into a feature space by use of the projection matrix and classified accordingly.	04-01-2010
20100121792	Directed Graph Embedding - Directed graph embedding is described. In one implementation, a system explores the link structure of a directed graph and embeds the vertices of the directed graph into a vector space while preserving affinities that are present among vertices of the directed graph. Such an embedded vector space facilitates general data analysis of the information in the directed graph. Optimal embedding can be achieved by measuring local affinities among vertices via transition probabilities between the vertices, based on a stationary distribution of Markov random walks through the directed graph. For classifying linked web pages represented by a directed graph, the system can train a support vector machine (SVM) classifier, which can operate in a user-selectable number of dimensions.	05-13-2010
20100135584	Image-Based Face Search - A search includes comparing a query image provided by a user to a plurality of stored images of faces stored in a stored image database, and determining a similarity of the query image to the plurality of stored images. One or more resultant images of faces, selected from among the stored images, are displayed to the user based on the determined similarity of the stored images to the query image provided by the user. The resultant images are displayed based at least in part on one or more facial features.	06-03-2010
20100303367	Determining Intensity Similarity in Low-Light Conditions Using the Poisson-Quantization Noise Model - A Poisson-quantization noise model for modeling noise in low-light conditions is described. In one aspect, image information is received. A Poisson-quantization noise model is then generated from a Poisson noise model and a quantization noise model. Poisson-quantization noise is then estimated in the image information using the Poisson-quantization noise model.	12-02-2010
20110206276	HYBRID GRAPH MODEL FOR UNSUPERVISED OBJECT SEGMENTATION - This disclosure describes an integrated framework for class-unsupervised object segmentation. The class-unsupervised object segmentation occurs by integrating top-down constraints and bottom-up constraints on object shapes using an algorithm in an integrated manner. The algorithm describes a relationship among object parts and superpixels. This process forms object shapes with object parts and oversegments pixel images into the superpixels, with the algorithm in conjunction with the constraints. This disclosure describes computing a mask map from a hybrid graph, segmenting the image into a foreground object and a background, and displaying the foreground object from the background.	08-25-2011

Patent applications by Xiaoou Tang, Beijing CN

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Xiaoou Tang, Beijing CN

Xiaoou Tang, Beijing CN