Shipeng Li, Palo Alto US

Shipeng Li, Palo Alto, CA US

Patent application number	Description	Published
20090252146	CONTINUOUS NETWORK CODING IN WIRELESS RELAY NETWORKS - Described is continuous network coding, in which a relay sends probability data comprising a continuous number for use as parity data. The node receives streams of bits sent from sources towards a destination, and computes the probability data based on current noise data and/or fading data. A selected set of the bits (all or some subset thereof) are combined, e.g., XOR-ed or concatenated, and send to the destination. Phase modulation is performed to convey probability information based on the probability data. The destination demodulates the signal to obtain the probability information, and combines the probability information with the data directly received from sources to perform joint decoding. The number of bits in the set of selected bits may be adaptively chosen based on current channel conditions, e.g., increased when the channel conditions from the sources directly to a destination are poor relative to the channel conditions via the relay.	10-08-2009
20090292685	VIDEO SEARCH RE-RANKING VIA MULTI-GRAPH PROPAGATION - A video search re-ranking via multi-graph propagation technique employing multimodal fusion in video search is presented. It employs not only textual and visual features, but also semantic and conceptual similarity between video shots to rank or re-rank the search results received in response to a text-based search query. In one embodiment, the technique employs an object-sensitive approach to query analysis to improve the baseline result of text-based video search. The technique then employs a graph-based approach to text-based search result ranking or re-ranking. To better exploit the underlying relationship between video shots, the re-ranking scheme simultaneously leverages textual relevancy, semantic concept relevancy, and low-level-feature-based visual similarity. The technique constructs a set of graphs with the video shots as vertices, and the conceptual and visual similarity between video shots as hyperlinks. A modified topic-sensitive PageRank algorithm is then applied to these graphs to determine the overall relevancy ranking.	11-26-2009
20090310854	Multi-Label Multi-Instance Learning for Image Classification - Described is a technology by which an image is classified (e.g., grouped and/or labeled), based on multi-label multi-instance data learning-based classification according to semantic labels and regions. An image is processed in an integrated framework into multi-label multi-instance data, including region and image labels. The framework determines local association data based on each region of an image. Other multi-label multi-instance data is based on relationships between region labels of the image, relationships between image labels of the image, and relationships between the region and image labels. These data are combined to classify the image. Training is also described.	12-17-2009
20100076923	ONLINE MULTI-LABEL ACTIVE ANNOTATION OF DATA FILES - Online multi-label active annotation may include building a preliminary classifier from a pre-labeled training set included with an initial batch of annotated data samples, and selecting a first batch of sample-label pairs from the initial batch of annotated data samples. The sample-label pairs may be selected by using a sample-label pair selection module. The first batch of sample-label pairs may be provided to online participants to manually annotate the first batch of sample-label pairs based on the preliminary classifier. The preliminary classifier may be updated to form a first updated classifier based on an outcome of the providing the first batch of sample-label pairs to the online participants.	03-25-2010
20100106671	Comprehensive Human Computation Framework - Technologies for a human computation framework suitable for answering common sense questions that are difficult for computers to answer but easy for humans to answer. The technologies support solving general common sense problems without a priori knowledge of the problems; support for determining whether an answer is from a bot or human so as to screen out spurious answers from bots; support for distilling answers collected from human users to ensure high quality solutions to the questions asked; and support for preventing malicious elements in or out of the system from attacking other system elements or contaminating the solutions produced by the system, and preventing users from being compensated without contributing answers.	04-29-2010
20100241663	PROVIDING CONTENT ITEMS SELECTED BASED ON CONTEXT - Systems, methods, and computer storage media having computer-executable instructions embodied thereon that provide content items selected based on context are provided. Contextual indicators associated with a user are identified and utilized to determine one or more content items that the user is likely to desire to access at a particular point in time. Upon receiving an indication that the user desires to perform a context-aware search, the identified content items (or references thereto) are presented automatically to the user, that is, without the user having to input any search query terms. The indication that the user desires to perform a context-aware search may be received, for instance, upon receiving an indication that a selectable context-aware search button has been selected by the user. This single-button action is particularly useful for mobile computing devices, wherein alpha-numeric textual input is relatively difficult.	09-23-2010
20100284625	Computing Visual and Textual Summaries for Tagged Image Collections - Described is a technology for computing visual and textual summaries for tagged image collections. Heterogeneous affinity propagation is used to together identify both visual and textual exemplars. The heterogeneous affinity propagation finds the exemplars for relational heterogeneous data (e.g., images and words) by considering the relationships (e.g., similarities) within pairs of images, pairs of words, and relationships of words to images (affinity) in an integrated manner.	11-11-2010
20110191336	CONTEXTUAL IMAGE SEARCH - Techniques for image search using contextual information related to a user query are described. A user query including at least one of textual data or image data from a collection of data displayed by a computing device is received from a user. At least one other subset of data selected from the collection of data is received as contextual information that is related to and different from the user query. Data files such as image files are retrieved and ranked based on the user query to provide a pre-ranked set of data files. The pre-ranked data files are then ranked based on the contextual information to provide a re-ranked set of data files to be displayed to the user.	08-04-2011
20110196859	Visual Search Reranking - An initial ranked list of a first plurality of visual documents is obtained from a first source in response to a query, and a second plurality of visual documents relevant to the query is gathered from a plurality of second sources. Visual patterns identified from the second plurality of visual documents are compared with the first visual documents for reranking the first visual documents.	08-11-2011
20110208716	Image-Based CAPTCHA Exploiting Context in Object Recognition - Techniques for an image-based CAPTCHA for object recognition are described. The disclosure describes adding images to a database by collecting images by querying descriptive keywords to an image search engine or crawling images from the Internet.	08-25-2011
20110264700	ENRICHING ONLINE VIDEOS BY CONTENT DETECTION, SEARCHING, AND INFORMATION AGGREGATION - Many internet users consume content through online videos. For example, users may view movies, television shows, music videos, and/or homemade videos. It may be advantageous to provide additional information to users consuming the online videos. Unfortunately, many current techniques may be unable to provide additional information relevant to the online videos from outside sources. Accordingly, one or more systems and/or techniques for determining a set of additional information relevant to an online video are disclosed herein. In particular, visual, textual, audio, and/or other features may be extracted from an online video (e.g., original content of the online video and/or embedded advertisements). Using the extracted features, additional information (e.g., images, advertisements, etc.) may be determined based upon matching the extracted features with content of a database. The additional information may be presented to a user consuming the online video.	10-27-2011
20110267544	NEAR-LOSSLESS VIDEO SUMMARIZATION - Described is perceptually near-lossless video summarization for use in maintaining video summaries, which operates to substantially reconstruct an original video in a generally perceptually near-lossless manner. A video stream is summarized with little information loss by using a relatively very small piece of summary metadata. The summary metadata comprises an image set of synthesized mosaics and representative keyframes, audio data, and the metadata about video structure and motion. In one implementation, the metadata is computed and maintained (e.g., as a file) to summarize a relatively large video sequence, by segmenting a video shot into subshots, and selecting keyframes and mosaics based upon motion data corresponding to those subshots. The motion data is maintained as a semantic description associated with the image set. To reconstruct the video, the metadata is processed, including simulating motion using the image set and the semantic description, which recovers the audiovisual content without any significant information loss.	11-03-2011
20110288929	Enhancing Photo Browsing through Music and Advertising - Techniques for recommending music and advertising to enhance a user's experience while photo browsing are described. In some instances, songs and ads are ranked for relevance to at least one photo from a photo album. The songs, ads and photo(s) from the photo album are then mapped to a style and mood ontology to obtain vector-based representations. The vector-based representations can include real valued terms, each term associated with a human condition defined by the ontology. A re-ranking process generates a relevancy term for each song and each ad indicating relevancy to the photo album. The relevancy terms can be calculated by summing weighted terms from the ranking and the mapping. Recommended music and ads may then be provided to a user, as the user browses a series of photos obtained from the photo album. The ads may be seamlessly embedded into the music in a nonintrusive manner.	11-24-2011
20120095825	Incentive Selection of Region-of-Interest and Advertisements for Image Advertising - Techniques for image selection and region of interest analysis are described herein. A pair of two or more users is configured, and an image is displayed to the pair. The image can be a still image (i.e., a picture) or a moving image (i.e., video). In some instances, a plurality of advertisements is suggested for possible association with the image. Input is received from both users in the pair, indicating a positive or a negative association between each advertisement and the image. When the pair positively rates an advertisement, the advertisement is associated with the image. A plurality of regions of interest within the image may be suggested. In response, positive or negative input is received from the pair indicating whether each of the plurality of regions of interest is appropriately suggested for placement of an advertisement.	04-19-2012
20120109754	SPONSORED MULTI-MEDIA BLOGGING - The sponsored multi-media blogging technique is an advertising-driven service on a computing device, such as a mobile phone, that makes the multi-media micro-blog or blog an effective carrier for advertising. The data collected while employing the sponsored multi-media blogging technique is used for user intent mining and increasing advertisement relevance for mobile advertising projects. The benefits to the sponsored multi-media blogging technique's users are a natural interface for composing multi-media micro-blogs/blogs and instant experience sharing, while the benefits to advertisers is the promoted brand impression from the contextual advertising in rich media micro-blogs/blogs.	05-03-2012
20120110432	Tool for Automated Online Blog Generation - Techniques for the design and operation of a blogging tool for automated blog creation and automated upload to a server are described herein. A content capturing process may obtain a plurality of images, including still images or video, as well as audio capture of voices and other sound, according to direction of a user operating an image-capture device. One or more of the images may be annotated with metadata or with text, which may be derived from verbal content provided by the user. A template may be selected in either an automated or user-controlled manner. The images and other content may be assembled into the template to form a blog entry. The blog entry may be uploaded to a server or otherwise shared. In one example, the uploading may be in response to a single user command, obtained by operation of a physical user interface or from verbal user input.	05-03-2012
20120150871	Autonomous Mobile Blogging - An autonomous blog engine is implemented to enable the autonomous generation of a blog. The autonomous blog engine receives media objects that are captured by an electronic device during a trip session. The autonomous blog engine determines a place of interest based on photographs selected from the media objects. The autonomous blog engine then generates textual content using one or more pre-stored knowledge items that include information on the place of interest. The autonomous blog engine further autonomously publishes a blog entry on the place of interest that includes one or more photographs from the photograph cluster and the textual content.	06-14-2012
20120263433	Detecting Key Roles and Their Relationships from Video - Tools and techniques for acquiring key roles and their relationships from a video independent of metadata, such as cast lists and scripts, are described herein. These techniques include discovering key roles and their relationships by treating a video (e.g., a movie, television program, music video, and personal video, etc.) as a community. For instance, a video is segmented into a hierarchical structure that includes levels for scenes, shots, and key frames. In some implementations, the techniques include performing face detection and grouping on the detected key frames. In some implementations, the techniques include exploiting the key roles and their correlations in this video to discover a community. The discovered community provides for a wide variety of applications, including the automatic generation of visual summaries or video posters including acquired key roles.	10-18-2012
20120265802	Using A Proxy Server For A Mobile Browser - Techniques describe providing a web page for a proxy-based browser on a mobile device to enhance user experience. A proxy server receives a layout of the web page, extracts web elements from the web page, and captures images of the web elements of the web page. The web elements are incorporated with a background screen image to form a composite screen format to represent a display of the web page. The background screen image is compressed by splitting an encoded frame into fixed-size slices and splitting a previous screen frame into fixed-size slices. The proxy server provides the web page synchronized with the mobile device based on the composite screen format and the compressed background screen image. Furthermore, the proxy server receives input from a user to provide updates to web elements that are dynamic on the web page to be displayed on the screen of the mobile device.	10-18-2012
20120271833	HYBRID NEIGHBORHOOD GRAPH SEARCH FOR SCALABLE VISUAL INDEXING - A hybrid search method may be used to identify information responsive to a query. A search may be performed utilizing a neighborhood graph and a partitioning tree. The partitioning tree may be searched to select one or more pivots that may be used to guide a subsequent search in the neighborhood graph. Once the search in the neighborhood graph is unable to identify nearest neighbors in closer proximity to the query, the search may be switched to the partitioning tree. The partitioning tree may then be searched to select pivots that may be used to guide subsequent searches in the neighborhood graph. The searches performed in the partitioning tree and/or the neighborhood graph may be conducted utilizing an iterative algorithm.	10-25-2012
20120294520	GESTURE-BASED VISUAL SEARCH - A user may perform an image search on an object shown in an image. The user may use a mobile device to display an image. In response to displaying the image, the client device may send the image to a visual search system for image segmentation. Upon receiving a segmented image from the visual search system, the client device may display the segmented image to the user who may select one or more segments including an object of interest to instantiate a search. The visual search system may formulate a search query based on the one or more selected segments and perform a search using the search query. The visual search system may then return search results to the client device for display to the user.	11-22-2012
20120295640	User Behavior Model for Contextual Personalized Recommendation - A user behavior model provides personalized recommendations based in part on time and location, particularly to users of mobile devices. Entity types are ranked according to relevance to the user. Example entity types are restaurant, hotel, etc. The relevance may be based on reference to a large-scale database containing queries from other users. Additionally, entities within each entity type may be ranked based on relevance to the user and the time and location context. A user interface may display a ranked list of entity types, such as restaurant, hotel, etc., wherein each entity type is represented by a highest-ranked entity with the entity type. Thus, the user interface may display a highest-ranked restaurant, a highest-ranked hotel, etc. Upon user selection of one such entity type the user interface is replaced with a second user interface, for example showing a ranked hierarchy of restaurants, headed by the highest-ranked restaurant.	11-22-2012
20120297038	Recommendations for Social Network Based on Low-Rank Matrix Recovery - Techniques describe analyzing users and groups of a social network to identify user interests and providing recommendations for a user based on the user's identified interests. A content-awareness application obtains a collection of images and tags associated with the images belonging to members in the social network. The content-awareness application decomposes the members into a representative matrix to identify users and groups in order to calculate a similarity matrix between the users and their images based on a visual content of the images and a textual content of the tags. The content-awareness application further constructs a graph Laplacian over the users and the groups to align with the representative matrix based at least in part on the similarity matrix and further provides recommendations of groups for a user to join in the social network based at least in part on the graph Laplacian identifying the user's interests.	11-22-2012
20120306769	MULTI-TOUCH TEXT INPUT - This document describes tools associated with symbol entry control functions. In some implementations, the tools identify a first finger that is in tactile contact with a touch screen. The first finger can select a subset of symbols from a plurality of symbols that can be entered via the touch screen. The tools can also identify whether one or more other fingers are in concurrent tactile contact with the first finger on the touch screen. The tools can select an individual symbol from the subset based on whether the one or more other fingers are in concurrent tactile contact with the first finger on the touch screen.	12-06-2012
20120323948	DIALOG-ENHANCED CONTEXTUAL SEARCH QUERY ANALYSIS - Embodiments of the present invention relate to systems, methods, and computer-storage media for a method of contextually analyzing terms within a search query. In one embodiment, a received search query is classified into a domain category. Additionally, information is assigned to a schema associated with the domain by analyzing the search query. Further, at least one search result that helps a user complete a task within the domain is provided based on the information in the schema.	12-20-2012
20130060726	COMPREHENSIVE HUMAN COMPUTATION FRAMEWORK - Technologies for a human computation framework suitable for answering common sense questions that are difficult for computers to answer but easy for humans to answer. The technologies support solving general common sense problems without a priori knowledge of the problems; support for determining whether an answer is from a bot or human so as to screen out spurious answers from bots; support for distilling answers collected from human users to ensure high quality solutions to the questions asked; and support for preventing malicious elements in or out of the system from attacking other system elements or contaminating the solutions produced by the system, and preventing users from being compensated without contributing answers.	03-07-2013
20130298195	Image-Based CAPTCHA Exploiting Context in Object Recognition - Techniques for an image-based CAPTCHA for object recognition are described. The disclosure describes adding images to a database by collecting images by querying descriptive keywords to an image search engine or crawling images from the Internet.	11-07-2013
20140003714	GESTURE-BASED VISUAL SEARCH	01-02-2014
20140044349	CONTEXTUAL DOMINANT COLOR NAME EXTRACTION - Dominant color names may be extracted from an image by analyzing spatial-context of pixels contained in the image. A dominant color region may be defined by taking a double-threshold approach that addresses ambiguous color regions and a degree of confidence that each pixel belongs in the dominant color region. Affiliation maps and binary maps may be used to generate the dominant color region. Images may be converted to a saliency map, from which a region of interest may be assigned a dominant color name. Image search results may be filtered by the dominant color name associated with the image.	02-13-2014
20140053054	Cooperative Web Browsing Using Multiple Devices - A proxy-based thin-client web browsing framework enables cooperative web browsing of multiple devices. The multiple devices may include devices that are not intended for web browsing and have limited or no web browsers and/or user input capabilities. The proxy-based thin client web browsing framework employs a virtual browser at a proxy server to perform all browser-engine logics, and retrieve, render and encode web pages on behalf of the multiple devices. The multiple devices therefore only need to have limited decoding and display capabilities to perform web browsing. The proxy-based thin client web browsing framework further includes a touch controller as a remote controller for a device that has no or limited user texting or manipulating capabilities.	02-20-2014
20140075393	Gesture-Based Search Queries - An image-based text extraction and searching system extracts an image be selected by gesture input by a user and the associated image data and proximate textual data in response to the image selection. Extracted image data and textual data can be utilized to perform or enhance a computerized search. The system can determine one or more database search terms based on the textual data and generate at least a first search query proposal related to the image data and the textual data.	03-13-2014
20140122729	HOME CLOUD WITH VIRTUALIZED INPUT AND OUTPUT ROAMING OVER NETWORK - A home cloud computing system employs a virtualization system to virtualize data of a device and adaptively transform type or format of the virtualized data for one or more other devices, thus leveraging resources of the device for the one or more other devices. Through data virtualization and adaptive transformation, devices of heterogeneous types are seamlessly connected to one another and can act as input or output devices for each other to create a home cloud network of devices.	05-01-2014
20140244614	Cross-Domain Topic Space - Some examples include receiving a microblog entry from a social stream domain. Further, some implementations include determining, based on a topic space associated with the social stream domain and a media domain, a topic that is associated with the microblog entry. Some implementations include determining, based on the topic space, one or more media items that are associated with the topic.	08-28-2014
20140254922	Salient Object Detection in Images via Saliency - An input image, which may include a salient object, is received by a salient object detection and localization system. The system may be trained to detect whether the input image includes a salient object. If the system fails to detect a salient object in the input image, the system may provide the sender of the input with a null result or an indication that the input image does not contain a salient object. If the system detects a salient object in the input image, the system may localize the salient object within the input image. The system may generate an output image based at least in part on the localization of the salient object. The system may provide the sender of the input image with information pertaining to the detected salient object.	09-11-2014
20140258295	Approximate K-Means via Cluster Closures - A set of data points is divided into a plurality of subsets of data points. A set of cluster closures is generated based at least in part on the subset of data points. Each cluster closure envelopes a corresponding cluster of a set of clusters and is comprised of data points of the enveloped cluster and data points neighboring the enveloped cluster. A k-Means approximator iteratively assigns data points to a cluster of the set of clusters and updates a set of cluster centroids corresponding to the set of clusters. The k-Means approximator assigns data points based at least in part on the set of cluster closures.	09-11-2014
20140289228	USER BEHAVIOR MODEL FOR CONTEXTUAL PERSONALIZED RECOMMENDATION - A user behavior model provides personalized recommendations based in part on time and location, particularly to users of mobile devices. Entity types are ranked according to relevance to the user. Example entity types are restaurant, hotel, etc. The relevance may be based on reference to a large-scale database containing queries from other users. Additionally, entities within each entity type may be ranked based on relevance to the user and the time and location context. A user interface may display a ranked list of entity types, such as restaurant, hotel, etc., wherein each entity type is represented by a highest-ranked entity with the entity type. Thus, the user interface may display a highest-ranked restaurant, a highest-ranked hotel, etc. Upon user selection of one such entity type the user interface is replaced with a second user interface, for example showing a ranked hierarchy of restaurants, headed by the highest-ranked restaurant.	09-25-2014
20140314316	IMAGE COMPRESSION BASED ON PARAMETER-ASSISTED INPAINTING - Systems and methods provide image compression based on parameter-assisted inpainting. In one implementation of an encoder, an image is partitioned into blocks and the blocks classified as smooth or unsmooth, based on the degree of visual edge content and chromatic variation in each block. Image content of the unsmooth blocks is compressed, while image content of the smooth blocks is summarized by parameters, but not compressed. The parameters, once obtained, may also be compressed. At a decoder, the compressed image content of the unsmooth blocks and the compressed parameters of the smooth blocks are each decompressed. Each smooth block is then reconstructed by inpainting, guided by the parameters in order to impart visual detail from the original image that cannot be implied from the image content of neighboring blocks that have been decoded.	10-23-2014
20140354768	Socialized Mobile Photography - A system, method or computer readable storage device to enable mobile devices in capturing high quality photos by using both the rich context available from mobile devices and crowd-sourced social media on the Web. Considering the flexible and adaptive adoption of photography principles with different content and context composition rules and exposure principles are learned from the community-contributed images. Leveraging a mobile device user's scene context and social context, the proposed socialized mobile photography system is able to suggest optimal view enclosure to achieve appealing composition. Due to the complex scene content and a number of shooting-related contexts to exposure parameters, exposure learning is applied to suggest appropriate camera parameters.	12-04-2014

Patent applications by Shipeng Li, Palo Alto, CA US

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Shipeng Li, Palo Alto US

Shipeng Li, Palo Alto, CA US