Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Yinglian Xie, Cupertino US

Yinglian Xie, Cupertino, CA US

Patent application number	Description	Published
20080320119	Automatically identifying dynamic Internet protocol addresses - Dynamic IP addresses may be automatically identified and their dynamics patterns may be analyzed. Multi-user IP address blocks are determined as candidates for further analysis. An entropy score is determined for each IP address in every candidate block to distinguish between a dynamic IP and a static IP shared by multiple users. IP addresses with high entropy scores are grouped, and then analyzed, and may be used in various applications, such as spam filtering.	12-25-2008
20090138937	ENHANCED SECURITY AND PERFORMANCE OF WEB APPLICATIONS - A client-side enforcement mechanism may allow application security policies to be specified at a server in a programmatic manner. Servers may specify security policies as JavaScript functions included in a page returned by the server and run before other scripts. At runtime, and during initial loading, the functions are invoked by the client on each page modification to ensure the page conforms to the security policy. As such, before a mutation takes effect, the policy may transform that mutation and the code and data of the page. Replicated code execution may take place at both the client and the server where the server runs its own shadow copy of a client-side application in a trusted execution environment so that the server may check that the method calls coming from the client correspond to a correct execution of the client-side application The redundant execution at the client can be untrusted, but serves to improve the responsiveness and performance of the Web application.	05-28-2009
20090249480	MINING USER BEHAVIOR DATA FOR IP ADDRESS SPACE INTELLIGENCE - The claimed subject matter is directed to mining user behavior data for increasing Internet Protocol (“IP”) space intelligence. Specifically, the claimed subject matter provides a method and system of mining user behavior within an IP address space and the application of the IP address space intelligence derived from the mined user behavior.	10-01-2009
20090254989	CLUSTERING BOTNET BEHAVIOR USING PARAMETERIZED MODELS - Identification and prevention of email spam that originates from botnets may be performed by finding similarity in their host property and behavior patterns using a set of labeled data. Clustering models of host properties pertaining to previously identified and appropriately tagged botnet hosts may be learned. Given labeled data, each botnet may be examined individually and a clustering model learned to reflect upon a set of selected host properties. Once a model has been learned for every botnet, clustering behavior may be used to look for host properties that fit into a profile. Such traffic can be either discarded or tagged for subsequent analysis and can also be used to profile botnets preventing them from launching other attacks. In addition, models of individual botnets can be further clustered to form superclusters, which can help understand botnet behavior and detect future attacks.	10-08-2009
20090265786	AUTOMATIC BOTNET SPAM SIGNATURE GENERATION - A framework may be used for generating URL signatures to identify botnet spam and membership. The framework may take a set of unlabeled emails as input that are grouped based on URLs contained within the emails. The framework may return a set of spam URL signatures and a list of corresponding botnet host IP addresses by analyzing the URLs within the emails that are contained within the groups. Each URL signature may be in the form of either a complete URL string or a URL regular expression. The signatures may be used to identify spam emails launched from botnets, while the knowledge of botnet host identities can help filter other spam emails also sent by them.	10-22-2009
20100095374	GRAPH BASED BOT-USER DETECTION - Computer implemented methods are disclosed for detecting bot-user groups that send spam email over a web-based email service. Embodiments of the present system employ a two-prong approach to detecting bot-user groups. The first prong employs a historical-based approach for detecting anomalous changes in user account information, such as aggressive bot-user signups. The second prong of the present system entails constructing a large user-user relationship graph, which identifies bot-user sub-graphs through finding tightly connected subgraph components.	04-15-2010
20100223499	FINGERPRINTING EVENT LOGS FOR SYSTEM MANAGEMENT TROUBLESHOOTING - A technique for automatically detecting and correcting configuration errors in a computing system. In a learning process, recurring event sequences, including e.g., registry access events, are identified from event logs, and corresponding rules are developed. In a detecting phase, the rules are applied to detected event sequences to identify violations and to recover from failures. Event sequences across multiple hosts can be analyzed. The recurring event sequences are identified efficiently by flattening a hierarchical sequence of the events such as is obtained from the Sequitur algorithm. A trie is generated from the recurring event sequences and edges of nodes of the trie are marked as rule edges or non-rule edges. A rule is formed from a set of nodes connected by rule edges. The rules can be updated as additional event sequences are analyzed. False positive suppression policies include a violation- consistency policy and an expected event disappearance policy.	09-02-2010
20100312877	HOST ACCOUNTABILITY USING UNRELIABLE IDENTIFIERS - An IP (Internet Protocol) address is a directly observable identifier of host network traffic in the Internet and a host's IP address can dynamically change. Analysis of traffic (e.g., network activity or application request) logs may be performed and a host tracking graph may be generated that shows hosts and their bindings to IP addresses over time. A host tracking graph may be used to determine host accountability. To generate a host tracking graph, a host is represented. Host representations may be application-dependent. In an implementation, application-level identifiers (IDs) such as user email IDs, messenger login IDs, social network IDs, or cookies may be used. Each identifier may be associated with a human user. These unreliable IDs can be used to track the activity of the corresponding hosts.	12-09-2010
20110208714	LARGE SCALE SEARCH BOT DETECTION - A framework may be used for identifying low-rate search bot traffic within query logs by capturing groups of distributed, coordinated search bots. Search log data may be input to a history-based anomaly detection engine to determine if query-click pairs associated with a query are suspicious in view of historical query-click pairs for the query. Users associated with suspicious query-click pairs may be input to a matrix-based bot detection engine to determine correlations between queries submitted by the users. Those users indicating strong correlations may be categorized as bots, whereas those who do not may be categorized as part of flash crowd traffic.	08-25-2011
20110283360	IDENTIFYING MALICIOUS QUERIES - A framework identifies malicious queries contained in search logs to uncover relationships between the malicious queries and the potential attacks launched by attackers submitting the malicious queries. A small seed set of malicious queries may be used to identify an IP address in the search logs that submitted the malicious queries. The seed set may be expanded by examining all queries in the search logs submitted by the identified IP address. Regular expressions may be generated from the expanded set of queries and used for detecting yet new malicious queries. Upon identifying the malicious queries, the framework may be used to detect attacks on vulnerable websites, spamming attacks, and phishing attacks.	11-17-2011
20120102169	AUTOMATIC IDENTIFICATION OF TRAVEL AND NON-TRAVEL NETWORK ADDRESSES - A system to automatically classify types of IP addresses associated with a user. Information, such as user names, machine information, IP address, etc., may be obtained from logs. For each user or host in the logs, home IP addresses are identified from IP addresses where the user or host shows a predetermined level of activity. Travel IP addresses are identified, which are IP addresses at locations greater than a predetermined distance from the home IP addresses, as determined from geolocation data. A pattern analysis may be performed to determine which of the home IP addresses are work IP addresses associated with the user or host. The system may thus provide a classification of a user's or host's associated IP addresses as being one of travel, home, and work IP addresses. From this classification, mobility patterns may be derived, as well as applications to enhance security, advertising, search and network management.	04-26-2012
20120246720	USING SOCIAL GRAPHS TO COMBAT MALICIOUS ATTACKS - Detection of user accounts associated with spammer attacks may be performed by constructing a social graph of email users. Biggest connected components (BCC) of the social graph may be used to identify legitimate user accounts, as the majority of the users in the biggest connected components are legitimate users. BCC users may be used to identify more legitimate users. Using degree-based detection techniques and PageRank based detection techniques, the hijacked user accounts and spammer user accounts may be identified. The users' email sending and receiving behaviors may also be examined, and the subgraph structure may be used to detect stealthy attackers. From the social graph analysis, legitimate user accounts, malicious user accounts, and compromised user accounts can be identified.	09-27-2012
20120304287	AUTOMATIC DETECTION OF SEARCH RESULTS POISONING ATTACKS - Search result poisoning attacks may be automatically detected by identifying groups of suspicious uniform resource locators (URLs) containing multiple keywords and exhibiting patterns that deviate from other URLs in the same domain without crawling and evaluating the actual contents of each web page. Suspicious websites are identified and lexical features are extracted for each such website. The websites are clustered based on their lexical features, and group analysis is performed on each group to identify at least one suspicious group. Other implementations are directed to detecting a search engine optimization (SEO) attack by processing a large population of URLs to identify suspicious URLs based on the presence of a subset of keywords in each URL and the relative newness of each URL.	11-29-2012
20130152057	OPTIMIZING DATA PARTITIONING FOR DATA-PARALLEL COMPUTING - A data partitioning plan is automatically generated that—given a data-parallel program and a large input dataset, and without having to first run the program on the input dataset—substantially optimizes performance of the distributed execution system that explicitly measures and infers various properties of both data and computation to perform cost estimation and optimization. Estimation may comprise inferring the cost of a candidate data partitioning plan, and optimization may comprise generating an optimal partitioning plan based on the estimated costs of computation and input/output.	06-13-2013
20130185791	VOUCHING FOR USER ACCOUNT USING SOCIAL NETWORKING RELATIONSHIP - Trusted user accounts of an application provider are determined. Graphs, such as trees, are created with each node corresponding to a trusted account. Each of the nodes is associated with a vouching quota, or the nodes may share a vouching quota. Untrusted user accounts are determined. For each of these untrusted accounts, a trusted user account that has a social networking relationship is determined. If the node corresponding to the trusted user account has enough vouching quota to vouch for the untrusted user account, then the quota is debited, a node is added for the untrusted user account to the graph, and the untrusted user account is vouched for. If not, available vouching quota may be borrowed from other nodes in the graph.	07-18-2013
20130339158	DETERMINING LEGITIMATE AND MALICIOUS ADVERTISEMENTS USING ADVERTISING DELIVERY SEQUENCES - Known legitimate and malicious display advertisements are selected, and the ordered sequence of entities involved in the delivery of each display advertisement is observed and used to generate advertisement delivery sequences. The entities include the various servers, publishers, and advertising networks that are involved in the delivery of a display advertisement. Attributes of the entities in each sequence are determined and used to generate a set of rules that identify a display advertisement as legitimate or malicious based on the attributes of the advertising delivery sequence associated with the delivery of the display advertisement. The generated rules are used to identify possible malicious advertisements, and to identify one or more sources of malicious display advertisements.	12-19-2013

Patent applications by Yinglian Xie, Cupertino, CA US