Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Andrey Balmin

Andrey Balmin, San Jose, CA US

Patent application number	Description	Published
20090006314	INDEX EXPLOITATION - Various embodiments of a computer-implemented method, computer program product, and data processing system are provided that generate an index plan that produces a superset of data comprising the query result. In some embodiments, a computer-implemented method, computer program product, and data processing system produce a maximal-index-satisfiable query tree.	01-01-2009
20090006447	BETWEEN MATCHING - Various embodiments of a computer-implemented method, computer program product, and data processing system are provided that identify a range filter in a mark-up language query. In response to receiving a query of at least one mark-up language document, the query comprising a plurality of singleton filters, at least one group of the plurality of singleton filters are identified. Each group of comprises at least two singleton filters, wherein each group is equivalent to a range filter having a start value and a stop value. The start value and stop value are based on at least two singleton filters of each group. A query plan is generated to process the query based on, at least in part, a range defined by the start value and the stop value of the at least two singleton filters of each group.	01-01-2009
20090063399	INDEX SELECTION FOR XML DATABASE SYSTEMS - A method, system, and computer program product for selecting indexes to be created over XML data are provided. The method, system, and computer program product provide for receiving a workload for the XML data, the workload including one or more database statements, and utilizing an optimizer to recommend a set of one or more path expressions based on the workload received, wherein the set of one or more path expressions is to be used to create one or more indexes over the XML data.	03-05-2009
20090240682	GRAPH SEARCH SYSTEM AND METHOD FOR QUERYING LOOSELY INTEGRATED DATA - A system, method and computer program product for executing a query on linked data sources. Embodiments of the invention generate an instance graph expressing relationships between objects in the linked data sources and receive a query including at least first and second search terms. The first search term is then executed on the instance graph and a summary graph is generated using the results of the executing step. A second search term is then executed on the summary graph.	09-24-2009
20090259641	OPTIMIZATION OF EXTENSIBLE MARKUP LANGUAGE PATH LANGUAGE (XPATH) EXPRESSIONS IN A DATABASE MANAGEMENT SYSTEM CONFIGURED TO ACCEPT EXTENSIBLE MARKUP LANGUAGE (XML) QUERIES - An apparatus, system, and method are disclosed for optimization of XPath expressions in a database management system configured to accept XML queries. Operations of the method include receiving an XQuery representation and partitioning XPath expressions within the XQuery representation into a plurality of XPath expression clusters. The XPath expression clusters may comprise one or more XPath expressions and those in each cluster may operate on a common document. Furthermore, the XPath expressions in each cluster are hierarchically related to each other such that branch nodes of the cluster are executable independent of nodes in other XPath expression clusters. The method also defines merging the one or more XPath expressions into one or more expression trees for each XPath expression cluster. The method generates one or more query execution plans from the one or more XPath expression blocks. The method includes, for each query execution plan, splitting each of the XPath expression blocks into one or more ordered fragments. The method determines a cardinality according to database statistics and an execution cost for each XPath expression block within each query execution plan. Finally, the method determines an aggregate cardinality for each query execution plan and an aggregate execution cost for each query execution plan. Therefore, an XQuery may be optimized at both the global XQuery and local XPath expression block level, improving performance and reducing overhead.	10-15-2009
20100223266	SCALING DYNAMIC AUTHORITY-BASED SEARCH USING MATERIALIZED SUBGRAPHS - According to one embodiment of the present invention, a method for processing a query is provided. The method includes generating a set of pre-computed materialized sub-graphs from a dataset and receiving a search query having one or more search query terms. A particular one of the pre-computed materialized sub-graphs is accessed and a dynamic authority-based keyword search is executed on the particular one of the pre-computed materialized sub-graphs. Nodes in the dataset are then retrieved based on the executing, and a response to the search query is provided which includes the retrieved nodes.	09-02-2010
20100223268	Searching Digital Information and Databases - This application describes methods for searching digital information such as digital documents (e.g., web pages) and computer databases, and specific search techniques such as authority ranking and information retrieval (IR) relevance ranking in keyword searches. In some implementations, the technique includes analyzing digital information viewed as a labeled graph, including nodes and edges, based on a flow of authority among the nodes along the edges, the flow of authority being derived at least in part from different authority transfer rates assigned to the edges based on edge type schema information. In some implementations, the system includes an object rank module configured to generate multiple initial rankings corresponding to multiple query keywords, each of the multiple initial rankings indicating authority of nodes in a graph with respect to each respective query keyword individually; and a query module configured to combine the multiple initial rankings in response to a query.	09-02-2010
20120304186	Scheduling Mapreduce Jobs in the Presence of Priority Classes - Techniques for scheduling one or more MapReduce jobs in a presence of one or more priority classes are provided. The techniques include obtaining a preferred ordering for one or more MapReduce jobs, wherein the preferred ordering comprises one or more priority classes, prioritizing the one or more priority classes subject to one or more dynamic minimum slot guarantees for each priority class, and iteratively employing a MapReduce scheduler, once per priority class, in priority class order, to optimize performance of the one or more MapReduce jobs.	11-29-2012
20120304188	Scheduling Flows in a Multi-Platform Cluster Environment - Techniques for scheduling multiple flows in a multi-platform cluster environment are provided. The techniques include partitioning a cluster into one or more platform containers associated with one or more platforms in the cluster, scheduling one or more flows in each of the one or more platform containers, wherein the one or more flows are created as one or more flow containers, scheduling one or more individual jobs into the one or more flow containers to create a moldable schedule of one or more jobs, flows and platforms, and automatically converting the moldable schedule into a malleable schedule.	11-29-2012
20120311581	ADAPTIVE PARALLEL DATA PROCESSING - Described herein are methods, systems, apparatuses and products for adaptive parallel data processing. An aspect provides providing a map phase in which at least one map function is applied in parallel on different partitions of input data at different mappers in a parallel data processing system; providing a communication channel between mappers using a distributed meta-data store, wherein said map phase comprises mapper data processing adapted responsive to communication with said distributed meta-data store; and providing data accessible by at least one reduce phase node in which at least one reduce function is applied. Other embodiments are disclosed.	12-06-2012
20130031558	Scheduling Mapreduce Jobs in the Presence of Priority Classes - Techniques for scheduling one or more MapReduce jobs in a presence of one or more priority classes are provided. The techniques include obtaining a preferred ordering for one or more MapReduce jobs, wherein the preferred ordering comprises one or more priority classes, prioritizing the one or more priority classes subject to one or more dynamic minimum slot guarantees for each priority class, and iteratively employing a MapReduce scheduler, once per priority class, in priority class order, to optimize performance of the one or more MapReduce jobs.	01-31-2013
20130031561	Scheduling Flows in a Multi-Platform Cluster Environment - Techniques for scheduling multiple flows in a multi-platform cluster environment are provided. The techniques include partitioning a cluster into one or more platform containers associated with one or more platforms in the cluster, scheduling one or more flows in each of the one or more platform containers, wherein the one or more flows are created as one or more flow containers, scheduling one or more individual jobs into the one or more flow containers to create a moldable schedule of one or more jobs, flows and platforms, and automatically converting the moldable schedule into a malleable schedule.	01-31-2013
20130080441	INDEX SELECTION FOR XML DATABASE SYSTEMS - A method, computer-implemented system, and computer program product for creating indexes over XML data managed by a database system are provided. The method, computer-implemented system, and computer program product provide for receiving a workload for the XML data, the workload including one or more database statements, utilizing an optimizer of the database system to enumerate a set of one or more path expressions by creating a virtual universal index based on the workload received and matching a path expression to the virtual universal index, and recommending one or more path expressions from the set of one or more candidate path expressions to create the indexes over the XML data.	03-28-2013

Patent applications by Andrey Balmin, San Jose, CA US

Andrey Balmin, Mountain View, CA US

Patent application number	Description	Published
20080222087	System and Method for Optimizing Query Access to a Database Comprising Hierarchically-Organized Data - An cost based optimizer optimizes access to at least a portion of hierarchically-organized documents, such as those formatted using eXtensible Markup Language (XML), by estimating a number of results produced by the access of the hierarchically-organized documents. Estimating the number of results comprises computing the cardinality of each operator executing query language expressions and further computing a sequence size of sequences of hierarchically-organized nodes produced by the query language expressions. Access to the hierarchically-organized documents is optimized using the structure of the query expression and/or path statistics involving the hierarchically-organized data. The cardinality and the sequence size are used to calculate a cost estimation for execution of alternate query execution plans. Based on the cost estimation, an optimal query execution plan is selected from among the alternate query execution plans.	09-11-2008