Patent application number | Description | Published |
20090006392 | DATA PROFILE COMPUTATION - Architecture that provides a data profile computation technique which employs key profile computation and data pattern profile computation. Key profile computation in a data table includes both exact keys as well as approximate keys, and is based on key strengths. A key strength of 100% is an exact key, and any other percentage in an approximate key. The key strength is estimated based on the number of table rows that have duplicated attribute values. Only column sets that exceed a threshold value are returned. Pattern profiling identifies a small set of regular expression patterns which best describe the patterns within a given set of attribute values. Pattern profiling includes three phases: a first phases for determining token regular expressions, a second phase for determining candidate regular expressions, and a third phase for identifying the best regular expressions of the candidates that match the attribute values. | 01-01-2009 |
20090254522 | DETECTING ESTIMATION ERRORS IN DICTINCT PAGE COUNTS - A database server may be configured to compute distinct page counts of pages accessed to execute operands of respective queries. The queries may be executed against a table comprised of the pages and having an index managed by the database server. The distinct page counts may be obtained by counting, as a part of the executing of the queries, distinct pages accessed during the execution of the queries. | 10-08-2009 |
20100235347 | TECHNIQUES FOR EXACT CARDINALITY QUERY OPTIMIZATION - An exact cardinality query optimization system and method for optimizing a query having a plurality of expressions to obtain a cardinality-optimal query execution plan for the query. Embodiments of the system and method use various techniques to shorten the time necessary to obtain the cardinality-optimal query execution plan, which contains the query execution plan when all cardinalities are exact. Embodiments of the system and method include a covering queries technique that leverages query execution feedback to obtain an unordered subset of relevant expressions for the query, an early termination technique that bounds the cardinality to determine whether the processing can be terminate before each of the expressions are executed, and an expressions ordering technique that finds an ordering of expressions that yields the greatest reduction in time to obtain the cardinality-optimal query execution plan. | 09-16-2010 |
20110225164 | GRANULAR AND WORKLOAD DRIVEN INDEX DEFRAGMENTATION - This patent application relates to granular and workload driven database index defragmentation techniques. These techniques allow for defragmenting individual index ranges, performing benefit analyses to estimate the impact of defragmenting indexes or index ranges, and leveraging such benefit analyses to provide automated workload-driven recommendations of index(es) or index range(s) to defragment. | 09-15-2011 |
20110314000 | TRANSFORMATION RULE PROFILING FOR A QUERY OPTIMIZER - Technology is described for transformation rule profiling for a query optimizer. The method can include obtaining a database query configured to be optimized by the query optimizer of a database system. An optimized query plan for the database query can be found using a host set of transformation rules. One transformation rule can be removed and checked at a time. Each transformation rule can be checked to determine whether the transformation rule affects an optimal query plan output. A test query plan can be generated after each transformation rule has been removed. The query optimizer can determine whether the test query plan is different than the optimized query plan in the absence of the removed transformation rule. An equivalent set of transformation rules can be created that includes transformation rules where the test query plan generated from the equivalent set of transformation rules is equivalent to the optimized plan. | 12-22-2011 |
20120323921 | DICTIONARY FOR HIERARCHICAL ATTRIBUTES FROM CATALOG ITEMS - A plurality of items included in a catalog may be obtained, each item associated with an item category. Brand indicators may be obtained, each brand indicator associated with the item category. Brand indicators associated with each of the items may be determined, and the each item may be assigned to a partition group associated with the brand indicator that is associated with the each item. Correlated string tokens that are correlated, greater than a predetermined correlation threshold value, with the brand indicator associated with the partition group that is associated with the each one of the items, the correlated string tokens associated with the each one of the plurality of items, may be determined. A dictionary hierarchy may be generated based on the one or more correlated string tokens. | 12-20-2012 |
20120323929 | COMPRESSION AWARE PHYSICAL DATABASE DESIGN - A plurality of indicators representing a plurality of respective candidate database configurations may be obtained, each of the candidate database configurations including a plurality of database queries and a plurality of candidate database indexes associated with a database table. A portion of the candidate database indexes included in the plurality of database indexes may be selected based on skyline selection. An enumeration of the portion of the plurality of the candidate database indexes may be determined based on a greedy algorithm. | 12-20-2012 |
20130151504 | QUERY PROGRESS ESTIMATION - The claimed subject matter provides a method for providing a progress estimate for a database query. The method includes determining static features of a query plan for the database query. The method also includes selecting an initial progress estimator based on the static features and a trained machine learning model. The model is trained using static features of a plurality of query plans, and dynamic features of the plurality of query plans. Further, the method includes determining dynamic features of the query plan for each of a plurality of candidate estimators. Additionally, the method includes selecting a revised progress estimator based on the static features, the dynamic features and a trained machine learning model for each of the candidate estimators. The method further includes producing the progress estimate based on the revised progress estimator. | 06-13-2013 |
20130159317 | HIGH PRECISION SET EXPANSION FOR LARGE CONCEPTS - A set expansion system is described herein that improves precision, recall, and performance of prior set expansion methods for large sets of data. The system maintains high precision and recall by 1) identifying the qualify of particular lists and applying that quality through a weight, 2) allowing for the specification or negative examples in a set of seeds to reduce the introduction of bad entities into the set, and 3) applying a cutoff to eliminate lists that include a low number of positive matches. The system may perform multiple passes to first generate a good candidate result set and then refine the set to find a set with highest quality. The system may also apply Map Reduce or other distributed processing techniques to allow calculation in parallel. Thus, the system efficiently expands large concept sets from a potentially small set of initial seeds from readily available web data. | 06-20-2013 |
20140379924 | DYNAMIC ALLOCATION OF RESOURCES WHILE CONSIDERING RESOURCE RESERVATIONS - Described herein are technologies relating to computing resource allocation among multiple tenants. Each tenant may have a respective absolute reservation for rate-based computing resources, which is independent of computing resource reservations of other tenants. The multiple tenants vie for the rate-based computing resources, and tasks are scheduled based upon which tenants submit the tasks and the resource reservations of such tenants. | 12-25-2014 |
20150019540 | RETRIEVAL OF ATTRIBUTE VALUES BASED UPON IDENTIFIED ENTITIES - Various technologies that facilitate performance of a data finding data (DFD) search are described herein. A user specifies entities, for example, by entering the entities into a query field, selecting the entities from a computer-executable application, or the like. The user further specifies an attribute of the entities that is of interest. A query is constructed based upon the entities and the attribute, and a search for tables is performed based upon the entities and the attribute. Values of the attribute for the selected entities are identified in a table, and the values of the attribute are returned. | 01-15-2015 |
Patent application number | Description | Published |
20090094086 | AUTOMATIC ASSIGNMENT FOR DOCUMENT REVIEWING - Assignment algorithm for automatically making assignments between documents and document reviewers for a review process. If the automated assignments need adjusting, a coordinator can manually refine the assignment(s). The assignment algorithm facilitates the automated assignment process based on inputs related to a constraint and/or a preference. The constraints and preferences include, but are not limited to, a conflict of interest, a minimum number of reviews, a maximum number of submissions, a partial assignment, bidding preferences, and health metrics. Once the assignments have been made, histograms can be generated that present an overview of certain health metrics, further allowing refinement of the assignment process. | 04-09-2009 |
20090094191 | EXPLOITING EXECUTION FEEDBACK FOR OPTIMIZING CHOICE OF ACCESS METHODS - A proactive monitoring mechanism for correcting the choice of access methods (available query plans) for a given query, based on execution feedback from the same query. The mechanism exploits bypassing predicate short-circuiting inside the database server's predicate evaluation module to obtain expression cardinalities. The mechanism can also modify a plan to obtain expression cardinalities. These techniques are used judiciously by the query optimizer and/or a database administrator (DBA) so that the execution overheads are within acceptable limits. | 04-09-2009 |
20090106746 | APPLICATION AND DATABASE CONTEXT FOR DATABASE APPLICATION DEVELOPERS - Infrastructure for capturing and correlating application context and database context for tuning, profiling and debugging tasks. The infrastructure extends the DBMS and application profiling infrastructure making it easy for a developer to invoke and interact with a tool from inside the development environment. Three sources of information are employed when an application is executed: server tracing, data access layer tracing, and application tracing. The events obtained from each of these sources are written into a log file. An event log is generated on each machine that involves either an application process or the DBMS server process and the log file receives log traces from different processes on a machine to the same trace session. A post-processing step over the event log(s) correlates the application and database contexts. The output is a single view where both the application and database profile of each statement issued by the application are exposed. | 04-23-2009 |
20100262593 | AUTOMATED FILTERED INDEX RECOMMENDATIONS - The described implementations relate to filtered index recommendations. In one case a filtered index recommendation (FIR) tool is configured to recommend a final set of filtered indexes to use with a workload. The final set is selected from a first set of candidate filtered indexes and a second set of merged filtered indexes. | 10-14-2010 |
20100287214 | Static Analysis Framework for Database Applications - A tool facilitating static analysis for database applications, such that the static analysis tool (SAT) can significantly enhance the ability for developers to identify security, correctness and performance problems in database applications during the development phase of an application lifecycle. A static analysis tool for database applications presents a framework for database applications using the ADO.NET data access APIs. The SAT framework consists of a core set of static analysis services upon which verticals such as workload extraction, SQL injection detection, identifying data integrity violations, and SQL performance analysis are built using the core services. | 11-11-2010 |
20110208748 | Foreign-Key Detection - This patent application relates to foreign-key detection. One implementation obtains a set of data tables. This implementation automatically determines foreign-key relationships of columns from separate tables of the set. | 08-25-2011 |
20110295833 | Framework for Testing Query Transformation Rules - Described is a test framework for testing transformation rules of query optimizers. Rule patterns obtained as tree structures from a query optimizer are used to generate queries that are used to test the rule optimizer's transformation rules. The test framework tracks which rules are exercised for each query, and also determines the correctness of the transformation rule by comparing the results of the query processing with the rule and without the rule (by turning off the rule). The test framework creates a composite pattern corresponding to two or more rules, such as to test rules in a set (e.g., as pairs). Also described is the efficient execution of a test suite for correctness testing, in which queries of the test suite are selected based upon cost information. | 12-01-2011 |
20130346464 | Data Services for Enterprises Leveraging Search System Data Assets - A data service system is described herein which processes raw data assets from at least one network-accessible system (such as a search system), to produce processed data assets. Enterprise applications can then leverage the processed data assets to perform various environment-specific tasks. In one implementation, the data service system can generate any of: synonym resources for use by an enterprise application in providing synonyms for specified terms associated with entities; augmentation resources for use by an enterprise application in providing supplemental information for specified seed information; and spelling-correction resources for use by an enterprise application in providing spelling information for specified terms, and so on. | 12-26-2013 |
20140207740 | Isolating Resources and Performance in a Database Management System - Techniques for tenant performance isolation in a multiple-tenant database management system are described. These techniques may include providing a reservation of server resources. The server resources reservation may include a reservation of a central processing unit (CPU), a reservation of Input/Output throughput, and/or a reservation of buffer pool memory or working memory. The techniques may also include a metering mechanism that determines whether the resource reservation is satisfied. The metering mechanism may be independent of an actual resource allocation mechanism associated with the server resource reservation. | 07-24-2014 |
Patent application number | Description | Published |
20110313999 | SLICING RELATIONAL QUERIES USING SPOOL OPERATORS - A relational database server may concurrently execute many relational queries, but a complex relational query may cause performance delays in the fulfillment of other relational queries. Instead, the relational database server may generate a query plan for the relational query, and may endeavor to partition the relational query between a spool operator and a scan operator into two or more query slices, where each query slice may be executed within a query slice threshold. Many alternative candidate query plans may be considered, such as inserting spool and scan operators after various operators and parameterizing operators in order to partition the records of a relation into two or more ranges based on an attribute of the relation. A large search space of candidate query plans may be reviewed in order to select a query plan that respects the query slice threshold while efficiently executing the logic of the relational query. | 12-22-2011 |
20130091120 | INTEGRATED FUZZY JOINS IN DATABASE MANAGEMENT SYSTEMS - A fuzzy joins system that is integrated in a database system generates fuzzy joins between records from two datasets. The fuzzy joins system includes a tokenizer to generate tokens for data records and a transformer to find transforms for the tokens. The fuzzy joins system invokes a signature generator, running within a runtime layer of the database system, to generate signatures for data records based on the tokens and their transforms. Subsequently, an equi-join operation joins the records from the two datasets with at least one equal signature. A similarity calculator, running within a runtime layer of the database system, computes a similarity measure using the token information of the joined records. If the similarity measure for any two records is above a threshold, the fuzzy joins system generates a fuzzy join between such two records. | 04-11-2013 |
20130297655 | PERFORMANCE SERVICE LEVEL AGREEMENTS IN MULTI-TENANT DATABASE SYSTEMS - Various technologies described herein pertain to evaluating service provider compliance with terms of a performance service level agreement (SLA) for a tenant in a multi-tenant database system. The terms of the performance SLA can set a performance criterion as though a level of a resource of hardware of the multi-tenant database system is dedicated to the tenant. An actual performance metric of the resource can be tracked for a workload of the tenant. Further, a baseline performance metric of the resource can be determined for the workload of the tenant. The baseline performance metric can be based on a simulation as though the level of the resource as set in the performance SLA is dedicated to the workload of the tenant. Moreover, the actual performance metric can be compared with the baseline performance metric to evaluate compliance with the performance SLA. | 11-07-2013 |
20130332428 | Online and Workload Driven Index Defragmentation - The subject disclosure is directed towards defragmenting one or more ranges of a database index based upon actual usage statistics and policy. A range tracker tracks and uses statistics corresponding to actual I/O operations to determine whether the benefit of defragmenting a range sufficiently (based upon the policy) exceeds its cost. If so, the online range defragmenter automatically defragments the range in an online manner. The range tracker may be configurable to monitor less than all ranges of the index. | 12-12-2013 |