Patent application number | Description | Published |
20100218031 | ROOT CAUSE ANALYSIS BY CORRELATING SYMPTOMS WITH ASYNCHRONOUS CHANGES - An indication of a problem in at least one component of a computing system is obtained. A relevant change set associated with a directed dependency graph is analyzed. The computing system is configured to proactively overcome a root cause of the problem. The relevant change set includes a list of past changes to the computing system which are potentially relevant to the problem. The directed dependency graph includes dependency information regarding given components of the computing system invoked by transactions in the computing system. The analyzing includes identifying at least one of the past changes to the computing system that is the root cause of the problem. | 08-26-2010 |
20110072319 | Parallel Processing of ETL Jobs Involving Extensible Markup Language Documents - Techniques for running an Extract Transform Load (ETL) job in parallel on one or more processors wherein the ETL job comprises use of an extensible markup language (XML) document are provided. The techniques include receiving an XML document input, identifying a node in the XML document at which partitioning of the XML document is to begin, sending partition information to each respective processor, performing a shallow parsing of the XML document in parallel on the one or more processors, wherein each processor performs shallow parsing using the identified partition node until it reaches its identified partition, using the shallow parsing to generate the partition of the input XML document, wherein each processor generates a different partition of the same XML document, and sending each partition in streaming format to an ETL job instance. | 03-24-2011 |
20110096074 | SYSTEM AND METHOD FOR PERFORMANCE MANAGEMENT OF LARGE SCALE SDP PLATFORMS - Arrangements and methods for employing empirical evidence to estimate the performance of applications with very few data samples, in complex environments such as dynamic SDP environments, using one or more effective, data-plotting models. | 04-28-2011 |
20110191323 | EFFICIENT MULTIPLE TUPLE GENERATION OVER STREAMING XML DATA - Methods and arrangements for extracting tuples from a streaming XML document. A query twig is applied to the XML document stream, tuples are extracted from the XML document stream based on the query twig, and a quantity of extracted tuples is limited via foregoing extraction of duplicate tuples extraction of tuples that do not satisfy query twig criteria. | 08-04-2011 |
20120078929 | Utilizing Metadata Generated During XML Creation to Enable Parallel XML Processing - A method, computer program product, and system for enabling parallel processing of an XML document without pre-parsing, utilizing metadata associated with the XML document and created at the same time as the XML document. The metadata is used to generate partitions of the XML document at the time of parallel processing, without requiring system-intensive pre-parsing. | 03-29-2012 |
20120079364 | Finding Partition Boundaries for Parallel Processing of Markup Language Documents - A method, a computer program product and a system identify partition locations within an extended markup language (XML) document without parsing so as to process portions of said document in parallel. The XML document includes sections required to remain continuous. The document is scanned for continuous sections without parsing, and boundaries of the initial partitions are adjusted to reside outside the continuous sections to determine resulting partitions for the document. The resulting partitions may be processed in parallel to provide the document information for storage. | 03-29-2012 |
20120166460 | Utilizing Metadata Generated During XML Creation to Enable Parallel XML Processing - A method, computer program product, and system for enabling parallel processing of an XML document without pre-parsing, utilizing metadata associated with the XML document and created at the same time as the XML document. The metadata is used to generate partitions of the XML document at the time of parallel processing, without requiring system-intensive pre-parsing. | 06-28-2012 |
20120311589 | SYSTEMS AND METHODS FOR PROCESSING HIERARCHICAL DATA IN A MAP-REDUCE FRAMEWORK - Methods and arrangements for processing hierarchical data in a map-reduce framework. Hierarchical data is accepted, and a map-reduce job is performed on the hierarchical data. This performing of a map-reduce job includes determining a cost of partitioning the data, determining a cost of redefining the job and thereupon selectively performing at least one step taken from the group consisting of: partitioning the data and redefining the job. | 12-06-2012 |
20120320059 | SYSTEM AND METHOD FOR PERFORMANCE MANAGEMENT OF LARGE SCALE SDP PLATFORMS - Arrangements and methods for employing empirical evidence to estimate the performance of applications with very few data samples, in complex environments such as dynamic SDP environments, using one or more effective data-plotting models. | 12-20-2012 |
20120324459 | PROCESSING HIERARCHICAL DATA IN A MAP-REDUCE FRAMEWORK - Methods and arrangements for processing hierarchical data in a map-reduce framework. Hierarchical data is accepted, and a map-reduce job is performed on the hierarchical data. This performing of a map-reduce job includes determining a cost of partitioning the data, determining a cost of redefining the job and thereupon selectively performing at least one step taken from the group consisting of: partitioning the data and redefining the job. | 12-20-2012 |
20130086116 | DECLARATIVE SPECIFICATION OF DATA INTEGRATON WORKFLOWS FOR EXECUTION ON PARALLEL PROCESSING PLATFORMS - A method for receiving a declarative specification including a plurality of stages. Each stage specifies an atomic operation, a data input to the atomic operation, and a data output from the atomic operation. The data input is characterized by a data type. Links between at least two of the stages are generated to create a data integration workflow. The data integration workflow is compiled to generate computer code for execution on a parallel processing platform. The computer code configured to perform at least one of data preparation and data analysis. | 04-04-2013 |
20130254237 | DECLARATIVE SPECIFICATION OF DATA INTEGRATON WORKFLOWS FOR EXECUTION ON PARALLEL PROCESSING PLATFORMS - A method for receiving a declarative specification including a plurality of stages. Each stage specifies an atomic operation, a data input to the atomic operation, and a data output from the atomic operation. The data input is characterized by a data type. Links between at least two of the stages are generated to create a data integration workflow. The data integration workflow is compiled to generate computer code for execution on a parallel processing platform. The computer code configured to perform at least one of data preparation and data analysis. | 09-26-2013 |
20130325826 | MATCHING TRANSACTIONS IN MULTI-LEVEL RECORDS - A method for identifying matching transactions between two log files where each transaction includes one or more statements. Each log file record records the execution of a statement and includes a transaction identifier. Each record in turn in one log file is compared to an advancing window of records in the other log file. A first table contains associations of statements to transactions and transactions to statements for records in the window. If a match is found between a record in the one file and a record in the window, information associating partial transactions in the one file to potential transactions of the records in the window is added to a second table. If an end-of-transaction record is read from the one file, a best match is found between the ended transaction and the potential transactions based on information in the first and second tables. | 12-05-2013 |
20140188783 | SAMPLING TRANSACTIONS FROM MULTI-LEVEL LOG FILE RECORDS - A log file contains operation records, each operation record is of a certain type, and each operation record is associated with a transaction. A plurality of operation records is read from the log file into a record store. Records of the plurality of operation records of each operation record type are sampled at a predefined sampling rate. Operation records in the plurality of operations records are identified that are associated with completed transactions of which the sampled operation records are associated. The identified operation records are then extracted from the record store into a data store. | 07-03-2014 |
20140195505 | SAMPLING TRANSACTIONS FROM MULTI-LEVEL LOG FILE RECORDS - A log file contains operation records, each operation record is of a certain type, and each operation record is associated with a transaction. A plurality of operation records is read from the log file into a record store. Records of the plurality of operation records of each operation record type are sampled at a predefined sampling rate. Operation records in the plurality of operations records are identified that are associated with completed transactions of which the sampled operation records are associated. The identified operation records are then extracted from the record store into a data store. | 07-10-2014 |
20140236976 | MATCH WINDOW SIZE FOR MATCHING MULTI-LEVEL TRANSACTIONS BETWEEN LOG FILES - A predefined number of matches is identified between records in a first file and records in a second file. For the matches, determine the span of the actual range of record positions in the second file relative to the positions of the operation records in the first file within which all matches were found. If the actual span is smaller than the span of a current defined range of record positions by at least a first threshold value, decrease the span of the current defined range. If the actual span is within a second threshold value of the span of the current defined range, increase the span of the current defined range. If an amount above a third threshold value of operation records in the first file are not matched to operation records in the second file, increasing the span of the current defined range. | 08-21-2014 |
20140279945 | MATCHING TRANSACTIONS IN MULTI-LEVEL RECORDS - Identifying matching transactions. First and second log files contain operation records of transactions in a transaction workload, each file recording a respective execution of the transaction workload, the method comprising. A first record location in the first file and an associated window of a defined number of sequential second record locations in the second file are advanced one record location at a time. Whether each operation record of a complete transaction at a first record location has a matching operation record at one of the record locations in the associated window of second record locations is determined. If so, the complete transaction in the first file and the transaction that includes the matching operation records in the second file are identified as matching transactions. | 09-18-2014 |
20140337283 | COMPARING DATABASE PERFORMANCE WITHOUT BENCHMARK WORKLOADS - Database operation records are sequentially read from two or more log files. If the transaction identifier is new and the record is not an end-of-transaction record, an open transactions list entry is created. If the transaction identifier is new and the record is an end-of-transaction record, a transaction type list entry is created or updated. If the transaction identifier is not new and is not an end-of-transaction record, an open transactions list entry is updated. If the transaction identifier is not new and the record is an end-of-transaction record, a transaction type list entry is created or updated. When all log file records are read, analytical comparison between the information associated with two or more of the log files in data fields in the transaction type list entries is performed. | 11-13-2014 |
20140337364 | COMPARING DATABASE PERFORMANCE WITHOUT BENCHMARK WORKLOADS - Database operation records are sequentially read from two or more log files. If the transaction identifier is new and the record is not an end-of-transaction record, an open transactions list entry is created. If the transaction identifier is new and the record is an end-of-transaction record, a transaction type list entry is created or updated. If the transaction identifier is not new and is not an end-of-transaction record, an open transactions list entry is updated. If the transaction identifier is not new and the record is an end-of-transaction record, a transaction type list entry is created or updated. When all log file records are read, analytical comparison between the information associated with two or more of the log files in data fields in the transaction type list entries is performed. | 11-13-2014 |