Patent application number | Description | Published |
20090019425 | DATA SPLITTING FOR RECURSIVE DATA STRUCTURES - Embodiments of the present invention provide a method, system and computer program product for the data splitting of recursive data structures. In one embodiment of the invention, a method for data splitting recursive data structures can be provided. The method can include identifying data objects of a recursive data structure type, such as a linked list, within source code, the recursive data structure type defining multiple different data fields. The method further can include grouping the data objects into some memory pool units, each of which can contain the same number of data objects. Each memory pool unit can be seen as an array of data objects. The method can include data splitting, which could be maximal array splitting in each different memory pool unit. Finally, the method can include three different approaches, including field padding, field padding and field splitting, to handle irregular field sizes in the data structure. | 01-15-2009 |
20090019433 | METHOD AND SYSTEM FOR DESCRIBING WHOLE-PROGRAM TYPE BASED ALIASING - A compilation system for whole-program type based aliasing, the system includes: a set of hardware and networking resources; a front-end, a whole-program optimization component; a backend; an algorithm implemented on the set of hardware and networking resources; wherein the algorithm configures the front-end to a specific programming language being compiled and processes one source file at a time; wherein the whole-program optimization component merges the aliasing information from multiple invocations of the front-end into a single aliasing representation of a whole program; and wherein the backend uses that information to optimize and generate executable code that is the output of the compilation system. | 01-15-2009 |
20090055798 | METHOD AND APPARATUS FOR ADDRESS TAKEN REFINEMENT USING CONTROL FLOW INFORMATION - A computer implemented method, apparatus, and computer program product for obtaining aliasing information for a target variable in a computer program. A control flow graph representing the computer program is partitioned into an taken address portion that includes all reachable nodes in which an address of the target variable is taken and an untaken address portion that includes all other reachable nodes. All references to the target variable are replaced with a temporary variable in the untaken address portion. The target variable is initialized with the value from the temporary variable at each intermediary node in a set of intermediary nodes in the taken address portion. An intermediary node is a node at which an address of a target variable is taken. The aliasing information for the target variable is generated using the modified computer program. | 02-26-2009 |
20090064119 | Systems, Methods, And Computer Products For Compiler Support For Aggressive Safe Load Speculation - Systems, methods and computer products for compiler support for aggressive safe load speculation. Exemplary embodiments include a method for aggressive safe load speculation for a compiler in a computer system, the method including building a control flow graph, identifying both countable and non-countable loops, gathering a set of candidate loops for load speculation, for each candidate loop in the set of candidate loops gathered for load speculation performing computing an estimate of the iteration count, delay cycles, and code size, performing a profitability analysis and determine an unroll factor based on the delay cycles and the code size, transforming the loop by generating a prologue loop to achieve data alignment and an unrolled main loop with loop directives, indicating which loads can safely be executed speculatively and performing low-level instruction on the generated unrolled main loop. | 03-05-2009 |
20090064121 | SYSTEMS, METHODS, AND COMPUTER PRODUCTS FOR IMPLEMENTING SHADOW VERSIONING TO IMPROVE DATA DEPENDENCE ANALYSIS FOR INSTRUCTION SCHEDULING - Systems, methods and computer products for implementing shadow versioning to improve data dependence analysis for instruction scheduling. Exemplary embodiments include a method to identify loops within the code to be compiled, for each loop a dependence initializing a matrix, for each loop shadow identifying symbols that are accessed by the loop, examining dependencies, storing, comparing and classifying the dependence vectors, generating new shadow symbols, replacing the old shadow symbols with the new shadow symbols, generating alias relationships between the newly created shadow symbols, scheduling instructions and compiling the code. | 03-05-2009 |
20100011339 | SINGLE INSTRUCTION MULTIPLE DATA (SIMD) CODE GENERATION FOR PARALLEL LOOPS USING VERSIONING AND SCHEDULING - Embodiments of the present invention address deficiencies of the art in respect to loop parallelization for a target architecture implementing a shared memory model and provide a novel and non-obvious method, system and computer program product for SIMD code generation for parallel loops using versioning and scheduling. In an embodiment of the invention, within a code compilation data processing system a parallel SIMD loop code generation method can include identifying a loop in a representation of source code as a parallel loop candidate, either through a user directive or through auto-parallelization. The method also can include selecting a trip count condition responsive to a scheduling policy set for the code compilation data processing system and also on a minimal simdizable threshold, determining a trip count and an alignment constraint for the selected loop, and generating a version of a parallel loop in the source code according to the alignment constraint and a comparison of the trip count to the trip count condition. | 01-14-2010 |
20110016460 | MULTIPLE PASS COMPILER INSTRUMENTATION INFRASTRUCTURE - A method includes configuring one or more processors to perform operations. The operations include instrumenting at least one code region of an application with at least one annotation for generating profile data when the at least one code region is executed. The operations include executing the application to generate profile data for the at least one code region. The operations also include identifying, from the profile data, a delinquent code region from the generated profile data. The operations include instrumenting the delinquent code region with annotations for generating profile data when the code regions are executed. The operations include executing the application to generate additional profile data for the at least one code region, including the delinquent code region. | 01-20-2011 |
20110093838 | MANAGING SPECULATIVE ASSIST THREADS - An illustrative embodiment provides a computer-implemented process for managing speculative assist threads for data pre-fetching that analyzes collected source code and cache profiling information to identify a code region containing a delinquent load instruction and generates an assist thread, including a value for a local version number, at a program entry point within the identified code region. Upon activation of the assist thread the local version number of the assist thread is compared to the global unique version number of the main thread for the identified code region and an iteration distance between the assist thread relative to the main thread is compared to a predefined value. The assist thread is executed when the local version number of the assist thread matches the global unique version number of the main thread, and the iteration distance between the assist thread relative to the main thread is within a predefined range of values. | 04-21-2011 |
20110283095 | Hardware Assist Thread for Increasing Code Parallelism - Mechanisms are provided for offloading a workload from a main thread to an assist thread. The mechanisms receive, in a fetch unit of a processor of the data processing system, a branch-to-assist-thread instruction of a main thread. The branch-to-assist-thread instruction informs hardware of the processor to look for an already spawned idle thread to be used as an assist thread. Hardware implemented pervasive thread control logic determines if one or more already spawned idle threads are available for use as an assist thread. The hardware implemented pervasive thread control logic selects an idle thread from the one or more already spawned idle threads if it is determined that one or more already spawned idle threads are available for use as an assist thread, to thereby provide the assist thread. In addition, the hardware implemented pervasive thread control logic offloads a portion of a workload of the main thread to the assist thread. | 11-17-2011 |
20120254594 | Hardware Assist Thread for Increasing Code Parallelism - Mechanisms are provided for offloading a workload from a main thread to an assist thread. The mechanisms receive, in a fetch unit of a processor of the data processing system, a branch-to-assist-thread instruction of a main thread. The branch-to-assist-thread instruction informs hardware of the processor to look for an already spawned idle thread to be used as an assist thread. Hardware implemented pervasive thread control logic determines if one or more already spawned idle threads are available for use as an assist thread. The hardware implemented pervasive thread control logic selects an idle thread from the one or more already spawned idle threads if it is determined that one or more already spawned idle threads are available for use as an assist thread, to thereby provide the assist thread. In addition, the hardware implemented pervasive thread control logic offloads a portion of a workload of the main thread to the assist thread. | 10-04-2012 |
20130013899 | Using Hardware Transaction Primitives for Implementing Non-Transactional Escape Actions Inside Transactions - Mechanisms are provided for performing escape actions within transactions. These mechanisms execute a transaction comprising a transactional section and an escape action. The transactional section is comprised of one or more instructions that are to be executed in an atomic manner as part of the transaction. The escape action is comprised of one or more instructions to be executed in a non-transactional manner. These mechanisms further populate at least one actions list data structure, associated with a thread of the data processing system that is executing the transaction, with one or more actions associated with the escape action. Moreover, these mechanisms execute one or more actions in the actions list data structure based upon whether the transaction commits successfully or is aborted. | 01-10-2013 |
20130019232 | MANAGING ALIASING CONSTRAINTSAANM CUI; SHIMINAACI TORONTOAACO CAAAGP CUI; SHIMIN TORONTO CAAANM SILVERA; RAUL E.AACI WOODBRIDGEAACO CAAAGP SILVERA; RAUL E. WOODBRIDGE CA - An illustrative embodiment of a computer-implemented process for managing aliasing constraints, identifies an object to form an identified object, identifies a scope of the identified object to form an identified scope, and assigns a unique value to the identified object within the identified scope. The computer-implemented process further demarcates an entrance to the identified scope, demarcates an exit to the identified scope, optimizes the identified object using a property of the identified scope and associated aliasing information, tracks the identified object state to form tracked state information; and uses the tracked state information to update the identified object. | 01-17-2013 |
20130061000 | SOFTWARE COMPILER GENERATED THREADED ENVIRONMENT - A computer-implemented method for creating a threaded package of computer executable instructions from software compiler generated code includes allocating, through a computer processor, the computer executable instructions into a plurality of stacks, differentiating between different types of computer executable instructions for each computer executable instruction allocated to each stack of the plurality of stacks, creating switch points for each stack of the plurality of stacks based upon the differentiating, and inserting the switch points within each stack of the plurality of stacks. | 03-07-2013 |
20140096141 | EFFICIENT ROLLBACK AND RETRY OF CONFLICTED SPECULATIVE THREADS USING DISTRIBUTED TOKENS - A method for rolling back speculative threads in symmetric-multiprocessing (SMP) environments is disclosed. In one embodiment, such a method includes detecting an aborted thread at runtime and determining whether the aborted thread is an oldest aborted thread. In the event the aborted thread is the oldest aborted thread, the method sets a high-priority request for allocation to an absolute thread number associated with the oldest aborted thread. The method further detects that the high-priority request is set and, in response, modifies a local allocation token of the oldest aborted thread. The modification prompts the oldest aborted thread to retry a work unit associated with its absolute thread number. The oldest aborted thread subsequently initiates the retry of a successor thread by updating the successor thread's local allocation token. A corresponding apparatus and computer program product are also disclosed. | 04-03-2014 |
20140096142 | EFFICIENT ROLLBACK AND RETRY OF CONFLICTED SPECULATIVE THREADS WITH HARDWARE SUPPORT - A method for rolling back speculative threads in symmetric-multiprocessing (SMP) environments is disclosed. In one embodiment, such a method includes detecting an aborted thread at runtime and determining whether the aborted thread is an oldest aborted thread. In the event the aborted thread is the oldest aborted thread, the method sets a high-priority request for allocation to an absolute thread number associated with the oldest aborted thread. The method further detects that the high-priority request is set and, in response, clears the high-priority request and sets an allocation token to the absolute thread number associated with the oldest aborted thread, thereby allowing the oldest aborted thread to retry a work unit associated with the absolute thread number. A corresponding apparatus and computer program product are also disclosed. | 04-03-2014 |
20140123152 | EFFICIENT ROLLBACK AND RETRY OF CONFLICTED SPECULATIVE THREADS WITH HARDWARE SUPPORT - A method for rolling back speculative threads in symmetric-multiprocessing (SMP) environments is disclosed. In one embodiment, such a method includes detecting an aborted thread at runtime and determining whether the aborted thread is an oldest aborted thread. In the event the aborted thread is the oldest aborted thread, the method sets a high-priority request for allocation to an absolute thread number associated with the oldest aborted thread. The method further detects that the high-priority request is set and, in response, clears the high-priority request and sets an allocation token to the absolute thread number associated with the oldest aborted thread, thereby allowing the oldest aborted thread to retry a work unit associated with the absolute thread number. A corresponding apparatus and computer program product are also disclosed. | 05-01-2014 |
20140123153 | EFFICIENT ROLLBACK AND RETRY OF CONFLICTED SPECULATIVE THREADS USING DISTRIBUTED TOKENS - A method for rolling back speculative threads in symmetric-multiprocessing (SMP) environments is disclosed. In one embodiment, such a method includes detecting an aborted thread at runtime and determining whether the aborted thread is an oldest aborted thread. In the event the aborted thread is the oldest aborted thread, the method sets a high-priority request for allocation to an absolute thread number associated with the oldest aborted thread. The method further detects that the high-priority request is set and, in response, modifies a local allocation token of the oldest aborted thread. The modification prompts the oldest aborted thread to retry a work unit associated with its absolute thread number. The oldest aborted thread subsequently initiates the retry of a successor thread by updating the successor thread's local allocation token. A corresponding apparatus and computer program product are also disclosed. | 05-01-2014 |
20140331210 | INSERTING IMPLICIT SEQUENCE POINTS INTO COMPUTER PROGRAM CODE TO SUPPORT DEBUG OPERATIONS - Arrangements described herein relate to inserting implicit sequence points into computer program code to support debug operations. Optimization of the computer program code can be performed during compilation of the computer program code and, during the optimization, implicit sequence points can be inserted into the computer program code. The implicit sequence points can be configured to provide virtual reads of symbols contained in the computer program code when the implicit sequence points are reached during execution of the computer program code during a debug operation performed on the computer program code after the computer program code is optimized and compiled. | 11-06-2014 |
20140331215 | INSERTING IMPLICIT SEQUENCE POINTS INTO COMPUTER PROGRAM CODE TO SUPPORT DEBUG OPERATIONS - Arrangements described herein relate to inserting implicit sequence points into computer program code to support debug operations. Optimization of the computer program code can be performed during compilation of the computer program code and, during the optimization, implicit sequence points can be inserted into the computer program code. The implicit sequence points can be configured to provide virtual reads of symbols contained in the computer program code when the implicit sequence points are reached during execution of the computer program code during a debug operation performed on the computer program code after the computer program code is optimized and compiled. | 11-06-2014 |
20150067260 | OPTIMIZING MEMORY BANDWIDTH CONSUMPTION USING DATA SPLITTING WITH SOFTWARE CACHING - A computer processor collects information for a dominant data access loop and reference code patterns based on data reference pattern analysis, and for pointer aliasing and data shape based on pointer escape analysis. The computer processor selects a candidate array for data splitting wherein the candidate array is referenced by a dominant data access loop. The computer processor determines a data splitting mode by which to split the data of the candidate array, based on the reference code patterns, the pointer aliasing, and the data shape information, and splits the data into two or more split arrays. The computer processor creates a software cache that includes a portion of the data of the two or more split arrays in a transposed format, and maintains the portion of the transposed data within the software cache and consults the software cache during an access of the split arrays. | 03-05-2015 |
20150067268 | OPTIMIZING MEMORY BANDWIDTH CONSUMPTION USING DATA SPLITTING WITH SOFTWARE CACHING - A computer processor collects information for a dominant data access loop and reference code patterns based on data reference pattern analysis, and for pointer aliasing and data shape based on pointer escape analysis. The computer processor selects a candidate array for data splitting wherein the candidate array is referenced by a dominant data access loop. The computer processor determines a data splitting mode by which to split the data of the candidate array, based on the reference code patterns, the pointer aliasing, and the data shape information, and splits the data into two or more split arrays. The computer processor creates a software cache that includes a portion of the data of the two or more split arrays in a transposed format, and maintains the portion of the transposed data within the software cache and consults the software cache during an access of the split arrays. | 03-05-2015 |