Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees


Code restructuring

Subclass of:

717 - Data processing: software development, installation, and management

717100000 - SOFTWARE PROGRAM DEVELOPMENT TOOL (E.G., INTEGRATED CASE TOOL OR STAND-ALONE DEVELOPMENT TOOL)

717136000 - Translation of code

717140000 - Compiling code

717151000 - Optimization

Patent class list (only not empty are listed)

Deeper subclasses:

Class / Patent application numberDescriptionNumber of patent applications / Date published
717159000 Code restructuring 73
20120185836OPTIMIZING CODE USING A BI-ENDIAN COMPILER - In one embodiment, a method includes identifying a byte swap operation, building a domain including the byte swap operation and other expressions, identifying domain entries and domain exits associated with the domain, determining that a benefit will be obtained by performing a swap of the domain, and responsive to the determination performing the swap of the domain, and storing the swapped domain in a storage medium. Other embodiments are described and claimed.07-19-2012
20090328019DETECTING RACE CONDITIONS WITH A SOFTWARE TRANSACTIONAL MEMORY SYSTEM - A dynamic race detection system is provided that detects race conditions in code that executes concurrently in a computer system. The dynamic race detection system uses a modified software transactional memory (STM) system to detect race conditions. A compiler converts portions of the code that are not configured to operate with the STM system into pseudo STM code that operates with the STM system. The dynamic race detection system detects race conditions in response to either a pseudo STM transaction in the pseudo STM code failing to validate when executed or an actual STM transaction failing to validate when executed because of conflict with a concurrent pseudo STM transaction.12-31-2009
20090013315Method and Apparatus for Restructuring a Software Program Hierarchy - Method, apparatus, and computer readable medium for restructuring a software program hierarchy having interface files and implementation files that include the interface files are described. In one example, dependencies between program units in the interface files and the implementation files are determined. The dependencies are represented as a plurality of bit strings. Correlated bitstrings of the plurality of bit strings are clustered into a plurality of partitions. Each of the plurality of partitions is transformed into corresponding program units. New interface files are respectively created having the corresponding program units for each of the plurality of partitions.01-08-2009
20100050164PIPELINED PROCESSOR AND COMPILER/SCHEDULER FOR VARIABLE NUMBER BRANCH DELAY SLOTS - Different numbers of delay slots are assigned by a compiler/scheduler to each different type of jump operation in a pipelined processor system. The number of delay slots is variable and kept to the minimum needed by each type of jump operation. A compatible processor uses a corresponding number of branch delay slots to exploit the difference in predictability of different types of branch or jump operations. Different types of jump operations resolved their target addresses in different numbers of delay slots. As a result, the compiler/scheduler is able to generate more efficient code than for a processor with a fixed number of delay slots for all jump types, resulting in better processor performance.02-25-2010
20120324431PATTERN-BASED COMPILATION OF ASYNCHRONOUS CONSUMPTION - The present invention extends to methods, systems, and computer program products for transforming source code to await execution of asynchronous operations. Embodiments of the invention simplify authoring and use of asynchronous methods, by generating statements that use well-defined awaitable objects to await completion of asynchronous operations. For example, a computer system can transform a statement that requests to await the completion of an asynchronous operation into a plurality of statements that use a predefined pattern of members of an awaitable object corresponding the asynchronous operation. The pattern can include one or more members configured to return a completion status of the asynchronous operation, one or more members configured to resume execution of the asynchronous method at a resumption point when the asynchronous operation completes, and one or more members configured to retrieve completion results. Accordingly, the plurality of statements can use these members to await execution of the asynchronous operation.12-20-2012
20080263530METHOD AND SYSTEM FOR AUTOMATED CODE CONVERSION - A method and system for converting application code into optimized application code or into execution code suitable for execution on a computation engine with an architecture comprising at least a first and a second level of data memory units are disclosed. In one aspect, the method comprises obtaining application code, the application code comprising data transfer operations between the levels of memory units. The method further comprises converting at least a part of the application code. The converting of application code comprises scheduling of data transfer operations from a first level of memory units to a second level of memory units such that accesses of data accessed multiple times are brought closer together in time than in the original code. The converting of application code further comprises, after the scheduling of the data transfer operations, deciding on layout of the data in the second level of memory units to improve the data layout locality such that data which is accessed closer together in time is also brought closer together in the layout than in the original code.10-23-2008
20080295085INTEGRATED CODE REVIEW TOOL - A code review tool system includes a developer on a developer node, a reviewer on a reviewer node, and a server on a server node. The developer node and the reviewer node include an integrated code review tool. The integrated code review tool includes functionality for a developer to specify source code files to be reviewed and a list of reviewer identifiers. The integrated code review tool also includes functionality to associate the reviewer's comments and/or proposed code changes with context information identifying a location in the source code files. The reviewer's comments, proposed code changes, and associated context information is sent to the developer. The developer may then see the reviewer's comments and proposed code changes in context with the location in the source code to which the comments and code changes pertain.11-27-2008
20090276766RUNTIME PROFITABILITY CONTROL FOR SPECULATIVE AUTOMATIC PARALLELIZATION - A compilation method and mechanism for parallelizing program code. A method for compilation includes analyzing source code and identifying candidate code for parallelization. The method includes parallelizing the candidate code, in response to determining said profitability meets a predetermined criteria; and generating object code corresponding to the source code. The generated object code includes both a non-parallelized version of the candidate code and a parallelized version of the candidate code. During execution of the object code, a dynamic selection between execution of the non-parallelized version of the candidate code and the parallelized version of the candidate code is made. Changing execution from said parallelized version of the candidate code to the non-parallelized version of the candidate code, may be in response to determining a transaction failure count meets a pre-determined threshold. Additionally, changing execution from one version to the other may be in further response to determining an execution time of the parallelized version of the candidate code is greater than an execution time of the non-parallelized version of the candidate code.11-05-2009
20090199169METHOD AND SYSTEM FOR ARRAY OPTIMIZATION - A method for transforming access to a structure array, that includes compiling source code, wherein compiling the source code includes identifying the structure array in the source code, performing an object safety analysis to determine whether the structure array is safe for transformation, wherein the object safety analysis includes an inter-procedural alias class analysis, performing a profitability analysis on the structure array when the structure array is safe for transformation, wherein the profitability analysis includes selecting a transformation from a plurality of transformations, wherein the plurality of transformations includes a pointer based fully splitting transformation, a pointer based partially splitting transformation, and an address based fully splitting transformation, and performing the selected transformation on the structure array, and storing the compiled code.08-06-2009
20090064118SOFTWARE DEOBFUSCATION SYSTEM AND METHOD - A system and method are disclosed that enable automated deobfuscation of software. A method may include identifying at least one section of target software matching trigger criteria, either by using pattern matching or behavior analysis; emulating at least a portion of the identified section; and generating deobfuscated software by substituting a simplified section for the identified section. The method may further be iterated. Emulation includes simulating the effect of certain instructions on control flow and/or memory locations, such as the program stack, a register, cache memory, heap memory, or other memory. The simplified section may comprise a number of no operation (NOP) instructions replacing, which may then be jumped for further simplification.03-05-2009
20090055814JUST-IN-TIME COMPILER SUPPORT FOR INTERRUPTIBLE CODE - A computer implemented method for performing inlining in a just-in-time compiler. Compilation of a first code of a program is begun. The first code is one of an interruptible code and a non-interruptible code. A try region is established around a second code of the program to form a wrapped second code. The try region is a boundary between interruptible and non-interruptible code such that a third code that modifies an observable state of the program cannot be moved across the boundary. The second code is, relative to the first code, the other of the interruptible code and the non-interruptible code. The wrapped second code is inlined with the first code during compilation. Compilation of the first code is completed to form a resultant code. The resultant code is stored.02-26-2009
20110231829USE OF COMPILER-INTRODUCED IDENTIFIERS TO IMPROVE DEBUG INFORMATION PERTAINING TO USER VARIABLES - A method and system for improving debug information pertaining to user variables using a compiler. The method may include identifying a statement to be removed from its current position in an internal representation of a program by a compiler as part of the compiler optimization, replacing the statement to be removed with a debug annotation, adding references to the debug annotation in subsequent debug expressions referring to the removed statement, and emitting debug location information for a user variable using the debug annotation.09-22-2011
20100229163METHOD FOR MODIFYING THE ASSEMBLY OUTPUT OF A COMPILER - The present invention performs manipulations on the assembly file level. As a compiler outputs an assembly file, the assembly file may be inspected and modified before it is sent to the assembler. One or more of the following modifications may be made to the assembly file: rewrite certain symbols, scramble program symbols, reorganize declarations of global variables so that their layout and default values are known prior to linking, and identify initializer and de-initializer functions in order to make them callable through central initialization and de-initialization functions, respectively.09-09-2010
20100218175EXCEPTION DECLARATION REFACTORING TO REDUCE MEMORY FOOTPRINT - A system, method and program product for optimizing compiled Java code to reduce file size. A system is provided that includes: a first optimization that removes unnecessary exception declarations in the compiled Java code; a second optimization that converts checked exception declarations to unchecked exception declarations in the compiled Java code; and a third optimization that removes exception lists in the compiled Java code.08-26-2010
20090138863Method And Apparatus For Protecting .NET Programs - The present invention discloses a method and apparatus for protecting .net programs, relating to software protection. The method mainly includes: selecting a binary code segment from a .net program; transforming the binary code segment, and removing it from the .net program; writing the binary code segment to a shell of the .net program, and writing a shell calling instruction to the .net program; and executing the .net program, and calling a .net virtual machine to execute the binary code segment. The apparatus includes a selecting module, a transforming and removing module, a writing module, and an executing module. The programs running on the .net platform can be protected simply by being transformed.05-28-2009
20090313616Code reuse and locality hinting - A method and apparatus for improving parallelism through optimal code replication is herein described. An optimal replication factor for code is determined based on costs associated with a plurality of replication factors. The code is replicated by the optimal replication factor, and then the code is potentially executed in parallel to obtain parallelized efficient execution.12-17-2009
20090217252OPTIMIZING COMPILER TRANSFORMS FOR A HIGH LEVEL SHADER LANGUAGE - A high level shader language compiler incorporates transforms to optimize shader code for graphics processing hardware. An instruction reordering transform determines instruction encapsulations of dependent instructions that reduce concurrent register usage by the shader. A phase pulling transform re-organizes the shader's instructions into phases that reduce a measure of depth of texture loads. A register assigning transform assigns registers to lower register usage by the shader.08-27-2009
20100058303SYSTEM AND METHOD FOR CONDITIONAL EXPANSION OBFUSCATION - Disclosed herein are systems, methods, and computer readable-media for obfuscating code through conditional expansion obfuscation. The method includes identifying a conditional expression in a computer program, identifying a sequence of conditional expressions that is semantically equivalent to the conditional expression, and replacing the conditional expression with the semantically equivalent sequence of conditional expressions. One option replaces each like conditional expression in the computer program with a diverse set of sequences of semantically equivalent conditional expressions. A second option rearranges computer instructions that are to be processed after the sequence of conditional expression is evaluated so that a portion of the instructions is performed before the entire sequence of conditional expressions is evaluated. A third option performs conditional expansion obfuscation of a conditional statement in combination with branch extraction obfuscation.03-04-2010
20100058304TYPE DESCRIPTOR MANAGEMENT FOR FROZEN OBJECTS - The efficient use of type descriptors with frozen objects. A frozen object might actually include several type descriptors, a primary type descriptor that is canonical according to a set of canonicalization rules, and an auxiliary type descriptor that is not identical to the primary type descriptor. The auxiliary type descriptor may be used to access the canonical type descriptor. When performing an operation, if the auxiliary type descriptor can be used to perform the operation, then that auxiliary type descriptor may be used. If the canonical type descriptor is to be used to perform the operation, the auxiliary type descriptor is used to gain access to the canonical primary type descriptor. The primary type descriptor is then used to perform the operation.03-04-2010
20120311553HANDLING CROSS-THREAD METHOD CALLS - A method of compiling source code into object code for a multi-threaded runtime environment is disclosed. Source code is compiled into object code using a compilation engine. Marshalling attributes associated with method code intended for executing in a secondary thread are identified. The marshalling attributes and the method code are rewritten as marshaled method code for executing the method code in the secondary thread according to the identified marshalling attributes.12-06-2012
20110265070RESUMABLE METHODS - APIs are provided, that are external to a programming language but that provide functionality that can be plugged into a language compiler. The provided APIs tailor functionality associated with asynchronous programming, iterators or writing symmetric co-routines using a generalized pattern-based approach. Several types of resumable methods are provided in the APIs which can be applied to method bodies written in traditional program code. Syntactically distinguishable control points in method bodies written in traditional program code invoke transformation of the code by the compiler using the external APIs. The transformed code enables the pausing and resumption of the code sandwiched between control points in the transformed code. The source code contained within a method having control points in it is transformed so that code within the method can be executed in discrete parts, each part starting and ending at a control point in the transformed code.10-27-2011
20120117551OPTIMIZATION OF DECLARATIVE QUERIES - Source code is generated that includes one or more iterator-based expressions such as declarative queries. The source code is translated into an intermediate language that classifies operators making up the iterator-based expressions into classes based on whether the operators are aggregating, element-wise, or sink operators. The intermediate language, including the identified classes, is processed using an automaton to replace the iterator-based expressions with one or more equivalent non-iterator-based expressions. Where an iterator-based expression is nested, the nested expression is processed using an equivalent number of nested automatons. The resulting optimized source code may be compiled and executed using fewer virtual function calls than the equivalent non-optimized source code.05-10-2012
20110131561Memory Optimization of Virtual Machine Code by Partitioning Extraneous Information - A method, computer program product, and system for memory optimization by partitioning extraneous information from executable virtual machine code or interpreted code. The extraneous information may be stored separately, or accessed from the original code if needed for debugging or servicing the code in the field. This approach optimizes memory usage by reducing memory footprint while maintaining accessibility of the non-executable information for debugging and other processes necessary for servicing code in the field.06-02-2011
20120096448APPARATUS AND METHOD TO SELECTIVELY REMOVE MEMOIZING FUNCTIONS FROM PROGRAM CODE - A method to selectively remove memoizing functions from computer program code is disclosed herein. In one embodiment, such a method includes locating a memoizing function call in program code. The method then replaces the memoizing function call with a simple object allocation. Using escape analysis, the method determines whether the replacement is legal. If the replacement is not legal, the method removes the simple object allocation and reinserts the original memoizing function call in its place. If the replacement is legal, the method retains the simple object allocation in the program code. If desired, certain compiler optimizations, such as stack allocation and scalarization, may then be performed on the simple object allocation. A corresponding computer program product and apparatus are also disclosed herein.04-19-2012
717160000 Including loop 49
20090307673System and Method for Domain Stretching for an Advanced Dual-Representation Polyhedral Loop Transformation Framework - A system and method for domain stretching for an advanced dual-representation polyhedral loop transformation framework are provided. The mechanisms of the illustrative embodiments address the weaknesses of the known polyhedral loop transformation based approaches by providing mechanisms for performing code generation transformations on individual statement instances in an intermediate representation generated by the polyhedral loop transformation optimization of the source code. These code generation transformations have the important property that they do not change program order of the statements in the intermediate representation. This property allows the result of the code generation transformations to be provided back to the polyhedral loop transformation mechanisms in a program statement view, via a new re-entrance path of the illustrative embodiments, for additional optimization. In addition, mechanisms are provided for stretching the domains of statements in a program loop view of the source code to thereby normalize the domains.12-10-2009
20130125104REDUCING BRANCH MISPREDICTION IMPACT IN NESTED LOOP CODE - According to one aspect of the present disclosure, a method and technique for reducing branch misprediction impact for nested loop code is disclosed. The method includes: responsive to identifying code having an outer loop and an inner loop, determining a quantity of iterations of the inner loop for an initial number of iterations of the outer loop; determining a number of processor cycles for executing the quantity of iterations of the inner loop for the initial number of iterations of the outer loop; determining whether the number of processor cycles is less than a threshold; and responsive to determining that the number of processor cycles is less than the threshold, fully unrolling the inner loop for the initial number of iterations of the outer loop.05-16-2013
20130125105UNIFIED PARALLEL C WORK-SHARING LOOP CONSTRUCT TRANSFORMATION - Control flow information and data flow information associated with a program containing a upc_forall loop are built. A shared reference map data structure using the control flow information and the data flow information is created. All local shared accesses are hashed to facilitate a constant access stride after being rewritten. All local shared references in a hash entry having a longest list are privatized. The upc_forall loop is rewritten into a for loop. Responsive to a determination that an unprocessed upc_forall loop does not exist, dead store elimination is run. The control flow information and the data flow information associated with the program containing the for loop is rebuilt.05-16-2013
20090307675DATA DEPENDENCE TESTING FOR LOOP FUSION WITH CODE REPLICATION, ARRAY CONTRACTION, AND LOOP INTERCHANGE - Methods and apparatus to data dependence testing for loop fusion, e.g., with code replication, array contraction, and/or loop interchange, are described. In one embodiment, a compiler may optimize code for efficient execution during run-time by testing for dependencies associated with improving memory locality through code replication in loops that enable various loop transformations. Other embodiments are also described.12-10-2009
20090083724System and Method for Advanced Polyhedral Loop Transformations of Source Code in a Compiler - A system and method for advanced polyhedral loop transformations of source code in a compiler are provided. The mechanisms of the illustrative embodiments address the weaknesses of the known polyhedral loop transformation based approaches by providing mechanisms for performing code generation transformations on individual statement instances in an intermediate representation generated by the polyhedral loop transformation optimization of the source code. These code generation transformations have the important property that they do not change program order of the statements in the intermediate representation. This property allows the result of the code generation transformations to be provided back to the polyhedral loop transformation mechanisms in a program statement view, via a new re-entrance path of the illustrative embodiments, for additional optimization.03-26-2009
20090064119Systems, Methods, And Computer Products For Compiler Support For Aggressive Safe Load Speculation - Systems, methods and computer products for compiler support for aggressive safe load speculation. Exemplary embodiments include a method for aggressive safe load speculation for a compiler in a computer system, the method including building a control flow graph, identifying both countable and non-countable loops, gathering a set of candidate loops for load speculation, for each candidate loop in the set of candidate loops gathered for load speculation performing computing an estimate of the iteration count, delay cycles, and code size, performing a profitability analysis and determine an unroll factor based on the delay cycles and the code size, transforming the loop by generating a prologue loop to achieve data alignment and an unrolled main loop with loop directives, indicating which loads can safely be executed speculatively and performing low-level instruction on the generated unrolled main loop.03-05-2009
20090288075PARALLELIZING NON-COUNTABLE LOOPS WITH HARDWARE TRANSACTIONAL MEMORY - A system and method for speculatively parallelizing non-countable loops in a multi-threaded application. A multi-core processor receives instructions for a multi-threaded application. The application may contain non-countable loops. Non-countable loops have an iteration count value that cannot be determined prior to the execution of the non-countable loop, a loop index value that cannot be non-speculatively determined prior to the execution of an iteration of the non-countable loop, and control that is not transferred out of the loop body by a code line in the loop body. The compiler replaces the non-countable loop with a parallelized loop pattern that uses outlined function calls defined in a parallelization library (PL) in order to speculatively execute iterations of the parallelized loop. The parallelized loop pattern is configured to squash and re-execute any speculative thread of the parallelized loop pattern that is signaled to have a transaction failure.11-19-2009
20110271265METHOD OF AUTOMATIC GENERATION OF EXECUTABLE CODE FOR MULTI-CORE PARALLEL PROCESSING - A system, method and computer program product for optimizing the process of compilation of computer program code. The compiler transforms the program code written in a variety of languages and creates additional code performing parallel processing of program tasks on target hardware architecture. The transformation of code is performed to achieve optimization of various critical parameters such as the execution speed on a multi-core or cluster target hardware architecture.11-03-2011
20080244549METHOD AND APPARATUS FOR EXPLOITING THREAD-LEVEL PARALLELISM - According to one example embodiment, there is disclosed herein uses partial recurrence relaxation for parallelizing DOACROSS loops on multi-core computer architectures. By one example definition, a DOACROSS may be a loop that allows successive iterations executing by overlapping; that is, all iterations must impose a partial execution order. According to one embodiment, the inventive subject matter may be used to transform the dependence structure of a given loop with recurrences for maximal degree of thread-level parallelism (TLP), where the threads can be mapped on to either different logical processors (in a hyperthreaded processor) or can be mapped onto different physical cores (or processors) in a multi-core processor.10-02-2008
20080250401Tiling across loop nests with possible recomputation - Described is a technology by which a series of loop nests corresponding to source code are detected by a compiler, with the series of loop nests tiled together, (thereby increasing the ratio of cache hits to misses in a multi-processor environment). The compiler transforms the series of loop nests into a plurality of tile loops within a controller loop, including using dependency analysis to determine which results from a tile loop need to be pre-computed before another tile loop. For dependency analysis, the compiler may use a directed acyclic graph as a high-level intermediate representation, and split the graph into sub-graphs each representing an array. The compiler uses descriptors processed from the graph to determine the controller loop and the tile loops within that controller loop.10-09-2008
20090328020INTERFACE OPTIMIZATION IN A CLOSED SYSTEM - Interface optimization is provided using a closed system in which all the individual software components in the system are known to the compiler at a single point in time. This knowledge enables significant opportunities to optimize the implementation of interfaces on a set of implemented objects. When code is compiled, because the compiler knows the full list of interfaces and the objects which implement the interfaces, it can improve execution and working set (i.e., recently referenced pages in a program's virtual address space) when implementing the interfaces on objects. This improvement may be realized by reducing the size of interface lookup tables which map each interface to the object types which implement that particular interface.12-31-2009
20080271005SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR REDUCING NUMBER OF EXCEPTION CHECKS - Based on operations within an uncounted loop of source code, one or more calculations are generated for determining, at runtime, an expected number of iterations through which the uncounted loop can iterate before encountering an exception corresponding to at least one target exception check. A copy of the uncounted loop omitting each target exception check is generated. The uncounted loop, the copy of the uncounted loop, and the one or more calculations are arranged in compiled code so that at runtime program flow enters the copy of the uncounted loop. If a maximum number of iterations of the copy of the uncounted loop is reached, program flow proceeds from the copy of the uncounted loop to the uncounted loop. The maximum number of iterations is no more than the smallest member of a set consisting of the expected number of iterations for each target exception check.10-30-2008
20100146495METHOD AND SYSTEM FOR INTERPROCEDURAL PREFETCHING - A computing system has an amount of shared cache, and performs runtime automatic parallelization wherein when a parallelized loop is encountered, a main thread shares the workload with at least one other non-main thread. A method for providing interprocedural prefetching includes compiling source code to produce compiled code having a main thread including a parallelized loop. Prior to the parallelized loop in the main thread, the main thread includes prefetching instructions for the at least one other non-main thread that shares the workload of the parallelized loop. As a result, the main thread prefetches data into the shared cache for use by the at least one other non-main thread.06-10-2010
20110225573Computation Reuse for Loops with Irregular Accesses - A compiler selects a nested loop within software code that includes an outer loop and an inner loop. The outer loop includes an outer induction variable and the inner loop includes an inner induction variable. The compiler identifies a computation included in the nested loop that generates an irregular array access, which includes an expression of both the outer induction variable and the inner induction variable. Next, the compiler identifies a redundant calculation for the computation based upon the outer induction variable and the inner induction variable, and generates a temporary variable to correspond with the redundant calculation. The compiler replaces the computation with the temporary variable in the nested loop and, in turn, compiles the nested loop with the included temporary variable.09-15-2011
20090064120Method and apparatus to achieve maximum outer level parallelism of a loop - In one embodiment, the present invention includes a method for constructing a data dependency graph (DDG) for a loop to be transformed, performing statement shifting to transform the loop into a first transformed loop according to at least one of first and second algorithms, performing unimodular and echelon transformations of a selected one of the first or second transformed loops, partitioning the selected transformed loop to obtain maximum outer level parallelism (MOLP), and partitioning the selected transformed loop into multiple sub-loops. Other embodiments are described and claimed.03-05-2009
20110231830Loop Transformation for Computer Compiler Optimization - A new computer-compiler architecture includes code analysis processes in which loops present in an intermediate instruction set are transformed into more efficient loops prior to fully executing the intermediate instruction set. The compiler architecture starts by generating the equivalent intermediate instructions for the original high level source code. For each loop in the intermediate instructions, a total cycle cost is calculated using a cycle cost table associated with the compiler. The compiler then generates intermediate code for replacement loops in which all conversion instructions are removed. The cycle costs for these new transformed loops are then compared against the total cycle cost for the original loops. If the total cycle costs exceed the new cycle costs, the compiler will replace the original loops in the intermediate instructions with the new transformed loops prior to generation of final code using the instruction set of the processor.09-22-2011
20080313621SCALAR CODE REDUCTION USING SHORTEST PATH ROUTING - This document discusses, among other things, a system and method computing the shortest path expression in a loop having a plurality of expressions. Candidate expressions in the loop are identified and partitioned into sets. A cost matrix is computed as a function of the sets. Paths are found through the cost matrix and, if there are cycles in the paths, the cycles are broken. One or more shortest path expressions are generated as a function of the paths and one or more of the expressions in the loop are replaced with the shortest path expressions.12-18-2008
20090077544METHOD, SYSTEM AND PROGRAM PRODUCT FOR OPTIMIZING EMULATION OF A SUSPECTED MALWARE - A method, system and program product for optimizing emulation of a suspected malware. The method includes identifying, using an emulation optimizer tool, whether an instruction in a suspected malware being emulated by an emulation engine in a virtual environment signifies a long loop and, if so, generating a first hash for the loop. Further, the method includes ascertaining whether the first hash generated matches any long loop entries in a storage and, if so calculating a second hash for the long loop. Furthermore, the method includes inspecting any long loop entries ascertained to find an entry having a respective second hash matching the second hash calculated. If an entry matching the second hash calculated is found, the method further includes updating one or more states of the emulation engine, such that, execution of the long loop of the suspected malware is skipped, which optimizes emulation of the suspected malware.03-19-2009
20090077545PIPELINED PARALLELIZATION OF MULTI-DIMENSIONAL LOOPS WITH MULTIPLE DATA DEPENDENCIES - A mechanism for folding all the data dependencies in a loop into a single, conservative dependence. This mechanism leads to one pair of synchronization primitives per loop. This mechanism does not require complicated, multi-stage compile time analysis. This mechanism considers only the data dependence information in the loop. The low synchronization cost balances the loss in parallelism due to the reduced overlap between iterations. Additionally, a novel scheme is presented to implement required synchronization to enforce data dependences in a DOACROSS loop. The synchronization is based on an iteration vector, which identifies a spatial position in the iteration space of the loop. Multiple iterations executing in parallel have their own iteration vector for synchronization where they update their position in the iteration space. As no sequential updates to the synchronization variable exist, this method exploits a greater degree of parallelism.03-19-2009
20090307674IMPROVING DATA LOCALITY AND PARALLELISM BY CODE REPLICATION AND ARRAY CONTRACTION - Provided are a method, system, and article of manufacture improving data locality and parallelism by code replication and array contraction. Source code including an array of elements referenced using at least two indices is processed. The array is nested within multiple loops, wherein at least two of the loops perform iterations with respect to the indices of the array, wherein the index incremented in at least one innermost loop of the loops does not comprise a leftmost index in the array. The source code is transformed to object code by performing operations including fusing at least two innermost loops of the loops in object code generated by compiling the source code by replicating statements from at least one of the innermost loops into a fused innermost loop and performing loop interchange in the object code to have the fused innermost loop provide iterations with respect to the leftmost index in the array.12-10-2009
20110029962VECTORIZATION OF PROGRAM CODE - A method for vectorization of a block of code is provided. The method comprises receiving a first block of code as input; and converting the first block of code into at least a second block of code and a third block of code. The first block of code accesses a first set of memory addresses that are potentially misaligned. The second block of code performs conditional leaping address incrementation to selectively access a first subset of the first set of memory addresses. The third block of code accesses a second subset of the first set of memory addresses starting from an aligned memory address, simultaneously accessing multiple memory addresses at a time. No memory address belongs to both the first subset and the second subset of memory addresses.02-03-2011
20110047534PROACTIVE LOOP FUSION OF NON-ADJACENT LOOPS WITH INTERVENING CONTROL FLOW INSTRUCTIONS - A system and method for optimization of code with non-adjacent loops. A compiler builds a node tree, which is not a control flow graph, that represents parent-child relationships of nodes of a computer program. Each node represents a control flow statement or a straight-line block of statements of the computer program. If a non-adjacent loop pair of nodes satisfy predetermined conditions, the compiler may perform legal code transformations on the computer program and corresponding node transformations on the node tree. These transformations may make adjacent this pair of loop nodes. The compiler may be configured to perform legal code transformations, such as head and tail duplication, code motion, and if-merging, in order to make adjacent these two loop nodes. Then loop fusion may be performed on this loop pair in order to increase instruction level parallelism (ILP) within an optimized version of the original source code.02-24-2011
20090328021Multiversioning if statement merging and loop fusion - In one embodiment of the invention, a method for fusing a first loop nested in a first IF statement with a second loop nested in a second IF statement without the use of modified and referenced (mod-ref) information to determine if certain conditional statements in the IF statements retain variable values.12-31-2009
20080229298Compiler Method for Employing Multiple Autonomous Synergistic Processors to Simultaneously Operate on Longer Vectors of Data - A compiler includes a mechanism for employing multiple synergistic processors to execute long vectors. The compiler receives a single source program. The compiler identifies vectorizable loop code in the single source program and extracts the vectorizable loop code from the single source program. The compiler then compiles the extracted vectorizable loop code for a plurality of synergistic processors. The compiler also compiles a remainder of the single source program for a principal processor to form an executable main program such that the executable main program controls operation of the executable vectorizable loop code on the plurality of synergistic processors.09-18-2008
20080222623Efficient Code Generation Using Loop Peeling for SIMD Loop Code with Multiple Misaligned Statements - An approach is provided for vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores. In this framework, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirements of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residual iteration counts, and multiple statements with arbitrary alignment combinations. Loop peeling is used to reduce the computational overhead associated with misaligned data. A loop prologue and epilogue are peeled from individual iterations in the simdized loop, and vector-splicing instructions are applied to the peeled iterations, while the steady-state loop body incurs no additional computational overhead.09-11-2008
20100318980STATIC PROGRAM REDUCTION FOR COMPLEXITY ANALYSIS - Described is an analysis tool/techniques for determining the computational complexity of a computer program, including when the program includes procedures having nested loops and/or multi-path loops. First, multi-path loops are converted into code-fragments consisting of simpler loops via a transformation called control flow refinement. Progress invariants are determined for appropriate locations in the procedure to represent relationships between a state that can arise at that program location and the previous state at that location. A bound finding mechanism (such as one based on pattern matching) is then used to compute loop bounds from progress invariants. These bounds are then composed appropriately to determine a precise bound for the enclosing procedure.12-16-2010
20100318979VECTOR ATOMIC MEMORY OPERATION VECTOR UPDATE SYSTEM AND METHOD - A system and method of compiling program code, wherein the program code includes an operation on an array of data elements stored in memory of a computer system. The program code is scanned for an equation which may have recurring data points. The equation is then replaced with vectorized machine executable code, wherein the machine executable code comprises a nested loop and wherein the nested loop comprises an exterior loop and a virtual interior loop. The exterior loop decomposes the equation into a plurality of loops of length N, wherein N is an integer greater than one. The virtual interior loop executes vector operations corresponding to the N length loop to form a result vector resident in memory, wherein the virtual interior loop includes a vector atomic memory operation (AMO) instruction.12-16-2010
20100023932Efficient Software Cache Accessing With Handle Reuse - A mechanism for efficient software cache accessing with handle reuse is provided. The mechanism groups references in source code into a reference stream with the reference stream having a size equal to or less than a size of a software cache line. The source code is transformed into optimized code by modifying the source code to include code for performing at most two cache lookup operations for the reference stream to obtain two cache line handles. Moreover, the transformation involves inserting code to resolve references in the reference stream based on the two cache line handles. The optimized code may be output for generation of executable code.01-28-2010
20110314461IMPLEMENTING PARALLEL LOOPS WITH SERIAL SEMANTICS - The present invention extends to methods, systems, and computer program products for implementing parallel loops with serial semantics. Embodiments of the invention provide a semantic transforms and codegen patterns that provide more efficient parallel loop implementations with serial loop semantics. Embodiments of the invention support assignments within for-loop bodies, support break/return constructs within for-loop bodies, and run transformations to covert serial constructs to parallel constructs.12-22-2011
20090055815Eliminate Maximum Operation in Loop Bounds with Loop Versioning - A method and computer program product for eliminating maximum and minimum expressions within loop bounds are provided. A loop in a code is identified. The loop is determined to meet conditions, which require an upper loop bound and a lower loop bound to contain maximum and minimum expressions, loop-invariant operands, a predetermined size for a code size, and a total number of instructions to be greater than a predetermined constant. A profitability of loop versioning is determined based on a performance gain of a fast version of the loop, a probability of executing the fast version of the loop at runtime, and an overhead for performing loop versioning. A pair of lower loop bound and upper loop bound values resulting in a constant number is identified. A loop iteration value is checked to be a non-zero constant. Branches are identified, and loop versioning is performed to generate a versioned loop.02-26-2009
20120117552SPECULATIVE COMPILATION TO GENERATE ADVICE MESSAGES - Methods to improve optimization of compilation are presented. In one embodiment, a method includes identifying one or more optimization speculations with respect to a code region and speculatively performing transformation on an intermediate representation of the code region in accordance with an optimization speculation. The method includes generating an advice message corresponding to the optimization speculation and displaying the advice message if the optimization speculation results in an improved compilation result.05-10-2012
20120079469Systems And Methods For Compiler-Based Vectorization Of Non-Leaf Code - Systems and methods for the vectorization of software applications are described. In some embodiments, source code dependencies can be expressed in ways that can extend a compiler's ability to vectorize otherwise scalar functions. For example, when compiling a called function, a compiler may identify dependencies of the called function on variables other than parameters passed to the called function. The compiler may record these dependencies, e.g., in a dependency file. Later, when compiling a calling function that calls the called function, the same (or another) compiler may reference the previously-identified dependencies and use them to determine whether and how to vectorize the calling function. In particular, these techniques may facilitate the vectorization of non-leaf loops. Because non-leaf loops are relatively common, the techniques described herein can increase the amount of vectorization that can be applied to many applications.03-29-2012
20120167069LOOP PARALLELIZATION BASED ON LOOP SPLITTING OR INDEX ARRAY - Methods and apparatus to provide loop parallelization based on loop splitting and/or index array are described. In one embodiment, one or more split loops, corresponding to an original loop, are generated based on the mis-speculation information. In another embodiment, a plurality of subloops are generated from an original loop based on an index array. Other embodiments are also described.06-28-2012
20120167068SPECULATIVE REGION-LEVEL LOOP OPTIMIZATIONS - A system and method are configured to apply region level optimizations to a selected region of source code rather than loop level optimizations to a loop or loop nest. The region may include an outer loop, a plurality of inner loops and at least one control code. If the region includes an exceptional control flow statement and/or a procedure call, speculative region-level multi-versioning may be applied.06-28-2012
20100205592CONTROL STRUCTURE REFINEMENT OF LOOPS USING STATIC ANALYSIS - A system and method for discovering a set of possible iteration sequences for a given loop in a software program is described, to transform the loop representation. In a program containing a loop, the loop is partitioned into a plurality of portions based on splitting criteria. Labels are associated with the portions, and an initial loop automaton is constructed that represents the loop iterations as a regular language over the labels corresponding to the portions in the program. Subsequences of the labels are analyzed to determine infeasibility of the subsequences permitted in the automaton. The automaton is refined by removing all infeasible subsequences to discover a set of possible iteration sequences in the loop. The resulting loop automaton is used in a subsequent program verification or analysis technique to find violations of correctness properties in programs.08-12-2010
20110185347METHOD AND SYSTEM FOR EXECUTION PROFILING USING LOOP COUNT VARIANCE - A method for executing a computer program involving obtaining a statement of the source code, where the statement comprises a method call, and where the source code is composed in a statically-typed programming language. The method also involves, upon entry into a loop included in the computer program: incrementing an entry counter by one; and, for each iteration of the loop, incrementing an iteration counter by one, incrementing a local counter by one to obtain an incremented value of the local counter, incrementing a summation variable by the incremented value of the local counter, and executing the iteration of the loop.07-28-2011
20120331453VECTORIZATION OF PROGRAM CODE - A method for vectorization of a block of code is provided. The method comprises receiving a first block of code as input; and converting the first block of code into at least a second block of code and a third block of code. The first block of code accesses a first set of memory addresses that are potentially misaligned. The second block of code performs conditional leaping address incrementation to selectively access a first subset of the first set of memory addresses. The third block of code accesses a second subset of the first set of memory addresses starting from an aligned memory address, simultaneously accessing multiple memory addresses at a time. No memory address belongs to both the first subset and the second subset of memory addresses.12-27-2012
20120151463METHOD AND SYSTEM FOR UTILIZING PARALLELISM ACROSS LOOPS - A method for compiling application source code that includes selecting multiple loops for parallelization. The multiple loops include a first loop and a second loop. The method further includes partitioning the first loop into a first set of chunks, partitioning the second loop into a second set of chunks, and calculating data dependencies between the first set of chunks and the second set of chunks. A first chunk of the second set of chunks is dependent on a first chunk of the first set of chunks. The method further includes inserting, into the first loop and prior to completing compilation, a precedent synchronization instruction for execution when execution of the first chunk of the first set of chunks completes, and completing the compilation of the application source code to create an application compiled code.06-14-2012
717161000 Including scheduling instructions 11
20100107147COMPILER AND COMPILING METHOD - A compiler allocates an unroll_group_number conferred based on a sequence in which a loop body is replicated by loop unrolling to each loop body during loop unrolling based on the optimized number of loop unrolling. The allocated unroll_group_number is added to each instruction included in each loop body. A priority of an instruction is adjusted based on the allocated unroll_group_number during instruction scheduling.04-29-2010
20120192169Optimizing Libraries for Validating C++ Programs Using Symbolic Execution - Particular embodiments optimize a C++ function comprising one or more loops for symbolic execution, comprising for each loop, if there is a branching condition within the loop, then rewrite the loop to move the branching condition outside the loop. Particular embodiments may further optimize the C++ function through simplified symbolic expressions and adding constructs forcing delayed interpretation of symbolic expressions during the symbolic execution.07-26-2012
20090013316Extension of Swing Modulo Scheduling to Evenly Distribute Uniform Strongly Connected Components - A method, apparatus, and computer instructions for scheduling instructions for execution. Identify a series of instructions in a loop, wherein the series of instructions has a cyclic data dependency. Determine whether the series of instructions is a uniform series of instructions. Schedule execution of the uniform series of instructions within the loop to optimize execution of the loop in response to the identified series of instructions being the uniform series of instructions.01-08-2009
20080201699Efficient Data Reorganization to Satisfy Data Alignment Constraints - Vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores is presented. In the framework presented herein, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirement of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residue iteration counts, and multiple statements with arbitrary alignment combinations. Beyond generating a valid simdization, a preferred embodiment further improves the quality of the generated codes. Four stream-shift placement policies are disclosed, which minimize the number of data reorganization generated by the alignment handling.08-21-2008
20090064121SYSTEMS, METHODS, AND COMPUTER PRODUCTS FOR IMPLEMENTING SHADOW VERSIONING TO IMPROVE DATA DEPENDENCE ANALYSIS FOR INSTRUCTION SCHEDULING - Systems, methods and computer products for implementing shadow versioning to improve data dependence analysis for instruction scheduling. Exemplary embodiments include a method to identify loops within the code to be compiled, for each loop a dependence initializing a matrix, for each loop shadow identifying symbols that are accessed by the loop, examining dependencies, storing, comparing and classifying the dependence vectors, generating new shadow symbols, replacing the old shadow symbols with the new shadow symbols, generating alias relationships between the newly created shadow symbols, scheduling instructions and compiling the code.03-05-2009
20090138864Method and Apparatus for Automatic Second-Order Predictive Commoning - A method and apparatus for automatic second-order predictive commoning is provided by the present invention. During an analysis phase, the intermediate representation of a program code is analyzed to identify opportunities for second-order predictive commoning optimization. The analyzed information is used by the present invention for apply transformations to the program code, such that the number of memory access and the number of computations are reduced for loop iterations and performance of program code is improved.05-28-2009
20090254895Prefetching Irregular Data References for Software Controlled Caches - Prefetching irregular memory references into a software controlled cache is provided. A compiler analyzes source code to identify at least one of a plurality of loops that contain an irregular memory reference. The compiler determines if the irregular memory reference within the at least one loop is a candidate for optimization. Responsive to an indication that the irregular memory reference may be optimized, the compiler determines if the irregular memory reference is valid for prefetching. Responsive to an indication that the irregular memory reference is valid for prefetching, a store statement for an address of the irregular memory reference is inserted into the at least one loop. A runtime library call is inserted into a prefetch runtime library for the irregular memory reference. Data associated with the irregular memory reference is prefetched into the software controlled cache when the runtime library call is invoked.10-08-2009
20080313622Low-Level Connectivity Admission Control for Networked Consumer Storage Devices - The present invention relates to a device and a method of accessing a storage of a storage device by reading or writing data to said storage device, wherein said accessing is controlled by an external data controller via a low level in a software stack of said device, and wherein said storage device is accessed without hampering the operation of functionalities at a higher level of said software stack of said device, said device comprises an intermediate storage, and the method comprises the steps of storing commands related to said accessing of data in said intermediate storage as a command queue, executing said commands in said command queue when allowed by a command queue scheduler, said scheduler scheduling in dependence of at least one of the functionalities at said higher level of said software stack. Thereby full control is obtained on storage medium requests by the scheduler.12-18-2008
20090217253COMPILER FRAMEWORK FOR SPECULATIVE AUTOMATIC PARALLELIZATION WITH TRANSACTIONAL MEMORY - A computer program is speculatively parallelized with transactional memory by scoping program variables at compile time, and inserting code into the program at compile time. Determinations of the scoping can be based on whether scalar variables being scoped are involved in inter-loop non-reduction data dependencies, are used outside loops in which they were defined, and at what point in a loop a scalar variable is defined. The inserted code can include instructions for execution at a run time of the program to determine loop boundaries of the program, and issue checkpoint instructions and commit instructions that encompass transaction regions in the program. A transaction region can include an original function of the program and a spin-waiting loop with a non-transactional load, wherein the spin-waiting loop is configured to wait for a previous thread to commit before the current transaction commits.08-27-2009
20100251229PROCESSORS AND COMPILING METHODS FOR PROCESSORS - A compiling method compiles an object program to be executed by a processor having a plurality of execution units operable in parallel. In the method a first availability chain is created from a producer instruction (p09-30-2010
20080201698Reordering application code to improve processing performance - A method of reordering a sequence of code for processing by a target data processor in order to reduce an execution time for said code on said target data processor is disclosed. The method comprises the steps of: in response to a request to execute said sequence of code, loading said sequence of code into a volatile data store associated with said target data processor; analyzing said sequence of code in relation to properties of said target data processor; identifying interlocks within said sequence of code when executing on said target data processor, in which a portion of code would be stalled while waiting for an earlier portion to complete; reordering said sequence of code to remove at least some of said interlocks; and executing said reordered sequence of code; wherein said steps of analyzing, identifying, reordering and executing are performed by said target data processor.08-21-2008

Patent applications in class Code restructuring

Patent applications in all subclasses Code restructuring