Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees


Tong Chen, Yorktown Heights US

Tong Chen, Yorktown Heights, NY US

Patent application numberDescriptionPublished
20080229291Compiler Implemented Software Cache Apparatus and Method in which Non-Aliased Explicitly Fetched Data are Excluded - A compiler implemented software cache apparatus and method in which non-aliased explicitly fetched data are excluded are provided. With the mechanisms of the illustrative embodiments, a compiler uses a forward data flow analysis to prove that there is no alias between the cached data and explicitly fetched data. Explicitly fetched data that has no alias in the cached data are excluded from the software cache. Explicitly fetched data that has aliases in the cached data are allowed to be stored in the software cache. In this way, there is no runtime overhead to maintain the correctness of the two copies of data. Moreover, the number of lines of the software cache that must be protected from eviction is decreased. This leads to a decrease in the amount of computation cycles required by the cache miss handler when evicting cache lines during cache miss handling.09-18-2008
20090070753INCREASE THE COVERAGE OF PROFILING FEEDBACK WITH DATA FLOW ANALYSIS - The present invention provides a system and method for profiling based optimization of a computer program. The system includes an optimization module that profiles feedback from profiled part of a program to a part of the program that was not reached, an identical expressions model that identifies at least one identical expression in the program that have not been profiled and copies alias profiling result from a profiled reference to the reference that has not been profiled, a speculative identical expressions model that identifies at least one speculative identical expression in the program that have not been profiled and copies alias profiling result from a profiled speculative identical reference to the speculative identical reference that has not been profiled, and a similar expressions model that identifies at least one similar expression in the program that have not been profiled and copies alias profiling result from a similar profiled reference to the similar reference that has not been profiled.03-12-2009
20090254711Reducing Cache Pollution of a Software Controlled Cache - Reducing cache pollution of a software controlled cache is provided. A request is received to prefetch data into the software controlled cache. A first designator is set for a first cache access to a first value. If there is the second cache access to prefetch, a determination is made as to whether data associated with the second cache access exists in the software controlled cache. If the data is in the software controlled cache, a determination is made as to whether a second value of a second designator is greater than the first value of the first cache access. If the second value fails to be greater than the first value, the position of the first cache access and the second cache access in a cache line is swapped. The first value is decremented by a predetermined amount and the second value is replaced to equal the first value.10-08-2009
20090254733Dynamically Controlling a Prefetching Range of a Software Controlled Cache - Dynamically controlling a prefetching range of a software controlled cache is provided. A compiler analyzes source code to identify at least one of a plurality of loops that contain irregular memory references. For each irregular memory reference in the source code, the compiler determines whether the irregular memory reference is a candidate for optimization. Responsive to identifying an irregular memory reference that may be optimized, the complier determines whether the irregular memory reference is valid for prefetching. If the irregular memory reference is valid for prefetching, a store statement for an address of the irregular memory reference is inserted into the at least one loop. A runtime library call is inserted into a prefetch runtime library to dynamically prefetch the irregular memory references. Data associated with the irregular memory references are dynamically prefetched into the software controlled cache when the runtime library call is invoked.10-08-2009
20090254895Prefetching Irregular Data References for Software Controlled Caches - Prefetching irregular memory references into a software controlled cache is provided. A compiler analyzes source code to identify at least one of a plurality of loops that contain an irregular memory reference. The compiler determines if the irregular memory reference within the at least one loop is a candidate for optimization. Responsive to an indication that the irregular memory reference may be optimized, the compiler determines if the irregular memory reference is valid for prefetching. Responsive to an indication that the irregular memory reference is valid for prefetching, a store statement for an address of the irregular memory reference is inserted into the at least one loop. A runtime library call is inserted into a prefetch runtime library for the irregular memory reference. Data associated with the irregular memory reference is prefetched into the software controlled cache when the runtime library call is invoked.10-08-2009
20090293047Reducing Runtime Coherency Checking with Global Data Flow Analysis - Reducing runtime coherency checking using global data flow analysis is provided. A determination is made as to whether a call is for at least one of a DMA get operation or a DMA put operation in response to the call being issued during execution of a compiled and optimized code. A determination is made as to whether a software cache write operation has been issued since a last flush operation in response to the call being the DMA get operation. A DMA get runtime coherency check is then performed in response to the software cache write operation being issued since the last flush operation.11-26-2009
20090293048Computer Analysis and Runtime Coherency Checking - Compiler analysis and runtime coherency checking for reducing coherency problems is provided. Source code is analyzed to identify at least one of a plurality of loops that contains a memory reference. A determination is made as to whether the memory reference is an access to a global memory that should be handled by at least one of a software controlled cache or a direct buffer. A determination is made as to whether there is a data dependence between the memory reference and at least one reference from at least one of other direct buffers or other software controlled caches in response to an indication that the memory reference is an access to the global memory that should be handled by either the software controlled cache or the direct buffer. A direct buffer transformation is applied to the memory reference in response to a negative indication of the data dependence.11-26-2009
20100023700Dynamically Maintaining Coherency Within Live Ranges of Direct Buffers - Reducing coherency problems in a data processing system is provided. Source code that is to be compiled is received and analyzed to identify at least one of a plurality of loops that contain a memory reference. A determination is made as to whether the memory reference is an access to a global memory that should be handled by a direct buffer. Responsive to an indication that the memory reference is an access to the global memory that should be handled by the direct buffer, the memory reference is marked for direct buffer transformation. The direct buffer transformation is then applied to the memory reference.01-28-2010
20100088673Optimized Code Generation Targeting a High Locality Software Cache - Mechanisms for optimized code generation targeting a high locality software cache are provided. Original computer code is parsed to identify memory references in the original computer code. Memory references are classified as either regular memory references or irregular memory references. Regular memory references are controlled by a high locality cache mechanism. Original computer code is transformed, by a compiler, to generate transformed computer code in which the regular memory references are grouped into one or more memory reference streams, each memory reference stream having a leading memory reference, a trailing memory reference, and one or more middle memory references. Transforming of the original computer code comprises inserting, into the original computer code, instructions to execute initialization, lookup, and cleanup operations associated with the leading memory reference and trailing memory reference in a different manner from initialization, lookup, and cleanup operations for the one or more middle memory references.04-08-2010
20110093687MANAGING MULTIPLE SPECULATIVE ASSIST THREADS AT DIFFERING CACHE LEVELS - An illustrative embodiment provides a computer-implemented process for managing multiple speculative assist threads for data pre-fetching that sends a command from an assist thread of a first processor to second processor and a memory, wherein parameters of the command specify a processor identifier of the second processor, responsive to receiving the command, reply by the second processor indicating an ability to receive a cache line that is a target of a pre-fetch, responsive to receiving the command replying by the memory indicating a capability to provide the cache line, responsive to receiving replies from the second processor and the memory, sending, by the first processor, a combined response to the second processor and the memory, wherein the combined response indicates an action, and responsive to the action indicating a transaction can continue sending the requested cache line, by the memory, to the second processor into a target cache level on the second processor.04-21-2011
20110093838MANAGING SPECULATIVE ASSIST THREADS - An illustrative embodiment provides a computer-implemented process for managing speculative assist threads for data pre-fetching that analyzes collected source code and cache profiling information to identify a code region containing a delinquent load instruction and generates an assist thread, including a value for a local version number, at a program entry point within the identified code region. Upon activation of the assist thread the local version number of the assist thread is compared to the global unique version number of the main thread for the identified code region and an iteration distance between the assist thread relative to the main thread is compared to a predefined value. The assist thread is executed when the local version number of the assist thread matches the global unique version number of the main thread, and the iteration distance between the assist thread relative to the main thread is within a predefined range of values.04-21-2011

Patent applications by Tong Chen, Yorktown Heights, NY US