Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees


Hierarchical caches

Subclass of:

711 - Electrical computers and digital processing systems: memory

711100000 - STORAGE ACCESSING AND CONTROL

711117000 - Hierarchical memories

711118000 - Caching

711119000 - Multiple caches

Patent class list (only not empty are listed)

Deeper subclasses:

Entries
DocumentTitleDate
20090210625Self Prefetching L3/L4 Cache Mechanism - Embodiments of the invention provide a look-aside-look-aside buffer (LLB) configured to retain a portion of the real addresses in a translation look-aside (TLB) buffer to allow prefetching of data from a cache. A subset of real address bits associated with an effective address may be retrieved relatively quickly from the LLB, thereby allowing access to the cache before the complete address translation is available and reducing cache access latency.08-20-2009
20090157966CACHE INJECTION USING SPECULATION - A method, system, and computer program product for cache injection using speculation are provided. The method includes creating a cache line indirection table at an input/output (I/O) hub, which includes fields and entries for addresses, processor ID, and cache type and includes cache level line limit fields. The method also includes setting cache line limits to the CLL fields and receiving a stream of contiguous addresses at the table. For each address in the stream, the method includes: looking up the address in the table; if the address is present in the table, inject the cache line corresponding to the address in the processor complex; if the address is not present in the table, search limit values from the lowest level cache to the highest level cache; and inject addresses not present in the table to the cache hierarchy of the processor last injected from the contiguous address stream.06-18-2009
20130031308DEVICE DRIVER FOR USE IN A DATA STORAGE SYSTEM - A device driver includes an aggregator aggregating data blocks into one or more container objects suited for storage in an object store; and a logger for maintaining in at least one log file for each data block an identification of a container object wherein the data block is stored with an identification of the location of the data block in the container object.01-31-2013
20130086324INTELLIGENCE FOR CONTROLLING VIRTUAL STORAGE APPLIANCE STORAGE ALLOCATION - A change in workload characteristics detected at one tier of a multi-tiered cache is communicated to another tier of the multi-tiered cache. Multiple caching elements exist at different tiers, and at least one tier includes a cache element that is dynamically resizable. The communicated change in workload characteristics causes the receiving tier to adjust at least one aspect of cache performance in the multi-tiered cache. In one aspect, at least one dynamically resizable element in the multi-tiered cache is resized responsive to the change in workload characteristics.04-04-2013
20110202727Apparatus and Methods to Reduce Duplicate Line Fills in a Victim Cache - Techniques and methods are used to reduce allocations to a higher level cache of cache lines displaced from a lower level cache. The allocations of the displaced cache lines are prevented for displaced cache lines that are determined to be redundant in the next level cache, whereby castouts are reduced. To such ends, a line is selected to be displaced in a lower level cache. Information associated with the selected line is identified which indicates that the selected line is present in a higher level cache or the selected line is a write-through line. An allocation of the selected line in the higher level cache is prevented based on the identified information. Preventing an allocation of the selected line saves power that would be associated with the allocation.08-18-2011
20110202726Apparatus and method for handling data in a cache - A data processing apparatus for forming a portion of a coherent cache system comprises at least one master device for performing data processing operations, and a cache coupled to the at least one master device and arranged to store data values for access by that at least one master device when performing the data processing operations. Cache coherency circuitry is responsive to a coherency request from another portion of the coherent cache system to cause a coherency action to be taken in respect of at least one data value stored in the cache. Responsive to an indication that the coherency action has resulted in invalidation of that at least one data value in the cache, refetch control circuitry is used to initiate a refetch of that at least one data value into the cache. Such a mechanism causes the refetch of data into the cache to be triggered by the coherency action performed in response to a coherency request from another portion of the coherent cache system, rather than relying on any actions taken by the at least one master device, thereby providing a very flexible and efficient mechanism for reducing cache latency in a coherent cache system.08-18-2011
20120246407METHOD AND SYSTEM TO IMPROVE UNALIGNED CACHE MEMORY ACCESSES - A method and system to improve unaligned cache memory accesses. In one embodiment of the invention, a processing unit has logic to facilitate access of at least two cache memory lines of a cache memory in a single read operation. By doing so, it avoids additional read operations or cycles to read the required data that is cached in more than one cache memory line. Embodiments of the invention facilitate the streaming of unaligned vector loads that does not require substantially more power than streaming aligned vector loads. For example, in one embodiment of the invention, the streaming of unaligned vector loads consumes less than two times the power requirements of streaming aligned vector loads.09-27-2012
20130086326SYSTEM AND METHOD FOR SUPPORTING A TIERED CACHE - A computer-implemented method and system can support a tiered cache, which includes a first cache and a second cache. The first cache operates to receive a request to at least one of update and query the tiered cache; and the second cache operates to perform at least one of an updating operation and a querying operation with respect to the request via at least one of a forward strategy and a listening scheme.04-04-2013
20130086325DYNAMIC CACHE SYSTEM AND METHOD OF FORMATION - Embodiments of the present invention provide a dynamic cache system comprising: a multi-level inspector design that handles multi-level data formats; a cache function design that handles multi-level data formats; a cache size controller design that is able to handle the varying cache sizes based on characteristics such as hit-rates, usage patterns, etc.; a cache behavior controller design that handles different types of files; and heterogeneous storage controller design that is configured to handle volumes of the storage based on the types of storage (RAM Disk, flash, HDD, etc.). Advantages of system include (among others): caching for different types of data when different types of data need to be cached, and/or cache size can be allocated based on the cache level (which itself can be established).04-04-2013
20130080705MANAGING IN-LINE STORE THROUGHPUT REDUCTION - Various embodiments of the present invention manage a hierarchical store-through memory cache structure. A store request queue is associated with a processing core in multiple processing cores. At least one blocking condition is determined to have occurred at the store request queue. Multiple non-store requests and a set of store requests associated with a remaining set of processing cores in the multiple processing cores are dynamically blocked from accessing a memory cache in response to the blocking condition having occurred.03-28-2013
20130036269PLACEMENT OF DATA IN SHARDS ON A STORAGE DEVICE - A method, system and computer program product for placing data in shards on a storage device may include determining placement of a data set in one of a plurality of shards on the storage device. Each one of the shards may include a different at least one performance feature. Each different at least one performance feature may correspond to a different at least one predetermined characteristic associated with a particular set of data. The data set is cached in the one of the plurality of shards on the storage device that includes the at least one performance feature corresponding to the at least one predetermined characteristic associated with the data set being cached.02-07-2013
20120265938PERFORMING A PARTIAL CACHE LINE STORAGE-MODIFYING OPERATION BASED UPON A HINT - Analyzing pre-processed code includes identifying at least one storage-modifying construct specifying a storage-modifying memory access to a memory hierarchy of a data processing system and determining if more than one granule of a cache line of data containing multiple granules that is targeted by the storage-modifying construct is subsequently referenced by said pre-processed code. Post-processed code including a storage-modifying instruction corresponding to the at least one storage-modifying construct in the pre-processed code is generated and stored. Generating the post-processed code includes marking the storage-modifying instruction with a partial cache line hint indicating that said storage-modifying instruction targets less than a full cache line of data within a memory hierarchy if the analyzing indicates only one granule of the target cache line will be accessed while the cache line is held in the cache memory and otherwise refraining from marking the storage-modifying instruction with the partial cache line hint.10-18-2012
20130042068SHADOW REGISTERS FOR LEAST RECENTLY USED DATA IN CACHE - A cache for use in a central processing unit (CPU) of a computer includes a data array; a tag array configured to hold a list of addresses corresponding to each data entry held in the data array; a least recently used (LRU) array configured to hold data indicating least recently used data entries in the data array; a line fill buffer configured to receive data from an address in main memory that is located external to the cache in the event of a cache miss; and a shadow register associated with the line fill buffer, wherein the shadow register is configured to hold LRU data indicating a current state of the LRU array.02-14-2013
20130042069Apparatus And A Method For Obtaining A Blur Image - [Problem] The present invention provides an apparatus and a method, which can reduce required memory, for obtaining blur image in computer graphics.02-14-2013
20130046936DATA PROCESSING SYSTEM OPERABLE IN SINGLE AND MULTI-THREAD MODES AND HAVING MULTIPLE CACHES AND METHOD OF OPERATION - Systems and methods are disclosed for a computer system that includes a first load/store execution unit 02-21-2013
20090043966Adaptive Mechanisms and Methods for Supplying Volatile Data Copies in Multiprocessor Systems - In a computer system with a memory hierarchy, when a high-level cache supplies a data copy to a low-level cache, the shared copy can be either volatile or non-volatile. When the data copy is later replaced from the low-level cache, if the data copy is non-volatile, it needs to be written back to the high-level cache; otherwise it can be simply flushed from the low-level cache. The high-level cache can employ a volatile-prediction mechanism that adaptively determines whether a volatile copy or a non-volatile copy should be supplied when the high-level cache needs to send data to the low-level cache. An exemplary volatile-prediction mechanism suggests use of a non-volatile copy if the cache line has been accessed consecutively by the low-level cache. Further, the low-level cache can employ a volatile-promotion mechanism that adaptively changes a data copy from volatile to non-volatile according to some promotion policy, or changes a data copy from non-volatile to volatile according to some demotion policy.02-12-2009
20120191916OPTIMIZING TAG FORWARDING IN A TWO LEVEL CACHE SYSTEM FROM LEVEL ONE TO LEVER TWO CONTROLLERS FOR CACHE COHERENCE PROTOCOL FOR DIRECT MEMORY ACCESS TRANSFERS - A second level memory controller uses shadow tags 07-26-2012
20120191915EFFICIENT LEVEL TWO MEMORY BANKING TO IMPROVE PERFORMANCE FOR MULTIPLE SOURCE TRAFFIC AND ENABLE DEEPER PIPELINING OF ACCESSES BY REDUCING BANK STALLS - The level two memory of this invention supports coherency data transfers with level one cache and DMA data transfers. The width of DMA transfers is 16 bytes. The width of level one instruction cache transfers is 32 bytes. The width of level one data transfers is 64 bytes. The width of level two allocates is 128 bytes. DMA transfers are interspersed with CPU traffic and have similar requirements of efficient throughput and reduced latency. An additional challenge is that these two data streams (CPU and DMA) require access to the level two memory at the same time. This invention is a banking technique for the level two memory to facilitate efficient data transfers.07-26-2012
20120191914PERFORMANCE AND POWER IMPROVEMENT ON DMA WRITES TO LEVEL TWO COMBINED CACHE/SRAM THAT IS CAUSED IN LEVEL ONE DATA CACHE AND LINE IS VALID AND DIRTY - This invention optimizes DMA writes to directly addressable level two memory that is cached in level one and the line is valid and dirty. When the level two controller detects that a line is valid and dirty in level one, the level two memory need not update its copy of the data. Level one memory will replace the level two copy with a victim writeback at a future time. Thus the level two memory need not store write a copy. This limits the number of DMA writes to level two directly addressable memory and thus improves performance and minimizes dynamic power. This also frees the level two memory for other master/requestors.07-26-2012
20120191913Distributed User Controlled Multilevel Block and Global Cache Coherence with Accurate Completion Status - This invention permits user controlled cache coherence operations with the flexibility to do these operations on all levels of cache together or each level independently. In the case of an all level operation, the user does not have to monitor and sequence each phase of the operation. This invention also provides a way for users to track completion of these operations. This is critical for multi-core/multi-processor devices. Multiple cores may be accessing the end point and the user/application needs to be able to identify when the operation from one core is complete, before permitting other cores access that data or code.07-26-2012
20130067169DYNAMIC CACHE QUEUE ALLOCATION BASED ON DESTINATION AVAILABILITY - An apparatus for controlling operation of a cache includes a first command queue, a second command queue and an input controller configured to receive requests having a first command type and a second command type and to assign a first request having the first command type to the first command queue and a second command having the first command type to the second command queue in the event that the first command queue has not received an indication that a first dedicated buffer is available.03-14-2013
20120117326APPARATUS AND METHOD FOR ACCESSING CACHE MEMORY - The present invention relates to an apparatus and a method for accessing a cache memory. The cache memory comprises a level-one memory and a level-two memory. The apparatus for accessing the cache memory according to the present invention comprises a register unit and a control unit. The control unit receives a first read command and a reject datum of the level-one memory and stores the reject datum of the level-one memory to the register unit. Then the control unit reads and stores a stored datum of the level-two memory to the level-one memory according to the first read command.05-10-2012
20110022803Two Partition Accelerator and Application of Tiered Flash to Cache Hierarchy in Partition Acceleration - An approach is provided to identify a disabled processing core and an active processing core from a set of processing cores included in a processing node. Each of the processing cores is assigned a cache memory. The approach extends a memory map of the cache memory assigned to the active processing core to include the cache memory assigned to the disabled processing core. A first amount of data that is used by a first process is stored by the active processing core to the cache memory assigned to the active processing core. A second amount of data is stored by the active processing core to the cache memory assigned to the inactive processing core using the extended memory map.01-27-2011
20090248983TECHNIQUE TO SHARE INFORMATION AMONG DIFFERENT CACHE COHERENCY DOMAINS - A technique to enable information sharing among agents within different cache coherency domains. In one embodiment, a graphics device may use one or more caches used by one or more processing cores to store or read information, which may be accessed by one or more processing cores in a manner that does not affect programming and coherency rules pertaining to the graphics device.10-01-2009
20110208915Fused Store Exclusive/Memory Barrier Operation - In an embodiment, a processor may be configured to detect a store exclusive operation followed by a memory barrier operation in a speculative instruction stream being executed by the processor. The processor may fuse the store exclusive operation and the memory barrier operation, creating a fused operation. The fused operation may be transmitted and globally ordered, and the processor may complete both the store exclusive operation and the memory barrier operation in response to the fused operation. As the fused operation progresses through the processor and one or more other components (e.g. caches in the cache hierarchy) to the ordering point in the system, the fused operation may push previous memory operations to effect the memory barrier operation. In some embodiments, the latency for completing the store exclusive operation and the subsequent data memory barrier operation may be reduced if the store exclusive operation is successful at the ordering point.08-25-2011
20090235027CACHE MEMORY SYSTEM, DATA PROCESSING APPARATUS, AND STORAGE APPARATUS - A cache memory system includes a plurality of first storage hierarchical units provided individually to a plurality of processors. A second storage hierarchical unit is provided commonly to the plurality of processors. A control unit controls data transfer between the plurality of first storage hierarchical units and the second storage hierarchical unit. Each of the plurality of processors is capable of executing a no-data transfer store command as a store command that does not require data transfer from the second storage hierarchical unit to the corresponding first storage hierarchical unit, and each of the plurality of first storage hierarchical units outputs a transfer-control signal in response to occurrence of a cache miss hit when executing the no-data transfer store command by the corresponding processor.09-17-2009
20090006753DESIGN STRUCTURE FOR ACCESSING A CACHE WITH AN EFFECTIVE ADDRESS - A design structure embodied in a machine readable storage medium for designing, manufacturing, and/or testing a design for accessing a processor cache is provided. The design structure comprises a processor having a processor core, a level one cache, and circuitry. The circuitry is configured to execute an access instruction in the processor's core, wherein the access instruction provides an untranslated effective address of data to be accessed by the access instruction, determine whether the processor core's level one cache includes the data corresponding to the effective address of the access instruction, wherein the effective address of the access instruction is used without address translation to determine whether the processor core's level one cache includes the data corresponding to the effective address, and provide the data for the access instruction from the level one cache if the level one cache includes the data corresponding to the effective address.01-01-2009
20090006752High Capacity Memory Subsystem Architecture Employing Hierarchical Tree Configuration of Memory Modules - A high-capacity memory subsystem architecture utilizes multiple memory modules arranged in a hierarchical tree configuration, in which at least some communications from an external source traverse successive levels of the tree to reach memory modules at the lowest level. Preferably, the memory system employs buffered memory chips having dual-mode operation, one of which supports a tree configuration in which data is interleaved and the communications buses operate at reduced bus width and/or reduced bus frequency to match the level of interleaving01-01-2009
20130166846Hierarchy-aware Replacement Policy - Some implementations disclosed herein provide techniques and arrangements for a hierarchy-aware replacement policy for a last-level cache. A detector may be used to provide the last-level cache with information about blocks in a lower-level cache. For example, the detector may receive a notification identifying a block evicted from the lower-level cache. The notification may include a category associated with the block. The detector may identify a request that caused the block to be filled into the lower-level cache. The detector may determine whether one or more statistics associated with the category satisfy a threshold. In response to determining that the one or more statistics associated with the category satisfy the threshold, the detector may send an indication to the last-level cache that the block is a candidate for eviction from the last-level cache.06-27-2013
20130166845METHOD AND DEVICE FOR RECOVERING DESCRIPTION INFORMATION, AND METHOD AND DEVICE FOR CACHING DATA IN DATABASE - A method for recovering description information, or a method for caching data in a database, includes: judging whether a database is closed normally after the last operation; if the database is not closed normally, traversing each data block in a level-2 cache, where corresponding disk location information is saved in a header of each data block; obtaining a data block in a disk according to the disk location information; and when the obtained data block in the disk is the same as a corresponding data block in the level-2 cache, establishing description information according to location information of the data block in the disk and location information of the data block in the level-2 cache, where the description information is used to describe correspondence between the location information of data in the disk and the location information of data in the level-2 cache.06-27-2013
20120221793SYSTEMS AND METHODS FOR RECONFIGURING CACHE MEMORY - A microprocessor system is disclosed that includes a first data cache that is shared by a first group of one or more program threads in a multi-thread mode and used by one program thread in a single-thread mode. A second data cache is shared by a second group of one or more program threads in the multi-thread mode and is used as a victim cache for the first data cache in the single-thread mode.08-30-2012
20100268885SPECIFYING AN ACCESS HINT FOR PREFETCHING LIMITED USE DATA IN A CACHE HIERARCHY - A system and method for specifying an access hint for prefetching limited use data. A processing unit receives a data cache block touch (DCBT) instruction having an access hint indicating to the processing unit that a program executing on the data processing system may soon access a cache block addressed within the DCBT instruction. The access hint is contained in a code point stored in a subfield of the DCBT instruction. In response to detecting that the code point is set to a specific value, the data addressed in the DCBT instruction is prefetched into an entry in the lower level cache. The entry may then be updated as a least recently used entry of a plurality of entries in the lower level cache. In response to a new cache block being fetched to the cache, the prefetched cache block is cast out of the cache.10-21-2010
20100268882LOAD REQUEST SCHEDULING IN A CACHE HIERARCHY - A system and method for tracking core load requests and providing arbitration and ordering of requests. When a core interface unit (CIU) receives a load operation from the processor core, a new entry in allocated in a queue of the CIU. In response to allocating the new entry in the queue, the CIU detects contention between the load request and another memory access request. In response to detecting contention, the load request may be suspended until the contention is resolved. Received load requests may be stored in the queue and tracked using a least recently used (LRU) mechanism. The load request may then be processed when the load request resides in a least recently used entry in the load request queue. CIU may also suspend issuing an instruction unless a read claim (RC) machine is available. In another embodiment, CIU may issue stored load requests in a specific priority order.10-21-2010
20090006754DESIGN STRUCTURE FOR L2 CACHE/NEST ADDRESS TRANSLATION - A design structure embodied in a machine readable storage medium for designing, manufacturing, and/or testing a design for accessing a processor's cache memory is provided. The design structure comprises a processor having one or more level one caches, a lookaside buffer configured to include a corresponding entry for each cache line placed in each of the processor's one or more level one caches. The corresponding entry indicates a translation from the effective addresses to the real addresses for the cache line. The processor also comprises circuitry configured to access requested data in the processor's one or more level one caches using requested effective addresses of the requested data, translate the requested effective addresses to real addresses if the processor's one or more level one caches do not contain requested data corresponding to the requested effective addresses, and use the translated real addresses to access the level two cache.01-01-2009
20110302372SMT/ECO MODE BASED ON CACHE MISS RATE - A computer implemented method for managing an execution mode for a parallel processor is provided. A monitor identifies a first efficiency rate for a first contested resource of the parallel processor operating in a first operating mode. Responsive to identifying the first efficiency rate for the first contested resource, the monitor identifies whether the first efficiency rate for the contested resource of the parallel processor operating in the first operating mode exceeds a threshold. Responsive to identifying that the efficiency rate for the contested resource exceeds the threshold, an operation of the parallel processor is changed to a second operating mode.12-08-2011
20110276762COORDINATED WRITEBACK OF DIRTY CACHELINES - A data processing system includes a processor core and a cache memory hierarchy coupled to the processor core. The cache memory hierarchy includes at least one upper level cache and a lowest level cache. A memory controller is coupled to the lowest level cache and to a system memory and includes a physical write queue from which the memory controller writes data to the system memory. The memory controller initiates accesses to the lowest level cache to place into the physical write queue selected cachelines having spatial locality with data present in the physical write queue.11-10-2011
20100223429Hybrid Caching Techniques and Garbage Collection Using Hybrid Caching Techniques - Hybrid caching techniques and garbage collection using hybrid caching techniques are provided. A determination of a measure of a characteristic of a data object is performed, the characteristic being indicative of an access pattern associated with the data object. A selection of one caching structure, from a plurality of caching structures, is performed in which to store the data object based on the measure of the characteristic. Each individual caching structure in the plurality of caching structures stores data objects has a similar measure of the characteristic with regard to each of the other data objects in that individual caching structure. The data object is stored in the selected caching structure and at least one processing operation is performed on the data object stored in the selected caching structure.09-02-2010
20100146211Shader Complex with Distributed Level One Cache System and Centralized Level Two Cache - A shader pipe texture filter utilizes a level one cache system as a primary method of storage but with the ability to have the level one cache system read and write to a level two cache system when necessary. The level one cache system communicates with the level two cache system via a wide channel memory bus. In addition, the level one cache system can be configured to support dual shader pipe texture filters while maintaining access to the level two cache system. A method utilizing a level one cache system as a primary method of storage with the ability to have the level one cache system read and write a level two cache system when necessary is also presented. In addition, level one cache systems can allocate a defined area of memory to be sharable amongst other resources.06-10-2010
20100268883Information Handling System with Immediate Scheduling of Load Operations and Fine-Grained Access to Cache Memory - An information handling system (IHS) includes a processor with a cache memory system. The processor includes a processor core with an L1 cache memory that couples to an L2 cache memory. The processor includes an arbitration mechanism that controls load and store requests to the L2 cache memory. The arbitration mechanism includes control logic that enables a load request to interrupt a store request that the L2 cache memory is currently servicing. When the L2 cache memory finishes servicing the interrupting load request, the L2 cache memory may return to servicing the interrupted store request at the point of interruption. The control logic determines the size requirement of each load operation or store operation. When the cache memory system performs a store operation or load operation, the memory system accesses the portion of a cache line it needs to perform the operation instead of accessing an entire cache line.10-21-2010
20100268886SPECIFYING AN ACCESS HINT FOR PREFETCHING PARTIAL CACHE BLOCK DATA IN A CACHE HIERARCHY - A system and method for specifying an access hint for prefetching only a subsection of cache block data, for more efficient system interconnect usage by the processor core. A processing unit receives a data cache block touch (DCBT) instruction containing an access hint and identifying a specific size portion of data to be prefetched. Both the access hint and a value corresponding to an amount of data to be prefetched are contained in separate subfields of the DCBT instruction. In response to detecting that the code point is set to a specific value, only the specific size of data identified in a sub-field of the DCBT and addressed in the DCBT instruction is prefetched into an entry in the lower level cache.10-21-2010
20100169576SYSTEM AND METHOD FOR SIFT IMPLEMENTATION AND OPTIMIZATION - A method is to implement a Scale Invariant Feature Transform algorithm in a shared memory multiprocessing system. The method comprises building differences of Gaussian (DoG) images for an input image, detecting keypoints in the DoG images; assigning orientations to the keypoints and computing keypoints descriptors and performing matrix operations. In the method, building differences of Gaussian (DoG) images for an input image and detecting keypoints in the DoG images are executed for all scales of the input image in parallel. And, orientation assignment and keypoints descriptions computation are executed for all octaves of the input image in paralle.07-01-2010
20110197030Latency Reduction for Cache Coherent Bus-Based Cache - In one embodiment, a system comprises a plurality of agents coupled to an interconnect and a cache coupled to the interconnect. The plurality of agents are configured to cache data. A first agent of the plurality of agents is configured to initiate a transaction on the interconnect by transmitting a memory request, and other agents of the plurality of agents are configured to snoop the memory request from the interconnect. The other agents provide a response in a response phase of the transaction on the interconnect. The cache is configured to detect a hit for the memory request and to provide data for the transaction to the first agent prior to the response phase and independent of the response.08-11-2011
20080215815SYSTEM AND METHOD OF IMPROVING TASK SWITCHING AND PAGE TRANSLATION PERFORMANCE UTILIZING A MULTILEVEL TRANSLATION LOOKASIDE BUFFER - A system and method of improved task switching in a data processing system. First, a first-level cache memory casts out an invalidated page table entry and an associated first page directory base address to a second-level cache memory. Then, the second-level cache memory determines if a task switch has occurred. If a task switch has not occurred, first-level cache memory sends the invalidated page table entry to a current running task directory. If a task switch has occurred, first-level cache memory loads from the second-level cache directory a collection of page table entries related to a new task to enable improved task switching without requiring access to a page table stored in main memory to retrieve the collection of page table entries.09-04-2008
20100082905DISABLING CACHE PORTIONS DURING LOW VOLTAGE OPERATIONS - Methods and apparatus relating to disabling one or more cache portions during low voltage operations are described. In some embodiments, one or more extra bits may be used for a portion of a cache that indicate whether the portion of the cache is capable at operating at or below Vccmin levels. Other embodiments are also described and claimed.04-01-2010
20100100683Victim Cache Prefetching - A processing unit for a multiprocessor data processing system includes a processor core and a cache hierarchy coupled to the processor core to provide low latency data access. The cache hierarchy includes an upper level cache coupled to the processor core and a lower level victim cache coupled to the upper level cache. In response to a prefetch request of the processor core that misses in the upper level cache, the lower level victim cache determines whether the prefetch request misses in the directory of the lower level victim cache and, if so, allocates a state machine in the lower level victim cache that services the prefetch request by issuing the prefetch request to at least one other processing unit of the multiprocessor data processing system.04-22-2010
20100100682Victim Cache Replacement - A data processing system includes a processor core having an associated upper level cache and a lower level victim cache. In response to a memory access request of the processor core that specifies a non-modifying access to a target coherency granule, a determination is made whether the memory access request hits or misses in a directory of the lower level victim cache. In response to determining that the memory access request hits in the lower level victim cache in a data-valid coherence state, the lower level victim cache provides the target coherency granule of the memory access request to the upper level cache. The lower level victim cache preserves the target coherency granule in the lower level victim cache in a shared coherence state if the memory access request is of a first type and invalidates the target coherency granule if the memory access request is of a second type.04-22-2010
20110271057CACHE ACCESS FILTERING FOR PROCESSORS WITHOUT SECONDARY MISS DETECTION - The disclosed embodiments provide a system that filters duplicate requests from an L11-03-2011
20120297139MEMORY MANAGEMENT UNIT, APPARATUSES INCLUDING THE SAME, AND METHOD OF OPERATING THE SAME - A method of operating a memory management unit includes accessing a translation lookaside buffer (TLB), translating a page number of a virtual address into a frame number of a physical address when there is a match for the page number of the virtual address in the TLB, executing a miss process when there is no match for the page number of the virtual address in the TLB. The miss process includes accessing a page table translation (PTT) cache, checking whether access information of a k-th level page table corresponding to a k-th page number that will be accessed in the virtual address is in the PTT cache, acquiring a base address of a physical page using the access information, and determining the frame number of physical address corresponding to the page number of the virtual address using a page offset in the physical page.11-22-2012
20110173393CACHE MEMORY, MEMORY SYSTEM, AND CONTROL METHOD THEREFOR - A cache memory according to the present invention includes: a first port for input of a command from the processor; a second port for input of a command from a master other than the processor; a hit determining unit which, when a command is input to said first port or said second port, determines whether or not data corresponding to an address specified by the command is stored in said cache memory; and a first control unit which performs a process for maintaining coherency of the data stored in the cache memory and corresponding to the address specified by the command and data stored in the main memory, and outputs the input command to the main memory as a command output from the master, when the command is input to the second port and said hit determining unit determines that the data is stored in said cache memory.07-14-2011
20110173392EVICT ON WRITE, A MANAGEMENT STRATEGY FOR A PREFETCH UNIT AND/OR FIRST LEVEL CACHE IN A MULTIPROCESSOR SYSTEM WITH SPECULATIVE EXECUTION - In a multiprocessor system with at least two levels of cache, a speculative thread may run on a core processor in parallel with other threads. When the thread seeks to do a write to main memory, this access is to be written through the first level cache to the second level cache. After the write though, the corresponding line is deleted from the first level cache and/or prefetch unit, so that any further accesses to the same location in main memory have to be retrieved from the second level cache. The second level cache keeps track of multiple versions of data, where more than one speculative thread is running in parallel, while the first level cache does not have any of the versions during speculation. A switch allows choosing between modes of operation of a speculation blind first level cache.07-14-2011
20110173391System and Method to Access a Portion of a Level Two Memory and a Level One Memory - A system and method to access data from a portion of a level two memory or from a level one memory is disclosed. In a particular embodiment, the system includes a level one cache and a level two memory. A first portion of the level two memory is coupled to an input port and is addressable in parallel with the level one cache.07-14-2011
20100299481HIERARCHICAL READ-COMBINING LOCAL MEMORIES - The present disclosure relates to a system for hierarchical read-combining memory having a multicore processor operably coupled to a memory controller. The memory controller is configured for receiving a plurality of requests for data from one or more processing cores of the multicore processor, selectively holding a request for data from the plurality of requests for an undetermined or indefinite amount of time, and selectively combining a plurality of requests for the same data into a single read-combined data request. The present disclosure further relates to a method for hierarchical read-combining data requests of a multicore processor and a computer accessible medium having stored thereon computer executable instructions for performing a procedure for hierarchical read-combining data requests of a multicore processor.11-25-2010
20080313405Coherency maintaining device and coherency maintaining method - A second-level cache device stores part of registration information of data for a first-level cache device in a second-level cache-tag unit in association with registration information in a second-level-cache data unit, and stores the registration information of data for the first-level cache device in a first-level cache-tag copying unit. A coherency maintaining processor maintains coherency between the first-level cache device and the second-level cache device based on the information stored in the second-level cache-tag unit and the first-level cache-tag copying unit.12-18-2008
20100262783Mode-Based Castout Destination Selection - In response to a data request of a first of a plurality of processing units, the first processing unit selects a victim cache line to be castout from the lower level cache of the first processing unit and determines whether a mode is set. If not, the first processing unit issues on the interconnect fabric an LCO command identifying the victim cache line and indicating that a lower level cache is the intended destination. If the mode is set, the first processing unit issues a castout command with an alternative intended destination. In response to a coherence response to the LCO command indicating success of the LCO command, the first processing unit removes the victim cache line from its lower level cache, and the victim cache line is held elsewhere in the data processing system. The mode can be set to inhibit castouts to system memory, for example, for testing.10-14-2010
20080235453SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR EXECUTING A CACHE REPLACEMENT ALGORITHM - A system, method and computer program product for executing a cache replacement algorithm. A system includes a computer processor having an instruction processor, a cache and one or more useful indicators. The instruction processor processes instructions in a running program. The cache includes two or more cache levels including a level one (L1) cache level and one or more higher cache levels. Each cache level includes one or more cache lines and has an associated directory having one or more directory entries. A useful indicator is located within one or more of the directory entries and is associated with a particular cache line. The useful indicator is set to provide an indication that the associated cache line contains one or more instructions that are required by the running program and cleared to provide lack of such an indication.09-25-2008
20080307163METHOD FOR ACCESSING MEMORY - A method for accessing memory is provided. The memory includes many multi-level cells each having at least a storage capable of storing 212-11-2008
20120084511INEFFECTIVE PREFETCH DETERMINATION AND LATENCY OPTIMIZATION - A processor of an information handling system (IHS) initiates an L3 cache prefetch operation in response to a demand load during instruction processing. The processor selects an L3 cache prefetch at random for tracking as a target prefetched instruction. The processor initiates an L1 cache target prefetch operation and stores the resultant target prefetched instruction in the L1 cache. If a demand load arrives, the processor analyses the target prefetched instruction for effectiveness and determines the source of the prefetch data. If a demand does not arrive, the processor tests to determine if the particular prefetched instruction timed out in the cache and identifies the infectiveness of the prefetch operation. The processor samples multiple prefetch operations at random and generates a history of prefetch effectiveness and other useful prefetch information. The processor stores the prefetch effectiveness information to enable reduction or removal of ineffective prefetch operations.04-05-2012
20080209126METHOD FOR ACHIEVING VERY HIGH BANDWIDTH BETWEEN THE LEVELS OF A CACHE HIERARCHY IN 3-DIMENSIONAL STRUCTURES, AND A 3-DIMENSIONAL STRUCTURE RESULTING THEREFROM - A method of electronic computing, and more specifically, a method of design of cache hierarchies in 3-dimensional chips, and a cache hierarchy resulting therefrom, including a physical arrangement of bits in cache hierarchies implemented in 3 dimensions such that the planar wiring required in the busses connecting the levels of the hierarchy is minimized. In this way, the data paths between the levels are primarily the vias themselves, which leads to very short, hence fast and low power busses.08-28-2008
20120198163Level One Data Cache Line Lock and Enhanced Snoop Protocol During Cache Victims and Writebacks to Maintain Level One Data Cache and Level Two Cache Coherence - This invention assures cache coherence in a multi-level cache system upon eviction of a higher level cache line. A victim buffer stored data on evicted lines. On a DMA access that may be cached in the higher level cache the lower level cache sends a snoop write. The address of this snoop write is compared with the victim buffer. On a hit in the victim buffer the write completes in the victim buffer. When the victim data passes to the next cache level it is written into a second victim buffer to be retired when the data is committed to cache. DMA write addresses are compared to addresses in this second victim buffer. On a match the write takes place in the second victim buffer. On a failure to match the controller sends a snoop write.08-02-2012
20090024796High Performance Multilevel Cache Hierarchy - A digital system is provided with a hierarchical memory system having at least a first and second level cache and a higher level memory. If a requested data item misses in both the first cache level and in the second cache level, a line of data containing the requested data is obtained from a higher level of the hierarchical memory system. The line of data is allocated to both the first cache level and to the second cache level simultaneously.01-22-2009
20090063772Methods and apparatus for controlling hierarchical cache memory - Methods and apparatus for controlling hierarchical cache memories permit controlling a first level cache memory including a plurality of cache lines and controlling a next lower level cache memory including a plurality of cache lines. An additional memory may be associated with the next lower level cache memory and include a plurality of memory lines, the number of memory lines corresponding to the number of cache lines in a way set of the first level cache memory. Alternatively, the memory lines may include L-flags for multiple cache lines of each way set of the next lower level cache memory. L-flags associated with a given index plus any index offset from the first level cache memory may be contained in a single memory line of the additional memory.03-05-2009
20120079203Transaction Info Bypass for Nodes Coupled to an Interconnect Fabric - A shared resource within a module may be accessed by a request from an external requester. An external transaction request may be received from an external requester outside the module for access to the shared resource that includes control information, not all of which is needed to access the shared resource. The external transaction request may be modified to form a modified request by removing a portion of the locally unneeded control information and storing the unneeded portion of control information as an entry in a bypass buffer. A reply received from the shared resource may be modified by appending the stored portion of control information from the entry in the bypass buffer before sending the modified reply to the external requester.03-29-2012
20120079202MULTISTREAM PREFETCH BUFFER - A prefetching system receives a memory read request having an associated address. In response to a determination that a most significant portion of the associated address is not present within slots of an array for storing the most significant portion of predicted addresses, a prefetch FIFO (First In-First Out) counter is modified to point to a next slot of the array and a new predicted address is generated in response to the received most significant portion of the associated address and is placed in the next slot of the array. The prefetch FIFO counter cycles through the slots of the array before wrapping around to a first slot of the array for storing the most significant portion of predicted addresses.03-29-2012
20120198165Mechanism to Update the Status of In-Flight Cache Coherence In a Multi-Level Cache Hierarchy - Separate buffers store snoop writes and direct memory access writes. A multiplexer selects one of these for input to a FIFO buffer. The FIFO buffer is split into multiple FIFOs including: a command FIFO; an address FIFO; and write data FIFO. Each snoop command is compared with an allocated line set and way and deleted on a match to avoid data corruption. Each snoop command is also compared with a victim address. If the snoop address matches victim address logic redirects the snoop command to a victim buffer and the snoop write is completed in the victim buffer.08-02-2012
20120198164Programmable Address-Based Write-Through Cache Control - This invention is a cache system with a memory attribute register having plural entries. Each entry stores a write-through or a write-back indication for a corresponding memory address range. On a write to cached data the cache the cache consults the memory attribute register for the corresponding address range. Writes to addresses in regions marked as write-through always update all levels of the memory hierarchy. Writes to addresses in regions marked as write- back update only the first cache level that can service the write. The memory attribute register is preferably a memory mapped control register writable by the central processing unit.08-02-2012
20120198160Efficient Cache Allocation by Optimizing Size and Order of Allocate Commands Based on Bytes Required by CPU - This invention is a data processing system having a multi-level cache system. The multi-level cache system includes at least first level cache and a second level cache. Upon a cache miss in both the at least one first level cache and the second level cache the data processing system evicts and allocates a cache line within the second level cache. The data processing system determine from the miss address whether the request falls within a low half or a high half of the allocated cache line. The data processing system first requests data from external memory of the miss half cache line. Upon receipt data is supplied to the at least one first level cache and the CPU. The data processing system then requests data from external memory for the other half of the second level cache line.08-02-2012
20090248984METHOD AND DEVICE FOR PERFORMING COPY-ON-WRITE IN A PROCESSOR - There are disclosed a method and device for performing Copy-on-Write in a processor. The processor comprises: processor cores, L10-01-2009
20090259813MULTI-PROCESSOR SYSTEM AND METHOD OF CONTROLLING THE MULTI-PROCESSOR SYSTEM - A multi-processor system has a plurality of processor cores, a plurality of level-one caches, and a level-two cache. The level-two cache has a level-two cache memory which stores data, a level-two cache tag memory which stores a line bit indicative of whether an instruction code included in data stored in the level-two cache memory is stored in the plurality of level-one cache memories or not line by line, and a level-two cache controller which refers to the line bit stored in the level-two cache tag memory and releases a line in which data including the same instruction code as that stored in the level-one cache memory is stored, in lines in the level-two cache memory.10-15-2009
20100153646MEMORY HIERARCHY WITH NON-VOLATILE FILTER AND VICTIM CACHES - Various embodiments of the present invention are generally directed to an apparatus and method for non-volatile caching of data in a memory hierarchy of a data storage device. In accordance with some embodiments, a pipeline memory structure is provided to store data for use by a controller. The pipeline has a plurality of hierarchical cache levels each with an associated non-volatile filter cache and a non-volatile victim cache. Data retrieved from each cache level are respectively promoted to the associated non-volatile filter cache. Data replaced in each cache level are respectively demoted to the associated non-volatile victim cache.06-17-2010
20130219122MULTI-STAGE CACHE DIRECTORY AND VARIABLE CACHE-LINE SIZE FOR TIERED STORAGE ARCHITECTURES - A method in accordance with the invention includes providing first, second, and third storage tiers, wherein the first storage tier acts as a cache for the second storage tier, and the second storage tier acts as a cache for the third storage tier. The first storage tier uses a first cache line size corresponding to an extent size of the second storage tier. The second storage tier uses a second cache line size corresponding to an extent size of the third storage tier. The second cache line size is significantly larger than the first cache line size. The method further maintains, in the first storage tier, a first cache directory indicating which extents from the second storage tier are cached in the first storage tier, and a second cache directory indicating which extents from the third storage tier are cached in the second storage tier.08-22-2013
20100161904CACHE HIERARCHY WITH BOUNDS ON LEVELS ACCESSED - The present invention is directed to a system managing data in a multilevel cache memory system. Certain cache data is designated and stored only in particular levels of the multilevel cache, bypassing other levels of the multilevel cache. In a multiprocessor environment, the present invention includes cache coherency operations or messages that pertain to data stored only in certain levels of a multilevel cache.06-24-2010
20100262784Empirically Based Dynamic Control of Acceptance of Victim Cache Lateral Castouts - A second lower level cache receives an LCO command issued by a first lower level cache on an interconnect fabric. The LCO command indicates an address of a victim cache line to be castout from the first lower level cache and indicates that the second lower level cache is an intended destination of the victim cache line. The second lower level cache determines whether to accept the victim cache line from the first lower level cache based at least in part on the address of the victim cache line indicated by the LCO command. In response to determining not to accept the victim cache line, the second lower level cache provides a coherence response to the LCO command refusing the identified victim cache line. In response to determining to accept the victim cache line, the second lower level cache updates an entry corresponding to the identified victim cache line.10-14-2010
20120198166Memory Attribute Sharing Between Differing Cache Levels of Multilevel Cache - The level one memory controller maintains a local copy of the cacheability bit of each memory attribute register. The level two memory controller is the initiator of all configuration read/write requests from the CPU. Whenever a configuration write is made to a memory attribute register, the level one memory controller updates its local copy of the memory attribute register.08-02-2012
20100262782Lateral Castout Target Selection - In response to a data request of a first processing unit among a plurality of processing units, the first processing unit selects a victim cache line to be castout from the lower level cache of the first processing unit and selects the lower level cache of a second of the plurality of processing units as an intended destination of a lateral castout (LCO) command by randomized round-robin selection. The first processing unit issues on the interconnect fabric an LCO command identifying the victim cache line and the intended destination. In response to a coherence response to the LCO command indicating success of the LCO command, the first processing unit removes the victim cache line from its lower level cache, and the victim cache line is held in the lower level cache of one of the plurality of processing units other than the first processing unit.10-14-2010
20100191914REGION COHERENCE ARRAY HAVING HINT BITS FOR A CLUSTERED SHARED-MEMORY MULTIPROCESSOR SYSTEM - A system and method for a multilevel region coherence protocol for use in Region Coherence Arrays (RCAs) deployed in clustered shared-memory multiprocessor systems which optimize cache-to-cache transfers (interventions) by using region hint bits in each RCA to allow memory requests for lines of a region of the memory to be optimally sent to only a determined portion of the clustered shared-memory multiprocessor system without broadcasting the requests to all processors in the system. A sufficient number of region hint bits are used to uniquely identify each level of the system's interconnect hierarchy to optimally predict which level of the system likely includes a processor that has cached copies of lines of data from the region.07-29-2010
20100191913RECONFIGURATION OF EMBEDDED MEMORY HAVING A MULTI-LEVEL CACHE - A method of operating an embedded memory having (i) a local memory, (ii) a system memory, and (iii) a multi-level cache memory coupled between a processor and the system memory. According to one embodiment of the method, a two-level cache memory is configured to function as a single-level cache memory by excluding the level-two (L2) cache from the cache-transfer path between the processor and the system memory. The excluded L2-cache is then mapped as an independently addressable memory unit within the embedded memory that functions as an extension of the local memory, a separate additional local memory, or an extension of the system memory.07-29-2010
20100185816Multiple Cache Line Size - A mechanism which allows pages of flash memory to be read directly into cache. The mechanism enables different cache line sizes for different cache levels in a cache hierarchy, and optionally, multiple line size support, simultaneously or as an initialization option, in the highest level (largest/slowest) cache. Such a mechanism improves performance and reduces cost for some applications.07-22-2010
20120198162Hazard Prevention for Data Conflicts Between Level One Data Cache Line Allocates and Snoop Writes - A comparator compares the address of DMA writes in the final entry of the FIFO stack to all pending read addresses in a monitor memory. If there is no match, then the DMA access is permitted to proceed. If the DMA write is to a cache line with a pending read, the DMA write access is stalled together with any DMA accesses behind the DMA write in the FIFO stack. DMA read accesses are not compared but may stall behind a stalled DMA write access. These stalls occur if the cache read was potentially cacheable. This is possible for some monitored accesses but not all. If a DMA write is stalled, the comparator releases it to complete once there are no pending reads to the same cache line.08-02-2012
20120198161NON-BLOCKING, PIPELINED WRITE ALLOCATES WITH ALLOCATE DATA MERGING IN A MULTI-LEVEL CACHE SYSTEM - This invention handles write request cache misses. The cache controller stores write data, sends a read request to external memory for a corresponding cache line, merges the write data with data returned from the external memory and stores merged data in the cache. The cache controller includes buffers with plural entries storing the write address, the write data, the position of the write data within a cache line and unique identification number. This stored data enables the cache controller to proceed to servicing other access requests while waiting for response from the external memory.08-02-2012
20100153647Cache-To-Cache Cast-In - A data processing system includes a first processing unit and a second processing unit coupled by an interconnect fabric. The first processing unit has a first processor core and associated first upper and first lower level caches, and the second processing unit has a second processor core and associated second upper and lower level caches. In response to a data request, a victim cache line is selected for castout from the first lower level cache. The first processing unit issues on the interconnect fabric a lateral castout (LCO) command that identifies the victim cache line to be castout from the first lower level cache and indicates that a lower level cache is an intended destination. In response to a coherence response indicating success of the LCO command, the victim cache line is removed from the first lower level cache and held in the second lower level cache.06-17-2010
20100250853Prefetch engine based translation prefetching - A method and system for prefetching in computer system are provided. The method in one aspect includes using a prefetch engine to perform prefetch instructions and to translate unmapped data. Misses to address translations during the prefetch are handled and resolved. The method also includes storing the resolved translations in a respective cache translation table. A system for prefetching in one aspect includes a prefetch engine operable to receive instructions to prefetch data from the main memory. The prefetch engine is also operable to search cache address translation for prefetch data and perform address mapping translation, if the prefetch data is unmapped. The prefetch engine is further operable to prefetch the data and store the address mapping in one or more cache memory, if the data is unmapped.09-30-2010
20100235576Handling Castout Cache Lines In A Victim Cache - A victim cache memory includes a cache array, a cache directory of contents of the cache array, and a cache controller that controls operation of the victim cache memory. The cache controller, responsive to receiving a castout command identifying a victim cache line castout from another cache memory, causes the victim cache line to be held in the cache array. If the other cache memory is a higher level cache in the cache hierarchy of the processor core, the cache controller marks the victim cache line in the cache directory so that it is less likely to be evicted by a replacement policy of the victim cache, and otherwise, marks the victim cache line in the cache directory so that it is more likely to be evicted by the replacement policy of the victim cache.09-16-2010
20100235578Cached Memory System and Cache Controller for Embedded Digital Signal Processor - A cached memory system that can handle high-rate input data and ensure that an embedded DSP can meet real-time constraints is described. The cached memory system includes a cache memory located close to a processor core, an on-chip memory at the next higher memory level, and an external main memory at the topmost memory level. A cache controller handles paging of instructions and data between the cache memory and the on-chip memory for cache misses. A direct memory exchange (DME) controller handles user-controlled paging between the on-chip memory and the external memory. A user/programmer can arrange to have the instructions and data required by the processor core to be present in the on-chip memory well in advance of when they are actually needed by the processor core.09-16-2010
20100235577VICTIM CACHE LATERAL CASTOUT TARGETING - A data processing system includes a plurality of processing units coupled by an interconnect fabric. In response to a data request, a victim cache line is selected for castout from a first lower level cache of a first processing unit, and a target lower level cache of one of the plurality of processing units is selected based upon architectural proximity of the target lower level cache to a home system memory to which the address of the victim cache line is assigned. The first processing unit issues on the interconnect fabric a lateral castout (LCO) command that identifies the victim cache line to be castout from the first lower level cache and indicates that the target lower level cache is an intended destination. In response to a coherence response indicating success of the LCO command, the victim cache line is removed from the first lower level cache and held in the second lower level cache.09-16-2010
20120144118METHOD AND APPARATUS FOR SELECTIVELY PERFORMING EXPLICIT AND IMPLICIT DATA LINE READS ON AN INDIVIDUAL SUB-CACHE BASIS - A method and apparatus are described for selectively performing explicit and implicit data line reads. A controller, located in a cache, individually monitors the data resource availability for each of a plurality of sub-caches also located in the cache. The controller receives a data line request, generates an individual implicit tag request for each of the sub-caches that currently have sufficient data resources to perform an implicit data line read, and generates an individual explicit tag request for each of the sub-caches that do not currently have sufficient data resources to perform an implicit data line read. Each tag request includes an address of the requested data line and an indicator, (represented by at least one bit), of whether the tag request is an explicit or implicit tag request.06-07-2012
20080313404MULTIPROCESSOR SYSTEM AND OPERATING METHOD OF MULTIPROCESSOR SYSTEM - According to one aspect of embodiments, a multiprocessor system includes a cache memory corresponding to each of the processors, a hierarchy setting register in which the hierarchical level of each cache memory is set, an access control unit that controls access between each cache memory. The hierarchical level of the cache memory for each processor is stored in a rewritable hierarchy setting register. Each processor handles a cache memory corresponding to another processor as the cache memory having a deeper hierarchy than the cache memory corresponding to the each processor. As the result, each processor can access all the cache memories, and therefore the efficiency of cache memory utilization can be improved and the hierarchical level can be set so that the latency becomes optimal for each application.12-18-2008
20110131377MULTI-CORE PROCESSING CACHE IMAGE MANAGEMENT - A multi-core processor chip comprises at least one shared cache having a plurality of ports and a plurality of address spaces and a plurality of processor cores. Each processor core is coupled to one of the plurality of ports such that each processor core is able to access the at least one shared cache simultaneously with another of the plurality of processor cores. Each processor core is assigned one of a unique application or a unique application task and the multi-core processor is operable to execute a partitioning operating system that temporally and spatially isolates each unique application and each unique application task such that each of the plurality of processor cores does not attempt to write to the same address space of the at least one shared cache at the same time as another of the plurality of processor cores.06-02-2011
20120198167SYNCHRONIZING ACCESS TO DATA IN SHARED MEMORY VIA UPPER LEVEL CACHE QUEUING - A processing unit includes a store-in lower level cache having reservation logic that determines presence or absence of a reservation and a processor core including a store-through upper level cache, an instruction execution unit, a load unit that, responsive to a hit in the upper level cache on a load-reserve operation generated through execution of a load-reserve instruction by the instruction execution unit, temporarily buffers a load target address of the load-reserve operation, and a flag indicating that the load-reserve operation bound to a value in the upper level cache. If a storage-modifying operation is received that conflicts with the load target address of the load-reserve operation, the processor core sets the flag to a particular state, and, responsive to execution of a store-conditional instruction, transmits an associated store-conditional operation to the lower level cache with a fail indication if the flag is set to the particular state.08-02-2012
20110078381Cache Operations and Policies For A Multi-Threaded Client - A method for managing a parallel cache hierarchy in a processing unit. The method including receiving an instruction that includes a cache operations modifier that identifies a level of the parallel cache hierarchy in which to cache data associated with the instruction; and implementing a cache replacement policy based on the cache operations modifier.03-31-2011
20100306470Methods and Apparatus for Issuing Memory Barrier Commands in a Weakly Ordered Storage System - Efficient techniques are described for enforcing order of memory accesses. A memory access request is received from a device which is not configured to generate memory barrier commands. A surrogate barrier is generated in response to the memory access request. A memory access request may be a read request. In the case of a memory write request, the surrogate barrier is generated before the write request is processed. The surrogate barrier may also be generated in response to a memory read request conditional on a preceding write request to the same address as the read request. Coherency is enforced within a hierarchical memory system as if a memory barrier command was received from the device which does not produce memory barrier commands.12-02-2010
20100306474CACHE LINE USE HISTORY BASED DONE BIT MODIFICATION TO I-CACHE REPLACEMENT SCHEME - A method of providing history based done logic for instructions includes receiving an instruction in a cache line in a L2 cache; and loading the cache line into an L1 cache with a history count that indicates the number of read references of the previous access.12-02-2010
20100306473CACHE LINE USE HISTORY BASED DONE BIT MODIFICATION TO D-CACHE REPLACEMENT SCHEME - A method of providing history based done logic includes receiving a cache line in a L2 cache; determining if the cache line has a history of access at least three times on a previous call into the L2 cache; providing the cache line directly to a processor if the history of access was less then the at least three times; and loading the cache line into an L1 cache if the history of access was the at least three times.12-02-2010
20100306472I-CACHE LINE USE HISTORY BASED DONE BIT BASED ON SUCCESSFUL PREFETCHABLE COUNTER - A method of providing history based done logic for a I-cache includes receiving an I-cache line in an L2 cache; determining if the I-cache line is unprefetchable; aging the I-cache line without a delay if the I-cache line is prefetchable; and aging the I-cache line with a delay is the I-cache line is unprefetchable.12-02-2010
20130138887SELECTIVELY DROPPING PREFETCH REQUESTS BASED ON PREFETCH ACCURACY INFORMATION - The disclosed embodiments relate to a system that selectively drops a prefetch request at a cache. During operation, the system receives the prefetch request at the cache. Next, the system identifies a prefetch source for the prefetch request, and then uses accuracy information for the identified prefetch source to determine whether to drop the prefetch request. In some embodiments, the accuracy information includes accuracy information for different prefetch sources. In this case, determining whether to drop the prefetch request involves first identifying a prefetch source for the prefetch request, and then using accuracy information for the identified prefetch source to determine whether to drop the prefetch request.05-30-2013
20100325358Data storage protocols to determine items stored and items overwritten in linked data stores - A storage apparatus and method for storing a plurality of items is disclosed. The storage apparatus is configured to receive a first access request and a second access request for accessing respective items in a same clock cycle. The storage apparatus comprises: two stores each for storing a subset of the plurality of items, the first access request being routed to a first store and said second access request to a second store; miss detecting circuitry for detecting a miss where a requested item is not stored in the accessed store; item retrieving circuitry for retrieving an item whose access generated a miss from a further store; updating circuitry for selecting an item to overwrite in a respective one of the two stores in dependence upon an access history of the respective store, the updating circuitry being responsive to the miss detecting circuitry detecting the miss in an access to the first store and to at least one further condition to update both of the two stores with the item retrieved from the further store by overwriting the selected items.12-23-2010
20100191912Systems and Methods for Memory Management on Print Devices - Systems and methods disclosed permit flexible optimization of printer cache memories by specify criteria for determining cache membership for objects derived from a print data streams, wherein the objects may be associated with distinct reference counts. In some embodiments, the method may comprise the steps of: assigning an initial value to the reference count associated with an object, if the object is not present in the cache; incrementing the reference count by a first weight, if the object is already present in the cache; decrementing the reference count by a second weight, in response to an end-of-page event; and removing the object from the cache if the reference count is below a threshold.07-29-2010
20110022802Controlling data accesses to hierarchical data stores to retain access order - Data storage circuitry for controlling access to data stored in a memory is disclosed. The data storage circuitry comprises: a data store for storing a subset of the data stored in the memory; access circuitry for receiving access requests and for outputting the requested data, at least some of the received access requests being ordered access requests requiring the accessed data to be output in a same order as the access requests are received in; control circuitry for controlling access to the data; and retrieval circuitry for retrieving the data from the memory; wherein the control circuitry is responsive to an access request received from the access circuitry to access the data store and in response to detecting a miss in the data store when the requested data is not stored in the data store to transmit the access request to the retrieval circuitry; the retrieval circuitry being configured to retrieve requested data from the memory in response to the access request and to store the data in the data store and being responsive to no asserted output inhibit signal associated with the data access request to transmit the retrieved data to the access circuitry for output and being responsive to an asserted output inhibit signal associated with the data access request not to transmit the retrieved data to the access circuitry; the data storage circuitry further comprising detection circuitry for detecting an earlier ordered access request that misses in the data store and a later ordered access request that hits while the earlier ordered access request is pending, the data storage circuitry being configured to halt the later ordered access request and in response to receipt of a subsequent ordered access request while the earlier ordered request is still pending to assert an output inhibit signal associated with the subsequent ordered access request and in response to detection of completion of the earlier ordered access request to deassert the output inhibit signal.01-27-2011
20110119445HEAP/STACK GUARD PAGES USING A WAKEUP UNIT - A method and system for providing a memory access check on a processor including the steps of detecting accesses to a memory device including level-1 cache using a wakeup unit. The method includes invalidating level-1 cache ranges corresponding to a guard page, and configuring a plurality of wakeup address compare (WAC) registers to allow access to selected WAC registers. The method selects one of the plurality of WAC registers, and sets up a WAC register related to the guard page. The method configures the wakeup unit to interrupt on access of the selected WAC register. The method detects access of the memory device using the wakeup unit when a guard page is violated. The method generates an interrupt to the core using the wakeup unit, and determines the source of the interrupt. The method detects the activated WAC registers assigned to the violated guard page, and initiates a response.05-19-2011
20110087841PROCESSOR AND CONTROL METHOD - A processor includes a first processing unit that has a first memory and performs processing, a second processing unit that performs processing, a second memory that holds status information specifying a status of data held in the first memory, and a control unit that outputs a request for reading out the data of the first address to the first processing unit upon receiving a first access request for data of a first address from the second processing unit when first status information of the data of the first address indicates that the data of the first address is held in the first memory in an exclusive state or an owned state and that allows the second processing unit to access data of the first address included at the second memory upon receiving a no-data-modification notification indicating the data of the first address is not modified by the first processing unit.04-14-2011
20100070711Techniques for Cache Injection in a Processor System Using a Cache Injection Instruction - A technique for performing cache injection includes monitoring addresses on a bus in response to a cache injection instruction. Ownership of input/output data on the bus is acquired by a cache when an address on the bus (that is associated with the input/output data) corresponds to an address of a data block associated with the cache injection instruction.03-18-2010
20090216950Push for Sharing Instruction - In one embodiment, a system comprises a first processor, a main memory system, and a cache hierarchy coupled between the first processor and the main memory system. The cache hierarchy comprises at least a first cache. The first processor is configured to execute a first instruction, including forming an address responsive to one or more operands of the first instruction. The system is configured to push a first cache block that is hit by the first address in the first cache to a target location within the cache hierarchy or the main memory system, wherein the target location is unspecified in a definition of the first instruction within an instruction set architecture implemented by the first processor, and wherein the target location is implementation-dependent.08-27-2009
20090216949METHOD AND SYSTEM FOR A MULTI-LEVEL VIRTUAL/REAL CACHE SYSTEM WITH SYNONYM RESOLUTION - Method and system for a multi-level virtual/real cache system with synonym resolution. An exemplary embodiment includes a multi-level cache hierarchy, including a set of L1 caches associated with one or more processor cores and a set of L2 caches, wherein the set of L1 caches are a subset of the set of L2 caches, wherein the set of L1 caches underneath a given L2 cache are associated with one or more of the processor cores.08-27-2009
20100070712Techniques for Cache Injection in a Processor System with Replacement Policy Position Modification - A technique for performing cache injection includes monitoring, at a cache, addresses on a bus. Ownership of input/output data on the bus is then acquired by the cache when an address on the bus (that is associated with the input/output data) corresponds to an address of a data block stored in the cache. A replacement policy position of the data block is then modified (to increase a probability that the data block is consumed prior to ejection from the cache).03-18-2010
20100070710Techniques for Cache Injection in a Processor System - A technique for performing cache injection includes monitoring addresses on a bus. Ownership of input/output data on the bus is acquired by a cache when an address on the bus (that is associated with the input/output data) corresponds to an address of a data block stored in the cache.03-18-2010
20100070709CACHE FILTERING METHOD AND APPARATUS - A method and apparatus used within memory and data processing that reduces the number of references allowed in processor cache by using active rows to reject references that are less frequently used from the cache. Comparators within a memory controller are used to generate a signal indicative of a row hit or miss, which signal is then applied to one or more demultiplexers to enable or disable transfer of a memory reference to processor cache locations. The cache may be level one (L1) or level two (L2) caches including data and or instructions or some combination of L1, L2, data, and instructions.03-18-2010
20110153944Secure Cache Memory Architecture - A variety of circuits, methods and devices are implemented for secure storage of sensitive data in a computing system. A first dataset that is stored in main memory is accessed and a cache memory is configured to maintain logical consistency between the main memory and the cache. In response to determining that a second dataset is a sensitive dataset, the cache memory is directed to store the second dataset in a memory location of the cache memory without maintaining logical consistency with the dataset and main memory.06-23-2011
20110153945Apparatus and Method for Controlling the Exclusivity Mode of a Level-Two Cache - A method of controlling the exclusivity mode of a level-two cache includes generating level-two cache exclusivity control information at a processor in response to an exclusivity mode indicator, and utilizing the level-two cache exclusivity control information to configure the exclusivity mode of the level-two cache.06-23-2011
20110153943Aggregate Data Processing System Having Multiple Overlapping Synthetic Computers - A first SMP computer has first and second processing units and a first system memory pool, a second SMP computer has third and fourth processing units and a second system memory pool, and a third SMP computer has at least fifth and sixth processing units and third, fourth and fifth system memory pools. The fourth system memory pool is inaccessible to the third, fourth and sixth processing units and accessible to at least the second and fifth processing units, and the fifth system memory pool is inaccessible to the first, second and sixth processing units and accessible to at least the fourth and fifth processing units. A first interconnect couples the second processing unit for load-store coherent, ordered access to the fourth system memory pool, and a second interconnect couples the fourth processing unit for load-store coherent, ordered access to the fifth system memory pool.06-23-2011
20110153942REDUCING IMPLEMENTATION COSTS OF COMMUNICATING CACHE INVALIDATION INFORMATION IN A MULTICORE PROCESSOR - A processor may include several processor cores, each including a respective higher-level cache, wherein each higher-level cache includes higher-level cache lines; and a lower-level cache including lower-level cache lines, where each of the lower-level cache lines may be configured to store data that corresponds to multiple higher-level cache lines. In response to invalidating a given lower-level cache line, the lower-level cache may be configured to convey a sequence including several invalidation packets to the processor cores via an interface, where each member of the sequence of invalidation packets corresponds to a respective higher-level cache line to be invalidated, and where the interface is narrower than an interface capable of concurrently conveying all invalidation information corresponding to the given lower-level cache line. Each invalidation packet may include invalidation information indicative of a location of the respective higher-level cache line within different ones of the processor cores.06-23-2011
20110082982CONTENT DELIVERY NETWORK CACHE GROUPING - One or more content delivery networks (CDNs) that deliver content objects for others is disclosed. Content is propagated to edge servers through hosting and/or caching. End user computers are directed to an edge server for delivery of a requested content object by a universal resource indicator (URI). When a particular edge server does not have a copy of the content object from the URI, information is passed to another server, the ancestor or parent server to find the content object. There can be different parents servers designated for different URIs. The parent server looks for the content object and if not found, will go to another server, the grandparent server, and so on up a hierarchy within the group. Eventually, the topmost server in the hierarchy goes to the origin server to find the content object. The origin server may be hosted in the CDN or at a content provider across the Internet. Once the content object is located in the hierarchical chain, the content object is passed back down the chain to the edge server for delivery. Optionally, the various servers in the chain may cache or host the content object as it is relayed.04-07-2011
20110078380MULTI-LEVEL CACHE PREFETCH - Methods and apparatus relating to multi-level cache prefetch are described. In some embodiments, a data parking logic updates a prefetch request with one or more bits based on the status of a request queue. The one or more bits may in turn cause the corresponding prefetched data to be stored in one of at least two caches. Other embodiments are also described and claimed.03-31-2011
20110072213INSTRUCTIONS FOR MANAGING A PARALLEL CACHE HIERARCHY - A method for managing a parallel cache hierarchy in a processing unit. The method includes receiving an instruction from a scheduler unit, where the instruction comprises a load instruction or a store instruction; determining that the instruction includes a cache operations modifier that identifies a policy for caching data associated with the instruction at one or more levels of the parallel cache hierarchy; and executing the instruction and caching the data associated with the instruction based on the cache operations modifier.03-24-2011
20120066455HYBRID PREFETCH METHOD AND APPARATUS - A hybrid prefetch method and apparatus is disclosed. A processor includes a hybrid prefetch unit configured to generate addresses for accessing data from a system memory. The hybrid prefetch unit includes a first prediction unit configured to generate a first memory address according to a first prefetch algorithm and a second prediction unit configured to generate a second memory address according to a second prefetch algorithm. The hybrid prefetcher further includes an arbitration unit configured to select one of the first and second memory addresses and further configured to provide the selected one of the first and second memory addresses during a prefetch operation.03-15-2012
20130159627SYSTEM AND METHOD FOR MANAGING A CACHE USING FILE SYSTEM METADATA - Systems and methods for management of a cache are disclosed. In general, embodiments described herein store access counts in file system metadata associated with files in the cache. By encoding access counts in the file system metadata, file I/O operations are reduced. Preferably, the reference count is encoded in an access count timestamp in the file system metadata. The access counts can be decoded based on the difference between the access count time stamp and a base time value, with larger differences reflecting a larger access count. The cache can be aged by advancing the base time value, thereby causing the access count for a file to drop. The base time value can also be stored in file system metadata, thereby reducing file I/O operations when performing aging.06-20-2013
20120203969MEMORY BUS WRITE PRIORITIZATION - A data processing system includes a multi-level cache hierarchy including a lowest level cache, a processor core coupled to the multi-level cache hierarchy, and a memory controller coupled to the lowest level cache and to a memory bus of a system memory. The memory controller includes a physical read queue that buffers data read from the system memory via the memory bus and a physical write queue that buffers data to be written to the system memory via the memory bus. The memory controller grants priority to write operations over read operations on the memory bus based upon a number of dirty cachelines in the lowest level cache memory.08-09-2012
20100005242Efficient Processing of Data Requests With The Aid Of A Region Cache - A method and system for configuring a cache memory system in order to efficiently process processor requests. A group of cache elements, which include a Region Cache, a Region Coherence Array, and a lowest level cache, is configured based on a tradeoff of latency and power consumption requirements. A selected cache configuration differs from other feasible configurations in the order in which cache elements are accessed relative to each other. The Region Cache is employed in a number of configurations to reduce the power consumption, latency, and bandwidth requirements of the Region Coherence Array. The Region Cache is accessed by processor requests before (or in parallel with) the larger Region Coherence Array, providing the region coherence state and power efficiently to requests that hit in the Region Cache.01-07-2010
20100005241DETECTION OF STREAMING DATA IN CACHE - An apparatus to detect streaming data in memory is presented. In one embodiment the apparatus use reuse bits and S-bits status for cache lines wherein an S-bit status indicates the data in the cache line are potentially streaming data. To enhance the efficiency of a cache, different measures can be applied to make the streaming data become the next victim during a replacement.01-07-2010
20120203968COORDINATED WRITEBACK OF DIRTY CACHELINES - A data processing system includes a processor core and a cache memory hierarchy coupled to the processor core. The cache memory hierarchy includes at least one upper level cache and a lowest level cache. A memory controller is coupled to the lowest level cache and to a system memory and includes a physical write queue from which the memory controller writes data to the system memory. The memory controller initiates accesses to the lowest level cache to place into the physical write queue selected cachelines having spatial locality with data present in the physical write queue.08-09-2012
20080215816APPARATUS AND METHOD FOR FILTERING UNUSED SUB-BLOCKS IN CACHE MEMORIES - A memory system and method includes a cache having a filtered portion and an unfiltered portion. The unfiltered portion is divided into block sized components, and the filtered portion is divided into sub-block sized components. Blocks evicted from the unfiltered portion have selected sub-blocks thereof cached in the filtered portion for servicing requests.09-04-2008
20080320224Multiprocessor system, processor, and cache control method - A multiprocessor system includes processors each having a primary cache and a secondary cache shared by the processors. The processors each include a read unit that reads data from the primary cache, a request unit that makes a write request when the data to be read is not stored in the primary cache, a measuring unit that measures an elapsed time since the write request is made, a receiving unit that receives a read command from an external device, a comparing unit that compares specific information for specifying data, for which the read command has been received, with specific information for specifying data, for which the write request has been made, and a controller that suspends reading of the data according to the read command, when pieces of specific information are the same, and the elapsed time measured is less than a predetermined time.12-25-2008
20090055591Hierarchical cache memory system - A hierarchical cache memory system having first and second cache memories includes: a controller which outputs dirty data stored in the first cache memory to write back to a main memory; and a controller which processes the write-back to the main memory of the dirty data outputted from the first cache memory in parallel with the write-back to the main memory of dirty data stored in the second cache memory.02-26-2009
20080229020Systems and Methods of Providing A Multi-Tier Cache - The present solution provides a variety of techniques for accelerating and optimizing network traffic, such as HTTP based network traffic. The solution described herein provides techniques in the areas of proxy caching, protocol acceleration, domain name resolution acceleration as well as compression improvements. In some cases, the present solution provides various prefetching and/or prefreshening techniques to improve intermediary or proxy caching, such as HTTP proxy caching. In other cases, the present solution provides techniques for accelerating a protocol by improving the efficiency of obtaining and servicing data from an originating server to server to clients. In another cases, the present solution accelerates domain name resolution more quickly. As every HTTP access starts with a URL that includes a hostname that must be resolved via domain name resolution into an IP address, the present solution helps accelerate HTTP access. In some cases, the present solution improves compression techniques by prefetching non-cacheable and cacheable content to use for compressing network traffic, such as HTTP. The acceleration and optimization techniques described herein may be deployed on the client as a client agent or as part of a browser, as well as on any type and form of intermediary device, such as an appliance, proxying device or any type of interception caching and/or proxying device.09-18-2008
20110161585PROCESSING NON-OWNERSHIP LOAD REQUESTS HITTING MODIFIED LINE IN CACHE OF A DIFFERENT PROCESSOR - Methods and apparatus to efficiently process non-ownership load requests hitting modified line (M-line) in cache of a different processor are described. In one embodiment, a first agent changes the state of a first data and forwards it to a second, requesting agent who stores the first data in an alternative modified state. Other embodiments are also described.06-30-2011
20080222359STORAGE SYSTEM AND DATA MANAGEMENT METHOD - The present invention comprises a CHA 09-11-2008
20100306471D-CACHE LINE USE HISTORY BASED DONE BIT BASED ON SUCCESSFUL PREFETCHABLE COUNTER - A method of providing history based done logic for a D-cache includes receiving a D-cache line in an L2 cache; determining if the D-cache line is unprefetchable; aging the D-cache line without a delay if the D-cache line is prefetchable; and aging the D-cache line with a delay if the D-cache line is unprefetchable.12-02-2010
20110047332STORAGE SYSTEM, CACHE CONTROL DEVICE, AND CACHE CONTROL METHOD - A storage system includes a storage device that stores data, a cache memory that caches the data, an information storage unit that stores data configuration information indicating a configuration of the data and state information indicating a cache state of the data in the cache memory, a candidate data selection unit, a first determining unit and a data-to-be-written unit. The candidate data selection unit selects, according to the state information candidate data from the data cached in the cache memory. The first determination unit determines, according to the data configuration information, whether data relating to the candidate data is cached in the cache memory. The data-to-be-written selection unit selects, according to the determination made by the first determination unit, data to be written into the storage device, from the data cached in the cache memory.02-24-2011
20080256297Multi-Port High-Level Cache Unit an a Method For Retrieving Information From a Multi-Port High-Level Cache Unit - A device that includes multiple processors that are connected to multiple level-one cache units. The device also includes a multi-port high-level cache unit that includes a first modular interconnect, a second modular interconnect, multiple high-level cache paths; whereas the multiple high-level cache paths comprise multiple concurrently accessible interleaved high-level cache units. Conveniently, the device also includes at least one non-cacheable path. A method for retrieving information from a cache that includes: concurrently receiving, by a first modular interconnect of a multiple-port high-level cache unit, requests to retrieve information. The method is characterized by providing information from at least two paths out of multiple high-level cache paths if at least two high-level cache hit occurs, and providing information via a second modular interconnect if a high-level cache miss occurs.10-16-2008
20110055482SHARED CACHE RESERVATION - Various example embodiments are disclosed. According to an example embodiment, a shared cache may be configured to determine whether a word requested by one of the L1 caches is currently stored in the L2 shared cache, read the requested word from the main memory based on determining that the requested word is not currently stored in the L2 shared cache, determine whether at least one line in a way reserved for the requesting L1 cache is unused, store the requested word in the at least one line based on determining that the at least one line in the reserved way is unused, and store the requested word in a line of the L2 shared cache outside the reserved way based on determining that the at least one line in the reserved way is not unused.03-03-2011
20110264861METHODS AND SYSTEMS FOR UTILIZING BYTECODE IN AN ON-DEMAND SERVICE ENVIRONMENT INCLUDING PROVIDING MULTI-TENANT RUNTIME ENVIRONMENTS AND SYSTEMS - Execution of code in a multitenant runtime environment. A request to execute code corresponding to a tenant identifier (ID) is received in a multitenant environment. The multitenant database stores data for multiple client entities each identified by a tenant ID having one of one or more users associated with the tenant ID. Users of each of multiple client entities can only access data identified by a tenant ID associated with the respective client entity. The multitenant database is a hosted database provided by an entity separate from the client entities, and provides on-demand database service to the client entities. Source code corresponding to the code to be executed is retrieved from a multitenant database. The retrieved source code is compiled. The compiled code is executed in the multitenant runtime environment. The memory used by the compiled code is freed in response to completion of the execution of the compiled code.10-27-2011
20110119446CONDITIONAL LOAD AND STORE IN A SHARED CACHE - A method, system and computer program product are disclosed for implementing load-reserve and store-conditional instructions in a multi-processor computing system. The computing system includes a multitude of processor units and a shared memory cache, and each of the processor units has access to the memory cache. In one embodiment, the method comprises providing the memory cache with a series of reservation registers, and storing in these registers addresses reserved in the memory cache for the processor units as a result of issuing load-reserve requests. In this embodiment, when one of the processor units makes a request to store data in the memory cache using a store-conditional request, the reservation registers are checked to determine if an address in the memory cache is reserved for that one of the processor units. If an address in the memory cache is reserved for that one of the processors, the data are stored at this reserved address.05-19-2011
20110138124TRACE MODE FOR CACHE MEMORY SYSTEM - A cache, including a cache memory, is configurable to operate in a cache mode and a trace mode. When the cache is operating in the cache mode, the cache memory stores a copy of a portion of data that is stored in another memory external to the cache, and a received data access request is processed by retrieving a copy of a portion of data identified in the received data access request from the cache memory (if the cache memory stores a copy of the portion of data), or by forwarding the data access request to a data access request processing means external to the cache (if the cache memory does not store a copy of the portion of data). When the cache is operating in the trace mode, data access requests received by the cache are monitored and information relating to a received data access request is captured and stored in the cache memory.06-09-2011
20120311267EXTERNAL CACHE OPERATION BASED ON CLEAN CASTOUT MESSAGES - A processor transmits clean castout messages indicating that a cache line is not dirty and is no longer being stored by a lowest level cache of the processor. An external cache receives the clean castout messages and manages cache lines based in part on the clean castout messages.12-06-2012
20100180081Adaptive Data Prefetch System and Method - A data processing system includes a processor, a unit that includes a multi-level cache, a prefetch system and a memory. The data processing system can operate in a first mode and a second mode. The prefetch system can change behavior in response to a desired power consumption policy set by an external agent or automatically via hardware based on on-chip power/performance thresholds.07-15-2010
20110010501EFFICIENT DATA PREFETCHING IN THE PRESENCE OF LOAD HITS - A BIU prioritizes L1 requests above L2 requests. The L2 generates a first request to the BIU and detects the generation of a snoop request and L1 request to the same cache line. The L2 determines whether a bus transaction to fulfill the first request may be retried and, if so, generates a miss, and otherwise generates a hit. Alternatively, the L2 detects the L1 generated a request to the L2 for the same line and responsively requests the BIU to refrain from performing a transaction on the bus to fulfill the first request if the BIU has not yet been granted the bus. Alternatively, a prefetch cache and the L2 allow the same line to be simultaneously present. If an L1 request hits in both the L2 and in the prefetch cache, the prefetch cache invalidates its copy of the line and the L2 provides the line to the L1.01-13-2011
20100122031SPIRAL CACHE POWER MANAGEMENT, ADAPTIVE SIZING AND INTERFACE OPERATIONS - A spiral cache memory provides low access latency for frequently-accessed values by self-organizing to always move a requested value to a front-most storage tile of the spiral. If the spiral cache needs to eject a value to make space for a value moved to the front-most tile, space is made by ejecting a value from the cache to a backing store. A buffer along with flow control logic is used to prevent overflow of writes of ejected values to the generally slow backing store. The tiles in the spiral cache may be single storage locations or be organized as some form of cache memory such as direct-mapped or set-associative caches. Power consumption of the spiral cache can be reduced by dividing the cache into an active and inactive partition, which can be adjusted on a per-tile basis. Tile-generated or global power-down decisions can set the size of the partitions.05-13-2010
20100122032SELECTIVELY PERFORMING LOOKUPS FOR CACHE LINES - Embodiments of the present invention provide a system that selectively performs lookups for cache lines. During operation, the system by maintains a lower-level cache and a higher-level cache in accordance with a set of rules that dictate conditions under which cache lines are held in the lower-level cache and the higher-level cache. The system next performs a lookup for cache line A in the lower level cache. The system then discovers that the lookup for cache line A missed in the lower-level cache, but that cache line B is present in the lower-level cache. Next, in accordance with the set of rules, the system determines, without performing a lookup for cache line A in the higher-level cache, that cache line A is guaranteed not to be present and valid in the higher-level cache because cache line B is present in the lower-level cache.05-13-2010
20090198897CACHE MANAGEMENT DURING ASYNCHRONOUS MEMORY MOVE OPERATIONS - A data processing system includes a mechanism for completing an asynchronous memory move (AMM) operation in which the processor receives an AMM ST instruction and processes a processor-level move of data in virtual address space and an asynchronous memory mover then completes a physical move of the data within the real address space (memory). A status/control field of the AMM ST instruction includes an indication of a requested treatment of the lower level cache(s) on completion of the AMM operation. When the status/control field indicates an update to at least one cache should be performed, the asynchronous memory mover automatically forwards a copy of the data from the data move to the lower level cache, and triggers an update of a coherency state for a cache line in which the copy of the data is placed.08-06-2009
20110264860MULTI-MODAL DATA PREFETCHER - A microprocessor includes first and second cache memories occupying distinct hierarchy levels, the second backing the first. A prefetcher monitors load operations and maintains a recent history of the load operations from a cache line and determines whether the recent history indicates a clear direction. The prefetcher prefetches one or more cache lines into the first cache memory when the recent history indicates a clear direction and otherwise prefetches the one or more cache lines into the second cache memory. The prefetcher also determines whether the recent history indicates the load operations are large and, other things being equal, prefetches a greater number of cache lines when large than small. The prefetcher also determines whether the recent history indicates the load operations are received on consecutive clock cycles and, other things being equal, prefetches a greater number of cache lines when on consecutive clock cycles than not.10-27-2011
20100030966Cache memory and cache memory control apparatus - Disclosed herein is a cache memory including: a tag storage section including entries each including a tag address and a pending indication portion, at least one of the entries being to be referred to by a first address portion of an access address; a data storage section; a tag control section configured to compare a second address portion of the access address with the tag address included in each of the entries referred to to detect an entry whose tag address matches the second address portion, and, when the pending indication portion included in the detected entry indicates pending, cause an access related to the access address to be suspended; and a data control section configured to select data corresponding to the detected entry from among the data storage section, when the pending indication portion included in the detected entry does not indicate pending.02-04-2010
20100030965DISOWNING CACHE ENTRIES ON AGING OUT OF THE ENTRY - Caching where portions of data are stored in slower main memory and are transferred to faster memory between one or more processors and the main memory. The cache is such that an individual cache system must communicate to other associated cache systems, or check with such cache systems, to determine if they contain a copy of a given cached location prior to or upon modification or appropriation of data at a given cached location. The cache further includes provisions for determining when the data stored in a particular memory location may be replaced.02-04-2010
20100023695Victim Cache Replacement - A data processing system includes a processor core having an associated upper level cache and a lower level victim cache. In response to a memory access request of the processor core, the lower level cache victim determines whether the memory access request hits or misses in the directory of the lower level victim cache, and the upper level cache determines whether a castout from the upper level cache is to be performed and selects a victim coherency granule for eviction from the upper level cache. In response to determining that a castout from the upper level cache is to be performed, the upper level cache evicts the selected victim coherency granule. In the eviction, the upper level cache reads out the victim coherency granule from the data array of the upper level cache only in response to an indication that the memory access request misses in the directory of the lower level victim cache.01-28-2010
20080263281CACHE MEMORY SYSTEM USING TEMPORAL LOCALITY INFORMATION AND A DATA STORAGE METHOD - A cache memory system using temporal locality information and a data storage method are provided. The cache memory system including: a main cache which stores data accessed by a central processing unit; an extended cache which stores the data if the data is evicted from the main cache; and a separation cache which stores the data of the extended cache when the data of the extended cache is evicted from the extended cache and temporal locality information corresponding to the data of the extended cache satisfies a predetermined condition.10-23-2008
20120042126METHOD FOR CONCURRENT FLUSH OF L1 AND L2 CACHES - The present invention provides a method and apparatus for use with a hierarchical cache system. The method may include concurrently flushing one or more first caches and a second cache of a multi-level cache. Each first cache is smaller and at a lower level in the multi-level cache than the second level cache.02-16-2012
20120042127CACHE PARTITIONING - A method and apparatus for partitioning a cache includes determining an allocation of a subcache out of a plurality of subcaches within the cache for association with a compute unit out of a plurality of compute units. Data is processed by the compute unit, and the compute unit evicts a line. The evicted line is written to the subcache associated with the compute unit.02-16-2012
20120151144METHOD AND SYSTEM FOR DETERMINING A CACHE MEMORY CONFIGURATION FOR TESTING - A method and computer device for determining the cache memory configuration. The method includes allocating an amount of cache memory from a first memory level of the cache memory, and determining a read transfer time for the allocated amount of cache memory. The allocated amount of cache memory then is increased and the read transfer time for the increased allocated amount of cache memory is determined. The allocated amount of cache memory continues to be increased and the read transfer time determined for the each allocated amount until all of the cache memory in all of the cache memory levels has been allocated. The cache memory configuration is determined based on the read transfer times from the allocated portions of the cache memory. The determined cache memory configuration includes the number of cache memory levels and the respective capacities of each cache memory level.06-14-2012
20130185512MANAGEMENT OF PARTIAL DATA SEGMENTS IN DUAL CACHE SYSTEMS - For movement of partial data segments within a computing storage environment having lower and higher levels of cache by a processor, a whole data segment containing one of the partial data segments is promoted to both the lower and higher levels of cache. Requested data of the whole data segment is split and positioned at a Most Recently Used (MRU) portion of a demotion queue of the higher level of cache. Unrequested data of the whole data segment is split and positioned at a Least Recently Used (LRU) portion of the demotion queue of the higher level of cache. The unrequested data is pinned in place until a write of the whole data segment to the lower level of cache completes.07-18-2013
20120210069SHARED CACHE FOR A TIGHTLY-COUPLED MULTIPROCESSOR - Computing apparatus (08-16-2012
20120210068SYSTEMS AND METHODS FOR A MULTI-LEVEL CACHE - A multi-level cache comprises a plurality of cache levels, each configured to cache I/O request data pertaining to I/O requests of a different respective type and/or granularity. A cache device manager may allocate cache storage space to each of the cache levels. Each cache level maintains respective cache metadata that associates I/O request data with respective cache address. The cache levels monitor I/O requests within a storage stack, apply selection criteria to identify cacheable I/O requests, and service cacheable I/O requests using the cache storage device.08-16-2012
20110167224CACHE MEMORY, MEMORY SYSTEM, DATA COPYING METHOD, AND DATA REWRITING METHOD - A cache memory according to an aspect of the present invention including entries each of which includes a tag address, line data, and a dirty flag, the cache memory includes: a command execution unit which rewrites, when a first command is instructed by a processor, a tag address included in at least one entry specified by the processor among the entries to a tag address corresponding to an address specified by the processor, and to set a dirty flag corresponding to the entry; and a write-back unit which writes, back to a main memory, the line data included in the entry in which the dirty flag is set.07-07-2011
20120017049METHOD AND APPARATUS FOR IMPLEMENTING CACHE COHERENCY OF A PROCESSOR - An advanced processor comprises a plurality of multithreaded processor cores each having a data cache and instruction cache. A data switch interconnect is coupled to each of the processor cores and configured to pass information among the processor cores. A messaging network is coupled to each of the processor cores and a plurality of communication ports. In one aspect of an embodiment of the invention, the data switch interconnect is coupled to each of the processor cores by its respective data cache, and the messaging network is coupled to each of the processor cores by its respective message station. Advantages of the invention include the ability to provide high bandwidth communications between computer systems and memory in an efficient and cost-effective manner.01-19-2012
20120059995Apparatus and Methods to Reduce Castouts in a Multi-Level Cache Hierarchy - Techniques and methods are used to reduce allocations to a higher level cache of cache lines displaced from a lower level cache. The allocations of the displaced cache lines are prevented for displaced cache lines that are determined to be redundant in the next level cache, whereby castouts are reduced. To such ends, a line is selected to be displaced in a lower level cache. Information associated with the selected line is identified which indicates that the selected line is present in a higher level cache. An allocation of the selected line in the higher level cache is prevented based on the identified information. Preventing an allocation of the selected line saves power that would be associated with the allocation.03-08-2012
20120159074METHOD, APPARATUS, AND SYSTEM FOR ENERGY EFFICIENCY AND ENERGY CONSERVATION INCLUDING DYNAMIC CACHE SIZING AND CACHE OPERATING VOLTAGE MANAGEMENT FOR OPTIMAL POWER PERFORMANCE - Embodiments of the invention relate to increased energy efficiency and conservation by reducing and increasing an amount of cache available for use by a processor, and an amount of power supplied to the cache and to the processor, based on the amount of cache actually being used by the processor to process data. For example, a power control unit (PCU) may monitor a last level cache (LLC) to identify if the size or amount of the cache being used by a processor to process data and to determine heuristics based on that amount. Based on the monitored amount of cache being used and the heuristics, the PCU causes a corresponding decrease or increase in an amount of the cache available for use by the processor, and a corresponding decrease or increase in an amount of power supplied to the cache and to the processor.06-21-2012
20120159073METHOD AND APPARATUS FOR ACHIEVING NON-INCLUSIVE CACHE PERFORMANCE WITH INCLUSIVE CACHES - An apparatus and method for improving cache performance in a computer system having a multi-level cache hierarchy. For example, one embodiment of a method comprises: selecting a first line in a cache at level N for potential eviction; querying a cache at level M in the hierarchy to determine whether the first cache line is resident in the cache at level M, wherein M06-21-2012
20120159075CACHE LINE USE HISTORY BASED DONE BIT MODIFICATION TO D-CACHE REPLACEMENT SCHEME - A method of providing history based done logic includes receiving a cache line in a L2 cache; determining if the cache line has a history of access at least three times on a previous call into the L2 cache; providing the cache line directly to a processor if the history of access was less then the at least three times; and loading the cache line into an L1 cache if the history of access was the at least three times.06-21-2012
20120072668SLOT/SUB-SLOT PREFETCH ARCHITECTURE FOR MULTIPLE MEMORY REQUESTORS - A prefetch unit generates a prefetch address in response to an address associated with a memory read request received from the first or second cache. The prefetch unit includes a prefetch buffer that is arranged to store the prefetch address in an address buffer of a selected slot of the prefetch buffer, where each slot of the prefetch unit includes a buffer for storing a prefetch address, and two sub-slots. Each sub-slot includes a data buffer for storing data that is prefetched using the prefetch address stored in the slot, and one of the two sub-slots of the slot is selected in response to a portion of the generated prefetch address. Subsequent hits on the prefetcher result in returning prefetched data to the requestor in response to a subsequent memory read request received after the initial received memory read request.03-22-2012
20120072667VARIABLE LINE SIZE PREFETCHER FOR MULTIPLE MEMORY REQUESTORS - A prefetch unit generates prefetch addresses in response to an initial received memory read request, an address associated with the initial received memory read request, a line length of the requestor of the initial received memory read request, and a request type width of the initial received memory read request. Prefetch operations are generated using the generated prefetch addresses, wherein each generated prefetch address is stored in a prefetch buffer slot that is selected by a prefetch FIFO (First In First Out) prefetch counter. Subsequent hits on the prefetcher result in returning prefetched data to the requestor in response to a subsequent memory read request received after the initial received memory read request.03-22-2012
20120110266DISABLING CACHE PORTIONS DURING LOW VOLTAGE OPERATIONS - Methods and apparatus relating to disabling one or more cache portions during low voltage operations are described. In some embodiments, one or more extra bits may be used for a portion of a cache that indicate whether the portion of the cache is capable at operating at or below Vccmin levels. Other embodiments are also described and claimed.05-03-2012
20100122033MEMORY SYSTEM INCLUDING A SPIRAL CACHE - An integrated memory system with a spiral cache responds to requests for values at a first external interface coupled to a particular storage location in the cache in a time period determined by the proximity of the requested values to the particular storage location. The cache supports multiple outstanding in-flight requests directed to the same address using an issue table that tracks multiple outstanding requests and control logic that applies the multiple requests to the same address in the order received by the cache memory. The cache also includes a backing store request table that tracks push-back write operations issued from the cache memory when the cache memory is full and a new value is provided from the external interface, and the control logic to prevent multiple copies of the same value from being loaded into the cache or a copy being loaded before a pending push-back has been completed.05-13-2010
20110107031Extended Cache Capacity - A method, programmed medium and system are provided for enabling a core's cache capacity to be increased by using the caches of the disabled or non-enabled cores on the same chip. Caches of disabled or non-enabled cores on a chip are made accessible to store cachelines for those chip cores that have been enabled, thereby extending cache capacity of enabled cores.05-05-2011
20110107032CACHE RECONFIGURATION BASED ON RUN-TIME PERFORMANCE DATA OR SOFTWARE HINT - A method for reconfiguring a cache memory is provided. The method in one aspect may include analyzing one or more characteristics of an execution entity accessing a cache memory and reconfiguring the cache based on the one or more characteristics analyzed. Examples of analyzed characteristic may include but are not limited to data structure used by the execution entity, expected reference pattern of the execution entity, type of an execution entity, heat and power consumption of an execution entity, etc. Examples of cache attributes that may be reconfigured may include but are not limited to associativity of the cache memory, amount of the cache memory available to store data, coherence granularity of the cache memory, line size of the cache memory, etc.05-05-2011
20100095065FIELD DEVICE COMMUNICATIONS - The present invention is a system and method for caching data in mobile devices to improve field assessment capabilities in a relief management system. The system analyzes captured geolocation information associated with one or more mobile field devices to identify a set of data to be cached by a mobile field device. The system then communicates the set of data to be cached to the mobile field device. The data cached is predicated upon the predicted likely location of field assessment operations and the data includes server-side image tiles.04-15-2010
20120124291Secondary Cache Memory With A Counter For Determining Whether to Replace Cached Data - A selective cache includes a set configured to receive data evicted from a number of primary sets of a primary cache. The selective cache also includes a counter associated with the set. The counter is configured to indicate a frequency of access to data within the set. A decision whether to replace data in the set with data from one of the primary sets is based on a value of the counter.05-17-2012
20130173860NEAR NEIGHBOR DATA CACHE SHARING - Parallel computing environments, where threads executing in neighboring processors may access the same set of data, may be designed and configured to share one or more levels of cache memory. Before a processor forwards a request for data to a higher level of cache memory following a cache miss, the processor may determine whether a neighboring processor has the data stored in a local cache memory. If so, the processor may forward the request to the neighboring processor to retrieve the data. Because access to the cache memories for the two processors is shared, the effective size of the memory is increased. This may advantageously decrease cache misses for each level of shared cache memory without increasing the individual size of the caches on the processor chip.07-04-2013
20120166729SUBCACHE AFFINITY - A method and apparatus for controlling affinity of subcaches is disclosed. When a core compute unit evicts a line of victim data, a prioritized search for space allocation on available subcaches is executed, in order of proximity between the subcache and the compute unit. The victim data may be injected into an adjacent subcache if space is available. Otherwise, a line may be evicted from the adjacent subcache to make room for the victim data or the victim data may be sent to the next closest subcache. To retrieve data, a core compute unit sends a Tag Lookup Request message directly to the nearest subcache as well as to a cache controller, which controls routing of messages to all of the subcaches. A Tag Lookup Response message is sent back to the cache controller to indicate if the requested data is located in the nearest sub-cache.06-28-2012
20120317360Cache Streaming System - A system, having a stream cache and a storage. The stream cache includes a stream cache controller adapted to control or mediate input data transmitted through the stream cache; and a stream cache memory. The stream cache memory is adapted to both store at least first portions of the input data, as determined by the stream cache controller, and to further output the stored first portions of the input data to a processor. The storage is adapted to receive and store second portions of the input data, as determined by the stream cache controller, and to further transmit the stored second portions of the input data for output to the processor.12-13-2012
20120215983DATA CACHING METHOD - Data caching for use in a computer system including a lower cache memory and a higher cache memory. The higher cache memory receives a fetch request. It is then determined by the higher cache memory the state of the entry to be replaced next. If the state of the entry to be replaced next indicates that the entry is exclusively owned or modified, the state of the entry to be replaced next is changed such that a following cache access is processed at a higher speed compared to an access processed if the state would stay unchanged.08-23-2012
20120137075System and Method for a Cache in a Multi-Core Processor - The invention relates to a multi-core processor system, in particular a single-package multi-core processor system, comprising at least two processor cores, preferably at least four processor cores, each of said at least two cores, preferably at least four processor cores, having a local LEVEL-1 cache, a tree communication structure combining the multiple LEVEL-1 caches, the tree having at least one node, preferably at least three nodes for a four processor core multi-core processor, and TAG information is associated to data managed within the tree, usable in the treatment of the data.05-31-2012
20120137074METHOD AND APPARATUS FOR STREAM BUFFER MANAGEMENT INSTRUCTIONS - A method and system to perform stream buffer management instructions in a processor. The stream buffer management instructions facilitate the creation and usage of a dedicated memory space or stream buffer of the processor in one embodiment of the invention. The dedicated memory space is a contiguous memory space and has a sequential or linear addressing scheme in one embodiment of the invention. The processor has logic to execute a stream buffer management instruction to copy data from a source memory address to a destination memory address that is specified with a desired level of memory hierarchy.05-31-2012
20110185125RESOURCE SHARING TO REDUCE IMPLEMENTATION COSTS IN A MULTICORE PROCESSOR - A processor may include several processor cores, each including a respective higher-level cache; a lower-level cache including several tag units each including several controllers, where each controller corresponds to a respective cache bank configured to store data, and where the controllers are concurrently operable to access their respective cache banks; and an interconnect network configured to convey data between the cores and the lower-level cache. The controllers may share access to an interconnect egress port coupled to the interconnect network, and may generate multiple concurrent requests to convey data via the shared port, where each of the requests is destined for a corresponding core, and where a datapath width of the port is less than a combined width of the multiple requests. The given tag unit may arbitrate among the controllers for access to the shared port, such that the requests are transmitted to corresponding cores serially rather than concurrently.07-28-2011
20120221794Computer Cache System With Stratified Replacement - Methods for selecting a line to evict from a data storage system are provided. A computer system implementing a method for selecting a line to evict from a data storage system is also provided. The methods include selecting an uncached class line for eviction prior to selecting a cached class line for eviction.08-30-2012
20120226867Binary tree based multilevel cache system for multicore processors - A binary tree based multi-level cache system for multi-core processors and its two possible implementations LogN and LogN+1 models maintaining a true pyramid is described.09-06-2012
20120226866DYNAMIC MIGRATION OF VIRTUAL MACHINES BASED ON WORKLOAD CACHE DEMAND PROFILING - A computer-implemented method comprises obtaining a cache hit ratio for each of a plurality of virtual machines, and identifying, from among the plurality of virtual machines, a first virtual machine having a cache hit ratio that is less than a threshold ratio. The identified first virtual machine is then migrated from the first physical server having a first cache size to a second physical server having a second cache size that is greater than the first cache size. Optionally, a virtual machine having a cache hit ratio that is less than a threshold ratio is identified on a class-specific basis, such as for L1 cache, L2 cache and L3 cache.09-06-2012
20100299482METHOD AND APPARATUS FOR DETERMINING CACHE STORAGE LOCATIONS BASED ON LATENCY REQUIREMENTS - A method for determining whether to store binary information in a fast way or a slow way of a cache is disclosed. The method includes receiving a block of binary information to be stored in a cache memory having a plurality of ways. The plurality of ways includes a first subset of ways and a second subset of ways, wherein a cache access by a first execution core from one of the first subset of ways has a lower latency time than a cache access from one of the second subset of ways. The method further includes determining, based on a predetermined access latency and one or more parameters associated with the block of binary information, whether to store the block of binary information into one of the first set of ways or one of the second set of ways.11-25-2010
20100274971Multi-Core Processor Cache Coherence For Reduced Off-Chip Traffic - Technologies are generally described herein for maintaining cache coherency within a multi-core processor. A first cache entry to be evicted from a first cache may be identified. The first cache entry may include a block of data and a first tag indicating an owned state. An owner eviction message for the first cache entry may be broadcasted from the first cache. A second cache entry in a second cache may be identified. The second cache entry may include the block of data and a second tag indicating a shared state. The broadcasted owner eviction message may be detected with the second cache. An ownership acceptance message for the second cache entry may be broadcasted from the second cache. The broadcasted ownership acceptance message may be detected with the first cache. The second tag in the second cache entry may be transformed from the shared state to the owned state.10-28-2010
20120179872GLOBAL INSTRUCTIONS FOR SPIRAL CACHE MANAGEMENT - A method of operation of a pipelined cache memory supports global operations within the cache. The cache may be a spiral cache, with a move-to-front M2F network for moving values from a backing store to a front-most tile coupled to a processor or lower-order level of a memory hierarchy and a spiral push-back network for pushing out modified values to the backing-store. The cache controller manages application of global commands by propagating individual commands to the tiles. The global commands may provide zeroing, flushing and reconciling of the given tiles. Commands for interrupting and resuming interrupted global commands may be implemented, to reduce halting or slowing of processing while other global operations are in process. A line detector within each tile supports reconcile and flush operations, and a line patcher in the controller provides for initializing address ranges with no processor intervention.07-12-2012
20120254541METHODS AND APPARATUS FOR UPDATING DATA IN PASSIVE VARIABLE RESISTIVE MEMORY - Methods and apparatus for updating data in passive variable resistive memory (PVRM) are provided. In one example, a method for updating data stored in PVRM is disclosed. The method includes updating a memory block of a plurality of memory blocks in a cache hierarchy without invalidating the memory block. The updated memory block may be copied from the cache hierarchy to a write through buffer. Additionally, the method includes writing the updated memory block to the PVRM, thereby updating the data in the PVRM.10-04-2012
20120254540METHOD AND SYSTEM FOR OPTIMIZING PREFETCHING OF CACHE MEMORY LINES - A method and system to optimize prefetching of cache memory lines in a processing unit. The processing unit has logic to determine whether a vector memory operand is cached in two or more adjacent cache memory lines. In one embodiment of the invention, the determination of whether the vector memory operand is cached in two or more adjacent cache memory lines is based on the size and the starting address of the vector memory operand. In one embodiment of the invention, the pre-fetching of the two or more adjacent cache memory lines that cache the vector memory operand is performed using a single instruction that uses one issue slot and one data cache memory execution slot. By doing so, it avoids additional software prefetching instructions or operations to read a single vector memory operand when the vector memory operand is cached in more than one cache memory line.10-04-2012
20100268884Updating Partial Cache Lines in a Data Processing System - A processing unit for a data processing system includes a processor core having one or more execution units for processing instructions and a register file for storing data accessed in processing of the instructions. The processing unit also includes a multi-level cache hierarchy coupled to and supporting the processor core. The multi-level cache hierarchy includes at least one upper level of cache memory having a lower access latency and at least one lower level of cache memory having a higher access latency. The lower level of cache memory, responsive to receipt of a memory access request that hits only a partial cache line in the lower level cache memory, sources the partial cache line to the at least one upper level cache memory to service the memory access request. The at least one upper level cache memory services the memory access request without caching the partial cache line.10-21-2010
20120260041SIMULTANEOUS EVICTION AND CLEANING OPERATIONS IN A CACHE - Embodiments provide a method comprising receiving, at a cache associated with a central processing unit that is disposed on an integrated circuit, a request to perform a cache operation on the cache; in response to receiving and processing the request, determining that first data cached in a first cache line of the cache is to be written to a memory that is coupled to the integrated circuit; identifying a second cache line in the cache, the second cache line being complimentary to the first cache line; transmitting a single memory instruction from the cache to the memory to write to the memory (i) the first data from the first cache line and (ii) second data from the second cache line; and invalidating the first data in the first cache line, without invalidating the second data in the second cache line.10-11-2012
20120233407CACHE PHASE DETECTOR AND PROCESSOR CORE - A cache phase detector included in a processor core according to example embodiments includes a counting unit and a signal generating unit. The counting unit generates a critical section miscount by counting a request from the processor core resulting in a tag miss and a valid cache line based on a tag miss signal and a cache line valid signal. The signal generating unit compares the critical section miscount from the counting unit with a reference value, and generates a cache phase change signal if the critical section miscount is greater than the reference value.09-13-2012
20120239883RESOURCE SHARING TO REDUCE IMPLEMENTATION COSTS IN A MULTICORE PROCESSOR - A processor may include several processor cores, each including a respective higher-level cache; a lower-level cache including several tag units each including several controllers, where each controller corresponds to a respective cache bank configured to store data, and where the controllers are concurrently operable to access their respective cache banks; and an interconnect network configured to convey data between the cores and the lower-level cache. The controllers in a given tag unit may share access to a resource that may include one or more of an interconnect egress port coupled to the interconnect network, an interconnect ingress port coupled to the interconnect network, a test controller, or a data storage structure.09-20-2012
20120272004EFFICIENT DATA PREFETCHING IN THE PRESENCE OF LOAD HITS - A memory subsystem in a microprocessor includes a first-level cache, a second-level cache, and a prefetch cache configured to speculatively prefetch cache lines from a memory external to the microprocessor. The second-level cache and the prefetch cache are configured to allow the same cache line to be simultaneously present in both. If a request by the first-level cache for a cache line hits in both the second-level cache and in the prefetch cache, the prefetch cache invalidates its copy of the cache line and the second-level cache provides the cache line to the first-level cache.10-25-2012
20120272003EFFICIENT DATA PREFETCHING IN THE PRESENCE OF LOAD HITS - A microprocessor configured to access an external memory includes a first-level cache, a second-level cache, and a bus interface unit (BIU) configured to interface the first-level and second-level caches to a bus used to access the external memory. The BIU is configured to prioritize requests from the first-level cache above requests from the second-level cache. The second-level cache is configured to generate a first request to the BIU to fetch a cache line from the external memory. The second-level cache is also configured to detect that the first-level cache has subsequently generated a second request to the second-level cache for the same cache line. The second-level cache is also configured to request the BIU to refrain from performing a transaction on the bus to fulfill the first request if the BIU has not yet been granted ownership of the bus to fulfill the first request.10-25-2012
20120089782METHOD FOR MANAGING AND TUNING DATA MOVEMENT BETWEEN CACHES IN A MULTI-LEVEL STORAGE CONTROLLER CACHE - A method for managing data movement in a multi-level cache system having a primary cache and a secondary cache. The method includes determining whether an unallocated space of the primary cache has reached a minimum threshold; selecting at least one outgoing data block from the primary cache when the primary cache reached the minimum threshold; initiating a de-stage process for de-staging the outgoing data block from the primary cache; and terminating the de-stage process when the unallocated space of the primary cache has reached an upper threshold. The de-stage process further includes determining whether a cache hit has occurred in the secondary cache before; storing the outgoing data block in the secondary cache when the cache hit has occurred in the secondary cache before; generating and storing metadata regarding the outgoing data block; and deleting the outgoing data block from the primary cache.04-12-2012
20110276763MEMORY BUS WRITE PRIORITIZATION - A data processing system includes a multi-level cache hierarchy including a lowest level cache, a processor core coupled to the multi-level cache hierarchy, and a memory controller coupled to the lowest level cache and to a memory bus of a system memory. The memory controller includes a physical read queue that buffers data read from the system memory via the memory bus and a physical write queue that buffers data to be written to the system memory via the memory bus. The memory controller grants priority to write operations over read operations on the memory bus based upon a number of dirty cachelines in the lowest level cache memory.11-10-2011
20110320721DYNAMIC TRAILING EDGE LATENCY ABSORPTION FOR FETCH DATA FORWARDED FROM A SHARED DATA/CONTROL INTERFACE - A computer-implemented method for managing data transfer in a multi-level memory hierarchy that includes receiving a fetch request for allocation of data in a higher level memory, determining whether a data bus between the higher level memory and a lower level memory is available, bypassing an intervening memory between the higher level memory and the lower level memory when it is determined that the data bus is available, and transferring the requested data directly from the higher level memory to the lower level memory.12-29-2011
20110320720Cache Line Replacement In A Symmetric Multiprocessing Computer - Cache line replacement in a symmetric multiprocessing computer, the computer having a plurality of processors, a main memory that is shared among the processors, a plurality of cache levels including at least one high level of private caches and a low level shared cache, and a cache controller that controls the shared cache, including receiving in the cache controller a memory instruction that requires replacement of a cache line in the low level shared cache; and selecting for replacement by the cache controller a least recently used cache line in the low level shared cache that has no copy stored in any higher level cache.12-29-2011
20120102269USING SPECULATIVE CACHE REQUESTS TO REDUCE CACHE MISS DELAYS - The disclosed embodiments provide a system that uses speculative cache requests to reduce cache miss delays for a cache in a multi-level memory hierarchy. During operation, the system receives a memory reference which is directed to a cache line in the cache. Next, while determining whether the cache line is available in the cache, the system determines whether the memory reference is likely to miss in the cache, and if so, simultaneously sends a speculative request for the cache line to a lower level of the multi-level memory hierarchy.04-26-2012
20130013863Hybrid Caching Techniques and Garbage Collection Using Hybrid Caching Techniques - Hybrid caching techniques and garbage collection using hybrid caching techniques are provided. A determination of a measure of a characteristic of a data object is performed, the characteristic being indicative of an access pattern associated with the data object. A selection of one caching structure, from a plurality of caching structures, is performed in which to store the data object based on the measure of the characteristic. Each individual caching structure in the plurality of caching structures stores data objects has a similar measure of the characteristic with regard to each of the other data objects in that individual caching structure. The data object is stored in the selected caching structure and at least one processing operation is performed on the data object stored in the selected caching structure.01-10-2013
20100131712PSEUDO CACHE MEMORY IN A MULTI-CORE PROCESSOR (MCP) - Specifically, under the present invention, a cache memory unit can be designated as a pseudo cache memory unit for another cache memory unit within a common hierarchal level. For example, in case of cache miss at cache memory unit “X” on cache level L2 of a hierarchy, a request is sent to a cache memory unit on cache level L3 (external), as well as one or more other cache memory units on cache level L2. The L2 level cache memory units return search results as a hit or a miss. They typically do not search L3 nor write back with the L3 result even (e.g., if it the result is a miss). To this extent, only the immediate origin of the request is written back with L3 results, if all L2s miss. As such, the other L2 level cache memory units serve the original L2 cache memory unit as pseudo caches05-27-2010
20110161591INCREASED NAND FLASH MEMORY READ THROUGHPUT - A method of reading sequential pages of flash memory from alternating memory blocks comprises loading data from a first page into a first primary data cache and a second page into a second primary data cache simultaneously, the first and second pages loaded from different blocks of flash memory. Data from the first primary data cache is stored in a first secondary data cache, and data from the second primary data cache is stored in a second secondary data cache. Data is sequentially provided from the first and second secondary data caches by a multiplexer coupled to the first and second data caches.06-30-2011
20110161590SYNCHRONIZING ACCESS TO DATA IN SHARED MEMORY VIA UPPER LEVEL CACHE QUEUING - A processing unit includes a store-in lower level cache having reservation logic that determines presence or absence of a reservation and a processor core including a store-through upper level cache, an instruction execution unit, a load unit that, responsive to a hit in the upper level cache on a load-reserve operation generated through execution of a load-reserve instruction by the instruction execution unit, temporarily buffers a load target address of the load-reserve operation, and a flag indicating that the load-reserve operation bound to a value in the upper level cache. If a storage-modifying operation is received that conflicts with the load target address of the load-reserve operation, the processor core sets the flag to a particular state, and, responsive to execution of a store-conditional instruction, transmits an associated store-conditional operation to the lower level cache with a fail indication if the flag is set to the particular state.06-30-2011
20110161589SELECTIVE CACHE-TO-CACHE LATERAL CASTOUTS - A data processing system includes first and second processing units and a system memory. The first processing unit has first upper and first lower level caches, and the second processing unit has second upper and lower level caches. In response to a data request, a victim cache line to be castout from the first lower level cache is selected, and the first lower level cache selects between performing a lateral castout (LCO) of the victim cache line to the second lower level cache and a castout of the victim cache line to the system memory based upon a confidence indicator associated with the victim cache line. In response to selecting an LCO, the first processing unit issues an LCO command on the interconnect fabric and removes the victim cache line from the first lower level cache, and the second lower level cache holds the victim cache line.06-30-2011
20110161588FORMATION OF AN EXCLUSIVE OWNERSHIP COHERENCE STATE IN A LOWER LEVEL CACHE - In response to a memory access request of a processor core that targets a target cache line, the lower level cache of a vertical cache hierarchy associated with the processor core supplies a copy of the target cache line to an upper level cache in the vertical cache hierarchy and retains a copy in a shared coherence state. The upper level cache holds the copy of the target cache line in a private shared ownership coherence state indicating that each cached copy of the target memory block is cached within the vertical cache hierarchy associated with the processor core. In response to the upper level cache signaling replacement of the copy of the target cache line in the private shared ownership coherence state, the lower level cache updates its copy of the target cache line to the exclusive ownership coherence state without coherency messaging with other vertical cache hierarchies.06-30-2011
20110161587PROACTIVE PREFETCH THROTTLING - According to a method of data processing, a memory controller receives a plurality of data prefetch requests from multiple processor cores in the data processing system, where the plurality of prefetch load requests include a data prefetch request issued by a particular processor core among the multiple processor cores. In response to receipt of the data prefetch request, the memory controller provides a coherency response indicating an excess number of data prefetch requests. In response to the coherency response, the particular processor core reduces a rate of issuance of data prefetch requests.06-30-2011
20110161586Shared Memories for Energy Efficient Multi-Core Processors - Technologies are described herein related to multi-core processors that are adapted to share processor resources. An example multi-core processor can include a plurality of processor cores. The multi-core processor further can include a shared register file selectively coupled to two or more of the plurality of processor cores, where the shared register file is adapted to serve as a shared resource among the selected processor cores.06-30-2011
20080222358METHOD AND SYSTEM FOR PROVIDING AN IMPROVED STORE-IN CACHE - A system and method of providing a cache system having a store-in policy and affording the advantages of store-in cache operation, while simultaneously providing protection against soft-errors in locally modified data, which would normally preclude the use of a store-in cache when reliability is paramount. The improved store-in cache mechanism includes a store-in L1 cache, at least one higher-level storage hierarchy; an ancillary store-only cache (ASOC) that holds most recently stored-to lines of the store-in L1 cache, and a cache controller that controls storing of data to the ancillary store-only cache (ASOC) and recovering of data from the ancillary store-only cache (ASOC) such that the data from the ancillary store-only cache (ASOC) is used only if parity errors are encountered in the store-in L1 cache.09-11-2008
20130179639TECHNIQUE FOR PRESERVING CACHED INFORMATION DURING A LOW POWER MODE - A technique to retain cached information during a low power mode, according to at least one embodiment. In one embodiment, information stored in a processor's local cache is saved to a shared cache before the processor is placed into a low power mode, such that other processors may access information from the shared cache instead of causing the low power mode processor to return from the low power mode to service an access to its local cache.07-11-2013
20130173861NEAR NEIGHBOR DATA CACHE SHARING - Parallel computing environments, where threads executing in neighboring processors may access the same set of data, may be designed and configured to share one or more levels of cache memory. Before a processor forwards a request for data to a higher level of cache memory following a cache miss, the processor may determine whether a neighboring processor has the data stored in a local cache memory. If so, the processor may forward the request to the neighboring processor to retrieve the data. Because access to the cache memories for the two processors is shared, the effective size of the memory is increased. This may advantageously decrease cache misses for each level of shared cache memory without increasing the individual size of the caches on the processor chip.07-04-2013
20130145095MELTHOD AND SYSTEM FOR INTEGRATING THE FUNCTIONS OF A CACHE SYSTEM WITH A STORAGE TIERING SYSTEM - A tiered data storage system having a cache employs a tiering management subsystem to analyze data access patterns over time, and a cache management subsystem to monitor individual input/output operations and replicate data in the cache. The tiering management subsystem determines a distribution of data between tiers and determines what data should be cached while the cache management subsystem moves data into the cache. The tiered data storage system may analyze individual input/output operations to determine if data should be consolidated from multiple regions in one or more data storage tiers into a single region.06-06-2013
20130097383METHODS FOR PROVIDING A RESPONSE AND SYSTEMS THEREOF - A method, computer readable medium, and system for generating a response includes determining from which of a plurality of levels of cache to retrieve a response. The determination is based on a number of matches between current user session data associated with a current request and stored user session data rewritten into each of one or more metadata data variables for the response when a current request for the response matches at least one prior stored request for the response. The response from the determined level of the plurality of levels of cache is provided.04-18-2013
20130124800APPARATUS AND METHOD FOR REDUCING PROCESSOR LATENCY - There is provided a data processing system comprising a central processing unit, a processor cache memory operably coupled to the central processing unit and an external connection operably coupled to the central processing unit and processor cache memory in which a portion of the data processing system is arranged to load data directly from the external connection into the processor cache memory and modify a source address of said directly loaded data. There is also provided a method of improving latency in a data processing system having a central processing unit operably coupled to a processor cache memory and an external connection operably coupled to the central processing unit and processor cache memory, comprising loading data directly from the external connection into the processor cache memory and modifying a source address for said data to become indicative of a location other than from the external connection.05-16-2013
20130132675DATA PROCESSING APPARATUS HAVING A CACHE CONFIGURED TO PERFORM TAG LOOKUP AND DATA ACCESS IN PARALLEL, AND A METHOD OF OPERATING THE DATA PROCESSING APPARATUS - A data processing apparatus has a cache with a data array and a tag array. The tag array stores address tag portions associated with the data values in the data array. The cache performs a tag lookup, comparing a tag portion of a received address with a set of tag entries in the tag array. The data array includes a partial tag store storing a partial tag value in association with each data entry. In parallel with the tag lookup, a partial tag value of the received address is compared with partial tag values stored in association with a set of data entries in said data array. A data value is read out if a match condition occurs. Exclusivity circuitry ensures that at most one partial tag value of said partial tag values stored in association with said set of data entries can generate said match condition.05-23-2013
20130132674METHOD AND SYSTEM FOR DISTRIBUTING TIERED CACHE PROCESSING ACROSS MULTIPLE PROCESSORS - A data storage system having at least one cache and at least two processors balances the load of data access operations by directing certain processes in each data access operation to one of the processors. Each processor may be optimized for its specific processes. One processor may be dedicated to receiving and servicing data access requests; another processor may be dedicated to background tasks and cache management.05-23-2013
20110219190CACHE WITH RELOAD CAPABILITY AFTER POWER RESTORATION - A method and apparatus for repopulating a cache are disclosed. At least a portion of the contents of the cache are stored in a location separate from the cache. Power is removed from the cache and is restored some time later. After power has been restored to the cache, it is repopulated with the portion of the contents of the cache that were stored separately from the cache.09-08-2011
20130151780Weighted History Allocation Predictor Algorithm in a Hybrid Cache - A mechanism is provided for weighted history allocation prediction. For each member in a plurality of members in a lower level cache, an associated reference counter is initialized to an initial value based on an operation type that caused data to be allocated to a member location of the member. For each access to the member in the lower level cache, the associated reference counter is incremented. Responsive to a new allocation of data to the lower level cache and responsive to the new allocation of data requiring the victimization of another member in the lower level cache, a member of the lower level cache is identified that has a lowest reference count value in its associated reference counter. The member with the lowest reference count value in its associated reference counter is then evicted.06-13-2013
20130151777Dynamic Inclusive Policy in a Hybrid Cache Hierarchy Using Hit Rate - A mechanism is provided for dynamic cache allocation using a cache hit rate. A first cache hit rate is monitored in a first subset utilizing a first allocation policy of N sets of a lower level cache. A second cache hit rate is also monitored in a second subset utilizing a second allocation policy different from the first allocation policy of the N sets of the lower level cache. A periodic comparison of the first cache hit rate to the second cache hit rate is made to identify a third allocation policy for a third subset of the N-sets of the lower level cache. The third allocation policy for the third subset is then periodically adjusted to at least one of the first allocation policy or the second allocation policy based on the comparison of the first cache hit rate to the second cache hit rate.06-13-2013
20130151778Dynamic Inclusive Policy in a Hybrid Cache Hierarchy Using Bandwidth - A mechanism is provided for dynamic cache allocation using bandwidth. A bandwidth between a higher level cache and a lower level cache is monitored. Responsive to bandwidth usage between the higher level cache and the lower level cache being below a predetermined low bandwidth threshold, the higher level cache and the lower level cache are set to operate in accordance with a first allocation policy. Responsive to bandwidth usage between the higher level cache and the lower level cache being above a predetermined high bandwidth threshold, the higher level cache and the lower level cache are set to operate in accordance with a second allocation policy.06-13-2013
20130151779Weighted History Allocation Predictor Algorithm in a Hybrid Cache - A mechanism is provided for weighted history allocation prediction. For each member in a plurality of members in a lower level cache, an associated reference counter is initialized to an initial value based on an operation type that caused data to be allocated to a member location of the member. For each access to the member in the lower level cache, the associated reference counter is incremented. Responsive to a new allocation of data to the lower level cache and responsive to the new allocation of data requiring the victimization of another member in the lower level cache, a member of the lower level cache is identified that has a lowest reference count value in its associated reference counter. The member with the lowest reference count value in its associated reference counter is then evicted.06-13-2013
20100318741MULTIPROCESSOR COMPUTER CACHE COHERENCE PROTOCOL - A multiprocessor computer system comprises a processing node having a plurality of processors and a local memory shared among processors in the node. An L12-16-2010
20110289276CACHE MEMORY APPARATUS - A cache memory apparatus includes an L1 cache memory, an L2 cache memory coupled to the L1 cache memory, an arithmetic logic unit (ALU) within the L2 cache memory, the combined ALU and L2 cache memory being configured to perform therewithin at least one of: an arithmetic operation, a logical bit mask operation; the cache memory apparatus being further configured to interact with at least one processor such that atomic memory operations bypass the L1 cache memory and go directly to the L2 cache memory.11-24-2011
20120030429METHOD FOR COORDINATING UPDATES TO DATABASE AND IN-MEMORY CACHE - A computer method and system of caching. In a multi-threaded application, different threads execute respective transactions accessing a data store (e.g. database) from a single server. The method and system represent status of datastore transactions using respective certain (e.g. Future) parameters.02-02-2012
20120030428INFORMATION PROCESSING DEVICE, MEMORY MANAGEMENT DEVICE AND MEMORY MANAGEMENT METHOD - According to one embodiment, an information processing device includes a first determination section and a setting section. The first determination section determines inconsistency between first data and second data. The first data is stored in a nonvolatile semiconductor memory. The second data is corresponding to the first data and stored in a semiconductor memory. The setting section sets execution timing of write back based on access frequency information associated with the second data.02-02-2012
20130205088MULTI-STAGE CACHE DIRECTORY AND VARIABLE CACHE-LINE SIZE FOR TIERED STORAGE ARCHITECTURES - A method in accordance with the invention includes providing first, second, and third storage tiers, wherein the first storage tier acts as a cache for the second storage tier, and the second storage tier acts as a cache for the third storage tier. The first storage tier uses a first cache line size corresponding to an extent size of the second storage tier. The second storage tier uses a second cache line size corresponding to an extent size of the third storage tier. The second cache line size is significantly larger than the first cache line size. The method further maintains, in the first storage tier, a first cache directory indicating which extents from the second storage tier are cached in the first storage tier, and a second cache directory indicating which extents from the third storage tier are cached in the second storage tier.08-08-2013
20130205090MULTI-CORE PROCESSOR HAVING HIERARCHICAL COMMUNICATION ARCHITECTURE - Disclosed is a mufti-core processor having hierarchical communication architecture. The multi-core processor having hierarchical communication architecture is configured to include clusters in which cores are clustered; a lowest level memory shared among the cores included in the clusters; a middle level memory shared among the clusters; and a highest level memory shared by all the clusters. In accordance with an exemplary embodiment of the present invention, it is possible to improve the performance of the applications by reducing the communication overhead between respective core and supporting the data and functional parallelization.08-08-2013
20130205089Cache Device and Methods Thereof - A cache device, coupled to a processing device, a plurality of system components and an external memory control module, capable of exchanging all types of traffic streams from the processing device and the plurality of system components to the external memory control module. The cache device includes a plurality of cache units, comprising a plurality of cache lines and corresponding to a plurality of cache sets; a data accessing unit, coupled to the processing device, the plurality of system components, the plurality of cache units and the external memory control module, capable of exchanging data of the processing device, the plurality of cache units and an external memory device coupled to the external memory control module according to at least one request signal from the processing device and the plurality of system components.08-08-2013

Patent applications in class Hierarchical caches