Patent application number | Description | Published |
20100287561 | DEVICE FOR AND METHOD OF WEIGHTED-REGION CYCLE ACCOUNTING FOR MULTI-THREADED PROCESSOR CORES - An aspect of the present invention improves the accuracy of measuring processor utilization of multi-threaded cores by providing a calibration facility that derives utilization in the context of the overall dynamic operating state of the core by assigning weights to idle threads and assigning weights to run threads, depending on the status of the core. From previous chip designs it has been established in a Simultaneous Multi Thread (SMT) core that not all idle cycles in a hardware thread can be equally converted into useful work. Competition for core resources reduces the conversion efficiency of one thread's idle cycles when any other thread is running on the same core. | 11-11-2010 |
20130151777 | Dynamic Inclusive Policy in a Hybrid Cache Hierarchy Using Hit Rate - A mechanism is provided for dynamic cache allocation using a cache hit rate. A first cache hit rate is monitored in a first subset utilizing a first allocation policy of N sets of a lower level cache. A second cache hit rate is also monitored in a second subset utilizing a second allocation policy different from the first allocation policy of the N sets of the lower level cache. A periodic comparison of the first cache hit rate to the second cache hit rate is made to identify a third allocation policy for a third subset of the N-sets of the lower level cache. The third allocation policy for the third subset is then periodically adjusted to at least one of the first allocation policy or the second allocation policy based on the comparison of the first cache hit rate to the second cache hit rate. | 06-13-2013 |
20130151778 | Dynamic Inclusive Policy in a Hybrid Cache Hierarchy Using Bandwidth - A mechanism is provided for dynamic cache allocation using bandwidth. A bandwidth between a higher level cache and a lower level cache is monitored. Responsive to bandwidth usage between the higher level cache and the lower level cache being below a predetermined low bandwidth threshold, the higher level cache and the lower level cache are set to operate in accordance with a first allocation policy. Responsive to bandwidth usage between the higher level cache and the lower level cache being above a predetermined high bandwidth threshold, the higher level cache and the lower level cache are set to operate in accordance with a second allocation policy. | 06-13-2013 |
20130151779 | Weighted History Allocation Predictor Algorithm in a Hybrid Cache - A mechanism is provided for weighted history allocation prediction. For each member in a plurality of members in a lower level cache, an associated reference counter is initialized to an initial value based on an operation type that caused data to be allocated to a member location of the member. For each access to the member in the lower level cache, the associated reference counter is incremented. Responsive to a new allocation of data to the lower level cache and responsive to the new allocation of data requiring the victimization of another member in the lower level cache, a member of the lower level cache is identified that has a lowest reference count value in its associated reference counter. The member with the lowest reference count value in its associated reference counter is then evicted. | 06-13-2013 |
20130151780 | Weighted History Allocation Predictor Algorithm in a Hybrid Cache - A mechanism is provided for weighted history allocation prediction. For each member in a plurality of members in a lower level cache, an associated reference counter is initialized to an initial value based on an operation type that caused data to be allocated to a member location of the member. For each access to the member in the lower level cache, the associated reference counter is incremented. Responsive to a new allocation of data to the lower level cache and responsive to the new allocation of data requiring the victimization of another member in the lower level cache, a member of the lower level cache is identified that has a lowest reference count value in its associated reference counter. The member with the lowest reference count value in its associated reference counter is then evicted. | 06-13-2013 |
20140089902 | MONITORING SOFTWARE PERFORMANCE - Systems, methods and computer program products may provide monitoring of software performance on a computer. A method of monitoring software performance in a computer may include marking at least one of a load request and a store request, the marked request including an effective instruction address and an effective data address, recording the effective instruction and data addresses in a processor core and sending the marked request to a memory subsystem. The method may also include receiving a fabric response for the marked request, recording the fabric response in the core and tying the effective instruction and data addresses and the fabric response together in a sample. | 03-27-2014 |
20140372704 | LEAST-RECENTLY-USED (LRU) TO FIRST-DIRTY-MEMBER DISTANCE-MAINTAINING CACHE CLEANING SCHEDULER - A technique for scheduling cache cleaning operations maintains a clean distance between a set of least-recently-used (LRU) clean lines and the LRU dirty (modified) line for each congruence class in the cache. The technique is generally employed at a victim cache at the highest-order level of the cache memory hierarchy, so that write-backs to system memory are scheduled to avoid having to generate a write-back in response to a cache miss in the next lower-order level of the cache memory hierarchy. The clean distance can be determined by counting all of the LRU clean lines in each congruence class that have a reference count that is less than or equal to the reference count of the LRU dirty line. | 12-18-2014 |
20140372705 | LEAST-RECENTLY-USED (LRU) TO FIRST-DIRTY-MEMBER DISTANCE-MAINTAINING CACHE CLEANING SCHEDULER - A technique for scheduling cache cleaning operations maintains a clean distance between a set of least-recently-used (LRU) clean lines and the LRU dirty (modified) line for each congruence class in the cache. The technique is generally employed at a victim cache at the highest-order level of the cache memory hierarchy, so that write-backs to system memory are scheduled to avoid having to generate a write-back in response to a cache miss in the next lower-order level of the cache memory hierarchy. The clean distance can be determined by counting all of the LRU clean lines in each congruence class that have a reference count that is less than or equal to the reference count of the LRU dirty line. | 12-18-2014 |
20140379953 | CONTINUOUS IN-MEMORY ACCUMULATION OF HARDWARE PERFORMANCE COUNTER DATA - In-memory accumulation of hardware counts in a computer system is carried out by continuously sending count values from full-speed hardware counter units to a memory controller. A sending unit periodically samples performance data from the hardware counter units, and transmits count values to a bus interface for an interconnection bus which communicates with the memory controller. The memory controller responsively updates an accumulated count value stored in system memory using the current count value, e.g., incrementing the accumulated count value. A count value can be sent with a pointer to a memory location and an instruction on how the location is to be updated. The instruction may be an atomic read-modify-write operation, and the memory controller can include a dedicated arithmetic logic unit to carry out that operation. A data harvester can then be used to harvest accumulated count values by reading them from a table in system memory. | 12-25-2014 |