Patent application number | Description | Published |
20090006693 | Apparatus and Method for Fairness Arbitration for a Shared Pipeline in a Large SMP Computer System - A modification of rank priority arbitration for access to computer system resources through a shared pipeline that provides more equitable arbitration by allowing a higher ranked request access to the shared resource ahead of a lower ranked requester only one time. If multiple requests are active at the same time, the rank priority will first select the highest priority active request and grant it access to the resource. It will also set a ‘blocking latch’ to prevent that higher priority request from re-gaining access to the resource until the rest of the outstanding lower priority active requesters have had a chance to access the resource. | 01-01-2009 |
20090210661 | METHOD, SYSTEM AND COMPUTER PROGRAM PRODUCT FOR AN IMPLICIT PREDICTED RETURN FROM A PREDICTED SUBROUTINE - A method, system and computer program product for performing an implicit predicted return from a predicted subroutine are provided. The system includes a branch history table/branch target buffer (BHT/BTB) to hold branch information, including a target address of a predicted subroutine and a branch type. The system also includes instruction buffers, and instruction fetch controls to perform a method including fetching a branch instruction at a branch address and a return-point instruction. The method also includes receiving the target address and the branch type, and fetching a fixed number of instructions in response to the branch type. The method further includes referencing the return-point instruction within the instruction buffers such that the return-point instruction is available upon completing the fetching of the fixed number of instructions absent a re-fetch of the return-point instruction. | 08-20-2009 |
20090210684 | METHODS, SYSTEMS, AND COMPUTER PROGRAM PRODUCTS FOR RECOVERING FROM BRANCH PREDICTION LATENCY - A branch prediction algorithm is used to generate a prediction of whether or not a branch will be taken. One or more instructions are fetched such that, for each of the fetched instructions, the prediction initiates a fetch of an instruction at a predicted target of the branch. A test is performed to ascertain whether or not the prediction was generated late relative to the fetched instructions, so that if the branch is later detected as mispredicted, that detection can be correlated to the late prediction. When the prediction is generated late relative to the fetched instructions, a latent prediction is selected by utilizing a fetching initiated by the latent prediction such that a new fetch is not started. | 08-20-2009 |
20090216915 | METHOD, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR MERGING DATA - A method for merging data including receiving a request from an input/output device to merge a data, wherein a merge of the data includes a manipulation of the data, determining if the data exists in a local cache memory that is in local communication with the input/output device, fetching the data to the local cache memory from a remote cache memory or a main memory if the data does not exist in the local cache memory, merging the data according to the request to obtain a merged data, and storing the merged data in the local cache, wherein the merging of the data is performed without using a memory controller within a control flow or a data flow of the merging of the data. A corresponding system and computer program product. | 08-27-2009 |
20090216952 | METHOD, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR MANAGING CACHE MEMORY - A method for managing cache memory including receiving an instruction fetch for an instruction stream in a cache memory, wherein the instruction fetch includes an instruction fetch reference tag for the instruction stream and the instruction stream is at least partially included within a cache line, comparing the instruction fetch reference tag to a previous instruction fetch reference tag, maintaining a cache replacement status of the cache line if the instruction fetch reference tag is the same as the previous instruction fetch reference tag, and upgrading the cache replacement status of the cache line if the instruction fetch reference tag is different from the previous instruction fetch reference tag, whereby the cache replacement status of the cache line is upgraded if the instruction stream is independently fetched more than once. A corresponding system and computer program product. | 08-27-2009 |
20090217003 | METHOD, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR REDUCING CACHE MEMORY POLLUTION - A method for reducing cache memory pollution including fetching an instruction stream from a cache line, preventing a fetching for the instruction stream from a sequential cache line, searching for a next predicted taken branch instruction, determining whether a length of the instruction stream extends beyond a length of the cache line based on the next predicted taken branch instruction, continuing preventing the fetching for the instruction stream from the sequential cache line if the length of the instruction stream does not extend beyond the length of the cache line, and allowing the fetching for the instruction stream from the sequential cache line if the length of the instruction stream extends beyond the length of the cache line, whereby the fetching from the sequential cache line and a resulting polluting of a cache memory that stores the instruction stream is minimized. A corresponding system and computer program product. | 08-27-2009 |
20090217015 | SYSTEM AND METHOD FOR CONTROLLING RESTARTING OF INSTRUCTION FETCHING USING SPECULATIVE ADDRESS COMPUTATIONS - A system and method for controlling restarting of instruction fetching using speculative address computations in a processor are provided. The system includes a predicted target queue to hold branch prediction logic (BPL) generated target address values. The system also includes target selection logic including a recycle queue. The target selection logic selects a saved branch target value between a previously speculatively calculated branch target value from the recycle queue and an address value from the predicted target queue. The system further includes a compare block to identify a wrong target in response to a mismatch between the saved branch target value and a current calculated branch target, where instruction fetching is restarted in response to the wrong target. | 08-27-2009 |
20090217017 | METHOD, SYSTEM AND COMPUTER PROGRAM PRODUCT FOR MINIMIZING BRANCH PREDICTION LATENCY - A method, system, and computer program product for minimizing branch prediction latency in a pipelined computer processing environment are provided. The method includes detecting a branch loop utilizing branch instruction addresses and corresponding target addresses stored in a branch target buffer (BTB). The method also includes fetching the branch loop into a pre-decode instruction buffer and qualifying the branch loop for loop lockdown. The method further includes locking an instruction stream that forms the branch loop in the pre-decode instruction buffer and processing qualified branch loop instructions from the buffer and powering down instruction fetching and branch prediction logic (BPL) associated with the BTB. | 08-27-2009 |
20090240891 | METHOD, SYSTEM AND COMPUTER PROGRAM PRODUCT FOR DATA BUFFERS PARTITIONED FROM A CACHE ARRAY - Systems, methods and computer program products for data buffers partitioned from a cache array. An exemplary embodiment includes a method in a processor and for providing data buffers partitioned from a cache array, the method including clearing cache directories associated with the processor to an initial state, obtaining a selected directory state from a control register preloaded by the service processor, in response to the control register including the desired cache state, sending load commands with an address and data, loading cache lines and cache line directory entries into the cache and storing the specified data in the corresponding cache line. | 09-24-2009 |
20110314183 | SYSTEM AND METHOD FOR MANAGING DATAFLOW IN A TEMPORARY MEMORY - A method of managing a temporary memory includes: receiving a request to transfer data from a source location to a destination location, the data transfer request associated with an operation to be performed, the operation selected from an input into an intermediate temporary memory and an output; checking a two-state indicator associated with the temporary memory, the two-state indicator having a first state indicating that an immediately preceding operation on the temporary memory was an input to the temporary memory and a second state indicating that the immediately preceding operation was an output from the temporary memory; and performing the operation responsive to one of: the operation being an input operation and the two-state indicator being in the second state, indicating that the immediately preceding operation was an output; and the operation being an output operation and the two-state indicator being in the first state, indicating that the immediately preceding operation was an input. | 12-22-2011 |
20110314211 | RECOVER STORE DATA MERGING - Various embodiments of the present invention merge data in a cache memory. In one embodiment a set of store data is received from a processing core. A store merge command and a merge mask from are also received from the processing core. A portion of the store data to perform a merging operation thereon is identified based on the store merge command. A sub-portion of the portion of the store data to be merged with a corresponding set of data from a cache memory is identified based on the merge mask. The sub-portion is merged with the corresponding set of data from the cache memory. | 12-22-2011 |
20110314212 | MANAGING IN-LINE STORE THROUGHPUT REDUCTION - Various embodiments of the present invention manage a hierarchical store-through memory cache structure. A store request queue is associated with a processing core in multiple processing cores. At least one blocking condition is determined to have occurred at the store request queue. Multiple non-store requests and a set of store requests associated with a remaining set of processing cores in the multiple processing cores are dynamically blocked from accessing a memory cache in response to the blocking condition having occurred. | 12-22-2011 |
20110320657 | CONTROLLING DATA STREAM INTERRUPTIONS ON A SHARED INTERFACE - A mechanism for controlling data stream interruptions on a shared bus is provided. A first request is received to transfer data. High priority data components and low priority data components are determined for the first request. The high priority data components are transferred without interruptions. In response to receiving requests when transferring the high priority data components, the received requests are rejected. | 12-29-2011 |
20110320659 | DYNAMIC MULTI-LEVEL CACHE INCLUDING RESOURCE ACCESS FAIRNESS SCHEME - An apparatus for controlling access to a resource includes a shared pipeline configured to communicate with the resource, a plurality of command queues configured to form instructions for the shared pipeline and an arbiter coupled between the shared pipeline and the plurality of command queues configured to grant access to the shared pipeline to a one of the plurality of command queues based on a first priority scheme in a first operating mode. The apparatus also includes interface logic coupled to the arbiter and configured to determine that contention for access to the resource exists among the plurality of command queues and to cause the arbiter to grant access to the shared pipeline based on a second priority scheme in second operating mode. | 12-29-2011 |
20110320694 | CACHED LATENCY REDUCTION UTILIZING EARLY ACCESS TO A SHARED PIPELINE - A method of performing operations in a shared cache coupled to a first requestor and a second requestor includes receiving at the shared cache a first request from the second requester; assigning the request to a state machine; transmitting a first pipe pass request from the state machine to an arbiter; providing a first instruction from the first pipe pass request to a cache pipeline, the first instruction causing a first pipe pass; and providing a second pipe pass request to the arbiter before the first pipe pass is completed. | 12-29-2011 |
20110320695 | MITIGATING BUSY TIME IN A HIGH PERFORMANCE CACHE - Various embodiments of the present invention mitigate busy time in a hierarchical store-through memory cache structure. In one embodiment, a cache directory associated with a memory cache is divided into a plurality of portions each associated with a portion memory cache. Simultaneous cache lookup operations and cache write operations between the plurality of portions of the cache directory are supported. Two or more store commands are simultaneously processed in a shared cache pipeline communicatively coupled to the plurality of portions of the cache directory. | 12-29-2011 |
20110320696 | EDRAM REFRESH IN A HIGH PERFORMANCE CACHE ARCHITECTURE - A memory refresh requestor, a memory request interpreter, a cache memory, and a cache controller on a single chip. The cache controller configured to receive a memory access request, the memory access request for a memory address range in the cache memory, detect that the cache memory located at the memory address range is available, and send the memory access request to the memory request interpreter when the memory address range is available. The memory request interpreter configured to receive the memory access request from the cache controller, determine if the memory access request is a request to refresh a contents of the memory address range, and refresh data in the memory address range when the memory access request is a request to refresh memory. | 12-29-2011 |
20110320716 | LOADING AND UNLOADING A MEMORY ELEMENT FOR DEBUG - A method of debugging a memory element is provided. The method includes initializing a line fetch controller with at least one of write data and read data; utilizing at least two separate clocks for performing at least one of write requests and read requests based on the at least one of the write data and the read data; and debugging the memory element based on the at least one of write requests and read requests. | 12-29-2011 |
20110320721 | DYNAMIC TRAILING EDGE LATENCY ABSORPTION FOR FETCH DATA FORWARDED FROM A SHARED DATA/CONTROL INTERFACE - A computer-implemented method for managing data transfer in a multi-level memory hierarchy that includes receiving a fetch request for allocation of data in a higher level memory, determining whether a data bus between the higher level memory and a lower level memory is available, bypassing an intervening memory between the higher level memory and the lower level memory when it is determined that the data bus is available, and transferring the requested data directly from the higher level memory to the lower level memory. | 12-29-2011 |
20110320722 | MANAGEMENT OF MULTIPURPOSE COMMAND QUEUES IN A MULTILEVEL CACHE HIERARCHY - An apparatus for controlling access to a pipeline includes a plurality of command queues including a first subset of the plurality of command queues being assigned processes the commands of first command type, a second subset of the plurality of command queues being assigned to process commands of the second command type, and a third subset of the plurality of the command queues not being assigned to either the first subset or the second subset. The apparatus also includes an input controller configured to receive requests having the first command type and the second command type and assign requests having the first command type to command queues in the first subset until all command queues in the first subset are filled and then assign requests having the first command type to command queues in the third subset. | 12-29-2011 |
20110320725 | DYNAMIC MODE TRANSITIONS FOR CACHE INSTRUCTIONS - A method of providing requests to a cache pipeline includes receiving a plurality of requests from one or more state machines at an arbiter, selecting one of the plurality of requests as a selected request, the selected request having been provided by a first state machine, determining that the selected request includes a mode that requires a first step and a second step, the first step including an access to a location in a cache, determining that the location in the cache is unavailable, and replacing the mode with a modified mode that only includes the second step. | 12-29-2011 |
20110320727 | DYNAMIC CACHE QUEUE ALLOCATION BASED ON DESTINATION AVAILABILITY - An apparatus for controlling operation of a cache includes a first command queue, a second command queue and an input controller configured to receive requests having a first command type and a second command type and to assign a first request having the first command type to the first command queue and a second command having the first command type to the second command queue in the event that the first command queue has not received an indication that a first dedicated buffer is available. | 12-29-2011 |
20110320728 | PERFORMANCE OPTIMIZATION AND DYNAMIC RESOURCE RESERVATION FOR GUARANTEED COHERENCY UPDATES IN A MULTI-LEVEL CACHE HIERARCHY - A cache includes a cache pipeline, a request receiver configured to receive off chip coherency requests from an off chip cache and a plurality of state machines coupled to the request receiver. The cache also includes an arbiter coupled between the plurality of state machines and the cache pipe line and is configured to give priority to off chip coherency requests as well as a counter configured to count the number of coherency requests sent from the cache pipeline to a lower level cache. The cache pipeline is halted from sending coherency requests when the counter exceeds a predetermined limit. | 12-29-2011 |
20110320735 | DYNAMICALLY ALTERING A PIPELINE CONTROLLER MODE BASED ON RESOURCE AVAILABILITY - A mechanism for dynamically altering a request received at a hardware component is provided. The request is received at the hardware component, and the request includes a mode option. It is determined whether an action of the request requires an unavailable resource and it is determined whether the mode option is for the action requiring the unavailable resource. In response to the mode option being for the action requiring the unavailable resource, the action is automatically removed from the request. The request is passed for pipeline arbitration without the action requiring the unavailable resource. | 12-29-2011 |
20110320736 | PREEMPTIVE IN-PIPELINE STORE COMPARE RESOLUTION - A computer-implemented method that includes receiving a plurality of stores in a store queue, via a processor, comparing a fetch request against the store queue to search for a target store having a same memory address as the fetch request, determining whether the target store is ahead of the fetch request in a same pipeline, and processing the fetch request when it is determined that the target store is ahead of the fetch request. | 12-29-2011 |
20110320743 | MEMORY ORDERED STORE SYSTEM IN A MULTIPROCESSOR COMPUTER SYSTEM - A system and computer implemented method for storing of data in the memory of a computer system in order at a fast rate is provided. The method includes launching a first store to memory. A wait counter is initiated. A second store to memory is speculatively launched when the wait counter expires. The second store to memory is cancelled when the second store achieves coherency prior to the first store to memory. | 12-29-2011 |
20110320779 | PERFORMANCE MONITORING IN A SHARED PIPELINE - A pipelined processing device includes: a device controller configured to receive a request to perform an operation; a plurality of subcontrollers configured to receive at least one instruction associated with the operation, each of the plurality of subcontrollers including a counter configured to generate an active time value indicating at least a portion of a time taken to process the at least one instruction; a pipeline processor configured to receive and process the at least one instruction, the pipeline processor configured to receive the active time value; and a shared pipeline storage area configured to store the active time value for each of the plurality of subcontrollers. | 12-29-2011 |
20110320855 | ERROR DETECTION AND RECOVERY IN A SHARED PIPELINE - A pipelined processing device includes: a processor configured to receive a request to perform an operation; a plurality of processing controllers configured to receive at least one instruction associated with the operation, each of the plurality of processing controllers including a memory to store at least one instruction therein; a pipeline processor configured to receive and process the at least one instruction, the pipeline processor including shared error detection logic configured to detect a parity error in the at least one instruction as the at least one instruction is processed in a pipeline and generate an error signal; and a pipeline bus connected to each of the plurality of processing controllers and configured to communicate the error signal from the error detection logic. | 12-29-2011 |
20120215995 | PREEMPTIVE IN-PIPELINE STORE COMPARE RESOLUTION - A computer-implemented method that includes receiving a plurality of stores in a store queue, via a processor, comparing a fetch request against the store queue to search for a target store having a same memory address as the fetch request, determining whether the target store is ahead of the fetch request in a same pipeline, and processing the fetch request when it is determined that the target store is ahead of the fetch request. | 08-23-2012 |
20130046926 | EDRAM REFRESH IN A HIGH PERFORMANCE CACHE ARCHITECTURE - A method for implementing embedded dynamic random access memory (eDRAM) refreshing in a high performance cache architecture. The method includes receiving a memory access request, via a cache controller, from a memory refresh requestor, the memory access request for a memory address range in a cache memory. The method also includes detecting that the cache memory located at the memory address range is available to receive the memory access request and sending the memory access request to a memory request interpreter. The method further includes receiving the memory access request from the cache controller, determining that the memory access request is a request to refresh contents of the memory address range in the cache memory, and refreshing data in the memory address range. | 02-21-2013 |
20130060997 | MITIGATING BUSY TIME IN A HIGH PERFORMANCE CACHE - Various embodiments of the present invention mitigate busy time in a hierarchical store-through memory cache structure. In one embodiment, a cache directory associated with a memory cache is divided into a plurality of portions each associated with a portion memory cache. Simultaneous cache lookup operations and cache write operations between the plurality of portions of the cache directory are supported. Two or more store commands are simultaneously processed in a shared cache pipeline communicatively coupled to the plurality of portions of the cache directory. | 03-07-2013 |
20130080705 | MANAGING IN-LINE STORE THROUGHPUT REDUCTION - Various embodiments of the present invention manage a hierarchical store-through memory cache structure. A store request queue is associated with a processing core in multiple processing cores. At least one blocking condition is determined to have occurred at the store request queue. Multiple non-store requests and a set of store requests associated with a remaining set of processing cores in the multiple processing cores are dynamically blocked from accessing a memory cache in response to the blocking condition having occurred. | 03-28-2013 |