| Patent application number | Description | Published |
| 20080276074 | SIMPLE LOAD AND STORE DISAMBIGUATION AND SCHEDULING AT PREDECODE - Embodiments of the invention provide a processor for executing instructions. In one embodiment, the processor includes circuitry to receive a load instruction and a store instruction to be executed in the processor and detect a conflict between the load instruction and the store instruction. Detecting the conflict includes determining if load-store conflict information indicates that the load instruction previously conflicted with the store instruction. The load-store conflict information is stored for both the load instruction and the store instruction. The processor further includes circuitry to schedule execution of the load instruction and the store instruction so that execution of the load instruction and the store instruction do not result in a conflict. | 11-06-2008 |
| 20080276075 | SIMPLE LOAD AND STORE DISAMBIGUATION AND SCHEDULING AT PREDECODE - Embodiments of the invention provide a method and processor for executing instructions. In one embodiment, the method includes receiving a load instruction and a store instruction to be executed in the processor and detecting a conflict between the load instruction and the store instruction. Detecting the conflict includes determining if load-store conflict information indicates that the load instruction previously conflicted with the store instruction. The load-store conflict information is stored for both the load instruction and the store instruction. The method further includes scheduling execution of the load instruction and the store instruction so that execution of the load instruction and the store instruction do not result in a conflict. | 11-06-2008 |
| 20080276079 | MECHANISM TO MINIMIZE UNSCHEDULED D-CACHE MISS PIPELINE STALLS - A method and apparatus for minimizing unscheduled D-cache miss pipeline stalls is provided. In one embodiment, execution of an instruction in a processor is scheduled. The processor may have at least one cascaded delayed execution pipeline unit having two or more execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The method includes receiving an issue group of instructions, determining if a first instruction in the issue group is a load instruction, and if so, scheduling the first instruction to be executed in a pipeline in which execution is not delayed with respect to another pipeline in the cascaded delayed execution pipeline unit. | 11-06-2008 |
| 20090006754 | DESIGN STRUCTURE FOR L2 CACHE/NEST ADDRESS TRANSLATION - A design structure embodied in a machine readable storage medium for designing, manufacturing, and/or testing a design for accessing a processor's cache memory is provided. The design structure comprises a processor having one or more level one caches, a lookaside buffer configured to include a corresponding entry for each cache line placed in each of the processor's one or more level one caches. The corresponding entry indicates a translation from the effective addresses to the real addresses for the cache line. The processor also comprises circuitry configured to access requested data in the processor's one or more level one caches using requested effective addresses of the requested data, translate the requested effective addresses to real addresses if the processor's one or more level one caches do not contain requested data corresponding to the requested effective addresses, and use the translated real addresses to access the level two cache. | 01-01-2009 |
| 20090138690 | LOCAL AND GLOBAL BRANCH PREDICTION INFORMATION STORAGE - Embodiments of the invention provide an apparatus of storing branch prediction information. In one embodiment, an integrated circuit device includes a first table for storing local branch prediction information, a second table for storing global branch prediction information, and circuitry. The circuitry is configured to receive a branch instruction and store local branch prediction information for the branch instruction in the first table. The local branch prediction information includes a local predictability value for the local branch prediction information. The circuitry is further configured to store global branch prediction information for the branch instruction in the second table only if the local predictability value is below a threshold value of predictability. | 05-28-2009 |
| 20090204787 | Butterfly Physical Chip Floorplan to Allow an ILP Core Polymorphism Pairing - Improved techniques for executing instructions in a pipelined manner that may reduce stalls that occur when executing dependent instructions are provided. Stalls may be reduced by utilizing a cascaded arrangement of pipelines with execution units that are delayed with respect to each other. This cascaded delayed arrangement allows dependent instructions to be issued within a common issue group by scheduling them for execution in different pipelines to execute at different times. Separate processor cores may be morphed to appear differently for different applications. For example, two processor cores each capable of executing N-wide issue groups of instructions may be morphed to appear as a single processor core capable of executing 2N-wide issue groups. | 08-13-2009 |
| 20090204791 | Compound Instruction Group Formation and Execution - A method and apparatus for forming compound issue groups containing instructions from multiple cache lines of instructions are provided. By pre-fetching instruction lines containing instructions targeted by a conditional branch statement, if it is predicted that the conditional branch will be taken, a compound issue group may be formed with instructions from the I-line containing the branch statement and the I-line containing instructions targeted by the branch. | 08-13-2009 |
| 20090204792 | Scalar Processor Instruction Level Parallelism (ILP) Coupled Pair Morph Mechanism - Improved techniques for executing instructions in a pipelined manner that may reduce stalls that occur when executing dependent instructions are provided. Stalls may be reduced by utilizing a cascaded arrangement of pipelines with execution units that are delayed with respect to each other. This cascaded delayed arrangement allows dependent instructions to be issued within a common issue group by scheduling them for execution in different pipelines to execute at different times. Separate processor cores may be morphed to appear differently for different applications. For example, two processor cores each capable of executing N-wide issue groups of instructions may be morphed to appear as a single processor core capable of executing 2N-wide issue groups. | 08-13-2009 |
| 20090210624 | 3-Dimensional L2/L3 Cache Array to Hide Translation (TLB) Delays - Embodiments of the invention provide a look-aside-look-aside buffer (LLB) configured to retain a portion of the real addresses in a translation look-aside (TLB) buffer to allow prefetching of data from a cache. A subset of real address bits associated with an effective address may be retrieved relatively quickly from the LLB, thereby allowing access to the cache before the complete address translation is available and reducing cache access latency. | 08-20-2009 |
| 20090210625 | Self Prefetching L3/L4 Cache Mechanism - Embodiments of the invention provide a look-aside-look-aside buffer (LLB) configured to retain a portion of the real addresses in a translation look-aside (TLB) buffer to allow prefetching of data from a cache. A subset of real address bits associated with an effective address may be retrieved relatively quickly from the LLB, thereby allowing access to the cache before the complete address translation is available and reducing cache access latency. | 08-20-2009 |
| 20090210664 | System and Method for Issue Schema for a Cascaded Pipeline - The present invention provides system and method for a group priority issue schema for a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having four or more execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receiving an issue group of instructions; (2) scheduling the instructions in program order received; and (3) executing the issue group of instructions in the cascaded delayed execution pipeline unit. The present invention can also be viewed as providing methods for providing a group priority issue schema for a cascaded pipeline. The method includes: (1) receiving an issue group of instructions; (2) scheduling the instructions in the program order received; and (3) executing the issue group of instructions in the cascaded delayed execution pipeline unit. | 08-20-2009 |
| 20090210665 | System and Method for a Group Priority Issue Schema for a Cascaded Pipeline - The present invention provides system and method for a group priority issue schema for a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to receiving an issue group of instructions, reordering the issue group of instructions using instruction type priority, and executing the reordered issue group of instructions in the cascaded delayed execution pipeline unit. The method, among others, can be broadly summarized by the following steps: receiving an issue group of instructions, reordering the issue group of instructions using instruction type priority, and executing the reordered issue group of instructions in the cascaded delayed execution pipeline unit. | 08-20-2009 |
| 20090210666 | System and Method for Resolving Issue Conflicts of Load Instructions - The present invention provides system and method for a group priority issue schema for a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: receive an issue group of instructions; determine if at least one load instruction is in the issue group, if so scheduling the least one load instruction in a one of the plurality of execution pipelines based upon a first prioritization scheme; determine if there is a issue conflict for one of the plurality of execution pipelines and resolving the issue conflict by scheduling the at least one load instruction in a different execution pipeline; and schedule execution of the issue group of instructions in the cascaded delayed execution pipeline unit. | 08-20-2009 |
| 20090210667 | System and Method for Optimization Within a Group Priority Issue Schema for a Cascaded Pipeline - The present invention provides system and method for a group priority issue schema for a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions, (2) determine the dependency chain depth of all the instructions in the issue group, (3) schedule the instructions in an order of the longest dependency chain depth to shortest dependency chain depth, and (4) execute the issue group of instructions in the cascaded delayed execution pipeline unit. | 08-20-2009 |
| 20090210668 | System and Method for Optimization Within a Group Priority Issue Schema for a Cascaded Pipeline - The present invention provides system and method for a group priority issue schema for a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if a plurality of load instructions are in the issue group, if so, schedule the plurality of load instructions in descending order of longest dependency chain depth to shortest dependency chain depth in a shortest to longest available execution pipelines; and (3) execute the issue group of instructions in the cascaded delayed execution pipeline unit. | 08-20-2009 |
| 20090210669 | System and Method for Prioritizing Floating-Point Instructions - The present invention provides a system and method for prioritizing floating-point instructions in a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: ( | 08-20-2009 |
| 20090210670 | System and Method for Prioritizing Arithmetic Instructions - The present invention provides a system and method for prioritizing arithmetic instructions in a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if at least one arithmetic instruction is in the issue group, if so scheduling the least one arithmetic instruction in a one of the plurality of execution pipelines based upon a first prioritization scheme; (3) determine if there is an issue conflict for one of the plurality of execution pipelines and resolving the issue conflict by scheduling the at least one arithmetic instruction in a different execution pipeline; (4) schedule execution of the issue group of instructions in the cascaded delayed execution pipeline unit. | 08-20-2009 |
| 20090210671 | System and Method for Prioritizing Store Instructions - The present invention provides a system and method for prioritizing store instructions in a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if at least one store instruction is in the issue group, if so scheduling the least one store instruction in a one of the plurality of execution pipelines based upon a first prioritization scheme; (3) determine if there is an issue conflict for one of the plurality of execution pipelines and resolving the issue conflict by scheduling the at least one store instruction in a different execution pipeline; (4) schedule execution of the issue group of instructions in the cascaded delayed execution pipeline unit. | 08-20-2009 |
| 20090210672 | System and Method for Resolving Issue Conflicts of Load Instructions - The present invention provides system and method for a group priority issue schema for a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: receive an issue group of instructions; determine if at least one load instruction is in the issue group, if so scheduling the least one load instruction in a one of the plurality of execution pipelines based upon a first prioritization scheme; determine if there is an issue conflict for one of the plurality of execution pipelines and resolving the issue conflict by determining which load involved in the issue conflict has the closest dependency, and schedule any load involved in the issue conflict not having the closest dependency in a delayed execution pipeline. | 08-20-2009 |
| 20090210673 | System and Method for Prioritizing Compare Instructions - The present invention provides a system and method for prioritizing compare instructions in a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if at least one compare instruction is in the issue group, if so scheduling the least one compare instruction in a one of the plurality of execution pipelines based upon a first prioritization scheme; (3) determine if there is an issue conflict for one of the plurality of execution pipelines and resolving the issue conflict by scheduling the at least one compare instruction in a different execution pipeline; (4) schedule execution of the issue group of instructions in the cascaded delayed execution pipeline unit. | 08-20-2009 |
| 20090210674 | System and Method for Prioritizing Branch Instructions - The present invention provides a system and method for prioritizing branch instructions in a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if at least one branch instruction is in the issue group, if so scheduling the least one branch instruction in a one of the plurality of execution pipelines based upon a first prioritization scheme; (3) determine if there is an issue conflict for one of the plurality of execution pipelines and resolving the issue conflict by scheduling the at least one branch instruction in a different execution pipeline; (4) schedule execution of the issue group of instructions in the cascaded delayed execution pipeline unit. | 08-20-2009 |
| 20090210676 | System and Method for the Scheduling of Load Instructions Within a Group Priority Issue Schema for a Cascaded Pipeline - The present invention provides system and method for a group priority issue schema for a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions; (2) determine if at least one load instruction is in the issue group, if so scheduling the least one load instruction in a first pipeline based upon a priority list; and (3) schedule execution of the issue group of instructions in the cascaded delayed execution pipeline unit. | 08-20-2009 |
| 20090210677 | System and Method for Optimization Within a Group Priority Issue Schema for a Cascaded Pipeline - The present invention provides system and method for a group priority issue schema for a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having a plurality of execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receive an issue group of instructions, (2) determine a stall penalty of all the instructions in the issue group, (3) schedule the instructions in an order of the longest stall penalty to shortest stall penalty, and (4) execute the issue group of instructions in the cascaded delayed execution pipeline unit. | 08-20-2009 |
| 20090265527 | Multiport Execution Target Delay Queue Fifo Array - One embodiment provides a method of forwarding data in a processor. The method generally includes providing at least one cascaded delayed execution pipeline unit having at least a first pipeline and a second pipeline for executing first and second instructions in a common issue group, wherein the second pipeline executes the second instruction in a delayed manner relative to the execution of the first instruction in the first pipeline, storing results generated by an execution unit of the first pipeline in a first-in first-out (FIFO) storage target delay queue, determining if the target delay queue contains source data for executing the second instruction, and if the target delay queue contains source data for the second instruction, forwarding the source data for the second instruction from the target delay queue to an execution unit of the second pipeline. | 10-22-2009 |
| 20100077177 | Multiple Processor Core Vector Morph Coupling Mechanism - One embodiment of the invention provides a processor. The processor generally includes a first and second processor core, each having a plurality of pipelined execution units for executing an issue group of multiple instructions and scheduling logic configured to issue a first issue group of instructions to the first processor core for execution and a second issue group of instructions to the second processor core for execution when the processor is in a first mode of operation and configured to issue one or more vector instructions for concurrent execution on the first and second processor cores when the processor is in a second mode of operation. | 03-25-2010 |
| 20100306471 | D-CACHE LINE USE HISTORY BASED DONE BIT BASED ON SUCCESSFUL PREFETCHABLE COUNTER - A method of providing history based done logic for a D-cache includes receiving a D-cache line in an L2 cache; determining if the D-cache line is unprefetchable; aging the D-cache line without a delay if the D-cache line is prefetchable; and aging the D-cache line with a delay if the D-cache line is unprefetchable. | 12-02-2010 |
| 20100306472 | I-CACHE LINE USE HISTORY BASED DONE BIT BASED ON SUCCESSFUL PREFETCHABLE COUNTER - A method of providing history based done logic for a I-cache includes receiving an I-cache line in an L2 cache; determining if the I-cache line is unprefetchable; aging the I-cache line without a delay if the I-cache line is prefetchable; and aging the I-cache line with a delay is the I-cache line is unprefetchable. | 12-02-2010 |
| 20100306473 | CACHE LINE USE HISTORY BASED DONE BIT MODIFICATION TO D-CACHE REPLACEMENT SCHEME - A method of providing history based done logic includes receiving a cache line in a L2 cache; determining if the cache line has a history of access at least three times on a previous call into the L2 cache; providing the cache line directly to a processor if the history of access was less then the at least three times; and loading the cache line into an L1 cache if the history of access was the at least three times. | 12-02-2010 |
| 20100306474 | CACHE LINE USE HISTORY BASED DONE BIT MODIFICATION TO I-CACHE REPLACEMENT SCHEME - A method of providing history based done logic for instructions includes receiving an instruction in a cache line in a L2 cache; and loading the cache line into an L1 cache with a history count that indicates the number of read references of the previous access. | 12-02-2010 |