Class / Patent application number | Description | Number of patent applications / Date published |
712237000 | Prefetching a branch target (i.e., look ahead) | 60 |
20090049286 | DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING HAVING IMPROVED BRANCH TARGET ADDRESS CACHE - A processor includes an execution unit and instruction sequencing logic that fetches instructions from a memory system for execution. The instruction sequencing logic includes branch logic that outputs predicted branch target addresses for use as instruction fetch addresses. The branch logic includes a level one branch target address cache (BTAC) and a level two BTAC each having a respective plurality of entries each associating at least a tag with a predicted branch target address. The branch logic accesses the level one and level two BTACs in parallel with a tag portion of a first instruction fetch address to obtain a first predicted branch target address from the level one BTAC for use as a second instruction fetch address in a first processor clock cycle and a second predicted branch target address from the level two BTAC for use as a third instruction fetch address in a later second processor clock cycle. | 02-19-2009 |
20090070568 | Computation parallelization in software reconfigurable all digital phase lock loop - A novel and useful apparatus for and method of software based phase locked loop (PLL). The software based PLL incorporates a reconfigurable calculation unit (RCU) that is optimized and programmed to sequentially perform all the atomic operations of a PLL or any other desired task in a time sharing manner. An application specific instruction-set processor (ASIP) incorporating the RCU includes an instruction set whose instructions are optimized to perform the atomic operations of a PLL. A multi-stage data stream based processor incorporates a parallel/pipelined architecture optimized to perform data stream processing efficiently. The multi-stage parallel/pipelined processor provides significantly higher processing speeds by combining multiple RCUs wherein input data samples are input in parallel to all RCUs while computation results from one RCU are used by adjacent downstream RCUs. A register file provides storage for historical values while local storage in each RCU provides storage for temporary results. | 03-12-2009 |
20090265532 | ANTI-PREFETCH INSTRUCTION - Embodiments of the present invention execute an anti-prefetch instruction. These embodiments start by decoding instructions in a decode unit in a processor to prepare the instructions for execution. Upon decoding an anti-prefetch instruction, these embodiments stall the decode unit to prevent decoding subsequent instructions. These embodiments then execute the anti-prefetch instruction, wherein executing the anti-prefetch instruction involves: (1) sending a prefetch request for a cache line in an L1 cache; (2) determining if the prefetch request hits in the L1 cache; (3) if the prefetch request hits in the L1 cache, determining if the cache line contains a predetermined value; and (4) conditionally performing subsequent operations based on whether the prefetch request hits in the L1 cache or the value of the data in the cache line. | 10-22-2009 |
20090300340 | Accuracy of Correlation Prefetching Via Block Correlation and Adaptive Prefetch Degree Selection - A method for prefetching data and/or instructions from a main memory to a cache memory may include generating control flow information by storing respective information for each retired branch instruction. The method may further include storing respective one or more cache miss addresses for each retired instruction that incurs one or more cache misses, with the respective one or more cache miss addresses corresponding respectively to the one or more cache misses. A correlation table may be maintained based on the generated control flow information and the stored cache miss addresses. Each respective correlation table entry may correspond to a respective index, and may contain a respective tag and a respective correlation list. The correlation list may consist of a specified number of cache miss addresses that most frequently follow the cache miss address used in generating the index to which the respective correlation table entry corresponds. A prefetch operation may be performed for each cache miss based on the contents of the correlation table entry corresponding to the index generated using a combination of bits of a given cache miss address corresponding to the cache miss, and at least a subset of bits of the program control flow information corresponding to the given cache miss address. | 12-03-2009 |
20100146248 | METHODS AND APPARATUS FOR PERFORMING JUMP OPERATIONS IN A DIGITAL PROCESSOR - Methods and apparatus are provided for performing a jump operation in a pipelined digital processor. The method includes writing target addresses of jump instructions to be executed to a memory table, detecting a first jump instruction being executed by the processor, the first jump instruction referencing a pointer to a first target address in the memory table, the processor executing the first jump instruction by jumping to the first target address and modifying the pointer to point to a second target address in the memory table, the second target address corresponding to a second jump instruction. The execution of the first jump instruction may include prefetching at least one future target address from the memory table and writing the future target address in a local memory. The second target address may be accessed in the local memory in response to detection of the second jump instruction. | 06-10-2010 |
20100169624 | Adaptive Fetch Advance Control for a Low Power Processor - A digital signal processor (DSP) includes an instruction buffer queue (IBQ) with multiple lines, as well as a modifiable fetch advance parameter to specify a fetch advance setting for the IBQ. The DSP also has a control flow module. In response to execution of a program in the DSP, the control flow module may automatically determine whether a branch has been predicted for the program, or for a portion of the program. The control flow module may automatically reduce the fetch advance parameter in response to determining that a branch has been predicted for the program. Also, the control flow module may automatically increase the fetch advance setting in response to determining that no branch has been predicted for a portion of the program. Other embodiments are described and claimed. | 07-01-2010 |
20100306513 | Processor Core and Method for Managing Program Counter Redirection in an Out-of-Order Processor Pipeline - A processor core and method for managing program counter redirection in an out-of-order processor pipeline. In one embodiment, the pipeline of the processor core includes a front-end instruction fetch portion, a back-end instruction execution portion, and pipeline control logic. Operation of the instruction fetch portion is decoupled from operation of the instruction execution portion. Following detection of a control transfer misprediction, operation of the instruction fetch portion is halted and instructions residing in the instruction fetch portion are invalidated. When the instruction associated with the misprediction reaches a selected pipeline stage, instructions residing in the instruction execution portion of the pipeline are invalidated and the flow of instructions from the instruction fetch portion to the instruction execution portion of the processor pipeline is restarted. A mispredict instruction identification checker and instruction identification tags are used to determine if a control transfer instruction is permitted to redirect instruction fetching. | 12-02-2010 |
20100332811 | SPECULATIVE MULTI-THREADING FOR INSTRUCTION PREFETCH AND/OR TRACE PRE-BUILD - The latencies associated with retrieving instruction information for a main thread are decreased through the use of a simultaneous helper thread. The helper thread is a speculative prefetch thread to perform instruction prefetch and/or trace pre-build for the main thread. | 12-30-2010 |
20120124345 | CUMULATIVE CONFIDENCE FETCH THROTTLING - A method and apparatus to utilize a fetching scheme for instructions in a processor to limit the expenditure of power caused by the speculative execution of branch instructions is provided. Also provided is a computer readable storage device encoded with data for adapting a manufacturing facility to create an apparatus. The method includes calculating a cumulative confidence measure based on one or more outstanding conditional branch instructions. The method also includes reducing prefetching operations in response to detecting that the cumulative confidence measure is below a first threshold level. | 05-17-2012 |
20140019736 | Embedded Branch Prediction Unit - In accordance with some embodiments of the present invention, a branch prediction unit for an embedded controller may be placed in association with the instruction fetch unit instead of the decode stage. In addition, the branch prediction unit may include no branch predictor. Also, the return address stack may be associated with the instruction decode stage and is structurally separate from the branch prediction unit. In some cases, this arrangement reduces the area of the branch prediction unit, as well as power consumption. | 01-16-2014 |
20140122846 | BRANCH TARGET ADDRESS CACHE USING HASHED FETCH ADDRESSES - An integrated circuit | 05-01-2014 |
20140164748 | PRE-FETCHING INSTRUCTIONS USING PREDICTED BRANCH TARGET ADDRESSES - The present application describes a method and apparatus for prefetching instructions based on predicted branch target addresses. Some embodiments of the method include providing a second cache line to a second cache when a target address for a branch instruction in a first cache line of a first cache is included in the second cache line of the first cache and when the second cache line is not resident in the second cache. | 06-12-2014 |
20140195788 | REDUCING INSTRUCTION MISS PENALTIES IN APPLICATIONS - Embodiments include systems and methods for reducing instruction cache miss penalties during application execution. Application code is profiled to determine “hot” code regions likely to experience instruction cache miss penalties. The application code can be linearized into a set of traces that include the hot code regions. Embodiments traverse the traces in reverse, keeping track of instruction scheduling information, to determine where an accumulated instruction latency covered by the code blocks exceeds an amount of latency that can be covered by prefetching. Each time the accumulated latency exceeds the amount of latency that can be covered by prefetching, a prefetch instruction can be scheduled in the application code. Some embodiments insert additional prefetches, merge prefetches, and/or adjust placement of prefetches to account for scenarios, such as loops, merging or forking branches, edge confidence values, etc. | 07-10-2014 |
20140372734 | User-Level Hardware Branch Records - A processor, a method and a computer-readable medium for recording branch addresses are provided. The processor comprises hardware registers and first and second circuitry. The first circuitry is configured to store a first address associated with a branch instruction in the hardware registers. The first circuitry is further configured to store a second address that indicates where the processor execution is redirected to as a result of the branch instruction in the hardware registers. The second circuitry is configured to, in response to a second instruction, retrieve a value of at least one of the registers. The second instruction can be a user-level instruction. | 12-18-2014 |
20140372735 | SOFTWARE CONTROLLED INSTRUCTION PREFETCH BUFFERING - The invention relates to the method of prefetching instruction in micro-processor buffer under software controls. | 12-18-2014 |
20160378661 | INSTRUCTION BLOCK ALLOCATION - Apparatus and methods are disclosed for throttling processor operation in block-based processor architectures. In one example of the disclosed technology, a block-based instruction set architecture processor includes a plurality of processing cores configured to fetch and execute a sequence of instruction blocks. Each of the processing cores includes function resources for performing operations specified by the instruction blocks. The processor further includes a core scheduler configured to allocate functional resources for performing the operations. The functional resources are allocated for executing the instruction blocks based, at least in part, on a performance metric. The performance metric can be generated dynamically or statically based on branch prediction accuracy, energy usage tolerance, and other suitable metrics. | 12-29-2016 |
20220137974 | BRANCH DENSITY DETECTION FOR PREFETCHER - In one embodiment, a microprocessor, comprising: first logic configured to dynamically adjust a maximum prefetch count based on a total count of predicted taken branches over a predetermined quantity of cache lines; and second logic configured to prefetch instructions based on the adjusted maximum prefetch count. | 05-05-2022 |
712238000 | Branch target buffer | 43 |
20080201564 | DATA PROCESSOR - An object of the present invention is to achieve fast data processing. A unit (FF) is included for selecting whether a central processing unit (CPU) performs instruction reading in units of 16 bits (a first word length) or in units of 32 bits (a second word length). Depending on whether instruction reading is performed in units of 16 bits or 32 bits, increment values (+2 and +4) by which a program counter (PC) is incremented are switched. Data reading or writing is performed in units of a given data length irrespective of the selecting unit. When the CPU issues a request for instruction reading in units of 16 bits or 32 bits or for data reading or writing, a bus control unit performs reading or writing a predetermined number of times according to a bus width designated for a resource located at an address specified in the request. The bus control unit causes the CPU to wait until an instruction of 16 or 32 bits long (read data) requested by the CPU gets ready. | 08-21-2008 |
20080288760 | BRANCH TARGET PREDICTION FOR MULTI-TARGET BRANCHES BY IDENTIFYING A REPEATED PATTERN - An information processing system for branch target prediction includes: a first memory for storing entries for multi-target branch, wherein each entry includes a plurality of target addresses representing a history of target addresses for each single branch in the multi-target branch, and wherein said first memory stores an entry for the branch only if the branch is a multi-target branch; hardware logic for reading the memory and identifying a repeated pattern in each of the plurality of target addresses for the multi-target branch; logic for predicting a next target address for the multi-target branch based on the repeated pattern that was identified, using a pattern matching algorithm; and a second memory for storing information regarding whether a branch is a multi-target branch; wherein the logic for reading and the logic for predicting are executed only if the branch is the multi-target branch. | 11-20-2008 |
20090037708 | TARGET BRANCH PREDICTION USING CORRELATION OF LOCAL TARGET HISTORIES - A system for predicting multiple targets for a single branch includes: a branch target buffer that includes a previous next address for an instruction and that receives an indirect instruction address to provide a first branch target prediction; a first branch table for capturing local past target information of an indirect branch in an encoded form; a second branch table which is a correlation table for storing potential branch targets based on a local branch history and which provides a second branch target prediction when the first branch target prediction is not successful; an exclusion predictor for inhibiting updates of inefficient entries; and a multiplexer to select the predicted target as output. | 02-05-2009 |
20090106540 | APPARATUS AND METHOD FOR REMANIPULATING INSTRUCTIONS - An apparatus for modifying instructions of a machine readable program according to remanipulation rules includes a remanipulation unit, which is configured to identify a manipulated instruction and to remanipulate the manipulated instruction according the remanipulation rules. The apparatus further includes a processor unit configured to process a predetermined instruction set, wherein the predetermined instruction set includes manipulated instructions and remanipulated instructions. | 04-23-2009 |
20090119493 | Using Branch Instruction Counts to Facilitate Replay of Virtual Machine Instruction Execution - A method and computer program product for logging non-deterministic events of a virtual machine executing a sequence guest instructions, the method including tracking an execution point in the sequence of executing guest instructions, the tracking of the execution point including determining a branch count of executed branch instructions; and detecting an occurrence of a non-deterministic event directed to the virtual machine during execution of the sequence of guest instructions, and recording information which includes an identifier of a current execution point, wherein the identifier includes the branch count. | 05-07-2009 |
20090177875 | BRANCH TARGET BUFFER ADDRESSING IN A DATA PROCESSOR - A branch target buffer (BTB) receives, from a processor, a current fetch group address which corresponds to a current fetch group including a plurality of instructions. In response to the current fetch group address resulting in a group hit in the BTB, the BTB provides to the processor a branch target address corresponding to a branch instruction within the current fetch group which is indicated by a control field as valid and predicted taken. The BTB generates the branch target address using an unshared lower order target portion, corresponding to the branch instruction and located within the entry of the BTB which caused the group hit, and one of a shared higher order target portion located within the entry of the BTB which caused the group hit or a higher order portion of the current fetch group address based on a value of the control field. | 07-09-2009 |
20090198981 | DATA PROCESSING SYSTEM, PROCESSOR AND METHOD OF DATA PROCESSING HAVING BRANCH TARGET ADDRESS CACHE STORING DIRECT PREDICTIONS - In at least one embodiment, a processor includes at least one execution unit and instruction sequencing logic that fetches instructions for execution by the execution unit. The instruction sequencing logic includes branch logic that outputs predicted branch target addresses for use as instruction fetch addresses. The branch logic includes a branch target address cache (BTAC) having at least one direct entry providing storage for a direct branch target address prediction associating a first instruction fetch address with a branch target address to be used as a second instruction fetch address immediately after the first instruction fetch address and at least one indirect entry providing storage for an indirect branch target address prediction associating a third instruction fetch address with a branch target address to be used as a fourth instruction fetch address subsequent to both the third instruction fetch address and an intervening fifth instruction fetch address. | 08-06-2009 |
20090222648 | SELECTIVE POSTPONEMENT OF BRANCH TARGET BUFFER (BTB) ALLOCATION - A system and method provides branch target buffer (BTB) allocation. When a branch instruction is received, a branch target address that corresponds to the branch instruction is determined. A determination is made whether the branch target address is presently stored in a branch target buffer (BTB). When the branch target address is not presently stored in the branch target buffer, an entry in the branch target buffer is identified to receive the branch target address. A value in a field within the identified entry in the branch target buffer, such as a postponement flag (PF), is used to selectively override a replacement decision defined by predetermined branch target buffer allocation criteria. In one form, if a branch is taken, the identified entry is replaced with the branch target address in response to determining that the value in the field within the identified entry has a predetermined value. | 09-03-2009 |
20090249048 | BRANCH TARGET BUFFER ADDRESSING IN A DATA PROCESSOR - A data processing system includes a branch target buffer (BTB) including a plurality of entries, each entry comprising a tag portion and a long branch indicator. The system also includes segment target address storage circuitry which stores a plurality of segment target addresses, index storage circuitry which stores a plurality of indices for indexing into the segment target address storage circuitry, and control circuitry which receives an instruction address and determines whether the instruction address matches a valid entry in the BTB. When the instruction address matches a valid entry in the BTB and the long branch indicator of the valid entry indicates a long branch, the index storage circuitry provides a selected index of the plurality of indices selected by the received instruction address. In response to the selected index, the segment target address storage circuitry provides a selected segment target address as a higher order target address portion. | 10-01-2009 |
20090249049 | PRECISE BRANCH COUNTING IN VIRTUALIZATION SYSTEMS - A method for precisely counting guest branch instructions in a virtualized computer system is described. In one embodiment, guest instructions execute in a direct execution mode of the virtualized computer system. The direct execution mode operates at a first privilege level having a lower privilege than a second privilege level. A branch count of previously executed first privilege level branch instructions is maintained as instructions execute. Execution of a first privilege level branch instruction caused by a control transfer to the direct execution mode is detected. Responsive to the detection, a guest branch instruction count is determined based on the first privilege level branch count. | 10-01-2009 |
20100017586 | FETCHING ALL OR PORTION OF INSTRUCTIONS IN MEMORY LINE UP TO BRANCH INSTRUCTION BASED ON BRANCH PREDICTION AND SIZE INDICATOR STORED IN BRANCH TARGET BUFFER INDEXED BY FETCH ADDRESS - The invention provides a method and apparatus for branch prediction in a processor. A fetch-block branch target buffer is used in an early stage of pipeline processing before the instruction is decoded, which stores information about a control transfer instruction for a “block” of instruction memory. The block of instruction memory is represented by a block entry in the fetch-block branch target buffer. The block entry represents one recorded control-transfer instruction (such as a branch instruction) and a set of sequentially preceding instructions, up to a fixed maximum length N. Indexing into the fetch-block branch target buffer yields an answer whether the block entry represents memory that contains a previously executed a control-transfer instruction, a length value representing the amount of memory that contains the instructions represented by the block, and an indicator for the type of control-transfer instruction that terminates the block, its target and outcome. Both the decode and execution pipelines include correction capabilities for modifying the block branch target buffer dependent on the results of the instruction decode and execution and can include a mechanism to correct malformed instructions. | 01-21-2010 |
20100031010 | BRANCH TARGET BUFFER ALLOCATION - A data processing system and method are provided for allocating an entry in a branch target buffer (BTB). The method comprises: receiving a branch instruction to be executed in a data processor; determining that the BTB does not include an entry corresponding to the branch instruction; identifying an entry in the BTB for allocation, the identified entry in the BTB comprising a target identifier and a first prediction value for a previously received branch instruction; determining whether to allocate the branch instruction to the identified entry in the BTB based on a comparison of the first prediction value to a second prediction value, wherein the second prediction value is generated from a branch history table (BHT); and allocating the branch instruction to the identified entry if the second prediction value indicates a more strongly taken prediction than the first prediction value. | 02-04-2010 |
20100058038 | Branch Target Buffer System And Method For Storing Target Address - A branch target buffer (BTB) system and method for storing target address is provided, applicable to a 16-bit, 32-bit, 64-bit or higher processor architecture. When storing the target address of the branch instruction, the BTB stores the variation range, carry bit and sub/add bit of the target address without having to store all the bits of the target address. Because the BTB of the present invention does not need to store the identical part of the branch instruction address and the target address, the present invention reduces the number of bits of the target address field for the BTB of the processor. Although the present invention uses less bits for target address field, the present invention is able to generate a complete target address without affecting the computation performance. | 03-04-2010 |
20100191943 | COORDINATION BETWEEN A BRANCH-TARGET-BUFFER CIRCUIT AND AN INSTRUCTION CACHE - A digital signal processor (DSP) having (i) a processing pipeline for processing instructions received from an instruction cache (I-cache) and (ii) a branch-target-buffer (BTB) circuit for predicting branch-target instructions corresponding to received branch instructions. The DSP reduces the number of I-cache misses by coordinating its BTB and instruction pre-fetch functionalities. The coordination is achieved by tying together an update of branch-instruction information in the BTB circuit and a pre-fetch request directed at a branch-target instruction implicated in the update. In particular, if an update of the branch-instruction information is being performed, then, before the branch instruction implicated in the update reenters the processing pipeline, the DSP initiates a pre-fetch of the corresponding branch-target instruction. | 07-29-2010 |
20100228957 | Systems and Methods for Branch Prediction Override During Process Execution - Various embodiments of the present invention provide systems and methods for branch prediction. As an example, some embodiments of the present invention provides processor circuits that include a program address circuit, a branch target buffer, a branch prediction replacement circuit, and an execution pipeline. The branch target buffer includes a plurality of entries each associated with a respective change of flow instruction. Each entry includes an indication of an entry source and a next program address corresponding to the respective change of flow instruction. The branch prediction replacement circuit is operable to determine replacement priorities of the plurality of entries based at least in part on the entry source for each of the plurality of entries. The execution pipeline receives an executable instruction corresponding to one of the next program addresses. | 09-09-2010 |
20110055529 | EFFICIENT BRANCH TARGET ADDRESS CACHE ENTRY REPLACEMENT - A microprocessor includes a branch target address cache (BTAC), each entry thereof configured to store branch prediction information for at most N branch instructions. An execution unit executes a branch instruction previously fetched in a fetch quantum. Update logic determines whether the BTAC is already storing information for N branch instructions within the fetch quantum (N is at least two), updates the BTAC for the branch instruction if the BTAC is not already storing information for N branch instructions, determines whether a type of the branch instruction has a higher replacement priority than a type of the N branch instructions if the BTAC is already storing information for N branch instructions, and updates the BTAC for the branch instruction if the type of the branch instruction has a higher replacement priority than the type of the N branch instructions already stored in the BTAC. | 03-03-2011 |
20110107071 | SYSTEM AND METHOD FOR USING A BRANCH MIS-PREDICTION BUFFER - A system and method is provided for executing a conditional branch instruction. The system and method may include a branch predictor to predict one or more instructions that depend on the conditional branch instruction and a branch mis-prediction buffer to store correct instructions that were not predicted by the branch predictor during a branch mis-prediction. | 05-05-2011 |
20110113223 | BRANCH TARGET BUFFER FOR EMULATION ENVIRONMENTS - Branch instructions are managed in an emulation environment that is executing a program. A plurality of entries is populated in a branch target buffer that resides within an emulated environment in which the program is executing. Each of the entries comprises an instruction address and a target address of a branch instruction of the program. When an indirect branch instruction of the program is encountered a processor analyzes one of the entries in the branch target buffer to determine if the instruction address of the one entry is associated with a target address of the indirect branch instruction. If the instruction address of the one entry is associated with the target address of the indirect branch instruction a branch to the target address of the one entry is performed. | 05-12-2011 |
20110320789 | Method and Apparatus for High Performance Cache Translation Look-Aside Buffer TLB Lookups Using Multiple Page Size Prediction - A computer processing system method and apparatus having a processor employing an operating system (O/S) multi-task control between multiple user programs and which ensures that the programs do not interfere with each other, said computing processing system having a branch multiple page size prediction mechanism which predicts a page size along with a branch direction and a branch target of a branch for instructions of a processing pipeline, having a branch target buffer (BTB) predicting the branch target, said branch prediction mechanism storing recently used instructions close to the processor in a local cache, and having a translation look-aside buffer TLB mechanism which tracks the translation of the most recent pages and supports multiple page sizes. | 12-29-2011 |
20120042155 | METHODS AND APPARATUS FOR PROACTIVE BRANCH TARGET ADDRESS CACHE MANAGEMENT - A multiple stage branch prediction system includes a branch target address cache (BTAC) and a branch predictor circuit. The BTAC is configured to store a BTAC entry. The branch predictor circuit is configured to store state information. The branch predictor circuit utilizes the state information to predict the direction of a branch instruction and to manage the BTAC entry based on modified state information prior to resolution of the branch instruction. | 02-16-2012 |
20120151193 | OPTIMIZED BUFFER PLACEMENT BASED ON TIMING AND CAPACITANCE ASSERTIONS - A method is provided for optimized buffer placement based on timing and capacitance assertions in a functional chip unit including a single source and multiple macros, each having a sink. Placement of the source and macros with the sinks is pre-designed and buffers are placed in branches connecting the source with the multiple sinks. The method includes: calculating an estimated slack for each branch based on cycle reach, calculating a minimum slack for each branch, arranging branches according to the calculated slack to evaluate at least one most critical branch, inserting decoupling buffers in all branches except the most critical branch(es) and placing decoupling buffers close to the source, globally routing the most critical branch(es) and fixing slew conditions within this branch, globally routing at least one subsequent branch as arranged according to the calculated slack and fixing slew conditions within this branch(es), and routing all remaining branches. | 06-14-2012 |
20130332712 | BRANCH PREDICTION TABLE INSTALL SOURCE TRACKING - Embodiments relate to branch prediction table install source tracking. An aspect includes a system for branch prediction table install source tracking. The system includes memory configured to store instructions accessible by a processor. The processor includes a branch target buffer, where the processor is configured to perform a method. The method includes receiving at the branch target buffer a request to install a branch target buffer entry corresponding to a branch instruction for branch prediction, and identifying a source of the request as an install source of the branch target buffer entry. The method further includes storing an install source identifier in the branch target buffer based on the install source. | 12-12-2013 |
20130339691 | BRANCH PREDICTION PRELOADING - Embodiments relate to branch prediction preloading. An aspect includes a system for branch prediction preloading. The system includes an instruction cache and branch target buffer (BTB) coupled to a processing circuit, the processing circuit configured to perform a method. The method includes fetching a plurality of instructions in an instruction stream from the instruction cache, and decoding a branch prediction preload instruction in the instruction stream. An address of a predicted branch instruction is determined based on the branch prediction preload instruction. A predicted target address is determined based on the branch prediction preload instruction. A mask field is identified in the branch prediction preload instruction, and a branch instruction length is determined based on the mask field. Based on executing the branch prediction preload instruction, the BTB is preloaded with the address of the predicted branch instruction, the branch instruction length, the branch type, and the predicted target address. | 12-19-2013 |
20130339692 | MITIGATING INSTRUCTION PREDICTION LATENCY WITH INDEPENDENTLY FILTERED PRESENCE PREDICTORS - Embodiments of the disclosure include mitigating instruction prediction latency with independently filtered instruction prediction presence predictors coupled to the processor pipeline. The prediction presence predictor includes a plurality of presence predictors configured to each receive an instruction address in parallel and to generate an unfiltered indication of an associated instruction prediction. The prediction presence predictor includes a plurality of dynamic filters that are each coupled to one of the plurality of presence predictors. Each dynamic filter is configured to block the unfiltered indications based on a performance of the presence predictor it is coupled to. The prediction presence predictor further including stall determination logic coupled to the plurality of dynamic filters. The stall determination logic is configured to generate a combined indication that will stall instruction delivery, allowing potentially latent instruction predictions to be accounted for, based upon one or more non-blocked indications received from the plurality of dynamic filters. | 12-19-2013 |
20130339693 | SECOND-LEVEL BRANCH TARGET BUFFER BULK TRANSFER FILTERING - Embodiments relate to second-level branch target buffer bulk transfer filtering. An aspect includes a system for second-level branch target buffer bulk transfer filtering. The system includes a first-level branch target buffer and a second-level branch target buffer coupled to a processing circuit. The processing circuit is configured to perform a method. The method includes receiving branch target buffer miss indicators, receiving instruction cache miss indicators, and recording information about the branch target buffer miss indicators and the instruction cache miss indicators in search trackers. Based on detecting, by the processing circuit, a search tracker representing a correlated pair of the branch target buffer miss indicators and the instruction cache miss indicators, the search tracker is activated by the processing circuit to perform a bulk transfer from the second-level branch target buffer to the first-level branch target buffer. | 12-19-2013 |
20130339694 | SEMI-EXCLUSIVE SECOND-LEVEL BRANCH TARGET BUFFER - Embodiments relate to a semi-exclusive second-level branch target buffer. An aspect includes a system for a semi-exclusive second-level branch target buffer. The system includes a first-level branch target buffer (BTB1), a branch target buffer preload table (BTBP), and a second-level branch target buffer (BTB2) coupled to a processing circuit. The processing circuit is configured to perform a method. The method includes performing a search to locate entries in the BTB2 having a memory region corresponding to a search request. Based on locating entries in the BTB2, a bulk transfer of located entries is performed from the BTB2 to the BTBP. A state associated with the located entries is updated to encourage exclusivity between the BTB1 and the BTB2. Based on transferring a BTBP entry from the BTBP to the BTB1, a BTB1 entry is evicted from the BTB1. The evicted BTB1 entry is transferred from the BTB1 to the BTB2. | 12-19-2013 |
20140059331 | BRANCH TARGET BUFFER FOR EMULATION ENVIRONMENTS - Branch instructions are managed in an emulation environment that is executing a program. A plurality of entries is populated in a branch target buffer that resides within an emulated environment in which the program is executing. Each of the entries comprises an instruction address and a target address of a branch instruction of the program. When an indirect branch instruction of the program is encountered a processor analyzes one of the entries in the branch target buffer to determine if the instruction address of the one entry is associated with a target address of the indirect branch instruction. If the instruction address of the one entry is associated with the target address of the indirect branch instruction a branch to the target address of the one entry is performed. | 02-27-2014 |
20140059332 | BRANCH TARGET BUFFER FOR EMULATION ENVIRONMENTS - Branch instructions are managed in an emulation environment that is executing a program. A plurality of slots in a Polymorphic Inline Cache is populated. A plurality of entries is populated in a branch target buffer residing within an emulated environment in which the program is executing. When an indirect branch instruction associated with the program is encountered, a target address associated with the instruction is identified from the indirect branch instruction. At least one address in each of the slots of the Polymorphic Inline Cache is compared to the target address associated with the indirect branch instruction. If none of the addresses in the slots of the Polymorphic Inline Cache matches the target address associated with the indirect branch instruction, the branch target buffer is searched to identify one of the entries in the branch target buffer that is associated with the target address of the indirect branch instruction. | 02-27-2014 |
20140082336 | TARGET BUFFER ADDRESS REGION TRACKING - Embodiments relate to target buffer address region tracking. An aspect includes receiving a restart address, and comparing, by a processing circuit, the restart address to a first stored address and to a second stored address. The processing circuit determines which of the first and second stored addresses is identified as a same range and a different range to form a predicted target address range defining an address region associated with an entry in the target buffer. Based on determining that the restart address matches the first stored address, the first stored address is identified as the same range and the second stored address is identified as the different range. Based on determining that the restart address matches the second stored address, the first stored address is identified as the different range and the second stored address is identified as the same range. | 03-20-2014 |
20140082337 | BRANCH TARGET BUFFER PRELOAD TABLE - Embodiments relate to using a branch target buffer preload table. An aspect includes receiving a search request to locate branch prediction information associated with a branch instruction. Searching is performed for an entry corresponding to the search request in a branch target buffer and a branch target buffer preload table in parallel. Based on locating a matching entry in the branch target buffer preload table corresponding to the search request and failing to locate the matching entry in the branch target buffer, a victim entry is selected to overwrite in the branch target buffer. Branch prediction information of the matching entry is received from the branch target buffer preload table at the branch target buffer. The victim entry in the branch target buffer is overwritten with the branch prediction information of the matching entry. | 03-20-2014 |
20140181486 | BRANCH PREDICTION TABLE INSTALL SOURCE TRACKING - Embodiments relate to branch prediction table install source tracking. An aspect includes a computer-implemented method for branch prediction table install source tracking. The method includes receiving at a branch target buffer a request to install a branch target buffer entry corresponding to a branch instruction for branch prediction. The method further includes identifying, by a computer, a source of the request as an install source of the branch target buffer entry. The method also includes storing, by the computer, an install source identifier in the branch target buffer based on the install source. | 06-26-2014 |
20140281438 | METHOD FOR A DELAYED BRANCH IMPLEMENTATION BY USING A FRONT END TRACK TABLE - A method for a delayed branch implementation by using a front end track table. The method includes receiving an incoming instruction sequence using a global front end, wherein the instruction sequence includes at least one branch, creating a delayed branch in response to receiving the one branch, and using a front end track table to track both the delayed branch the one branch. | 09-18-2014 |
20150019848 | ASYNCHRONOUS LOOKAHEAD HIERARCHICAL BRANCH PREDICTION - Embodiments relate to asynchronous lookahead hierarchical branch prediction. An aspect includes a computer-implemented method for asynchronous lookahead hierarchical branch prediction using a second-level branch target buffer. The method includes receiving a search request to locate branch prediction information associated with a search address. The method further includes searching, by a processing circuit, for an entry corresponding to the search request in a first-level branch target buffer. The method also includes, based on failing to locate a matching entry in the first-level branch target buffer corresponding to the search request, initiating, by the processing circuit, a secondary search to locate entries in the second-level branch target buffer having a memory region corresponding to the search request. The method additionally includes, based on locating the entries in the second-level branch target buffer, performing a bulk transfer of the entries from the second-level branch target buffer. | 01-15-2015 |
20150019849 | SEMI-EXCLUSIVE SECOND-LEVEL BRANCH TARGET BUFFER - Embodiments relate to a semi-exclusive second-level branch target buffer. An aspect includes a computer-implemented method for a semi-exclusive second-level branch target buffer. The method includes performing a search to locate entries in a BTB | 01-15-2015 |
20150039870 | SYSTEMS AND METHODS FOR LOCKING BRANCH TARGET BUFFER ENTRIES - A data processing system includes a processor configured to execute processor instructions and a branch target buffer having a plurality of entries. Each entry is configured to store a branch target address and a lock indicator, wherein the lock indicator indicates whether the entry is a candidate for replacement, and wherein the processor is configured to access the branch target buffer during execution of the processor instructions. The data processing system further includes control circuitry configured to determine a fullness level of the branch target buffer, wherein in response to the fullness level reaching a fullness threshold, the control circuitry is configured to assert the lock indicator of one or more of the plurality of entries to indicate that the one or more of the plurality of entries is not a candidate for replacement. | 02-05-2015 |
20150121050 | BANDWIDTH INCREASE IN BRANCH PREDICTION UNIT AND LEVEL 1 INSTRUCTION CACHE - A processor, a device, and a non-transitory computer readable medium for performing branch prediction in a processor are presented. The processor includes a front end unit. The front end unit includes a level 1 branch target buffer (BTB), a BTB index predictor (BIP), and a level 1 hash perceptron (HP). The BTB is configured to predict a target address. The BIP is configured to generate a prediction based on a program counter and a global history, wherein the prediction includes a speculative partial target address, a global history value, a global history shift value, and a way prediction. The HP is configured to predict whether a branch instruction is taken or not taken. | 04-30-2015 |
20150301829 | SYSTEMS AND METHODS FOR MANAGING BRANCH TARGET BUFFERS IN A MULTI-THREADED DATA PROCESSING SYSTEM - A data processing system includes a processor configured to execute processor instructions of a first thread and processor instructions of a second thread, a first branch target buffer (BTB) corresponding to the first thread, a second BTB corresponding to the second thread, storage circuitry configured to store a borrow enable indicator corresponding to the first thread which indicates whether borrowing is enabled for the first thread, and control circuitry configured to allocate an entry for a branch instruction executed within the first thread in the first branch target buffer but not the second branch target buffer if borrowing is not enabled by the borrow enable indicator and in the first branch target buffer or the second branch target buffer if borrowing is enabled by the borrow enable indicator and the second thread is not enabled. | 10-22-2015 |
20150339124 | SYSTEM AND METHOD FOR SELECTIVELY ALLOCATING ENTRIES AT A BRANCH TARGET BUFFER - A branch instruction and a corresponding branch instruction address are received at a data processing system. A first value is received and is compared to a portion of the branch instruction address. An entry at a branch target buffer corresponding to the branch instruction is selectively allocated based on a result of the comparing. | 11-26-2015 |
20150339125 | BRANCH PROCESSING METHOD AND SYSTEM - A method for branch processing is provided. The method includes determining an instruction type of an instruction written into a cache memory and recording the instruction type. The method also includes calculating a branch target instruction address of the branch instruction and recording target address information corresponding to the branch target instruction address when the instruction is a branch instruction, where the target address information corresponds to one instruction segment containing at least the branch target instruction. Further, the method includes filling the instruction segment containing at least the branch target instruction into the position corresponding to the target address information in the cache memory based on the branch target instruction address when the branch target instruction is not stored in the cache memory, such that before a CPU core executes the branch instruction, a next instruction following the branch instruction in a program sequence and the branch target instruction of the branch instruction are stored in the cache memory. | 11-26-2015 |
20150347147 | ABSOLUTE ADDRESS BRANCHING IN A FIXED-WIDTH REDUCED INSTRUCTION SET COMPUTING ARCHITECTURE - Embodiments relate to a method and computer program product for absolute address branching in a reduced instruction set computing (RISC) architecture. One aspect is a method that includes fetching a branch instruction from an instruction stream having a fixed instruction width. A branch target address value is acquired from the instruction stream. The branch target address value represents a target address of the branch instruction. The branch target address value is formatted as an absolute address and sized as a multiple of the fixed instruction width. The branch target address value is loaded into a program counter based on the branch instruction. Execution of the instruction stream is redirected to a next instruction based on the branch target address value in the program counter. | 12-03-2015 |
20150347148 | RELATIVE OFFSET BRANCHING IN A FIXED-WIDTH REDUCED INSTRUCTION SET COMPUTING ARCHITECTURE - Embodiments relate to a method and computer program product for relative offset branching in a reduced instruction set computing (RISC) architecture. One aspect is a method that includes fetching a branch instruction from an instruction stream having a fixed instruction width. A relative offset value is acquired from the instruction stream. The relative offset value is formatted as an offset relative to a program counter value and sized as a multiple of the fixed instruction width. The relative offset value is added with the program counter value to form a branch target address value. The branch target address value is loaded into a program counter based on the branch instruction. Execution of the instruction stream is redirected to a next instruction based on the branch target address value in the program counter. | 12-03-2015 |
20160092229 | SYSTEMS AND METHODS FOR MANAGING RETURN STACKS IN A MULTI-THREADED DATA PROCESSING SYSTEM - A processor is configured to execute instructions of a first thread and a second thread. A first return stack corresponds to the first thread, and a second return stack to the second thread. Control circuitry pushes a return address to the first return stack in response to a branch to subroutine instruction in the first thread. If the first return stack is full and borrowing is not enabled by the borrow enable indicator, the control circuitry removes an oldest return address from the first return stack and not store the removed oldest return address in the second return stack. If the first return stack is full and borrowing is enabled by the borrow enable indicator and the second thread is not enabled, the control circuitry removes the oldest return address from the first return stack and push the removed oldest return address onto the second return stack. | 03-31-2016 |
20160110199 | DEVICE AND METHOD FOR PROCESSING COUNTER DATA - Provided are a device and a method of processing counter data, the method including receiving pieces of state data and counter data which is a transition condition between the pieces of state data, the pieces of state data and the counter data being expressed as a state machine, determining whether or not the counter data is generated based on the state machine, storing state data, which is transitioned from the counter data in response to the generation of the counter data, and outputting the stored state data. | 04-21-2016 |