Patent application number | Description | Published |
20080313431 | Method and System for Altering Processor Execution of a Group of Instructions - An embodiment of the invention is a processor for detecting one or more groups of instructions and initiating a processor action upon detecting one or more groups of instructions. The processor includes an instruction unit for fetching and decoding a group of instructions. An instruction register receives the group of instruction having at least one instruction opcode. A control register includes a control word including a control opcode and an action field defining a processor action. An execution unit includes compare logic for comparing the instruction opcode and the control opcode. The execution unit initiates the processor action upon the compare logic detecting a hit between the instruction opcode and the control opcode. | 12-18-2008 |
20090198980 | FACILITATING PROCESSING IN A COMPUTING ENVIRONMENT USING AN EXTENDED DRAIN INSTRUCTION - An extended DRAIN instruction is used to stall processing within a computing environment. The instruction includes an indication of the one or more processing stages at which processing is to be stalled. It also includes a control that allows processing to be stalled for additional cycles, as desired. | 08-06-2009 |
20090204794 | METHODS COMPUTER PROGRAM PRODUCTS AND SYSTEMS FOR UNIFYING PROGRAM EVENT RECORDING FOR BRANCHES AND STORES IN THE SAME DATAFLOW - The present invention relates to a method for the unification of PER branch and PER store operations within the same dataflow. The method comprises determining a PER range, the PER range comprising a storage area defined by a designated storage starting area and a designated storage ending area, wherein the storage starting area is designated by a value of the contents of a first control register and the storage ending area is designated by a value of the contents of a second control register. The method also comprises retrieving register field content values that are stored at a plurality of registers, wherein the retrieved content values comprises a length field content value, and setting the length field content value to zero for a PER branch instruction, thereby enabling a PER branch instruction to performed similarly to a PER storage instruction. | 08-13-2009 |
20090210656 | METHOD AND SYSTEM FOR OVERLAPPING EXECUTION OF INSTRUCTIONS THROUGH NON-UNIFORM EXECUTION PIPELINES IN AN IN-ORDER PROCESSOR - A system and method for overlapping execution (OE) of instructions through non-uniform execution pipelines in an in-order processor are provided. The system includes a first execution unit to perform instruction execution in a first execution pipeline. The system also includes a second execution unit to perform instruction execution in a second execution pipeline, where the second execution pipeline includes a greater number of stages than the first execution pipeline. The system further includes an instruction dispatch unit (IDU), the IDU including OE registers and logic for dispatching an OE-capable instruction to the first execution unit such that the instruction completes execution prior to completing execution of a previously dispatched instruction to the second execution unit. The system additionally includes a latch to hold a result of the execution of the OE-capable instruction until after the second execution unit completes the execution of the previously dispatched instruction. | 08-20-2009 |
20090210675 | METHOD AND SYSTEM FOR EARLY INSTRUCTION TEXT BASED OPERAND STORE COMPARE REJECT AVOIDANCE - A method and system for early instruction text based operand store compare avoidance in a processor are provided. The system includes a processor pipeline for processing instruction text in an instruction stream, where the instruction text includes operand address information. The system also includes delay logic to monitor the instruction stream. The delay logic performs a method that includes detecting a load instruction following a store instruction in the instruction stream, comparing the operand address information of the store instruction with the load instruction. The method also includes delaying the load instruction in the processor pipeline in response to detecting a common field value between the operand address information of the store instruction and the load instruction. | 08-20-2009 |
20090210775 | METHOD AND SYSTEM FOR INSTRUCTION ADDRESS PARITY COMPARISON - A method and system for instruction address parity comparison are provided. The method includes calculating an instruction address parity value for an instruction, and distributing the instruction address parity value to one or more functional units in processing circuitry. The method also includes receiving the distributed instruction address parity value from the one or more functional units, and calculating a completing instruction address (CIA) parity value associated with completing the instruction. The method further includes generating an error indicator in response to a mismatch between the received instruction address parity value and the CIA parity value. | 08-20-2009 |
20090217005 | METHOD, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR SELECTIVELY ACCELERATING EARLY INSTRUCTION PROCESSING - A method for selectively accelerating early instruction processing including receiving an instruction data that is normally processed in an execution stage of a processor pipeline, wherein a configuration of the instruction data allows a processing of the instruction data to be accelerated from the execution stage to an address generation stage that occurs earlier in the processor pipeline than the execution stage, determining whether the instruction data can be dispatched to the address generation stage to be processed without being delayed due to an unavailability of a processing resource needed for the processing of the instruction data in the address generation stage, dispatching the instruction data to be processed in the address generation stage if it can be dispatched without being delayed due to the unavailability of the processing resource, and dispatching the instruction data to be processed in the execution stage if it can not be dispatched without being delayed due to the unavailability of the processing resource, wherein the processing of the instruction data is selectively accelerated using an address generation interlock scheme. A corresponding system and computer program product. | 08-27-2009 |
20090217009 | SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR TRANSLATING STORAGE ELEMENTS - A system, method and computer program product for translations in a computer system. The system includes a general purpose register containing a base address of an address translation table. The system also includes a millicode accessible special displacement register configured to receive a plurality of elements to be translated. The system further includes a multiplexer for selecting a particular one of the plurality of elements from the millicode accessible special displacement register and for generating a displacement or offset value. The system further includes an address generator for creating a combined address containing the base address from the general purpose register and the generated displacement or offset value. | 08-27-2009 |
20090217077 | METHOD, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR PROCESSOR ERROR CHECKING - A method for processor error checking including receiving an instruction data, generating a pre-processing parity data based on the instruction data, maintaining the pre-processing parity data, processing the instruction data, generating a post-processing parity data based on the processed instruction data, checking for an error related to processing the instruction data by comparing the post-processing parity data to the pre-processing parity data, and transmitting an error signal that indicates the error related to processing the instruction data occurred if the post-processing parity data does not match the pre-processing parity data, wherein checking for the error related to processing the instruction data is performed without using a duplicate processing circuitry. | 08-27-2009 |
20090217135 | METHOD, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR ADDRESS GENERATION CHECKING - A method for address generation checking including receiving a starting memory address for a data, an ending memory address for the data, a length value of the data, and an address wrap indicator value that indicates if the data wraps from an end of a memory block to a start of the memory block, determining whether the ending memory address is equal to a sum of the starting memory address added to a difference of the length value to the address wrap indicator value, and transmitting an error signal that indicates an error occurred in a generation of the starting memory address or the ending memory address if the ending memory address is not equal to the sum. | 08-27-2009 |
20090240914 | RECYCLING LONG MULTI-OPERAND INSTRUCTIONS - A pipelined microprocessor configured for long operand instructions is disclosed. The microprocessor includes a memory unit and a load-store unit. The load store unit is coupled to the memory unit and includes a data formatter receiving information from the memory unit and including an operand selector and a shift register portion. The microprocessor also includes an execution unit coupled to the load-store unit and receiving operand information there from. The execution unit includes output latches coupled to a storage location within the execution unit for storing output information from the execution unit. | 09-24-2009 |
20090240918 | METHOD, COMPUTER PROGRAM PRODUCT, AND HARDWARE PRODUCT FOR ELIMINATING OR REDUCING OPERAND LINE CROSSING PENALTY - Eliminating or reducing an operand line crossing penalty by performing an initial fetch for an operand from a data cache of a processor. The initial fetch is performed by allowing or permitting the initial fetch to occur unaligned with reference to a quadword boundary. A plurality of subsequent fetches for a corresponding plurality of operands from the data cache are performed wherein each of the plurality of subsequent fetches is aligned to any of a plurality of quadword boundaries to prevent each of a plurality of individual fetch requests from spanning a plurality of lines in the data cache. A steady stream of data is maintained by placing an operand buffer at an output of the data cache to store and merge data from the initial fetch and the plurality of subsequent fetches, and to return the stored and merged data to the processor. | 09-24-2009 |
20090240919 | PROCESSOR AND METHOD FOR SYNCHRONOUS LOAD MULTIPLE FETCHING SEQUENCE AND PIPELINE STAGE RESULT TRACKING TO FACILITATE EARLY ADDRESS GENERATION INTERLOCK BYPASS - A pipelined processor including an architecture for address generation interlocking, the processor including: an instruction grouping unit to detect a read-after-write dependency and to resolve instruction interdependency; an instruction dispatch unit (IDU) including address generation interlock (AGI) and operand fetching logic for dispatching an instruction to at least one of a load store unit and an execution unit; wherein the load store unit is configured with access to a data cache and to return fetched data to the execution unit; wherein the execution unit is configured to write data into a general purpose register bank; and wherein the architecture provides support for bypassing of results of a load multiple instruction for address generation while such instruction is executing in the execution unit before the general purpose register bank is written. A method and a computer system are also provided. | 09-24-2009 |
20090240921 | METHOD, SYSTEM AND COMPUTER PROGRAM PRODUCT FOR SUPPORTING PARTIAL RECYCLE IN A PIPELINED MICROPROCESSOR - A computer processing system is provided. The computer processing system includes a first datastore that stores a subset of information associated with an instruction. A first stage of a processor pipeline writes the subset of information to the first datastore based on an execution of an operation associated with the instruction. A second stage of the pipeline initiates reprocessing of the operation associated with the instruction based on the subset of information stored in the first datastore. | 09-24-2009 |
20090240922 | METHOD, SYSTEM, COMPUTER PROGRAM PRODUCT, AND HARDWARE PRODUCT FOR IMPLEMENTING RESULT FORWARDING BETWEEN DIFFERENTLY SIZED OPERANDS IN A SUPERSCALAR PROCESSOR - Result and operand forwarding is provided between differently sized operands in a superscalar processor by grouping a first set of instructions for operand forwarding, and grouping a second set of instructions for result forwarding, the first set of instructions comprising a first source instruction having a first operand and a first dependent instruction having a second operand, the first dependent instruction depending from the first source instruction; the second set of instructions comprising a second source instruction having a third operand and a second dependent instruction having a fourth operand, the second dependent instruction depending from the second source instruction, performing operand forwarding by forwarding the first operand, either whole or in part, as it is being read to the first dependent instruction prior to execution; performing result forwarding by forwarding a result of the second source instruction, either whole or in part, to the second dependent instruction, after execution; wherein the operand forwarding is performed by executing the first source instruction together with the first dependent instruction; and wherein the result forwarding is performed by executing the second source instruction together with the second dependent instruction. | 09-24-2009 |
20090240929 | METHOD, SYSTEM AND COMPUTER PROGRAM PRODUCT FOR REDUCED OVERHEAD ADDRESS MODE CHANGE MANAGEMENT IN A PIPELINED, RECYLING MICROPROCESSOR - A method, system, and computer program product for reduced overhead address mode change management in a pipelined, recycling microprocessor are provided. The recycling microprocessor includes logic executing thereon. The microprocessor also includes an instruction fetch unit (IFU) supporting computation of address adds in selected address modes and reporting non-equal comparison of the computation to the logic. The microprocessor further includes a fixed point unit determining whether the mode has changed and reporting changes to the logic. Upon determining the comparison yields an equal result but the mode has changed, a recycle event is triggered to ensure subsequent ofetches are relaunched in the correct mode and that no execution writebacks occur from work performed in an incorrect mode. For comparisons yielding a non-equal result and a changed mode, the logic clears bits set in response to the determinations, and a serialization event is taken to reset a corresponding pipeline for operation in the correct mode. | 09-24-2009 |
20090241084 | METHOD, SYSTEM AND COMPUTER PROGRAM PRODUCT FOR EXPLOITING ORTHOGONAL CONTROL VECTORS IN TIMING DRIVEN SYSTEMS - Systems, methods and computer program products for exploiting orthogonal control vectors in timing driven systems. An exemplary embodiment includes running an initial logic synthesis run on the system, identifying critical inputs to a logic cone related to the run, identifying orthogonal vectors in the logic cone, adding vectors to the logic cone, obtaining logical solutions and selecting a solution from the logical solutions. | 09-24-2009 |
20110167244 | EARLY INSTRUCTION TEXT BASED OPERAND STORE COMPARE REJECT AVOIDANCE - A method and system for early instruction text based operand store compare avoidance in a processor are provided. The system includes a processor pipeline for processing instruction text in an instruction stream, where the instruction text includes operand address information. The system also includes delay logic to monitor the instruction stream. The delay logic performs a method that includes detecting a load instruction following a store instruction in the instruction stream, comparing the operand address information of the store instruction with the load instruction. The method also includes delaying the load instruction in the processor pipeline in response to detecting a common field value between the operand address information of the store instruction and the load instruction. | 07-07-2011 |
20110320782 | PROGRAM STATUS WORD DEPENDENCY HANDLING IN AN OUT OF ORDER MICROPROCESSOR DESIGN - A computer implemented method of processing instructions of a computer program. The method comprises providing at least two copies of program status data; identifying a first update instruction of the instructions that writes to at least one field of the program status data; and associating the first update instruction with a first copy of the at least two copies of program status data. | 12-29-2011 |
20120036338 | FACILITATING PROCESSING IN A COMPUTING ENVIRONMENT USING AN EXTENDED DRAIN INSTRUCTION - An extended DRAIN instruction is used to stall processing within a computing environment. The instruction includes an indication of the one or more processing stages at which processing is to be stalled. It also includes a control that allows processing to be stalled for additional cycles, as desired. | 02-09-2012 |
20130191599 | CACHE SET REPLACEMENT ORDER BASED ON TEMPORAL SET RECORDING - A technique is provided for cache management of a cache. The processing circuit determines a miss count and a hit position field during a previous execution of an instruction requesting that a data element be stored in a cache. The miss count and the hit position field are stored for a data element corresponding to an instruction that requests storage of the data element. The processing circuit places the data element in a hierarchical order based on the miss count and/or the hit position field. The hit position field includes a hierarchical position related to the data element in the cache. | 07-25-2013 |
20130191678 | IN SITU PROCESSOR RE-CHARACTERIZATION - A re-characterization process is provided that adjusts one or more operating parameters of a processor to improve the health (e.g., reduce errors) of the processor. The parameters include voltage and/or clock frequency, as examples. The processor can be an inactive or active processor for which the re-characterization process is performed. It is performed, in one instance, by a hardware controller in real-time. | 07-25-2013 |
20130191684 | HARDWARE RECOVERY IN MULTI-THREADED PROCESSOR - A computer system includes a simultaneous multi-threading processor and memory in operable communication with the processor. The processor is configured to perform a method including running multiple threads simultaneously, detecting a hardware error in one or more hardware structures of the processing circuit, and identifying one or more victim threads of the multiple threads. The processor is further configured to identify a plurality of hardware structures associated with execution of the one or more victim threads, isolate the one or more victim threads from the rest of the multiple threads by preventing access to the plurality of hardware structures by the multiple threads, flush the one or more victim threads by resetting hardware states of the plurality of hardware structures, and restore the one or more victim threads by restoring the plurality of hardware structures to a known safe state. | 07-25-2013 |
20130191690 | IN SITU PROCESSOR RE-CHARACTERIZATION - A re-characterization process is provided that adjusts one or more operating parameters of a processor to improve the health (e.g., reduce errors) of the processor. The parameters include voltage and/or clock frequency, as examples. The processor can be an inactive or active processor for which the re-characterization process is performed. It is performed, in one instance, by a hardware controller in real-time. | 07-25-2013 |
20130191832 | MANAGEMENT OF THREADS WITHIN A COMPUTING ENVIRONMENT - Threads of a computing environment are managed to improve system performance. Threads are migrated between processors to take advantage of single thread processing mode, when possible. As an example, inactive threads are migrated from one or more processors, potentially freeing-up one or more processors to execute an active thread. Active threads are migrated from one processor to another to transform multiple threading mode processors to single thread mode processors. | 07-25-2013 |
20130191844 | MANAGEMENT OF THREADS WITHIN A COMPUTING ENVIRONMENT - Threads of a computing environment are managed to improve system performance. Threads are migrated between processors to take advantage of single thread processing mode, when possible. As an example, inactive threads are migrated from one or more processors, potentially freeing-up one or more processors to execute an active thread. Active threads are migrated from one processor to another to transform multiple threading mode processors to single thread mode processors. | 07-25-2013 |
20130198491 | MAJOR BRANCH INSTRUCTIONS WITH TRANSACTIONAL MEMORY - Major branch instructions are provided that enable execution of a computer program to branch from one segment of code to another segment of code. These instructions also create a new stream of processing at the other segment of code enabling execution of the other segment of code to be performed in parallel with the segment of code from which the branch was taken. In one example, the other stream of processing starts a transaction for processing instructions of the other stream of processing. | 08-01-2013 |
20130198492 | MAJOR BRANCH INSTRUCTIONS - Major branch instructions are provided that enable execution of a computer program to branch from one segment of code to another segment of code. These instructions also create a new stream of processing at the other segment of code enabling execution of the other segment of code to be performed in parallel with the segment of code from which the branch was taken. In one example, the other stream of processing starts a transaction for processing instructions of the other stream of processing. | 08-01-2013 |
20130198496 | MAJOR BRANCH INSTRUCTIONS - Major branch instructions are provided that enable execution of a computer program to branch from one segment of code to another segment of code. These instructions also create a new stream of processing at the other segment of code enabling execution of the other segment of code to be performed in parallel with the segment of code from which the branch was taken. In one example, the other stream of processing starts a transaction for processing instructions of the other stream of processing. | 08-01-2013 |
20130198497 | MAJOR BRANCH INSTRUCTIONS WITH TRANSACTIONAL MEMORY - Major branch instructions are provided that enable execution of a computer program to branch from one segment of code to another segment of code. These instructions also create a new stream of processing at the other segment of code enabling execution of the other segment of code to be performed in parallel with the segment of code from which the branch was taken. In one example, the other stream of processing starts a transaction for processing instructions of the other stream of processing. | 08-01-2013 |
20130275801 | RECONFIGURABLE RECOVERY MODES IN HIGH AVAILABILITY PROCESSORS - A computer program product for performing error recovery is configured to perform a method that includes creating, by a processor, a recovery checkpoint. The processor is dynamically switched into a non-recoverable processing mode of operation based on creating the software recovery checkpoint. The non-recoverable processing mode of operation is a mode in which a subset of hardware error recovery resources are powered-down or re-purposed for instruction processing. It is determined, during the non-recoverable processing mode of operation, that a new software recovery checkpoint is required. Based on the determining that a new software recovery checkpoint is required, the processor is dynamically switched into a recoverable processing mode of operation. The recoverable processing mode of operation is a mode in which hardware error recovery resources, including at least one of the hardware error recovery resources in the subset, are purposed for hardware error recovery operations. | 10-17-2013 |
20130275806 | RECONFIGURABLE RECOVERY MODES IN HIGH AVAILABILITY PROCESSORS - A method for performing error recovery that includes creating, by a processor, a recovery checkpoint. The processor is dynamically switched into a non-recoverable processing mode of operation based on creating the software recovery checkpoint. The non-recoverable processing mode of operation is a mode in which a subset of hardware error recovery resources are powered-down or re-purposed for instruction processing. It is determined, during the non-recoverable processing mode of operation, that a new software recovery checkpoint is required. Based on the determining that a new software recovery checkpoint is required, the processor is dynamically switched into a recoverable processing mode of operation. The recoverable processing mode of operation is a mode in which hardware error recovery resources, including at least one of the hardware error recovery resources in the subset, are purposed for hardware error recovery operations. | 10-17-2013 |
20130290802 | VARIABLE ACKNOWLEDGE RATE TO REDUCE BUS CONTENTION IN PRESENCE OF COMMUNICATION ERRORS - A variable write back indicator control is provided to control the amount of data to be re-transmitted when a packet error occurs. A hardware controller obtains an indication that an acknowledge rate or an amount of set write back indicators of a data frame is to be adjusted. The indication is based on an error rate of data transmission over a communication bus. Based on obtaining the indication that the amount of set write back indicators is to be adjusted, one or more write back indicators are adjusted. | 10-31-2013 |
20130290803 | VARIABLE ACKNOWLEDGE RATE TO REDUCE BUS CONTENTION IN PRESENCE OF COMMUNICATION ERRORS - A variable write back indicator control is provided to control the amount of data to be re-transmitted when a packet error occurs. A hardware controller obtains an indication that an acknowledge rate or an amount of set write back indicators of a data frame is to be adjusted. The indication is based on an error rate of data transmission over a communication bus. Based on obtaining the indication that the amount of set write back indicators is to be adjusted, one or more write back indicators are adjusted. | 10-31-2013 |
20130332670 | PROCESS IDENTIFIER-BASED CACHE DATA TRANSFER - Embodiments of the invention relate to process identifier (PID) based cache information transfer. An aspect of the invention includes sending, by a first core of a processor, a PID associated with a cache miss in a first local cache of the first core to a second cache of the processor. Another aspect of the invention includes determining that the PID associated with the cache miss is listed in a PID table of the second cache. Yet another aspect of the invention includes based on the PID being listed in the PID table of the second cache, determining a plurality of entries in a cache directory of the second cache that are associated with the PID. Yet another aspect of the invention includes pushing cache information associated with each of the determined plurality of entries in the cache directory from the second cache to the first local cache. | 12-12-2013 |
20130332672 | PROCESS IDENTIFIER-BASED CACHE INFORMATION TRANSFER - Embodiments of the invention relate to process identifier (PID) based cache information transfer. An aspect of the invention includes sending, by a first core of a processor, a PID associated with a cache miss in a first local cache of the first core to a second cache of the processor. Another aspect of the invention includes determining that the PID associated with the cache miss is listed in a PID table of the second cache. Yet another aspect of the invention includes based on the PID being listed in the PID table of the second cache, determining a plurality of entries in a cache directory of the second cache that are associated with the PID. Yet another aspect of the invention includes pushing cache information associated with each of the determined plurality of entries in the cache directory from the second cache to the first local cache. | 12-12-2013 |
20130339666 | SPECIAL CASE REGISTER UPDATE WITHOUT EXECUTION - A method of changing a value of associated with a logical address in a computing device. The method includes: receiving an instruction at an instruction decoder, the instruction including a target register expressed as a logical value; determining at an instruction decoder that a result of the instruction is to set the target register to a constant value, the target register being in a physical register file associated with an execution unit; and mapping, in a register mapper, the logical address to a location represented by a special register tag. | 12-19-2013 |
20130339667 | SPECIAL CASE REGISTER UPDATE WITHOUT EXECUTION - A method of changing a value of associated with a logical address in a computing device. The method includes: receiving an instruction at an instruction decoder, the instruction including a target register expressed as a logical value; determining at an instruction decoder that a result of the instruction is to set the target register to a constant value, the target register being in a physical register file associated with an execution unit; and mapping, in a register mapper, the logical address to a location represented by a special register tag. | 12-19-2013 |
20130339688 | MANAGEMENT OF MULTIPLE NESTED TRANSACTIONS - Embodiments relate to implementing processor management of transactions. An aspect includes receiving an instruction from a thread. The instruction includes an instruction type, and executes within a transaction. The transaction effectively delays committing stores to memory until the transaction has completed. A processor manages transaction nesting for the instruction based on the instruction type of the instruction. The transaction nesting includes a maximum processor capacity. The transaction nesting management performs enables executing a sequence of nested transactions within a transaction, supports multiple nested transactions in a processor pipeline, or generates and maintains a set of effective controls for controlling a pipeline. The processor prevents the transaction nesting from exceeding the maximum processor capacity. | 12-19-2013 |
20130339959 | DYNAMIC MANAGEMENT OF A TRANSACTION RETRY INDICATION - Embodiments relate to dynamic management of a transaction retry indication. One aspect is a system that includes a transactional facility configured to support transactions that effectively delay committing stores to memory or results to an architectural state until transaction completion, and a processor configured to identify a transaction abort reason associated with an aborted transaction of an initiating program. Transaction success and transaction abort history are tracked. Based on determining by the processor that the transaction abort reason was caused by the initiating program, a retry indication is assigned based on a static mapping of the transaction abort reason. Based on determining by the processor that the transaction abort reason was not caused by the initiating program, the retry indication is assigned based on a retry process using the transaction abort reason, the transaction abort history, and a current processor configuration. | 12-19-2013 |
20130339975 | MANAGEMENT OF SHARED TRANSACTIONAL RESOURCES - Embodiments relate to management of shared transactional resources. A system includes a transactional facility configured to support transactions that effectively delay committing stores to memory or results to an architectural state until transaction completion. The system includes a processor configured to perform an allocation or arbitration of processing resources to instructions of a transaction within a thread. The processor detects that the transaction has exceeded a manageable capacity of a resource or a potential collision of a transactional instruction storage access has occurred, resulting in a transaction abort. A transaction abort reason and a current configuration are examined to determine whether the transaction abort was based on an initiating program exceeding a restricted limit on the manageable capacity of the resource or an allocation. A processor state is updated to increase a likelihood of success upon retrying the transaction. | 12-19-2013 |
20140019803 | HARDWARE RECOVERY IN MULTI-THREADED PROCESSOR - A computer system includes a simultaneous multi-threading processor and memory in operable communication with the processor. The processor is configured to perform a method including running multiple threads simultaneously, detecting a hardware error in one or more hardware structures of the processing circuit, and identifying one or more victim threads of the multiple threads. The processor is further configured to identify a plurality of hardware structures associated with execution of the one or more victim threads, isolate the one or more victim threads from the rest of the multiple threads by preventing access to the plurality of hardware structures by the multiple threads, flush the one or more victim threads by resetting hardware states of the plurality of hardware structures, and restore the one or more victim threads by restoring the plurality of hardware structures to a known safe state. | 01-16-2014 |
20140082625 | MANAGEMENT OF RESOURCES WITHIN A COMPUTING ENVIRONMENT - Resources in a computing environment are managed, for example, by a hardware controller controlling dispatching of resources from one or more pools of resources to be used in execution of threads. The controlling includes conditionally dispatching resources from the pool(s) to one or more low-priority threads of the computing environment based on current usage of resources in the pool(s) relative to an associated resource usage threshold. The management further includes monitoring resource dispatching from the pool(s) to one or more high-priority threads of the computing environment, and based on the monitoring, dynamically adjusting the resource usage threshold used in the conditionally dispatching of resources from the pool(s) to the low-priority thread(s). | 03-20-2014 |
20140082626 | MANAGEMENT OF RESOURCES WITHIN A COMPUTING ENVIRONMENT - Resources in a computing environment are managed, for example, by a hardware controller controlling dispatching of resources from one or more pools of resources to be used in execution of threads. The controlling includes conditionally dispatching resources from the pool(s) to one or more low-priority threads of the computing environment based on current usage of resources in the pool(s) relative to an associated resource usage threshold. The management further includes monitoring resource dispatching from the pool(s) to one or more high-priority threads of the computing environment, and based on the monitoring, dynamically adjusting the resource usage threshold used in the conditionally dispatching of resources from the pool(s) to the low-priority thread(s). | 03-20-2014 |
20140089732 | THREAD SPARING BETWEEN CORES IN A MULTI-THREADED PROCESSOR - Embodiments relate to thread sparing between cores in a processor. An aspect includes determining that a number of recovery attempts made by a first thread on the first core has exceeded a recovery attempt threshold, and sending a request to transfer the first thread. Another aspect includes, selecting a second core from a plurality of cores to receive the first thread from the first core, wherein the second core is selected based on the second core having an idle thread. Another aspect includes transferring a last good architected state of the first thread from the first core to the second core. Another aspect includes loading the last good architected state of the first thread by the idle thread on the second core. Yet another aspect includes resuming execution of the first thread on the second core from the last good architected state of the first thread by the idle thread. | 03-27-2014 |
20140089734 | THREAD SPARING BETWEEN CORES IN A MULTI-THREADED PROCESSOR - Embodiments relate to thread sparing between cores in a processor. An aspect includes determining that a number of recovery attempts made by a first thread on the first core has exceeded a recovery attempt threshold, and sending a request to transfer the first thread. Another aspect includes, selecting a second core from a plurality of cores to receive the first thread from the first core, wherein the second core is selected based on the second core having an idle thread. Another aspect includes transferring a last good architected state of the first thread from the first core to the second core. Another aspect includes loading the last good architected state of the first thread by the idle thread on the second core. Yet another aspect includes resuming execution of the first thread on the second core from the last good architected state of the first thread by the idle thread. | 03-27-2014 |
20140201501 | DYNAMIC ACCESSING OF EXECUTION ELEMENTS THROUGH MODIFICATION OF ISSUE RULES - Embodiments of the invention relate to dynamically routing instructions to execution units based on detected errors in the execution units. An aspect of the invention includes a computer system including a processor having an instruction issue unit and a plurality of execution units. The processor is configured to detect an error in a first execution unit among the plurality of execution units and adjust instruction dispatch rules of the instruction issue unit based on detecting the error in the first execution unit to restrict access to the first execution unit while leaving un-restricted access to the remaining execution units of the plurality of execution units. | 07-17-2014 |
20140201508 | CONFIDENCE THRESHOLD-BASED OPPOSING BRANCH PATH EXECUTION FOR BRANCH PREDICTION - Embodiments relate to confidence threshold-based opposing path execution for branch prediction. An aspect includes determining a branch prediction for a first branch instruction that is encountered during execution of a first thread, wherein the branch prediction indicates a primary path and an opposing path for the first branch instruction. Another aspect includes executing the primary path by the first thread. Another aspect includes determining a confidence of the branch prediction and comparing the confidence of the branch prediction to a confidence threshold. Yet another aspect includes, based on the confidence of the branch prediction being less than the confidence threshold, starting a second thread that executes the opposing path of the first branch instruction, wherein the second thread is executed in parallel with the first thread. | 07-17-2014 |
20140258629 | SPECIFIC PREFETCH ALGORITHM FOR A CHIP HAVING A PARENT CORE AND A SCOUT CORE - Embodiments relate to a method, system, and computer program product for prefetching data on a chip having at least one scout core and a parent core. The method includes saving a prefetch code start address by the parent core. The prefetch code start address indicates where a prefetch code is stored. The prefetch code is specifically configured for monitoring the parent core based on a specific application being executed by the parent core. The method includes sending a broadcast interrupt signal by the parent core to the at least one scout core. The broadcast interrupt signal being sent based on the prefetch code start address being saved. The method includes monitoring the parent core by the prefetch code executed by at least one scout core. The scout core executes the prefetch code based on receiving the broadcast interrupt signal. | 09-11-2014 |
20140258630 | PREFETCHING FOR MULTIPLE PARENT CORES IN A MULTI-CORE CHIP - Embodiments relate to a method, system, and computer program product for prefetching data on a chip. The chip has at least one scout core, multiple parent cores that cooperate together to execute various tasks, and a shared cache that is common between the scout core and the multiple parent cores. An aspect of the embodiments includes monitoring the multiple parent cores by the at least one scout core through the shared cache for a shared cache access occurring in a base parent core. The method includes saving a fetch address by the at least one scout core based on the shared cache access occurring. The fetch address indicates a location of a specific line of cache requested by the base parent core. | 09-11-2014 |
20140258640 | PREFETCHING FOR A PARENT CORE IN A MULTI-CORE CHIP - Embodiments of the invention relate to prefetching data on a chip having at least one scout core, at least one parent core, and a shared cache that is common between the at least one scout core and the at least one parent core. A prefetch code is executed by the scout core for monitoring the parent core. The prefetch code executes independently from the parent core. The scout core determines that at least one specified data pattern has occurred in the parent core based on monitoring the parent core. A prefetch request is sent from the scout core to the shared cache. The prefetch request is sent based on the at least one specified pattern being detected by the scout core. A data set indicated by the prefetch request is sent to the parent core by the shared cache. | 09-11-2014 |
20140258681 | ANTICIPATED PREFETCHING FOR A PARENT CORE IN A MULTI-CORE CHIP - Embodiments relate to prefetching data on a chip having a scout core and a parent core coupled to the scout core. The method includes determining that a program executed by the parent core requires content stored in a location remote from the parent core. The method includes sending a fetch table address determined by the parent core to the scout core. The method includes accessing a fetch table that is indicated by the fetch table address by the scout core. The fetch table indicates how many of pieces of content are to be fetched by the scout core and a location of the pieces of content. The method includes based on the fetch table indicating, fetching the pieces of content by the scout core. The method includes returning the fetched pieces of content to the parent core. | 09-11-2014 |
20150019819 | PREFETCHING FOR MULTIPLE PARENT CORES IN A MULTI-CORE CHIP - Embodiments relate to a method and computer program product for prefetching data on a chip. The chip has at least one scout core, multiple parent cores that cooperate together to execute various tasks, and a shared cache that is common between the scout core and the multiple parent cores. An aspect of the embodiments includes monitoring the multiple parent cores by the at least one scout core through the shared cache for a shared cache access occurring in a base parent core. The method includes saving a fetch address by the at least one scout core based on the shared cache access occurring. The fetch address indicates a location of a specific line of cache requested by the base parent core. | 01-15-2015 |
20150019820 | PREFETCHING FOR A PARENT CORE IN A MULTI-CORE CHIP - Embodiments of the invention relate to prefetching data on a chip having at least one scout core, at least one parent core, and a shared cache that is common between the at least one scout core and the at least one parent core. A prefetch code is executed by the scout core for monitoring the parent core. The prefetch code executes independently from the parent core. The scout core determines that at least one specified data pattern has occurred in the parent core based on monitoring the parent core. A prefetch request is sent from the scout core to the shared cache. The prefetch request is sent based on the at least one specified pattern being detected by the scout core. A data set indicated by the prefetch request is sent to the parent core by the shared cache. | 01-15-2015 |
20150019821 | SPECIFIC PREFETCH ALGORITHM FOR A CHIP HAVING A PARENT CORE AND A SCOUT CORE - Embodiments relate to a method and computer program product for prefetching data on a chip having at least one scout core and a parent core. The method includes saving a prefetch code start address by the parent core. The prefetch code start address indicates where a prefetch code is stored. The prefetch code is specifically configured for monitoring the parent core based on a specific application being executed by the parent core. The method includes sending a broadcast interrupt signal by the parent core to the at least one scout core. The broadcast interrupt signal being sent based on the prefetch code start address being saved. The method includes monitoring the parent core by the prefetch code executed by at least one scout core. The scout core executes the prefetch code based on receiving the broadcast interrupt signal. | 01-15-2015 |
20150019841 | ANTICIPATED PREFETCHING FOR A PARENT CORE IN A MULTI-CORE CHIP - Embodiments relate to prefetching data on a chip having a scout core and a parent core coupled to the scout core. A method includes determining that a program executed by the parent core requires content stored in a location remote from the parent core. The method includes sending a fetch table address determined by the parent core to the scout core. The method includes accessing a fetch table that is indicated by the fetch table address by the scout core. The fetch table indicates how many of pieces of content are to be fetched by the scout core and a location of the pieces of content. The method includes based on the fetch table indicating, fetching the pieces of content by the scout core. The method includes returning the fetched pieces of content to the parent core. | 01-15-2015 |
20150019907 | DYNAMIC ACCESSING OF EXECUTION ELEMENTS THROUGH MODIFICATION OF ISSUE RULES - Embodiments of the invention relate to dynamically routing instructions to execution units based on detected errors in the execution units. An aspect of the invention includes a computer system including a processor having an instruction issue unit and a plurality of execution units. The processor is configured to detect an error in a first execution unit among the plurality of execution units and adjust instruction dispatch rules of the instruction issue unit based on detecting the error in the first execution unit to restrict access to the first execution unit while leaving un-restricted access to the remaining execution units of the plurality of execution units. | 01-15-2015 |
20150058607 | CONFIDENCE THRESHOLD-BASED OPPOSING BRANCH PATH EXECUTION FOR BRANCH PREDICTION - Embodiments relate to confidence threshold-based opposing path execution for branch prediction. An aspect includes determining a branch prediction for a first branch instruction that is encountered during execution of a first thread, wherein the branch prediction indicates a primary path and an opposing path for the first branch instruction. Another aspect includes executing the primary path by the first thread. Another aspect includes determining a confidence of the branch prediction and comparing the confidence of the branch prediction to a confidence threshold. Yet another aspect includes, based on the confidence of the branch prediction being less than the confidence threshold, starting a second thread that executes the opposing path of the first branch instruction, wherein the second thread is executed in parallel with the first thread. | 02-26-2015 |
20150089152 | MANAGING HIGH-CONFLICT CACHE LINES IN TRANSACTIONAL MEMORY COMPUTING ENVIRONMENTS - Cache lines in a computing environment with transactional memory are configurable with a coherency mode. Cache lines in full-line coherency mode are operated or managed with full-line granularity. Cache lines in sub-line coherency mode are operated or managed as sub-cache line portions of a full cache line. When a transaction accessing a cache line in full-line coherency mode results in a transactional abort, the cache line may be placed in sub-line coherency mode if the cache line is a high-conflict cache line. The cache line may be associated with a counter in a conflict address detection table that is incremented whenever a transaction conflict is detected for the cache line. The cache line may be a high-conflict cache line when the counter satisfies a high-conflict criterion, such as reaching a threshold value. The cache line may be returned to full-line coherency mode when a reset criterion is satisfied. | 03-26-2015 |
20150089153 | IDENTIFYING HIGH-CONFLICT CACHE LINES IN TRANSACTIONAL MEMORY COMPUTING ENVIRONMENTS - Cache lines in a computing environment with transactional memory are configurable with a coherency mode and are associated with a high-conflict indicator. Cache lines in full-line coherency mode are operated or managed with full-line granularity. Cache lines in sub-line coherency mode are operated or managed as sub-cache line portions of a full cache line. A cache line is placed in sub-line coherency mode based on examining the high-conflict indicator. A transaction accessing a memory address in a cache line in sub-line coherency mode marks only the sub-cache line portion associated with the memory address as transactionally accessed. The high-conflict indicator may be included in a set of descriptive bits associated with the cache line. A copy of the high-conflict indicator for a cache line in a first cache may be updated with the high-conflict indicator for the cache line in a second cache. | 03-26-2015 |
20150089154 | MANAGING HIGH-COHERENCE-MISS CACHE LINES IN MULTI-PROCESSOR COMPUTING ENVIRONMENTS - Cache lines in a multi-processor computing environment are configurable with a coherency mode. Cache lines in full-line coherency mode are operated or managed with full-line granularity. Cache lines in sub-line coherency mode are operated or managed as sub-cache line portions of a full cache line. A high-coherence-miss cache line may be placed in sub-line coherency mode. A cache line may be associated with a counter in a coherence miss detection table that is incremented whenever an access of the cache line results in a coherence request. The cache line may be a high-coherence-miss cache line when the counter satisfies a high-coherence-miss criterion, such as reaching a threshold value. The cache line may be returned to full-line coherency mode when a reset criterion is satisfied. | 03-26-2015 |
20150089155 | CENTRALIZED MANAGEMENT OF HIGH-CONTENTION CACHE LINES IN MULTI-PROCESSOR COMPUTING ENVIRONMENTS - Cache lines in a multi-processor computing environment are configurable with a coherency mode. Cache lines in full-line coherency mode are operated or managed with full-line granularity. Cache lines in sub-line coherency mode are operated or managed as sub-cache line portions of a full cache line. Communications detected on a coherence interconnect may indicate that a cache line is associated with performance-reducing events. A high-contention cache line may be placed in sub-line coherency mode. Caches accessing the cache line are notified that the cache line is in sub-line coherency mode. The cache line may be associated with a counter in a centralized detection table that is incremented based on detecting the communications. The cache line may be a high-contention cache line when the counter satisfies a high-contention criterion, such as reaching a threshold value. The cache line may be returned to full-line coherency mode when a reset criterion is satisfied. | 03-26-2015 |
20150089159 | MULTI-GRANULAR CACHE MANAGEMENT IN MULTI-PROCESSOR COMPUTING ENVIRONMENTS - Cache lines in a multi-processor computing environment are configurable with a coherency mode. Cache lines in full-line coherency mode are operated or managed with full-line granularity. Cache lines in sub-line coherency mode are operated or managed as sub-cache line portions of a full cache line. Each cache is associated with a directory having a number of directory entries and with a side table having a smaller number of entries. The directory entry for a cache line associates the cache line with a tag and a set of full-line descriptive bits. Creating a side table entry for the cache line places the cache line in sub-line coherency mode. The side table entry associates each of the sub-cache line portions of the cache line with a set of sub-line descriptive bits. Removing the side table entry may return the cache line to full-line coherency mode. | 03-26-2015 |