Patent application number | Description | Published |
20080229069 | System, Method And Software To Preload Instructions From An Instruction Set Other Than One Currently Executing - An instruction preload instruction executed in a first processor instruction set operating mode is operative to correctly preload instructions in a different, second instruction set. The instructions are pre-decoded according to the second instruction set encoding in response to an instruction set preload indicator (ISPI). In various embodiments, the ISPI may be set prior to executing the preload instruction, or may comprise part of the preload instruction or the preload target address. | 09-18-2008 |
20080250229 | System, Method and Software to Preload Instructions from a Variable-Length Instruction Set with Proper Pre-Decoding - In a processor executing instructions from a variable-length instruction set, a preload instruction is operative to retrieve from memory a data block corresponding to an instruction cache line, pre-decode instructions from a variable-length instruction set in the data block, and load the instructions and pre-decode information into the instruction cache. An instruction execution unit indicates to a pre-decoder the position within the data block of a first valid instruction. The pre-decoder successively determines the length of each instruction and hence the instruction boundaries. An instruction cache line offset indicator that identifies the position of the first valid instruction may be generated and provided to the pre-decoder in a variety of ways. | 10-09-2008 |
20080281996 | Latency Insensitive FIFO Signaling Protocol - Data from a source domain operating at a first data rate is transferred to a FIFO in another domain operating at a different data rate. The FIFO buffers data before transfer to a sink for further processing or storage. A source side counter tracks space available in the FIFO. In disclosed examples, the initial counter value corresponds to FIFO depth. The counter decrements in response to a data ready signal from the source domain, without delay. The counter increments in response to signaling from the sink domain of a read of data off the FIFO. Hence, incrementing is subject to the signaling latency between domains. The source may send one more beat of data when the counter indicates the FIFO is full. The last beat of data is continuously sent from the source until it is indicated that a FIFO position became available; effectively providing one more FIFO position. | 11-13-2008 |
20080288753 | Methods and Apparatus for Emulating the Branch Prediction Behavior of an Explicit Subroutine Call - An apparatus for emulating the branch prediction behavior of an explicit subroutine call is disclosed. The apparatus includes a first input which is configured to receive an instruction address and a second input. The second input is configured to receive predecode information which describes the instruction address as being related to an implicit subroutine call to a subroutine. In response to the predecode information, the apparatus also includes an adder configured to add a constant to the instruction address defining a return address, causing the return address to be stored to an explicit subroutine resource, thus, facilitating subsequent branch prediction of a return call instruction. | 11-20-2008 |
20090210663 | Power Efficient Instruction Prefetch Mechanism - A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss. | 08-20-2009 |
20090313695 | Methods and Systems for Checking Run-Time Integrity of Secure Code Cross-Reference to Related Applications - Methods and systems to guard against attacks designed to replace authenticated, secure code with non-authentic, unsecure code and using existing hardware resources in the CPU's memory management unit (MMU) are disclosed. In certain embodiments, permission entries indicating that pages in memory have been previously authenticated as secure are maintained in a translation lookaside buffer (TLB) and checked upon encountering an instruction residing at an external page. A TLB permission entry indicating permission is invalid causes on-demand authentication of the accessed page. Upon authentication, the permission entry in the TLB is updated to reflect that the page has been authenticated. As another example, in certain embodiments, a page of recently authenticated pages is maintained and checked upon encountering an instruction residing at an external page. | 12-17-2009 |
20100023696 | Methods and System for Resolving Simultaneous Predicted Branch Instructions - A method of resolving simultaneous branch predictions prior to validation of the predicted branch instruction is disclosed. The method includes processing two or more predicted branch instructions, with each predicted branch instruction having a predicted state and a corrected state. The method further includes selecting one of the corrected states. Should one of the predicted branch instructions be mispredicted, the selected corrected state is used to direct future instruction fetches. | 01-28-2010 |
20100169615 | Preloading Instructions from an Instruction Set Other than a Currently Executing Instruction Set - A preload instruction in a first instruction set is executed at a processor. The preload instruction causes the processor to preload one or more instructions into an instruction cache. The pre-loaded instructions are pre-decoded according to a second instruction set that is different from the first instruction set. The preloaded instructions are pre-decoded according to the second instruction set in response to an instruction set preload indicator (ISPI). | 07-01-2010 |
20100306470 | Methods and Apparatus for Issuing Memory Barrier Commands in a Weakly Ordered Storage System - Efficient techniques are described for enforcing order of memory accesses. A memory access request is received from a device which is not configured to generate memory barrier commands. A surrogate barrier is generated in response to the memory access request. A memory access request may be a read request. In the case of a memory write request, the surrogate barrier is generated before the write request is processed. The surrogate barrier may also be generated in response to a memory read request conditional on a preceding write request to the same address as the read request. Coherency is enforced within a hierarchical memory system as if a memory barrier command was received from the device which does not produce memory barrier commands. | 12-02-2010 |
20110202727 | Apparatus and Methods to Reduce Duplicate Line Fills in a Victim Cache - Techniques and methods are used to reduce allocations to a higher level cache of cache lines displaced from a lower level cache. The allocations of the displaced cache lines are prevented for displaced cache lines that are determined to be redundant in the next level cache, whereby castouts are reduced. To such ends, a line is selected to be displaced in a lower level cache. Information associated with the selected line is identified which indicates that the selected line is present in a higher level cache or the selected line is a write-through line. An allocation of the selected line in the higher level cache is prevented based on the identified information. Preventing an allocation of the selected line saves power that would be associated with the allocation. | 08-18-2011 |
20110283083 | Configuring Surrogate Memory Accessing Agents Using Non-Priviledged Processes - Configuring a surrogate memory accessing agent using an instruction for translating and storing a data value is described. In one embodiment, the instruction is received that includes a first operand specifying a data value to be translated and a second operand specifying a virtual address associated with a location of a surrogate memory accessing agent register in which to store the data value. The data value can be translated to a first physical address. The virtual address can be translated to a second physical address. The first physical address is stored in the surrogate memory accessing agent register based on the second physical address. | 11-17-2011 |
20120059995 | Apparatus and Methods to Reduce Castouts in a Multi-Level Cache Hierarchy - Techniques and methods are used to reduce allocations to a higher level cache of cache lines displaced from a lower level cache. The allocations of the displaced cache lines are prevented for displaced cache lines that are determined to be redundant in the next level cache, whereby castouts are reduced. To such ends, a line is selected to be displaced in a lower level cache. Information associated with the selected line is identified which indicates that the selected line is present in a higher level cache. An allocation of the selected line in the higher level cache is prevented based on the identified information. Preventing an allocation of the selected line saves power that would be associated with the allocation. | 03-08-2012 |
20120226888 | Memory Management Unit With Pre-Filling Capability - Systems and method for memory management units (MMUs) configured to automatically pre-fill a translation lookaside buffer (TLB) with address translation entries expected to be used in the future, thereby reducing TLB miss rate and improving performance. The TLB may be pre-filled with translation entries, wherein addresses corresponding to the pre-fill may be selected based on predictions. Predictions may be derived from external devices, or based on stride values, wherein the stride values may be a predetermined constant or dynamically altered based on access patterns. Pre-filling the TLB may effectively remove latency involved in determining address translations for TLB misses from the critical path. | 09-06-2012 |
20130145137 | Methods and Apparatus for Saving Conditions Prior to a Reset for Post Reset Evaluation - A processor reset control circuit is configured to automatically capture a pre-reset value of processor information stored in one or more hardware registers, as part of a reset operation state machine and prior to changing the processor information to its architecturally required post reset value. Such pre-reset processor information includes, for example one or more pre-reset values of the processor program counter (PC) and one or more pre-reset values of an operating-state mode register, both of which may be captured in one or more pre-reset capture storage devices which are then made available for evaluation purposes. Such pre-reset capture storage devices store pre-reset information in response to the reset and maintain the stored pre-reset information until another reset occurs. | 06-06-2013 |
20130151799 | Auto-Ordering of Strongly Ordered, Device, and Exclusive Transactions Across Multiple Memory Regions - Efficient techniques are described for controlling ordered accesses in a weakly ordered storage system. A stream of memory requests is split into two or more streams of memory requests and a memory access counter is incremented for each memory request. A memory request requiring ordered memory accesses is identified in one of the two or more streams of memory requests. The memory request requiring ordered memory accesses is stalled upon determining a previous memory request from a different stream of memory requests is pending. The memory access counter is decremented for each memory request guaranteed to complete. A count value in the memory access counter that is different from an initialized state of the memory access counter indicates there are pending memory requests. The memory request requiring ordered memory accesses is processed upon determining there are no further pending memory requests. | 06-13-2013 |
20140006752 | Qualifying Software Branch-Target Hints with Hardware-Based Predictions | 01-02-2014 |
20140040593 | MULTIPLE SETS OF ATTRIBUTE FIELDS WITHIN A SINGLE PAGE TABLE ENTRY - A first processing unit and a second processing unit can access a system memory that stores a common page table that is common to the first processing unit and the second processing unit. The common page table can store virtual memory addresses to physical memory addresses mapping for memory chunks accessed by a job of an application. A page entry, within the common page table, can include a first set of attribute bits that defines accessibility of the memory chunk by the first processing unit, a second set of attribute bits that defines accessibility of the same memory chunk by the second processing unit, and physical address bits that define a physical address of the memory chunk. | 02-06-2014 |
20140281332 | EXTERNALLY PROGRAMMABLE MEMORY MANAGEMENT UNIT - A method includes reading, by a processor, one or more configuration values from a storage device or a memory management unit. The method also includes loading the one or more configuration values into one or more registers of the processor. The one or more registers are useable by the processor to perform address translation. | 09-18-2014 |
20140282508 | SYSTEMS AND METHODS OF EXECUTING MULTIPLE HYPERVISORS - An apparatus includes a primary hypervisor that is executable on a first set of processors and a secondary hypervisor that is executable on a second set of processors. The primary hypervisor may define settings of a resource and the secondary hypervisor may use the resource based on the settings defined by the primary hypervisor. For example, the primary hypervisor may program memory address translation mappings for the secondary hypervisor. The primary hypervisor and the secondary hypervisor may include their own schedulers. | 09-18-2014 |
20140282580 | METHOD AND APPARATUS TO SAVE AND RESTORE SYSTEM MEMORY MANAGEMENT UNIT (MMU) CONTEXTS - A wireless mobile device includes a graphic processing unit (GPU) that has a system memory management unit (MMU) for saving and restoring system MMU translation contexts. The system MMU is coupled to a memory and the GPU. The system MMU includes a set of hardware resources. The hardware resources may be context banks, with each of the context banks having a set of hardware registers. The system MMU also includes a hardware controller that is configured to restore a hardware resource associated with an access stream of content issued by an execution thread of the GPU. The associated hardware resource may be restored from the memory into a physical hardware resource when the hardware resource associated with the access stream of content is not stored within one of the hardware resources. | 09-18-2014 |
20140310468 | METHODS AND APPARATUS FOR IMPROVING PERFORMANCE OF SEMAPHORE MANAGEMENT SEQUENCES ACROSS A COHERENT BUS - Techniques are described for a multi-processor having two or more processors that increases the opportunity for a load-exclusive command to take a cache line in an Exclusive state, which results in increased performance when a store-exclusive is executed. A new bus operation read prefer exclusive is used as a hint to other caches that a requesting master is likely to store to the cache line, and, if possible, the other cache should give the line up. In most cases, this will result in the other master giving the line up and the requesting master taking the line Exclusive. In most cases, two or more processors are not performing a semaphore management sequence to the same address at the same time. Thus, a requesting master's load-exclusive is able to take a cache line in the Exclusive state an increased number of times. | 10-16-2014 |
20140331023 | MULTI-CORE PAGE TABLE SETS OF ATTRIBUTE FIELDS - A device includes a memory that stores a first page table that includes a first page table entry, wherein the first page table entry further includes a physical address, an alternative location associated with the page table entry, and a physical page of memory associated with the physical address. A first processing unit is configured to: read the first page table entry, and determine the physical address from the first page table entry. The second processing unit is configured to: read the physical address from the first page table entry, determine second page attribute data from the alternative location, wherein the second page attribute data define one or more accessibility attributes of the physical page of memory for the second processing unit, and access the physical page of memory associated with the physical address according to the one or more accessibility attributes. | 11-06-2014 |
20150019843 | METHOD AND APPARATUS FOR SELECTIVE RENAMING IN A MICROPROCESSOR - A method and apparatus for allowing an out-of-order processor to reuse an in-use physical register is disclosed herein. The method and apparatus uses identifiers, such as tokens and/or other identifiers in a rename map table (RMT) and a physical register file (PRF), to indicate whether an instruction result is allowed or disallowed to be written into a physical register. | 01-15-2015 |
20150046604 | FLEXIBLE HARDWARE MODULE ASSIGNMENT FOR ENHANCED PERFORMANCE - A system is disclosed for mapping operating-system-identified addresses for substantially-identical hardware modules into performance-parameter-based addresses for the hardware modules. The mapping is accomplished by configuring a flexible I/O interface responsive to a characterization of at least one performance parameter for each hardware module. | 02-12-2015 |