Patent application number | Description | Published |
20080240426 | Flexible architecture and instruction for advanced encryption standard (AES) - A flexible aes instruction set for a general purpose processor is provided. The instruction set includes instructions to perform a “one round” pass for aes encryption or decryption and also includes instructions to perform key generation. An immediate may be used to indicate round number and key size for key generation for 128/192/256 bit keys. The flexible aes instruction set enables full use of pipelining capabilities because it does not require tracking of implicit registers. | 10-02-2008 |
20090052659 | METHOD AND APPARATUS FOR GENERATING AN ADVANCED ENCRYPTION STANDARD (AES) KEY SCHEDULE - An Advanced Encryption Standard (AES) key generation assist instruction is provided. The AES key generation assist instruction assists in generating round keys used to perform AES encryption and decryption operations. The AES key generation instruction operates independent of the size of the cipher key and performs key generation operations in parallel on four 32-bit words thereby increasing the speed at which the round keys are generated. This instruction is easy to use in software. Hardware implementation of this instruction removes potential threats of software (cache access based) side channel attacks on this part of the AES algorithm. | 02-26-2009 |
20090168998 | EXECUTING AN ENCRYPTION INSTRUCTION USING STORED ROUND KEYS - Embodiments of an invention for executing an encryption instruction using stored round keys are disclosed. In one embodiment, an apparatus includes instruction logic, encryption logic, a storage region, and control logic. The instruction logic is to receive an encryption instruction. The encryption logic is to perform, in response to the instruction logic receiving the encryption instruction, an encryption operation including a plurality of rounds, each round using a corresponding round key from a plurality of round keys. The storage region is to store the plurality of round keys. The control logic is to fetch, for use during each of the plurality of rounds, the corresponding round key from the storage region. | 07-02-2009 |
20090172357 | USING A PROCESSOR IDENTIFICATION INSTRUCTION TO PROVIDE MULTI-LEVEL PROCESSOR TOPOLOGY INFORMATION - Embodiments of an invention for using a processor identification instruction to provide multi-level processor topology information are disclosed. In one embodiment, a processor includes decode logic and control logic. The decode logic is to receive an identification instruction having an associated topological level value. The control logic is to provide, in response to the decode logic receiving the identification instruction, processor identification information corresponding to the associated topological level value. | 07-02-2009 |
20100332574 | Digital random number generator - A hardware-based digital random number generator is provided. The digital random number generator is a randomly behaving random number generator based on a set of nondeterministic behaviors. The nondeterministic behaviors include temporal asynchrony between subunits, entropy source “extra” bits, entropy measurement, autonomous deterministic random bit generator reseeding and consumption from a shared resource. | 12-30-2010 |
20100332578 | Method and apparatus for performing efficient side-channel attack resistant reduction - A time-invariant method and apparatus for performing modular reduction that is protected against cache-based and branch-based attacks is provided. The modular reduction technique adds no performance penalty and is side-channel resistant. The side-channel resistance is provided through the use of lazy evaluation of carry bits, elimination of data-dependent branches and use of even cache accesses for all memory references. | 12-30-2010 |
20110087843 | Monitoring cache usage in a distributed shared cache - An apparatus, method, and system are disclosed. In one embodiment the apparatus includes a cache memory, which a number of sets. Each of the sets in the cache memory have several cache lines. The apparatus also includes at least one process resource table. The process resource table maintains a cache line occupancy count of a number of cache lines. Specifically, the cache line occupancy count for each cache line describes the number of cache lines in the cache storing information utilized by a process running on a computer system. Additionally, the process resource table stores the occupancy count of less cache lines than the total number of cache lines in the cache memory. | 04-14-2011 |
20110153700 | Method and apparatus for performing a shift and exclusive or operation in a single instruction - Method and apparatus for performing a shift and XOR operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources perform a shift and XOR on at least one value. | 06-23-2011 |
20110153952 | SYSTEM, METHOD, AND APPARATUS FOR A CACHE FLUSH OF A RANGE OF PAGES AND TLB INVALIDATION OF A RANGE OF ENTRIES - Systems, methods, and apparatus for performing the flushing of a plurality of cache lines and/or the invalidation of a plurality of translation look-aside buffer (TLB) entries is described. In one such method, for flushing a plurality of cache lines of a processor a single instruction including a first field that indicates that the plurality of cache lines of the processor are to be flushed and in response to the single instruction, flushing the plurality of cache lines of the processor. | 06-23-2011 |
20110153960 | TRANSACTIONAL MEMORY IN OUT-OF-ORDER PROCESSORS WITH XABORT HAVING IMMEDIATE ARGUMENT - Methods, systems, and apparatuses to provide an XABORT in a transactional memory access system are described. In one embodiment, the stored value is a context value indicating the context in which a transactional memory execution was aborted. A fallback handler may use the context value to perform a series of operations particular to the context in which the abort occurred. | 06-23-2011 |
20110153983 | Gathering and Scattering Multiple Data Elements - According to a first aspect, efficient data transfer operations can be achieved by: decoding by a processor device, a single instruction specifying a transfer operation for a plurality of data elements between a first storage location and a second storage location; issuing the single instruction for execution by an execution unit in the processor; detecting an occurrence of an exception during execution of the single instruction; and in response to the exception, delivering pending traps or interrupts to an exception handler prior to delivering the exception. | 06-23-2011 |
20110153993 | Add Instructions to Add Three Source Operands - A method in one aspect may include receiving an add instruction. The add instruction may indicate a first source operand, a second source operand, and a third source operand. A sum of the first, second, and third source operands may be stored as a result of the add instruction. The sum may be stored partly in a destination operand indicated by the add instruction and partly a plurality of flags. Other methods are also disclosed, as are apparatus, systems, and instructions on machine-readable medium. | 06-23-2011 |
20110153994 | Multiplication Instruction for Which Execution Completes Without Writing a Carry Flag - A method in one aspect may include receiving a multiply instruction. The multiply instruction may indicate a first source operand and a second source operand. A product of the first and second source operands may be stored in one or more destination operands indicated by the multiply instruction. Execution of the multiply instruction may complete without writing a carry flag. Other methods are also disclosed, as are apparatus, systems, and instructions on machine-readable medium. | 06-23-2011 |
20110153997 | Bit Range Isolation Instructions, Methods, and Apparatus - Receiving an instruction indicating a source operand and a destination operand. Storing a result in the destination operand in response to the instruction. The result operand may have: (1) first range of bits having a first end explicitly specified by the instruction in which each bit is identical in value to a bit of the source operand in a corresponding position; and (2) second range of bits that all have a same value regardless of values of bits of the source operand in corresponding positions. Execution of instruction may complete without moving the first range of the result relative to the bits of identical value in the corresponding positions of the source operand, regardless of the location of the first range of bits in the result. Execution units to execute such instructions, computer systems having processors to execute such instructions, and machine-readable medium storing such an instruction are also disclosed. | 06-23-2011 |
20110154079 | Instruction For Enabling A Procesor Wait State - In one embodiment, the present invention includes a processor having a core with decode logic to decode an instruction prescribing an identification of a location to be monitored and a timer value, and a timer coupled to the decode logic to perform a count with respect to the timer value. The processor may further include a power management unit coupled to the core to determine a type of a low power state based at least in part on the timer value and cause the processor to enter the low power state responsive to the determination. Other embodiments are described and claimed. | 06-23-2011 |
20110154090 | Controlling Time Stamp Counter (TSC) Offsets For Mulitple Cores And Threads - In one embodiment, the present invention includes a method for recording a time stamp counter (TSC) value of a first TSC counter of a processor before a system suspension, accessing the stored TSC value after the system suspension, and directly updating a thread offset value associated with a first thread executing on a first core of the processor with the stored TSC value, without performing a synchronization between a plurality of cores of the processor. Other embodiments are described and claimed. | 06-23-2011 |
20110161635 | Rotate instructions that complete execution without reading carry flag - A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag. | 06-30-2011 |
20110161639 | Event counter checkpointing and restoring - A method of one aspect may include storing an event count of an event counter that counts events that occur during execution within a logic device. The method may further include restoring the event counter to the stored event count after the event counter has counted additional events. Other methods are also disclosed. Apparatus, systems, and machine-readable medium having software are also disclosed. | 06-30-2011 |
20120151183 | ENHANCING PERFORMANCE BY INSTRUCTION INTERLEAVING AND/OR CONCURRENT PROCESSING OF MULTIPLE BUFFERS - An embodiment may include circuitry to execute, at least in part, a first list of instructions and/or to concurrently process, at least in part, first and second buffers. The execution of the first list of instructions may result, at least in part, from invocation of a first function call. The first list of instructions may include at least one portion of a second list of instructions interleaved, at least in part, with at least one other portion of a third list of instructions. The portions may be concurrently carried out, at least in part, by one or more sets of execution units of the circuitry. The second and third lists of instructions may implement, at least in part, respective algorithms that are amenable to being invoked by separate respective function calls. The concurrent processing may involve, at least in part, complementary algorithms. | 06-14-2012 |
20120166767 | SYSTEM, APPARATUS, AND METHOD FOR SEGMENT REGISTER READ AND WRITE REGARDLESS OF PRIVILEGE LEVEL - Embodiments of systems, apparatuses, and methods for performing privilege agnostic segment base register read or write instruction are described. An exemplary method may include fetching the privilege agnostic segment base register write instruction, wherein the privilege agnostic write instruction includes a 64-bit data source operand, decoding the fetched privilege agnostic segment base register write instruction, and executing the decoded privilege agnostic segment base register write instruction to write the 64-bit data of the source operand into the segment base register identified by the opcode of the privilege agnostic segment base register write instruction. | 06-28-2012 |
20120227045 | METHOD, APPARATUS, AND SYSTEM FOR SPECULATIVE EXECUTION EVENT COUNTER CHECKPOINTING AND RESTORING - An apparatus, method, and system are described herein for providing programmable control of performance/event counters. An event counter is programmable to track different events, as well as to be checkpointed when speculative code regions are encountered. So when a speculative code region is aborted, the event counter is able to be restored to it pre-speculation value. Moreover, the difference between a cumulative event count of committed and uncommitted execution and the committed execution, represents an event count/contribution for uncommitted execution. From information on the uncommitted execution, hardware/software may be tuned to enhance future execution to avoid wasted execution cycles. | 09-06-2012 |
20130205119 | INSTRUCTION AND LOGIC TO TEST TRANSACTIONAL EXECUTION STATUS - Novel instructions, logic, methods and apparatus are disclosed to test transactional execution status. Embodiments include decoding a first instruction to start a transactional region. Responsive to the first instruction, a checkpoint for a set of architecture state registers is generated and memory accesses from a processing element in the transactional region associated with the first instruction are tracked. A second instruction to detect transactional execution of the transactional region is then decoded. An operation is executed, responsive to decoding the second instruction, to determine if an execution context of the second instruction is within the transactional region. Then responsive to the second instruction, a first flag is updated. In some embodiments, a register may optionally be updated and/or a second flag may optionally be updated responsive to the second instruction. | 08-08-2013 |
20130227252 | Add Instructions to Add Three Source Operands - A method in one aspect may include receiving an add instruction. The add instruction may indicate a first source operand, a second source operand, and a third source operand. A sum of the first, second, and third source operands may be stored as a result of the add instruction. The sum may be stored partly in a destination operand indicated by the add instruction and partly a plurality of flags. Other methods are also disclosed, as are apparatus, systems, and instructions on machine-readable medium. | 08-29-2013 |
20130246824 | Instruction For Enabling A Processor Wait State - In one embodiment, the present invention includes a processor having a core with decode logic to decode an instruction prescribing an identification of a location to be monitored and a timer value, and a timer coupled to the decode logic to perform a count with respect to the timer value. The processor may further include a power management unit coupled to the core to determine a type of a low power state based at least in part on the timer value and cause the processor to enter the low power state responsive to the determination. Other embodiments are described and claimed. | 09-19-2013 |
20130275722 | METHOD AND APPARATUS TO PROCESS KECCAK SECURE HASHING ALGORITHM - A processor includes a plurality of registers, an instruction decoder to receive an instruction to process a KECCAK state cube of data representing a KECCAK state of a KECCAK hash algorithm, to partition the KECCAK state cube into a plurality of subcubes, and to store the subcubes in the plurality of registers, respectively, and an execution unit coupled to the instruction decoder to perform the KECCAK hash algorithm on the plurality of subcubes respectively stored in the plurality of registers in a vector manner. | 10-17-2013 |
20130283064 | METHOD AND APPARATUS TO PROCESS SHA-1 SECURE HASHING ALGORITHM - A processor includes an instruction decoder to receive a first instruction to process a SHA-1 hash algorithm, the first instruction having a first operand to store a SHA-1 state, a second operand to store a plurality of messages, and a third operand to specify a hash function, and an execution unit coupled to the instruction decoder to perform a plurality of rounds of the SHA-1 hash algorithm on the SHA-1 state specified in the first operand and the plurality of messages specified in the second operand, using the hash function specified in the third operand. | 10-24-2013 |
20130305019 | Instruction and Logic to Control Transfer in a Partial Binary Translation System - A dynamic optimization of code for a processor-specific dynamic binary translation of hot code pages (e.g., frequently executed code pages) may be provided by a run-time translation layer. A method may be provided to use an instruction look-aside buffer (iTLB) to map original code pages and translated code pages. The method may comprise fetching an instruction from an original code page, determining whether the fetched instruction is a first instruction of a new code page and whether the original code page is deprecated. If both determinations return yes, the method may further comprise fetching a next instruction from a translated code page. If either determinations returns no, the method may further comprise decoding the instruction and fetching the next instruction from the original code page. | 11-14-2013 |
20130311756 | ROTATE INSTRUCTIONS THAT COMPLETE EXECUTION WITHOUT READING CARRY FLAG - A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag. | 11-21-2013 |
20130326201 | PROCESSOR-BASED APPARATUS AND METHOD FOR PROCESSING BIT STREAMS - An apparatus and method are described for processing bit streams using bit-oriented instructions. For example, a method according to one embodiment includes the operations of: executing an instruction to get bits for an operation, the instruction identifying a start bit address and a number of bits to be retrieved; retrieving the bits identified by the start bit address and number of bits from a bit-oriented register or cache; and performing a sequence of specified bit operations on the retrieved bits to generate results. | 12-05-2013 |
20140003602 | Flexible Architecture and Instruction for Advanced Encryption Standard (AES) | 01-02-2014 |
20140006739 | Systems, Apparatuses, and Methods for Implementing Temporary Escalated Privilege | 01-02-2014 |
20140006753 | MATRIX MULTIPLY ACCUMULATE INSTRUCTION | 01-02-2014 |
20140013086 | ADDITION INSTRUCTIONS WITH INDEPENDENT CARRY CHAINS - A number of addition instructions are provided that have no data dependency between each other. A first addition instruction stores its carry output in a first flag of a flags register without modifying a second flag in the flags register. A second addition instruction stores its carry output in the second flag of the flags register without modifying the first flag in the flags register. | 01-09-2014 |
20140016773 | INSTRUCTIONS PROCESSORS, METHODS, AND SYSTEMS TO PROCESS BLAKE SECURE HASHING ALGORITHM - A method of an aspect includes receiving an instruction indicating a first source having at least one set of four state matrix data elements, which represent a complete set of four inputs to a G function of a cryptographic hashing algorithm. The algorithm uses a sixteen data element state matrix, and alternates between updating data elements in columns and diagonals. The instruction also indicates a second source having data elements that represent message and constant data. In response to the instruction, a result is stored in a destination indicated by the instruction. The result includes updated state matrix data elements including at least one set of four updated state matrix data elements. Each of the four updated state matrix data elements represents a corresponding one of the four state matrix data elements of the first source, which has been updated by the G function. | 01-16-2014 |
20140016774 | INSTRUCTIONS TO PERFORM GROESTL HASHING - A method is described. The method includes executing an instruction to perform one or more Galois Field (GF) multiply by 2 operations on a state matrix and executing an instruction to combine results of the one or more GF multiply by 2 operations with exclusive or (XOR) functions to generate a result matrix. | 01-16-2014 |
20140053000 | INSTRUCTIONS TO PERFORM JH CRYPTOGRAPHIC HASHING - A method is described. The method includes executing one or more JH_SBOX_L instruction to perform S-Box mappings and a linear (L) transformation on a JH state and executing one or more JH_Permute instruction to perform a permutation function on the JH state once the S-Box mappings and the L transformation have been performed | 02-20-2014 |
20140059333 | METHOD, APPARATUS, AND SYSTEM FOR SPECULATIVE ABORT CONTROL MECHANISMS - An apparatus and method is described herein for providing robust speculative code section abort control mechanisms. Hardware is able to track speculative code region abort events, conditions, and/or scenarios, such as an explicit abort instruction, a data conflict, a speculative timer expiration, a disallowed instruction attribute or type, etc. And hardware, firmware, software, or a combination thereof makes an abort determination based on the tracked abort events. As an example, hardware may make an initial abort determination based on one or more predefined events or choose to pass the event information up to a firmware or software handler to make such an abort determination. Upon determining an abort of a speculative code region is to be performed, hardware, firmware, software, or a combination thereof performs the abort, which may include following a fallback path specified by hardware or software. And to enable testing of such a fallback path, in one implementation, hardware provides software a mechanism to always abort speculative code regions. | 02-27-2014 |
20140122839 | APPARATUS AND METHOD OF EXECUTION UNIT FOR CALCULATING MULTIPLE ROUNDS OF A SKEIN HASHING ALGORITHM - An apparatus is described that includes an execution unit within an instruction pipeline. The execution unit has multiple stages of a circuit that includes a) and b) as follows. a) a first logic circuitry section having multiple mix logic sections each having: i) a first input to receive a first quad word and a second input to receive a second quad word; ii) an adder having a pair of inputs that are respectively coupled to the first and second inputs; iii) a rotator having a respective input coupled to the second input; iv) an XOR gate having a first input coupled to an output of the adder and a second input coupled to an output of the rotator. b) permute logic circuitry having inputs coupled to the respective adder and XOR gate outputs of the multiple mix logic sections. | 05-01-2014 |
20140164467 | APPARATUS AND METHOD FOR VECTOR INSTRUCTIONS FOR LARGE INTEGER ARITHMETIC - An apparatus is described that includes a semiconductor chip having an instruction execution pipeline having one or more execution units with respective logic circuitry to: a) execute a first instruction that multiplies a first input operand and a second input operand and presents a lower portion of the result, where, the first and second input operands are respective elements of first and second input vectors; b) execute a second instruction that multiplies a first input operand and a second input operand and presents an upper portion of the result, where, the first and second input operands are respective elements of first and second input vectors; and, c) execute an add instruction where a carry term of the add instruction's adding is recorded in a mask register. | 06-12-2014 |
20140195782 | METHOD AND APPARATUS TO PROCESS SHA-2 SECURE HASHING ALGORITHM - A processor includes an instruction decoder to receive a first instruction to process a secure hash algorithm 2 (SHA-2) hash algorithm, the first instruction having a first operand associated with a first storage location to store a SHA-2 state and a second operand associated with a second storage location to store a plurality of messages and round constants. The processor further includes an execution unit coupled to the instruction decoder to perform one or more iterations of the SHA-2 hash algorithm on the SHA-2 state specified by the first operand and the plurality of messages and round constants specified by the second operand, in response to the first instruction. | 07-10-2014 |
20140195817 | THREE INPUT OPERAND VECTOR ADD INSTRUCTION THAT DOES NOT RAISE ARITHMETIC FLAGS FOR CRYPTOGRAPHIC APPLICATIONS - A method is described that includes performing the following within an instruction execution pipeline implemented on a semiconductor chip: summing three input vector operands through execution of a single instruction; and, not raising any arithmetic flags even though a result of the summing creates more bits than circuitry designed to transport the summation is able to transport. | 07-10-2014 |
20140205084 | INSTRUCTIONS TO PERFORM JH CRYPTOGRAPHIC HASHING IN A 256 BIT DATA PATH - A method is described. The method includes executing one or more JH_SBOX_L instructions to perform S-Box mappings and a linear (L) transformation on a JH state and executing one or more JH_P instructions to perform a permutation function on the JH state once the S-Box mappings and the L transformation have been performed. | 07-24-2014 |
20140237218 | SIMD INTEGER MULTIPLY-ACCUMULATE INSTRUCTION FOR MULTI-PRECISION ARITHMETIC - A multiply-and-accumulate (MAC) instruction allows efficient execution of unsigned integer multiplications. The MAC instruction indicates a first vector register as a first operand, a second vector register as a second operand, and a third vector register as a destination. The first vector register stores a first factor, and the second vector register stores a partial sum. The MAC instruction is executed to multiply the first factor with an implicit second factor to generate a product, and to add the partial sum to the product to generate a result. The first factor, the implicit second factor and the partial sum have a same data width and the product has twice the data width. The most significant half of the result is stored in the third vector register, and the least significant half of the result is stored in the second vector register. | 08-21-2014 |
20140281008 | QOS BASED BINARY TRANSLATION AND APPLICATION STREAMING - In one embodiment, Quality of Service (QoS) criteria based server side binary translation and execution of applications is performed on multiple servers utilizing distributed translation and execution in either a virtualized or native execution environment. The translated applications are executed to generate output display data, the output display data is encoded in a media format suitable for video streaming, and the video stream is delivered over a network to a client device. In one embodiment, one or more graphics processors assist the central processors of the servers by accelerating the rendering of the application output, and a media encoder encodes the application output into a media format. | 09-18-2014 |
20140281196 | PROCESSORS, METHODS, AND SYSTEMS TO RELAX SYNCHRONIZATION OF ACCESSES TO SHARED MEMORY - A processor of an aspect includes a plurality of logical processors. A first logical processor of the plurality is to execute software that includes a memory access synchronization instruction that is to synchronize accesses to a memory. The processor also includes memory access synchronization relaxation logic that is to prevent the memory access synchronization instruction from synchronizing accesses to the memory when the processor is in a relaxed memory access synchronization mode. | 09-18-2014 |
20140281398 | INSTRUCTION EMULATION PROCESSORS, METHODS, AND SYSTEMS - A processor of an aspect includes decode logic to receive a first instruction and to determine that the first instruction is to be emulated. The processor also includes emulation mode aware post-decode instruction processor logic coupled with the decode logic. The emulation mode aware post-decode instruction processor logic is to process one or more control signals decoded from an instruction. The instruction is one of a set of one or more instructions used to emulate the first instruction. The one or more control signals are to be processed differently by the emulation mode aware post-decode instruction processor logic when in an emulation mode than when not in the emulation mode. Other apparatus are also disclosed as well as methods and systems. | 09-18-2014 |
20140283032 | INTER-PROCESSOR ATTESTATION HARDWARE - Embodiments of an invention for inter-processor attestation hardware are disclosed. In one embodiment, an apparatus includes first attestation hardware associated with a first portion of a system. The first attestation hardware is to attest to a second portion of the system that the first portion of the system is secure. | 09-18-2014 |
20140344553 | Gathering and Scattering Multiple Data Elements - According to a first aspect, efficient data transfer operations can be achieved by: decoding by a processor device, a single instruction specifying a transfer operation for a plurality of data elements between a first storage location and a second storage location; issuing the single instruction for execution by an execution unit in the processor; detecting an occurrence of an exception during execution of the single instruction; and in response to the exception, delivering pending traps or interrupts to an exception handler prior to delivering the exception. | 11-20-2014 |
20140379996 | METHOD, APPARATUS, AND SYSTEM FOR TRANSACTIONAL SPECULATION CONTROL INSTRUCTIONS - An apparatus and method is described herein for providing speculative escape instructions. Specifically, an explicit non-transactional load operation is described herein. During execution of a speculative code region (e.g. a transaction or critical section) loads are normally tracked in a read set. However, a programmer or compiler may utilize the explicit non-transactional read to load from a memory address into a destination register, while not adding the read/load to the transactional read set. Similarly, a non-transactional store is also provided. Here, a transactional store is performed and not added to a write set during speculative code execution. And the store may be immediately globally visible and/or persistent (even after an abort of the speculative code region). In other words, speculative escape operations are provided to ‘escape’ a speculative code region to perform non-transactional memory accesses without causing the speculative code region to abort or fail. | 12-25-2014 |
20150032998 | METHOD, APPARATUS, AND SYSTEM FOR TRANSACTIONAL SPECULATION CONTROL INSTRUCTIONS - An apparatus and method is described herein for providing speculation control instructions. An xAcquire and xRelease instruction are provided to define a critical section. In one embodiment, the xAcquire instruction includes a lock instruction with an elision prefix and the xRelease instruction includes a lock release instruction with an elision prefix. As a result, a processor is able to elide locks and transactionally execute a critical section defined in software by xAcquire and xRelease. But by adding only prefix hints, legacy processor are able to execute the same code by just ignoring the hints and executing the critical section traditionally with locks to guarantee mutual exclusion. Moreover, xBegin and xEnd are similarly provided for in an Instruction Set Architecture (ISA) to define a transactional code region. In addition, other control speculation instructions, such as xAbort to enable explicit abort of a critical or transactional code section and xTest to test a state of speculative execution is also provided in the ISA. | 01-29-2015 |
20150055778 | METHOD AND APPARATUS FOR A NON-DETERMINISTIC RANDOM BIT GENERATOR (NRBG) - A hardware-based digital random number generator is provided. In one embodiment, a processor includes a digital random number generator (DRNG) to condition entropy data provided by an entropy source, to generate a plurality of deterministic random bit (DRB) strings, and to generate a plurality of nondeterministic random bit (NRB) strings, and an execution unit coupled to the DRNG, in response to a first instruction to read a seed value, to retrieve one of the NRB strings from the DRNG and to store the NRB string in a destination register specified by the first instruction. | 02-26-2015 |
20150082047 | EFFICIENT MULTIPLICATION, EXPONENTIATION AND MODULAR REDUCTION IMPLEMENTATIONS - In one embodiment, the present disclosure provides a method that includes segmenting an n-bit exponent e into a first segment e | 03-19-2015 |
20150089195 | METHOD AND APPARATUS FOR PERFORMING A SHIFT AND EXCLUSIVE OR OPERATION IN A SINGLE INSTRUCTION - Method and apparatus for performing a shift and XOR operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources perform a shift and XOR on at least one value. | 03-26-2015 |
20150089196 | METHOD AND APPARATUS FOR PERFORMING A SHIFT AND EXCLUSIVE OR OPERATION IN A SINGLE INSTRUCTION - Method and apparatus for performing a shift and XOR operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources perform a shift and XOR on at least one value. | 03-26-2015 |
20150089197 | METHOD AND APPARATUS FOR PERFORMING A SHIFT AND EXCLUSIVE OR OPERATION IN A SINGLE INSTRUCTION - Method and apparatus for performing a shift and XOR operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources perform a shift and XOR on at least one value. | 03-26-2015 |
20150089199 | ROTATE INSTRUCTIONS THAT COMPLETE EXECUTION EITHER WITHOUT WRITING OR READING FLAGS - A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag. | 03-26-2015 |
20150089200 | ROTATE INSTRUCTIONS THAT COMPLETE EXECUTION EITHER WITHOUT WRITING OR READING FLAGS - A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag. | 03-26-2015 |
20150089201 | ROTATE INSTRUCTIONS THAT COMPLETE EXECUTION EITHER WITHOUT WRITING OR READING FLAGS - A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag. | 03-26-2015 |
20150089286 | EVENT COUNTER CHECKPOINTING AND RESTORING - Event counter checkpointing and restoring is disclosed. In one implementation, a processor includes a first event counter to count events that occur during execution within the processor, event counter checkpoint logic, communicably coupled with the first event counter, to store, prior to a transactional execution of the processor, a value of the first event counter, a second event counter to count events prior to and during the transactional execution, wherein the second event counter is to increment without resetting after the transactional execution is aborted, event count restore logic to restore the first event counter to the stored value after the transactional execution is aborted, and tuning logic to determine, in response to aborting of the transactional execution, a number of the events that occurred during the transactional execution based on the stored value of the first event counter and a value of the second event counter. | 03-26-2015 |