Patent application number | Description | Published |
20090070602 | System and Method of Executing Instructions in a Multi-Stage Data Processing Pipeline - A device is disclosed that includes an instruction execution pipeline having multiple stages for executing an instruction. The device also includes a control logic circuit coupled to the instruction execution pipeline. The control logic circuit is adapted to skip at least one stage of the instruction execution pipeline during execution of the instruction. The control logic circuit is also adapted to execute at least one non-skipped stage during execution of the decoded instruction. | 03-12-2009 |
20090119477 | Configurable Translation Lookaside Buffer - The disclosure includes a method and system of configuring a translation lookaside buffer (TLB). In an embodiment, the TLB includes a first portion and a second portion. The first portion or the second portion may be selectively disabled in response to a value of a TLB configuration indicator. | 05-07-2009 |
20090132733 | Selective Preclusion of a Bus Access Request - A system and method for selective preclusion of bus access requests are disclosed. In an embodiment, a method includes determining a bus unit access setting at a logic circuit of a processor. The method also includes selectively precluding a bus unit access request based on the bus unit access setting. | 05-21-2009 |
20090222626 | Systems and Methods for Cache Line Replacements - A system for determining a cache line to replace is described. In one embodiment, the system includes a cache comprising a plurality of cache lines. The system further includes an identifier configured to identify a cache line for replacement. The system also includes a control logic configured to determine a value of the identifier selected from an incrementer, a cache maintenance instruction, or remains the same. | 09-03-2009 |
20090235051 | System and Method of Selectively Committing a Result of an Executed Instruction - In a particular embodiment, a method is disclosed that includes receiving an instruction packet including a first instruction and a second instruction that is dependent on the first instruction at a processor having a plurality of parallel execution pipelines, including a first execution pipeline and a second execution pipeline. The method further includes executing in parallel at least a portion of the first instruction and at least a portion of the second instruction. The method also includes selectively committing a second result of executing the at least a portion of the second instruction with the second execution pipeline based on a first result related to execution of the first instruction with the first execution pipeline. | 09-17-2009 |
20090327647 | Memory Management Unit Directed Access to System Interfaces - A memory management unit (MMU) for servicing transaction requests from one or more processor threads is described. The MMU can include a translation lookaside buffer (TLB). The TLB can include a storage module and a logic circuit. The storage module can store a bit indicating one of a plurality of interfaces. The bit can be associated with a physical address range. The logic circuit can route a physical address within the physical address range to the one of the plurality of interfaces. | 12-31-2009 |
20100228941 | Configurable Cache and Method to Configure Same - In a particular embodiment, a cache is disclosed that includes a tag state array that includes a tag area addressable by a set index. The tag state array also includes a state area addressable by a state address, where the set index and the state address include at least one common bit. | 09-09-2010 |
20100228944 | Apparatus and Method to Translate Virtual Addresses to Physical Addresses in a Base Plus Offset Addressing Mode - An apparatus and method to translate virtual addresses to physical addresses in a base plus offset addressing mode are disclosed. In an embodiment, a method includes performing a first translation lookaside buffer (TLB) lookup based on a base address value to retrieve a speculative physical address. While performing the TLB lookup based on the base address value, the base address value is added to an offset value to generate an effective address value. The method also includes performing a comparison of the base address value and the effective address value based on a variable page size to determine whether the speculative physical address corresponds to the effective address. | 09-09-2010 |
20110125987 | Dedicated Arithmetic Decoding Instruction - A dedicated arithmetic decoding instruction is disclosed. In a particular embodiment, an apparatus includes a memory and a processor coupled to the memory. The processor is configured to execute general purpose instructions and to execute a dedicated arithmetic decoding instruction retrieved from the memory. | 05-26-2011 |
20110179242 | Multi-Stage Multiplexing Operation Including Combined Selection and Data Alignment or Data Replication - A multi-stage multiplexing operation that includes combined selection and data alignment or data replication is disclosed. In a particular embodiment, a method includes performing a first stage of a multi-stage multiplexing operation. During the first stage, a first data source is selected from a first plurality of data sources. At least one of a first data alignment operation and a first data replication operation is also performed on first data from the selected first data source during the first stage. | 07-21-2011 |
20110219212 | System and Method of Processing Hierarchical Very Long Instruction Packets - A system and method of processing a hierarchical very long instruction word (VLIW) packet is disclosed. In a particular embodiment, a method of processing instructions is disclosed. The method includes receiving a hierarchical VLIW packet of instructions and decoding an instruction from the packet to determine whether the instruction is a single instruction or whether the instruction includes a subpacket that includes a plurality of sub-instructions. The method also includes, in response to determining that the instruction includes the subpacket, executing each of the sub-instructions. | 09-08-2011 |
20120011342 | System and Method to Manage a Translation Lookaside Buffer - A system and method to manage a translation lookaside buffer (TLB) is disclosed. In a particular embodiment, a method of managing a first TLB includes in response to starting execution of a memory instruction, setting a first field associated with an entry of the first TLB to indicate use of the entry. The method also includes setting a second field to indicate that the entry in the first TLB matches a corresponding entry in a second TLB. | 01-12-2012 |
20120083912 | ARITHMETIC LOGIC AND SHIFTING DEVICE FOR USE IN A PROCESSOR - An arithmetic logic and shifting device is disclosed and includes an arithmetic logic unit that has a first input to receive a first operand from a first register port, a second input to receive a second operand from a second register port, and an output to selectively provide a memory address to a memory unit in a first mode of operation and to selectively provide an arithmetic output in a second mode of operation. Further, the arithmetic logic and shifting device includes a programmable shifter device that has a first input to receive data from the memory unit, a second input to receive the arithmetic output, a third input to receive an operation code of a computer execution instruction, and a shifted output to provide shifted data. | 04-05-2012 |
20120110367 | Architecture and Method for Eliminating Store Buffers in a DSP/Processor with Multiple Memory Accesses - A method and apparatus for controlling system access to a memory that includes receiving first and second instructions, and evaluating whether both instructions can architecturally complete. When at least one instruction cannot architecturally complete, delaying both instructions. When both instructions can architecturally complete and at least one is a write instruction, adjusting a write control of the memory to account for an evaluation delay. The evaluation delay can be sufficient to evaluate whether both instructions can architecturally complete. The evaluation delay can be input to the write control and not the read control of the memory. A precharge clock of the memory can be adjusted to account for the evaluation delay. Evaluating whether both instructions can architecturally complete can include determining whether data for each instruction is located in a cache, and whether the instructions are memory access instructions. | 05-03-2012 |
20120265943 | Configurable Cache and Method to Configure Same - A method includes receiving an address at a tag state array of a cache. The cache is configurable to have a first size or a second size that is larger than the first size. The method includes identifying a first portion of the address as a set index and using the set index to locate at least one tag field of the tag state array. The method also includes identifying a second portion of the address to compare to a value stored at the at least one tag field and locating at least one state field of the tag state array associated with a particular tag field that matches the second portion. The method further includes identifying a cache line based on a comparison of a third portion of the address to at least two status bits of the at least one state field and retrieving the cache line. | 10-18-2012 |
20120284488 | Methods and Apparatus for Constant Extension in a Processor - Programs often require constants that cannot be encoded in a native instruction format, such as | 11-08-2012 |
20120284489 | Methods and Apparatus for Constant Extension in a Processor - Programs often require constants that cannot be encoded in a native instruction format, such as 32-bits. To provide an extended constant, an instruction packet is formed with constant extender information and a target instruction. The constant extender information encoded as a constant extender instruction provides a first set of constant bits, such as 26-bits for example, and the target instruction provides a second set of constant bits, such as 6-bits. The first set of constant bits are combined with the second set of constant bits to generate an extended constant for execution of the target instruction. The extended constant may be used as an extended source operand, an extended address for memory access instructions, an extended address for branch type of instructions, and the like. Multiple constant extender instructions may be used together to provide larger constants than can be provided by a single extension instruction. | 11-08-2012 |
20130024663 | Table Call Instruction for Frequently Called Functions - An apparatus includes a memory that stores an instruction including an opcode and an operand. The operand specifies an immediate value or a register indicator of a register storing the immediate value. The immediate value is usable to identify a function call address. The function call address is selectable from a plurality of function call addresses. | 01-24-2013 |
20130086360 | FIFO Load Instruction - An instruction identifies a register and a memory location. Upon execution of the instruction by a processor, an item is loaded from the memory location and a shift and insert operation is performed to shift data in the register and to insert the item into the register. | 04-04-2013 |
20130097187 | Determining Top N or Bottom N Data Values and Positions - A method includes executing an instruction at a processor, where executing the instruction includes comparing a data value of a plurality of data values to a first element stored at a first location of a storage device. When the data value satisfies a condition with respect to the first element, the method includes moving the first element to a second location of the storage device and inserting the data value into the first location of the storage device. | 04-18-2013 |
20130145097 | Selective Access of a Store Buffer Based on Cache State - An apparatus includes a cache memory that includes a state array configured to store state information. The state information includes a state that indicates updated corresponding to a particular address of the cache memory is not stored in the cache memory but is available from at least one of multiple sources external to the cache memory, where at least one of the multiple sources is a store buffer. | 06-06-2013 |
20130148694 | Dual Fixed Geometry Fast Fourier Transform (FFT) - A method includes executing a first instruction at a processor to perform a first fast Fourier transform (FFT) operation on a set of inputs in a time domain to produce data in a frequency domain, where the set of inputs is in a first order and where the data in the frequency domain is in a second order. The method also includes performing an operation on the data in the frequency domain to produce data in the frequency domain, where the data in the frequency domain is in the second order. The method includes executing a second instruction at the processor to perform a second FFT operation on the data in the frequency domain to produce data in the time domain, where the data in the time domain is in the first order. | 06-13-2013 |
20130179642 | Non-Allocating Memory Access with Physical Address - Systems and methods for performing non-allocating memory access instructions with physical address. A system includes a processor, one or more levels of caches, a memory, a translation look-aside buffer (TLB), and a memory access instruction specifying a memory access by the processor and an associated physical address. Execution logic is configured to bypass the TLB for the memory access instruction and perform the memory access with the physical address, while avoiding allocation of one or more intermediate levels of caches where a miss may be encountered. | 07-11-2013 |
20130254489 | SYSTEMS AND METHODS FOR CACHE LINE REPLACEMENT - A computer readable storage medium includes instructions that, when executed by a processor, cause the processor to receive an index value included in a cache invalidate by index instruction, an encoded way value, and an incrementer output value. The instructions further cause the processor to assign the index value as an identifier value in response to receiving the cache invalidate by index instruction. The identifier value indicates a cache line for replacement. | 09-26-2013 |
20130304994 | Per Thread Cacheline Allocation Mechanism in Shared Partitioned Caches in Multi-Threaded Processors - Systems and methods for allocation of cache lines in a shared partitioned cache of a multi-threaded processor. A memory management unit is configured to determine attributes associated with an address for a cache entry associated with a processing thread to be allocated in the cache. A configuration register is configured to store cache allocation information based on the determined attributes. A partitioning register is configured to store partitioning information for partitioning the cache into two or more portions. The cache entry is allocated into one of the portions of the cache based on the configuration register and the partitioning register. | 11-14-2013 |
20140059323 | SYSTEMS AND METHODS OF DATA EXTRACTION IN A VECTOR PROCESSOR - Systems and methods of data extraction in a vector processor are disclosed. In a particular embodiment a method of data extraction in a vector processor includes copying at least one data element to a source register of a permutation network. The method includes reordering multiple data elements of the source register, populating a destination register of the permutation network with the reordered data elements, and copying the reordered data elements from the destination register to a memory. | 02-27-2014 |
20140068225 | CONFIGURABLE TRANSLATION LOOKASIDE BUFFER - A particular method includes receiving at least one translation lookaside buffer (TLB) configuration indicator. The at least one TLB configuration indicator indicates a specific number of entries to be enabled at a TLB. The method further includes modifying a number of searchable entries of the TLB in response to the at least one TLB configuration indicator. | 03-06-2014 |
20140115227 | SELECTIVE COUPLING OF AN ADDRESS LINE TO AN ELEMENT BANK OF A VECTOR REGISTER FILE - A method includes selectively coupling a first address line of a plurality of address lines and a second address line of the plurality of address lines to a first element bank of a plurality of element banks of a vector register file according to a selection pattern. The method also includes accessing data stored within the first element bank that is selectively addressed by the first address line via a single read port. | 04-24-2014 |
20140208027 | CONFIGURABLE CACHE AND METHOD TO CONFIGURE SAME - A method includes receiving an address at a tag state array of a cache, wherein the cache is configurable to have a first size and a second size that is smaller than the first size. The method further includes identifying a first portion of the address as a set index, wherein the first portion has a same number of bits when the cache has the first size as when the cache has the second size. The method further includes using the set index to locate at least one tag field of the tag state array, identifying a second portion of the address to compare to a value stored at the at least one tag field, locating at least one state field of the tag state array that is associated with a particular tag field that matches the second portion, identifying a cache line based on a comparison of a third portion of the address to at least one status bit of the at least one state field when the cache has the second size, and retrieving the cache line. | 07-24-2014 |
20140258680 | PARALLEL DISPATCH OF COPROCESSOR INSTRUCTIONS IN A MULTI-THREAD PROCESSOR - Techniques are addressed for parallel dispatch of coprocessor and thread instructions to a coprocessor coupled to a threaded processor. A first packet of threaded processor instructions is accessed from an instruction fetch queue (IFQ) and a second packet of coprocessor instructions is accessed from the IFQ. The IFQ includes a plurality of thread queues that are each configured to store instructions associated with a specific thread of instructions. A dispatch circuit is configured to select the first packet of thread instructions from the IFQ and the second packet of coprocessor instructions from the IFQ and send the first packet to a threaded processor and the second packet to the coprocessor in parallel. A data port is configured to share data between the coprocessor and a register file in the threaded processor. Data port operations are accomplished without affecting operations on any thread executing on the threaded processor. | 09-11-2014 |
20140270017 | DEVICE AND METHOD FOR COMPUTING A CHANNEL ESTIMATE - An apparatus includes selection logic configured to select a first subset of a first set of samples stored at a first set of registers. The first subset includes a first sample stored at a first register of the first set of registers and further includes a second sample stored at a second register of the first set of registers. The apparatus further includes shift logic configured to shift a second set of samples stored at a second set of registers. The apparatus further includes a channel estimator configured to generate a first value associated with a channel estimate based on the first subset and further based on a second subset of the shifted second set of samples. | 09-18-2014 |
20140281368 | CYCLE SLICED VECTORS AND SLOT EXECUTION ON A SHARED DATAPATH - An example method for executing multiple instructions in one or more slots includes receiving a packet including multiple instructions and executing the multiple instructions in one or more slots in a time shared manner. Each slot is associated with an execution data path or a memory data path. An example method for executing at least one instruction in a plurality of phases includes receiving a packet including an instruction, splitting the instruction into a plurality of phases, and executing the instruction in the plurality of phases. | 09-18-2014 |
20140281372 | VECTOR INDIRECT ELEMENT VERTICAL ADDRESSING MODE WITH HORIZONTAL PERMUTE - An example method for placing one or more element data values into an output vector includes identifying a vertical permute control vector including a plurality of elements, each element of the plurality of elements including a register address. The method also includes for each element of the plurality of elements, reading a register address from the vertical permute control vector. The method further includes retrieving a plurality of element data values based on the register address. The method also includes identifying a horizontal permute control vector including a set of addresses corresponding to an output vector. The method further includes placing at least some of the retrieved element data values of the plurality of element data values into the output vector based on the set of addresses in the horizontal permute control vector. | 09-18-2014 |
20140281421 | ARBITRARY SIZE TABLE LOOKUP AND PERMUTES WITH CROSSBAR - An example method of updating an output data vector includes identifying a data value vector including element data values. The method also includes identifying an address value vector including a set of elements. The method further includes applying a conditional operator to each element of the set of elements in the address value vector. The method also includes for each element data value in the data value vector, determining whether to update an output data vector based on applying the conditional operator. | 09-18-2014 |
20150052330 | VECTOR ARITHMETIC REDUCTION - In a particular embodiment, a method includes executing a vector instruction at a processor. The vector instruction includes a vector input that includes a plurality of elements. Executing the vector instruction includes providing a first element of the plurality of elements as a first output. Executing the vector instruction further includes performing an arithmetic operation on the first element and a second element of the plurality of elements to provide a second output. Executing the vector instruction further includes storing the first output and the second output in an output vector. | 02-19-2015 |