Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees


MIPS Technologies, Inc.

MIPS Technologies, Inc. Patent applications
Patent application numberTitlePublished
20120082167Method and Apparatus for Predicting Characteristics of Incoming Data Packets to Enable Speculative Processing to Reduce Processor Latency - A system for processing data packets in a data packet network has at least one input port for receiving data packets, at least one output port for sending out data packets, a processor for processing packet data, and a packet predictor for predicting a future packet based on a received packet, such that at least some processing for the predicted packet may be accomplished before the predicted packet actually arrives at the system. The system is used in preferred embodiments in Internet routers.04-05-2012
20120036380Clock Ratio Controller For Dynamic Voltage and Frequency Scaled Digital Systems, and Applications Thereof - The present invention provides a clock ratio controller for dynamic voltage and frequency scaled digital systems, and applications thereof. In an embodiment, a digital system is provided that includes a first digital circuit that operates at a first rate determined by a first clock signal and a second digital circuit that operates at a second rate determined by a second clock signal. The first digital circuit is coupled to the second digital circuit by a bus that is used for communications between the first digital circuit and the second digital circuit. A clock ratio controller is used to adjust the frequency of the first clock signal and/or the second clock signal in response to a power management signal without causing a loss of synchronization between the first digital circuit and the second digital circuit.02-09-2012
20120030392System and Method for Automatic Hardware Interrupt Handling - A processing system is provided consisting of an interrupt pin, multiple registers, a stack pointer, and an automatic interrupt system. The multiple registers store a number of processor states values. When the system detects an interrupt on the interrupt pin the system prepares to enter an exception mode where the automatic interrupt system causes an interrupt vector to be fetched, the stack pointer to be updated, and the processor state values to be read in parallel from the registers and stored in memory locations based on the updated stack pointer, prior to the execution of an interrupt service routine. A method for automatic hardware interrupt handling is also presented02-02-2012
20110153945Apparatus and Method for Controlling the Exclusivity Mode of a Level-Two Cache - A method of controlling the exclusivity mode of a level-two cache includes generating level-two cache exclusivity control information at a processor in response to an exclusivity mode indicator, and utilizing the level-two cache exclusivity control information to configure the exclusivity mode of the level-two cache.06-23-2011
20110138349Automated Digital Circuit Design Tool That Reduces or Eliminates Adverse Timing Constraints Due To An Inherent Clock Signal Skew, and Applications Thereof - The present invention provides an automated digital circuit design tool that reduces or eliminates adverse timing constraints due to an inherent clock signal skew, and applications thereof In an embodiment, an automated design tool according to the invention generates a clocking system that includes a clock signal generator, control logic, enable logic, and at least one clock gater. The clock signal generator generates a clock signal that is distributed to various logic blocks of the digital circuit using a buffered clock tree. The enable logic receives input values from the control logic and provides a control signal to the clock gater. When enabled, the clock gater allows a clock signal to pass through to multiple registers. An early clock signal is provided to register(s) in the control logic, which allows for an increased clock frequency while still meeting timing constraints.06-09-2011
20110099353System and Method for Extracting Fields from Packets Having Fields Spread Over More Than One Register - Systems and methods that allow for extracting a field from data stored in a pair of registers using two instructions. A first instruction extracts any part of the field from a first register designated as a first source register, and executes a second instruction extracting any part of the field from a second general register designated as a second source register. The second instruction inserts any extracted field parts in a result register.04-28-2011
20110055497Alignment and Ordering of Vector Elements for Single Instruction Multiple Data Processing - The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register. Then, a subset of elements are selected from the first register and the second register. The elements from the subset are then replicated into the elements in the third register in a particular order suitable for subsequent SIMD vector processing.03-03-2011
20110055488HORIZONTALLY-SHARED CACHE VICTIMS IN MULTIPLE CORE PROCESSORS - A processor includes multiple processor core units, each including a processor core and a cache memory. Victim lines evicted from a first processor core unit's cache may be stored in another processor core unit's cache, rather than written back to system memory. If the victim line is later requested by the first processor core unit, the victim line is retrieved from the other processor core unit's cache. The processor has low latency data transfers between processor core units. The processor transfers victim lines directly between processor core units' caches or utilizes a victim cache to temporarily store victim lines while searching for their destinations. The processor evaluates cache priority rules to determine whether victim lines are discarded, written back to system memory, or stored in other processor core units' caches. Cache priority rules can be based on cache coherency data, load balancing schemes, and architectural characteristics of the processor.03-03-2011
20110040956Symmetric Multiprocessor Operating System for Execution On Non-Independent Lightweight Thread Contexts - A multiprocessing system is disclosed. The system includes a multithreading microprocessor, including a plurality of thread contexts (TCs), each comprising a first control indicator for controlling whether the TC is exempt from servicing interrupt requests to an exception domain for the plurality of TCs, and a virtual processing element (VPE), comprising the exception domain, configured to receive the interrupt requests, wherein the interrupt requests are non-specific to the plurality of TCs, wherein the VPE is configured to select a non-exempt one of the plurality of TCs to service each of the interrupt requests, the VPE further comprising a second control indicator for controlling whether the VPE is enabled to select one of the plurality of TCs to service the interrupt requests. The system also includes a multiprocessor operating system (OS), configured to initially set the second control indicator to enable the VPE to service the interrupts, and further configured to schedule execution of threads on the plurality of TCs, wherein each of the threads is configured to individually disable itself from servicing the interrupts by setting the first control indicator, rather than by clearing the second control indicator.02-17-2011
20100312991Microprocessor with Compact Instruction Set Architecture - A re-encoded instruction set architecture (ISA) provides smaller bit-width instructions or a combination of smaller and larger bit-width instructions to improve instruction execution efficiency and reduce code footprint. The ISA can be re-encoded from a legacy ISA having larger bit-width instructions, and the re-encoded ISA can maintain assembly-level compatibility with the ISA from which it is derived. In addition, the re-encoded ISA can have new and different types of additional instructions, including instructions with encoded arguments determined by statistical analysis and instructions that have the effect of combinations of instructions.12-09-2010
20100306513Processor Core and Method for Managing Program Counter Redirection in an Out-of-Order Processor Pipeline - A processor core and method for managing program counter redirection in an out-of-order processor pipeline. In one embodiment, the pipeline of the processor core includes a front-end instruction fetch portion, a back-end instruction execution portion, and pipeline control logic. Operation of the instruction fetch portion is decoupled from operation of the instruction execution portion. Following detection of a control transfer misprediction, operation of the instruction fetch portion is halted and instructions residing in the instruction fetch portion are invalidated. When the instruction associated with the misprediction reaches a selected pipeline stage, instructions residing in the instruction execution portion of the pipeline are invalidated and the flow of instructions from the instruction fetch portion to the instruction execution portion of the processor pipeline is restarted. A mispredict instruction identification checker and instruction identification tags are used to determine if a control transfer instruction is permitted to redirect instruction fetching.12-02-2010
20100287359VARIABLE REGISTER AND IMMEDIATE FIELD ENCODING IN AN INSTRUCTION SET ARCHITECTURE - A method and apparatus provide means for compressing instruction code size. An Instruction Set Architecture (ISA) encodes instructions compact, usual or extended bit lengths. Commonly used instructions are encoded having both compact and usual bit lengths, with compact or usual bit length instructions chosen based on power, performance or code size requirements. Instructions of the ISA can be used in both privileged and non-privileged operating modes of a microprocessor. The instruction encodings can be used interchangeably in software applications. Instructions from the ISA may be executed on any programmable device enabled for the ISA, including a single instruction set architecture processor or a multi-instruction set architecture processor.11-11-2010
20100199054System and Method for Improving Memory Transfer - System and method for performing a high-bandwidth memory copy. Memory transfer instructions allow for copying of data from a first memory location to a second memory location without the use of load and store word instructions thereby achieving a high-bandwidth copy. In one embodiment, the method includes the steps of (1) decoding a destination address from a first memory transfer instruction, (2) storing the destination address in a register in the bus interface unit, (3) decoding a source address from a second memory transfer instruction, and (4) copying the contents of a memory location specified by the source memory address to a memory location specified by the contents of the register. Other methods and a microprocessor system are also presented.08-05-2010
20100115244MULTITHREADING MICROPROCESSOR WITH OPTIMIZED THREAD SCHEDULER FOR INCREASING PIPELINE UTILIZATION EFFICIENCY - A multithreading processor for concurrently executing multiple threads is provided. The processor includes an execution pipeline and a thread scheduler that dispatches instructions of the threads to the execution pipeline. The execution pipeline execution pipeline is configured for generating a thread context (TC) flush indicator associated with a thread context when one or more instructions of the thread context would stall in the execution pipeline. One or more instructions in the pipeline of the thread context associated with the thread context flush signal can be flushed or nullified.05-06-2010
20100115243Apparatus, Method and Instruction for Initiation of Concurrent Instruction Streams in a Multithreading Microprocessor - A fork instruction for execution on a multithreaded microprocessor and occupying a single instruction issue slot is disclosed. The fork instruction, executing in a parent thread, includes a first operand specifying the initial instruction address of a new thread and a second operand. The microprocessor executes the fork instruction by allocating context for the new thread, copying the first operand to a program counter of the new thread context, copying the second operand to a register of the new thread context, and scheduling the new thread for execution. If no new thread context is free for allocation, the microprocessor raises an exception to the fork instruction. The fork instruction is efficient because it does not copy the parent thread general purpose registers to the new thread. The second operand is typically used as a pointer to a data structure in memory containing initial general purpose register set values for the new thread.05-06-2010
20100103938Context Sharing Between A Streaming Processing Unit (SPU) and A Packet Management Unit (PMU) In A Packet Processing Environment - A context-selection mechanism is provided for selecting a best context from a pool of contexts for processing a data packet. The context selection mechanism comprises, an interface for communicating with a multi-streaming processor; circuitry for computing input data into a result value according to logic rule and for selecting a context based on the computed value and a loading mechanism for preloading the packet information into the selected context for subsequent processing. The computation of the input data functions to enable identification and selection of a best context for processing a data packet according to the logic rule at the instant time such that a multitude of subsequent context selections over a period of time acts to balance load pressure on functional units housed within the multi-streaming processor and required for packet processing. In preferred aspects, programmable singular or multiple predictive rules of logic are utilized in the selection process.04-29-2010
20100070257Methods, Systems, and Computer Program Products for Evaluating Electrical Circuits From Information Stored in Simulation Dump Files - Disclosed are methods, systems, and computer program products for evaluating performance aspects of electrical circuits, and particularly digital logic circuits. An exemplary method comprises obtaining access to a simulation dump file comprising state indications of the values of a plurality of signals of an electrical circuit at a plurality of simulation time points, and receiving an evaluation task that defines an output based on one or more input signals, with each input signal being a signal for which state indications are provided in the simulation dump file. The method further comprises generating, from the simulation dump file, one or more state representations for the input signals of the evaluation task, with each state representation being representative of the state of an input signal over a period of simulation time, and generating values of the output of the evaluation task at a plurality of simulation time points from the state representations.03-18-2010
20100049953DATA CACHE RECEIVE FLOP BYPASS - A microprocessor includes an N-way cache and a logic block that selectively enables and disables the N-way cache for at least one clock cycle if a first register load instructions and a second register load instruction, following the first register load instruction, are detected as pointing to the same index line in which the requested data is stored. The logic block further provides a disabling signal to the N-way cache for at least one clock cycle if the first and second instructions are detected as pointing to the same cache way.02-25-2010
20100049912DATA CACHE WAY PREDICTION - A microprocessor includes one or more N-way caches and a way prediction logic that selectively enables and disables the cache ways so as to reduce the power consumption. The way prediction logic receives an address and predicts in which one of the cache ways the data associated with the address is likely to be stored. The way prediction logic causes an enabling signal to be supplied only to the way predicted to contain the requested data. The remaining (N−1) of the cache ways do not receive the enabling signal. The power consumed by the cache is thus significantly reduced.02-25-2010
20100031005Instruction Encoding For System Register Bit Set And Clear - An instruction encoding architecture is provided for a microprocessor to allow atomic modification of privileged architecture registers. The instructions include an opcode that designates to the microprocessor that the instructions are to execute in privileged (kernel) state only, and that the instructions are to communicate with privileged control registers, a field for designating which of a plurality of privileged architecture registers is to be modified, a field for designating which bit fields within the designated privileged architecture register is to be modified, and a field to designate whether the whether the designated bit fields are to be set or cleared. The instruction encoding allows a single instruction to atomically set or clear bit fields within privileged architecture registers, without reading the privileged architecture registers into a general purpose register. In addition, the instruction encoding allows a programmer to specify whether the previous content of a privileged architecture register is to be saved to a general purpose register during the atomic modification.02-04-2010
20100011166Data Cache Virtual Hint Way Prediction, and Applications Thereof - A virtual hint based data cache way prediction scheme, and applications thereof. In an embodiment, a processor retrieves data from a data cache based on a virtual hint value or an alias way prediction value and forwards the data to dependent instructions before a physical address for the data is available. After the physical address is available, the physical address is compared to a physical address tag value for the forwarded data to verify that the forwarded data is the correct data. If the forwarded data is the correct data, a hit signal is generated. If the forwarded data is not the correct data, a miss signal is generated. Any instructions that operate on incorrect data are invalidated and/or replayed.01-14-2010
20100005247Method and Apparatus for Global Ordering to Insure Latency Independent Coherence - A method and apparatus is described for insuring coherency between memories in a multi-agent system where the agents are interconnected by one or more fabrics. A global arbiter is used to segment coherency into three phases: request; snoop; and response, and to apply global ordering to the requests. A bus interface having request, snoop, and response logic is provided for each agent. A bus interface having request, snoop and response logic is provided for the global arbiter, and a bus interface is provided to couple the global arbiter to each type of fabric it is responsible for. Global ordering and arbitration logic tags incoming requests from the multiple agents and insures that snoops are responded to according to the global order, without regard to latency differences in the fabrics.01-07-2010
20090327649Three-Tiered Translation Lookaside Buffer Hierarchy in a Multithreading Microprocessor - A three-tiered TLB architecture in a multithreading processor that concurrently executes multiple instruction threads is provided. A macro-TLB caches address translation information for memory pages for all the threads. A micro-TLB caches the translation information for a subset of the memory pages cached in the macro-TLB. A respective nano-TLB for each of the threads caches translation information only for the respective thread. The nano-TLBs also include replacement information to indicate which entries in the nano-TLB/micro-TLB hold recently used translation information for the respective thread. Based on the replacement information, recently used information is copied to the nano-TLB if evicted from the micro-TLB.12-31-2009
20090313457System and Method for Extracting Fields from Packets Having Fields Spread Over More Than One Register - Systems and methods that allow for extracting a field from data stored in a pair of registers using two instructions. A first instruction extracts any part of the field from a first register designated as a first source register, and executes a second instruction extracting any part of the field from a second general register designated as a second source register. The second instruction inserts any extracted field parts in a result register.12-17-2009
20090282220Microprocessor with Compact Instruction Set Architecture - A re-encoded instruction set architecture (ISA) provides smaller bit-width instructions or a combination of smaller and larger bit-width instructions to improve instruction execution efficiency and reduce code footprint. The ISA can be re-encoded from a legacy ISA having larger bit-width instructions and can be used to unify one or more ISA extensions such as application specific ASEs. The re-encoded ISA maintains assembly-level compatibility with the ISA from which it is derived. In addition, the re-encoded ISA can have new and different types of additional instructions.11-12-2009
20090271592Apparatus For Storing Instructions In A Multithreading Microprocessor - A circuit for selecting one of N requesters in a round-robin fashion is disclosed. The circuit 1-bit left rotatively increments a first addend by a second addend to generate a sum that is ANDed with the inverse of the first addend to generate a 1-hot vector indicating which of the requestors is selected next. The first addend is an N-bit vector where each bit is false if the corresponding requester is requesting access to a shared resource. The second addend is a 1-hot vector indicating the last selected requestor. A multithreading microprocessor dispatch scheduler employs the circuit for N concurrent threads each thread having one of P priorities. The dispatch scheduler generates P N-bit 1-hot round-robin bit vectors, and each thread's priority is used to select the appropriate round-robin bit from P vectors for combination with the thread's priority and an issuable bit to create a dispatch level used to select a thread for instruction dispatching.10-29-2009
20090249351Round-Robin Apparatus and Instruction Dispatch Scheduler Employing Same For Use In Multithreading Microprocessor - An apparatus for selecting one of N requesters of a shared resource in a round-robin fashion is disclosed. One or more of the N requestors may be disabled from being selected in a selection cycle. The apparatus includes a first input that receives a first value specifying which of the N requestors was last selected. A second input receives a second value specifying which of the N requestors is enabled to be selected. A barrel incrementer, coupled to receive the first and second inputs, 1-bit left-rotatively increments the second value by the first value to generate a sum. Combinational logic, coupled to the barrel incrementer, generates a third value specifying which of the N requestors is selected next.10-01-2009
20090249046APPARATUS AND METHOD FOR LOW OVERHEAD CORRELATION OF MULTI-PROCESSOR TRACE INFORMATION - A method of coordinating trace information in a multiprocessor system includes receiving processor trace information from a set of processors. The processor trace information from each processor includes a processor identity and a coherence indicator that demarks selective shared memory transactions. Coherence manager trace information is generated for each of the processors. The coherence manager trace information for each processor includes trace metrics and a coherence indicator.10-01-2009
20090249045APPARATUS AND METHOD FOR CONDENSING TRACE INFORMATION IN A MULTI-PROCESSOR SYSTEM - A computer readable storage medium includes executable instructions to characterize a coherency controller. The executable instructions define ports to receive processor trace information from a set of processors. The processor trace information from each processor includes a processor identity and a condensed coherence indicator. Circuitry produces a trace stream with trace metrics and condensed coherence indicators.10-01-2009
20090249039Providing Extended Precision in SIMD Vector Arithmetic Operations - The present invention provides extended precision in SIMD arithmetic operations in a processor having a register file and an accumulator. A first set of data elements and a second set of data elements are loaded into first and second vector registers, respectively. Each data element comprises N bits. Next, an arithmetic instruction is fetched from memory. The arithmetic instruction is decoded. Then, the first vector register and the second vector register are read from the register file. The present invention executes the arithmetic instruction on corresponding data elements in the first and second vector registers. The resulting element of the execution is then written into the accumulator. Then, the resulting element is transformed into an N-bit width element and written into a third register for further operation or storage in memory. The transformation of the resulting element can include, for example, rounding, clamping, and/or shifting the element.10-01-2009
20090248988MECHANISM FOR MAINTAINING CONSISTENCY OF DATA WRITTEN BY IO DEVICES - A multi-core microprocessor includes, in part, a cache coherence manager that maintains coherence among the multitude of microprocessor cores, and an I/O coherence unit that maintains coherent traffic between the I/O devices and the multitude of processing cores of the microprocessor. The I/O coherence unit stalls non-coherent I/O write requests until it receives acknowledgement that all pending coherent I/O write requests issued prior to the non-coherence I/O write requests have been made visible to the processing cores. The I/O coherence unit ensures that MMIO read responses are not delivered to the processing cores until after all previous I/O write requests are made visible to the processing cores. Deadlock conditions are prevented by limiting MMIO requests in such a way that they can never block I/O write requests from completing.10-01-2009
20090198986Configurable Instruction Sequence Generation - A configurable instruction set architecture is provided whereby a single virtual instruction may be used to generate a sequence of instructions. Dynamic parameter substitution may be used to substitute parameters specified by a virtual instruction into instructions within a virtual instruction sequence.08-06-2009
20090193177Virtual Processor Based Security For On-Chip Memory, and Applications Thereof - A processor-based method, system and apparatus to comprise a method, system and apparatus to access a memory location in an on-chip memory based on a virtual processing element identification associated with an instruction. The system comprises multiple virtual processing elements, an access list and a comparator coupled to the memory and the access list. In response to an instruction from a virtual processing element to access a memory location in the memory, the comparator compares a first virtual processing identification associated with the instruction to a second virtual processing identification stored in the access list and grants access to the virtual processing element to read from or write to the memory location if the first virtual processing element identification is equal to the second virtual processing element identification. The data in the memory is allocated and de-allocated by software. In one embodiment, the access list is instantiated in hardware and cannot be read from or written to by software. A virtual processing element comprises multiple hardware thread contexts with each thread context being associated with a distinct register file.07-30-2009
20090164733APPARATUS AND METHOD FOR CONTROLLING THE EXCLUSIVITY MODE OF A LEVEL-TWO CACHE - A method of controlling the exclusivity mode of a level-two cache includes generating level-two cache exclusivity control information at a processor in response to an exclusivity mode indicator, and utilizing the level-two cache exclusivity control information to configure the exclusivity mode of the level-two cache.06-25-2009
20090158078Clock ratio controller for dynamic voltage and frequency scaled digital systems, and applications thereof - The present invention provides a clock ratio controller for dynamic voltage and frequency scaled digital systems, and applications thereof. In an embodiment, a digital system is provided that includes a first digital circuit that operates at a first rate determined by a first clock signal and a second digital circuit that operates at a second rate determined by a second clock signal. The first digital circuit is coupled to the second digital circuit by a bus that is used for communications between the first digital circuit and the second digital circuit. A clock ratio controller is used to adjust the frequency of the first clock signal and/or the second clock signal in response to a power management signal without causing a loss of synchronization between the first digital circuit and the second digital circuit.06-18-2009
20090157981COHERENT INSTRUCTION CACHE UTILIZING CACHE-OP EXECUTION RESOURCES - A multiprocessor system maintains cache coherence among processors in a coherent domain. Within the coherent domain, a first processor can receive a command to perform a cache maintenance operation. The first processor can determine whether the cache maintenance operation is a coherent operation. For coherent operations, the first processor sends a coherent request message for distribution to other processors in the coherent domain and can cancel execution of the cache maintenance operation pending receipt of intervention messages corresponding to the coherent request. The intervention messages can reflect a global ordering of coherence traffic in the multiprocessor system and can include instructions for maintaining a data cache and an instruction cache of the first processor. Cache maintenance operations that are determined to be non-coherent can be executed at the first processor without sending the coherent request.06-18-2009
20090132841Processor Accessing A Scratch Pad On-Demand To Reduce Power Consumption - The present invention provides processing systems, apparatuses, and methods that access a scratch pad on-demand to reduce power consumption. In an embodiment, an instruction fetch unit initiates an instruction fetch. When a scratch pad is enabled, an instruction is retrieved from the scratch pad in parallel with a translation of a virtual address to a physical address. If the physical address is associated with the scratch pad, the retrieved instruction is provided to an execution unit. Otherwise, the scratch pad is disabled to reduce power consumption and the instruction fetch is re-initiated. When the scratch pad is disabled, an instruction is retrieved from another instruction source, such as an instruction cache, in parallel with the translation of the virtual address to the physical address. If the physical address is associated with the scratch pad, the scratch pad is enabled and the instruction fetch is re-initiated.05-21-2009
20090125660Interrupt and Exception Handling for Multi-Streaming Digital Processors - A multi-streaming processor has a plurality of streams for streaming one or more instruction threads, a set of functional resources for processing instructions from streams, and interrupt handler logic. The logic detects and maps interrupts and exceptions to one or more specific streams. In some embodiments, one interrupt or exception may be mapped to two or more streams, and in others two or more interrupts or exceptions may be mapped to one stream. Mapping may be static and determined at processor design, programmable, with data stored and amendable, or conditional and dynamic, the interrupt logic executing an algorithm sensitive to variables to determine the mapping. Interrupts may be external interrupts generated by devices external to the processor software (internal) interrupts generated by active streams, or conditional, based on variables. After interrupts are acknowledged, streams to which interrupts or exceptions are mapped are vectored to appropriate service routines. In a synchronous method, no vectoring occurs until all streams to which an interrupt is mapped acknowledge the interrupt.05-14-2009
20090113365Automated digital circuit design tool that reduces or eliminates adverse timing constraints due to an inherent clock signal skew, and applications thereof - The present invention provides an automated digital circuit design tool that reduces or eliminates adverse timing constraints due to an inherent clock signal skew, and applications thereof. In an embodiment, an automated design tool according to the invention generates a clocking system that includes a clock signal generator, control logic, enable logic, and at least one clock gater. The clock signal generator generates a clock signal that is distributed to various logic blocks of the digital circuit using a buffered clock tree. The enable logic receives input values from the control logic and provides a control signal to the clock gater. When enabled, the clock gater allows a clock signal to pass through to multiple registers. An early clock signal is provided to register(s) in the control logic, which allows for an increased clock frequency while still meeting timing constraints.04-30-2009
20090113180Fetch Director Employing Barrel-Incrementer-Based Round-Robin Apparatus For Use In Multithreading Microprocessor - A fetch director in a multithreaded microprocessor that concurrently executes instructions of N threads is disclosed. The N threads request to fetch instructions from an instruction cache. In a given selection cycle, some of the threads may not be requesting to fetch instructions. The fetch director includes a circuit for selecting one of threads in a round-robin fashion to provide its fetch address to the instruction cache. The circuit 1-bit left rotatively increments a first addend by a second addend to generate a sum that is ANDed with the inverse of the first addend to generate a 1-hot vector indicating which of the threads is selected next. The first addend is an N-bit vector where each bit is false if the corresponding thread is requesting to fetch instructions from the instruction cache. The second addend is a 1-hot vector indicating the last selected thread. In one embodiment threads with an empty instruction buffer are selected at highest priority; a last dispatched but not fetched thread at middle priority; all other threads at lowest priority. The threads are selected round-robin within the highest and lowest priorities.04-30-2009
20090089510SPECULATIVE READ IN A CACHE COHERENT MICROPROCESSOR - A cache coherence manager, disposed in a multi-core microprocessor, includes a request unit, an intervention unit, a response unit and an interface unit. The request unit receives coherent requests and selectively issues speculative requests in response. The interface unit selectively forwards the speculative requests to a memory. The interface unit includes at least three tables. Each entry in the first table represents an index to the second table. Each entry in the second table represents an index to the third table. The entry in the first table is allocated when a response to an associated intervention message is stored in the first table but before the speculative request is received by the interface unit. The entry in the second table is allocated when the speculative request is stored in the interface unit. The entry in the third table is allocated when the speculative request is issued to the memory.04-02-2009
20090083493SUPPORT FOR MULTIPLE COHERENCE DOMAINS - A number of coherence domains are maintained among the multitude of processing cores disposed in a microprocessor. A cache coherency manager defines the coherency relationships such that coherence traffic flows only among the processing cores that are defined as having a coherency relationship. The data defining the coherency relationships between the processing cores is optionally stored in a programmable register. For each source of a coherent request, the processing core targets of the request are identified in the programmable register. In response to a coherent request, an intervention message is forwarded only to the cores that are defined to be in the same coherence domain as the requesting core. If a cache hit occurs in response to a coherent read request and the coherence state of the cache line resulting in the hit satisfies a condition, the requested data is made available to the requesting core from that cache line.03-26-2009
20090080651SEMICONDUCTOR WITH HARDWARE LOCKED INTELLECTUAL PROPERTY AND RELATED METHODS - A computer readable medium includes executable instructions to describe an intellectual property core with a key check mechanism configured to compare an external key with an internal key in response to a specified event. A pending instruction is executed in response to a match between the external key and the internal key. An unexpected act is performed in response to a mismatch between the external key and the internal key.03-26-2009
20090077321Microprocessor with Improved Data Stream Prefetching - A microprocessor coupled to a system memory by a bus includes an instruction decode unit that decodes an instruction that specifies a data stream in the system memory and a stream prefetch priority. The microprocessor also includes a load/store unit that generates load/store requests to transfer data between the system memory and the microprocessor. The microprocessor also includes a stream prefetch unit that generates a plurality of prefetch requests to prefetch the data stream from the system memory into the microprocessor. The prefetch requests specify the stream prefetch priority. The microprocessor also includes a bus interface unit (BIU) that generates transaction requests on the bus to transfer data between the system memory and the microprocessor in response to the load/store requests and the prefetch requests. The BIU prioritizes the bus transaction requests for the prefetch requests relative to the bus transaction requests for the load/store requests based on the stream prefetch priority.03-19-2009
20090073974Method and Apparatus for Predicting Characteristics of Incoming Data Packets to Enable Speculative Processing To Reduce Processor Latency - A system for processing data packets in a data packet network has at least one input port for receiving data packets, at least one output port for sending out data packets, a processor for processing packet data, and a packet predictor for predicting a future packet based on a received packet, such that at least some processing for the predicted packet may be accomplished before the predicted packet actually arrives at the system. The system is used in preferred embodiments in Internet routers.03-19-2009
20090063881Low-overhead/power-saving processor synchronization mechanism, and applications thereof - A low-overhead/power-saving processor synchronization mechanism, and applications thereof. In an embodiment, the present invention provides a processor having a load-linked register. The processor implements instructions related to the load-linked register. A first instruction, when executed by the processor, causes the processor to load a first value specified by the first instruction in a first register of a register file and to load a second value in the load-linked register. A second instruction, when executed by the processor, causes the processor to suspend execution of a stream of instructions associated with the load-linked register if the second value in the load-linked register is unaltered until the second value in the load-linked register is altered. A third instruction, when executed by the processor, causes the processor to conditionally move a third value to a memory location specified by the third instruction and to move a value representing the state of the load-linked register to the third register.03-05-2009
20090049312Power Management for System Having One or More Integrated Circuits - Power management control software including power management policies is provided with those policies divided into observation code and response code. When predetermined execution points within the operating system 02-19-2009
20090037886APPARATUS AND METHOD FOR EVALUATING A FREE-RUNNING TRACE STREAM - A system includes a processor to generate a free-running trace stream and a probe with a real-time decoder to dynamically detect a trigger included in the free-running trace stream.02-05-2009
20090037704TRACE CONTROL FROM HARDWARE AND SOFTWARE - A system and method for program counter and data tracing is disclosed. The tracing mechanism of the present invention enables increased visibility into the hardware and software state of the processor core.02-05-2009
20080320233Reduced Handling of Writeback Data - The complexity of the logic of the cache coherency manager unit is reduced by leveraging the data path for intervention messages and responses to carry data associated with writeback requests. A processor core unit sends a writeback request to the cache coherency manager unit. The request does not include the writeback data. Upon receiving an intervention message associated with the writeback request, the processor core unit provides an intervention message response to the cache coherency manager unit indicating that the writeback operation should not be cancelled. The intervention message response includes the writeback data. Because the cache coherency manager already requires a data path to handle data transfers between processor core units, little or no additional overhead needs to be added to the cache coherency manager to handle data associated with writeback request.12-25-2008
20080320232Preventing Writeback Race in Multiple Core Processors - A processor prevents writeback race condition errors by maintaining responsibility for data until the writeback request is confirmed by an intervention message from a cache coherency manager. If a request for the same data arrives before the intervention message, the processor core unit provides the requested data and cancels the pending writeback request. The cache coherency data associated with cache lines indicates whether a request for data has been received prior to the intervention message associated with the writeback request. The cache coherency data of a cache line has a value of “modified” when the writeback request is initiated. When the intervention message associated with the writeback request is received, the cache lines's cache coherency data is examined. A change in the cache coherency data from the value of “modified” indicates that the request for data has been received prior to the intervention and the writeback request should be cancelled.12-25-2008
20080320231Avoiding Livelock Using Intervention Messages in Multiple Core Processors - Livelocks are prevented in multiple core processors by canceling data access requests upon determining that they conflict with other data access requests. A requesting processor core sends a data access request potentially causing livelock to a cache coherency manager. A cache coherency manager receives data access requests from multiple processor. The cache coherency manager sends intervention messages to all of the processor cores in response to all data access requests that may cause livelock. Upon receiving an intervention message from the cache coherency manager, the processor core determines if the intervention message corresponds with any of its own pending data access requests. If the intervention message is associated with a data access request conflicting with one of its own pending data access requests, the processor core responds to the invention message by directing the cache coherency manager to cancel its own conflicting pending data access request.12-25-2008
20080320230Avoiding Livelock Using A Cache Manager in Multiple Core Processors - Livelocks are prevented in multiple core processors by verifying that a data access request is still valid before sending messages to processor cores that may cause other data access requests to fail. A cache coherency manager receives data access requests from multiple processor cores. Upon receiving a data access request that may cause a livelock, the cache coherency manager first sends an intervention message back to the requesting processor core to confirm that this data access request will succeed. If the requesting processor core determines that the data access request is still valid, it directs the cache coherency manager to proceed with the data access request. The cache coherency manager may then send intervention messages to other processor cores to complete the data access request. If the requesting processor core determines that the data access request is invalid, it directs the cache coherency manager to abandon the data access request.12-25-2008
20080282087System debug and trace system and method, and applications thereof - An embedded system or system on chip (SoC) includes a secure JTAG system and method to provide secure on-chip control, capture, and export of on chip information in an embedded environment to a probe. In one embodiment, the system comprises encryption logic associated with a JTAG subsystem and decryption logic in the probe for encrypted JTAG read traffic. Inverted encryption/decryption logic provides bi-directional encryption and decryption of JTAG traffic. Encrypted information includes both authentication of valid probe/target interface and encryption of debug data.11-13-2008
20080270757Fetch and Dispatch Disassociation Apparatus for Multistreaming Processors - A dynamic multistreaming processor has instruction queues, each instruction queue corresponding to an instruction stream, and execution units. The dynamic multistreaming processor also has a dispatch stage to select at least one instruction from one of the instruction queues and to dispatch the selected at least one instruction to one of the execution units. Lastly the dynamic multistreaming processor has a queue counter, associated with each instruction queue, for indicating the number of instructions in each queue, and a fetch counter, associated with each instruction queue, for indicating an address from which to obtain instructions when the associated instruction queue is not full. The dynamic multistreaming processor might also have fetch counters for indicating a next instruction address from which to obtain at least one instruction when the associated instruction queue is not full. The dynamic multistreaming processor could also have a second counter for indicating a next instruction address.10-30-2008
20080222589Protecting Trade Secrets During the Design and Configuration of an Integrated Circuit Semiconductor Design - A system and method for facilitating the design process of an integrated circuits (IC) is described. The system and method utilizes a plurality of repositories, rules engines and design and verification tools to analyze the workload and automatically produce a hardened GDSII description or other representation of the IC. Synthesizable RTL is securely maintained on a server in a data center while providing designers graphical access to customizable IP block by way of a network portal.09-11-2008
20080222581Remote Interface for Managing the Design and Configuration of an Integrated Circuit Semiconductor Design - A software system for facilitating the design process and minimizing the time and effort required to complete the design and fabrication of an integrated circuits (IC) is described. The software system utilizes a data center having a plurality of repositories, rules engines and design and verification tools to automatically produce a hardened GDSII description or other representation of the device in response to the formation of a electronic license agreement. Designers select contractual terms for incorporating third party intellectual property and then design and initiate manufacture of the IC by way of a network portal.09-11-2008
20080222580SYSTEM AND METHOD FOR MANAGING THE DESIGN AND CONFIGURATION OF AN INTEGRATED CIRCUIT SEMICONDUCTOR DESIGN - A system and methods that facilitate the design process and minimize the time and effort required to complete the design and fabrication of an integrated circuits (IC) are described. The system and method utilize a plurality of repositories, rules engines and design and verification tools to analyze the workload and automatically produce a hardened GDSII description or other representation of the device. The system and method securely maintains synthesizable RTL on a server in a data center while providing designers access to portions of the mechanism by way of a network portal.09-11-2008
20080215857Method For Latest Producer Tracking In An Out-Of-Order Processor, And Applications Thereof - Methods for latest producer tracking in a processor. In one embodiment, the method includes the steps of (1) writing a physical register identification value in a first register rename map location specified by a first instruction, (2) writing a first in-register status value in a second register rename map location specified by the first instruction, (3) writing a producer tracking status value at a producer tracking map location specified by the physical register identification value, and (4) modifying, upon graduation of the first instruction, the first in-register status value only if the producer tracking map location stores the producer tracking status value written in step (3). Other methods are also presented.09-04-2008

Patent applications by MIPS Technologies, Inc.