Patent application number | Description | Published |
20080307276 | Memory Controller with Loopback Test Interface - In one embodiment, an apparatus comprises an interconnect; at least one processor coupled to the interconnect; and at least one memory controller coupled to the interconnect. The memory controller is programmable by the processor into a loopback test mode of operation and, in the loopback test mode, the memory controller is configured to receive a first write operation from the processor over the interconnect. The memory controller is configured to route write data from the first write operation through a plurality of drivers and receivers connected to a plurality of data pins that are capable of connection to one or more memory modules. The memory controller is further configured to return the write data as read data on the interconnect for a first read operation received from the processor on the interconnect. | 12-11-2008 |
20090055568 | Non-blocking Address Switch with Shallow Per Agent Queues - In one embodiment, a switch is configured to be coupled to an interconnect. The switch comprises a plurality of storage locations and an arbiter control circuit coupled to the plurality of storage locations. The plurality of storage locations are configured to store a plurality of requests transmitted by a plurality of agents. The arbiter control circuit is configured to arbitrate among the plurality of requests stored in the plurality of storage locations. A selected request is the winner of the arbitration, and the switch is configured to transmit the selected request from one of the plurality of storage locations onto the interconnect. In another embodiment, a system comprises a plurality of agents, an interconnect, and the switch coupled to the plurality of agents and the interconnect. In another embodiment, a method is contemplated. | 02-26-2009 |
20090119531 | Digital Phase Relationship Lock Loop - In one embodiment, an apparatus comprises a first clocked storage device operable in a first clock domain corresponding to a first clock signal. The first clocked storage device has an input coupled to receive one or more bits transmitted on the input from a second clock domain corresponding to a second clock signal. The apparatus further comprises control circuitry configured to ensure that a change in a value of the one or more bits transmitted on the input meets setup and hold time requirements of the first clocked storage device. The control circuitry is responsive to a sample history of one of the first clock signal or the second clock signal to detect a phase relationship between the first clock signal and the second clock signal on each clock cycle to ensure the change meets the setup and hold time requirements. | 05-07-2009 |
20090177846 | Retry Mechanism - An interface unit may comprise a buffer configured to store requests that are to be transmitted on an interconnect and a control unit coupled to the buffer. In one embodiment, the control unit is coupled to receive a retry response from the interconnect during a response phase of a first transaction for a first request stored in the buffer. The control unit is configured to record an identifier supplied on the interconnect with the retry response that identifies a second transaction that is in progress on the interconnect. The control unit is configured to inhibit reinitiation of the first transaction at least until detecting a second transmission of the identifier. In another embodiment, the control unit is configured to assert a retry response during a response phase of a first transaction responsive to a snoop hit of the first transaction on a first request stored in the buffer for which a second transaction is in progress on the interconnect. The control unit is further configured to provide an identifier of the second transaction with the retry response. | 07-09-2009 |
20100064120 | Replay Reduction for Power Saving - In one embodiment, a processor comprises a scheduler configured to issue a first instruction operation to be executed and an execution core coupled to the scheduler. Configured to execute the first instruction operation, the execution core comprises a plurality of replay sources configured to cause a replay of the first instruction operation responsive to detecting at least one of a plurality of replay cases. The scheduler is configured to inhibit issuance of the first instruction operation subsequent to the replay for a subset of the plurality of replay cases. The scheduler is coupled to receive an acknowledgement indication corresponding to each of the plurality of replay cases in the subset, and is configured to inhibit issuance of the first instruction operation until the acknowledgement indication is asserted that corresponds to an identified replay case of the subset. | 03-11-2010 |
20100235670 | Fast L1 Flush Mechanism - In one embodiment, a processor comprises a data cache configured to store a plurality of cache blocks and a control unit coupled to the data cache. The control unit is configured to flush the plurality of cache blocks from the data cache responsive to an indication that the processor is to transition to a low power state in which one or more clocks for the processor are inhibited. | 09-16-2010 |
20100235675 | Non-blocking Address Switch with Shallow Per Agent Queues - In one embodiment, a switch is configured to be coupled to an interconnect. The switch comprises a plurality of storage locations and an arbiter control circuit coupled to the plurality of storage locations. The plurality of storage locations are configured to store a plurality of requests transmitted by a plurality of agents. The arbiter control circuit is configured to arbitrate among the plurality of requests stored in the plurality of storage locations. A selected request is the winner of the arbitration, and the switch is configured to transmit the selected request from one of the plurality of storage locations onto the interconnect. In another embodiment, a system comprises a plurality of agents, an interconnect, and the switch coupled to the plurality of agents and the interconnect. In another embodiment, a method is contemplated. | 09-16-2010 |
20110010502 | Cache Implementing Multiple Replacement Policies - In an embodiment, a cache stores tags for cache blocks stored in the cache. Each tag may include an indication identifying which of two or more replacement policies supported by the cache is in use for the corresponding cache block, and a replacement record indicating the status of the corresponding cache block in the replacement policy. Requests may include a replacement attribute that identifies the desired replacement policy for the cache block accessed by the request. If the request is a miss in the cache, a cache block storage location may be allocated to store the corresponding cache block. The tag associated with the cache block storage location may be updated to include the indication of the desired replacement policy, and the cache may manage the block in accordance with the policy. For example, in an embodiment, the cache may support both an LRR and an LRU policy. | 01-13-2011 |
20110010504 | Combined Transparent/Non-Transparent Cache - In one embodiment, a memory that is delineated into transparent and non-transparent portions. The transparent portion may be controlled by a control unit coupled to the memory, along with a corresponding tag memory. The non-transparent portion may be software controlled by directly accessing the non-transparent portion via an input address. In an embodiment, the memory may include a decoder configured to decode the address and select a location in either the transparent or non-transparent portion. Each request may include a non-transparent attribute identifying the request as either transparent or non-transparent. In an embodiment, the size of the transparent portion may be programmable. Based on the non-transparent attribute indicating transparent, the decoder may selectively mask bits of the address based on the size to ensure that the decoder only selects a location in the transparent portion. | 01-13-2011 |
20110010520 | Block-Based Non-Transparent Cache - In an embodiment, a non-transparent memory unit is provided which includes a non-transparent memory and a control circuit. The control circuit may manage the non-transparent memory as a set of non-transparent memory blocks. Software executing on one or more processors may request a non-transparent memory block in which to process data. The control circuit may allocate a first block, and may return an address (or other indication) of the allocated block so that the software can access the block. The control circuit may also provide automatic data movement between the non-transparent memory and a main memory system to which the non-transparent memory unit is coupled. For example, the automatic data movement may include filling data from the main memory system to the allocated block, or flushing the data in the allocated block to the main memory system after the processing of the allocated block is complete. | 01-13-2011 |
20110035518 | Digital Phase Relationship Lock Loop - In one embodiment, an apparatus comprises a first clocked storage device operable in a first clock domain corresponding to a first clock signal. The first clocked storage device has an input coupled to receive one or more bits transmitted on the input from a second clock domain corresponding to a second clock signal. The apparatus further comprises control circuitry configured to ensure that a change in a value of the one or more bits transmitted on the input meets setup and hold time requirements of the first clocked storage device. The control circuitry is responsive to a sample history of one of the first clock signal or the second clock signal to detect a phase relationship between the first clock signal and the second clock signal on each clock cycle to ensure the change meets the setup and hold time requirements. | 02-10-2011 |
20110035560 | Memory Controller with Loopback Test Interface - In one embodiment, an apparatus comprises an interconnect; at least one processor coupled to the interconnect; and at least one memory controller coupled to the interconnect. The memory controller is programmable by the processor into a loopback test mode of operation and, in the loopback test mode, the memory controller is configured to receive a first write operation from the processor over the interconnect. The memory controller is configured to route write data from the first write operation through a plurality of drivers and receivers connected to a plurality of data pins that are capable of connection to one or more memory modules. The memory controller is further configured to return the write data as read data on the interconnect for a first read operation received from the processor on the interconnect. | 02-10-2011 |
20110252165 | Retry Mechanism - An interface unit may comprise a buffer configured to store requests that are to be transmitted on an interconnect and a control unit coupled to the buffer. In one embodiment, the control unit is coupled to receive a retry response from the interconnect during a response phase of a first transaction for a first request stored in the buffer. The control unit is configured to record an identifier supplied on the interconnect with the retry response that identifies a second transaction that is in progress on the interconnect. The control unit is configured to inhibit reinitiation of the first transaction at least until detecting a second transmission of the identifier. In another embodiment, the control unit is configured to assert a retry response during a response phase of a first transaction responsive to a snoop hit of the first transaction on a first request stored in the buffer for which a second transaction is in progress on the interconnect. The control unit is further configured to provide an identifier of the second transaction with the retry response. | 10-13-2011 |
20120072787 | Memory Controller with Loopback Test Interface - In one embodiment, an apparatus comprises an interconnect; at least one processor coupled to the interconnect; and at least one memory controller coupled to the interconnect. The memory controller is programmable by the processor into a loopback test mode of operation and, in the loopback test mode, the memory controller is configured to receive a first write operation from the processor over the interconnect. The memory controller is configured to route write data from the first write operation through a plurality of drivers and receivers connected to a plurality of data pins that are capable of connection to one or more memory modules. The memory controller is further configured to return the write data as read data on the interconnect for a first read operation received from the processor on the interconnect. | 03-22-2012 |
20120182902 | Hierarchical Fabric Control Circuits - In an embodiment, one or more fabric control circuits may be inserted in a communication fabric to control various aspects of the communications by components in the system. The fabric control circuits may be included on the interface of the components to the communication fabric, in some embodiments. In other embodiments that include a hierarchical communication fabric, fabric control circuits may alternatively or additionally be included. The fabric control circuits may be programmable, and thus may provide the ability to tune the communication fabric to meet performance and/or functionality goals. | 07-19-2012 |
20120185062 | Fabric Limiter Circuits - In an embodiment, one or more fabric control circuits may be inserted in a communication fabric to control various aspects of the communications by components in the system. The fabric control circuits may be included on the interface of the components to the communication fabric, in some embodiments. In other embodiments that include a hierarchical communication fabric, fabric control circuits may alternatively or additionally be included. The fabric control circuits may be programmable, and thus may provide the ability to tune the communication fabric to meet performance and/or functionality goals. | 07-19-2012 |
20120278557 | Combined Transparent/Non-Transparent Cache - In one embodiment, a memory that is delineated into transparent and non-transparent portions. The transparent portion may be controlled by a control unit coupled to the memory, along with a corresponding tag memory. The non-transparent portion may be software controlled by directly accessing the non-transparent portion via an input address. In an embodiment, the memory may include a decoder configured to decode the address and select a location in either the transparent or non-transparent portion. Each request may include a non-transparent attribute identifying the request as either transparent or non-transparent. In an embodiment, the size of the transparent portion may be programmable. Based on the non-transparent attribute indicating transparent, the decoder may selectively mask bits of the address based on the size to ensure that the decoder only selects a location in the transparent portion. | 11-01-2012 |
20130151781 | Cache Implementing Multiple Replacement Policies - In an embodiment, a cache stores tags for cache blocks stored in the cache. Each tag may include an indication identifying which of two or more replacement policies supported by the cache is in use for the corresponding cache block, and a replacement record indicating the status of the corresponding cache block in the replacement policy. Requests may include a replacement attribute that identifies the desired replacement policy for the cache block accessed by the request. If the request is a miss in the cache, a cache block storage location may be allocated to store the corresponding cache block. The tag associated with the cache block storage location may be updated to include the indication of the desired replacement policy, and the cache may manage the block in accordance with the policy. For example, in an embodiment, the cache may support both an LRR and an LRU policy. | 06-13-2013 |
20130275720 | ZERO CYCLE MOVE - A system and method for reducing the latency of data move operations. A register rename unit within a processor determines whether a decoded move instruction is eligible for a zero cycle move operation. If so, control logic assigns a physical register identifier associated with a source operand of the move instruction to the destination operand of the move instruction. Additionally, the register rename unit marks the given move instruction to prevent it from proceeding in the processor pipeline. Further maintenance of the particular physical register identifier may be done by the register rename unit during commit of the given move instruction. | 10-17-2013 |
20130290680 | OPTIMIZING REGISTER INITIALIZATION OPERATIONS - A system and method for efficiently reducing the latency of initializing registers. A register rename unit within a processor determines whether prior to an execution pipeline stage it is known a decoded given instruction writes a particular numerical value in a destination operand. An example is a move immediate instruction that writes a value of 0 in its destination operand. Other examples may also qualify. If the determination is made, a given physical register identifier is assigned to the destination operand, wherein the given physical register identifier is associated with the particular numerical value, but it is not associated with an actual physical register in a physical register file. The given instruction is marked to prevent it from proceeding to an execution pipeline stage. When the given physical register identifier is used to read the physical register file, no actual physical register is accessed. | 10-31-2013 |
20130290681 | REGISTER FILE POWER SAVINGS - A system and method for efficiently reducing the power consumption of register file accesses. A processor is operable to execute instructions with two or more data types, each with an associated size and alignment. Data operands for a first data type use operand sizes equal to an entire width of a physical register within a physical register file. Data operands for a second data type use operand sizes less than an entire width of a physical register. Accesses of the physical register file for operands associated with a non-full-width data type do not access a full width of the physical registers. A given numerical value may be bypassed for the portion of the physical register that is not accessed. | 10-31-2013 |
20140025900 | Combined Transparent/Non-Transparent Cache - In one embodiment, a memory that is delineated into transparent and non-transparent portions. The transparent portion may be controlled by a control unit coupled to the memory, along with a corresponding tag memory. The non-transparent portion may be software controlled by directly accessing the non-transparent portion via an input address. In an embodiment, the memory may include a decoder configured to decode the address and select a location in either the transparent or non-transparent portion. Each request may include a non-transparent attribute identifying the request as either transparent or non-transparent. In an embodiment, the size of the transparent portion may be programmable. Based on the non-transparent attribute indicating transparent, the decoder may selectively mask bits of the address based on the size to ensure that the decoder only selects a location in the transparent portion. | 01-23-2014 |
20140086070 | Bandwidth Management - In some embodiments, a system includes a shared, high bandwidth resource (e.g. a memory system), multiple agents configured to communicate with the shared resource, and a communication fabric coupling the multiple agents to the shared resource. The communication fabric may be equipped with limiters configured to limit bandwidth from the various agents based on one or more performance metrics measured with respect to the shared, high bandwidth resource. For example, the performance metrics may include one or more of latency, number of outstanding transactions, resource utilization, etc. The limiters may dynamically modify their limit configurations based on the performance metrics. In an embodiment, the system may include multiple thresholds for the performance metrics, and exceeding a given threshold may include modifying the limiters in the communication fabric. There may be hysteresis implemented in the system as well in some embodiments, to reduce the frequency of transitions between configurations. | 03-27-2014 |
20140089600 | SYSTEM CACHE WITH DATA PENDING STATE - Methods and apparatuses for utilizing a data pending state for cache misses in a system cache. To reduce the size of a miss queue that is searched by subsequent misses, a cache line storage location is allocated in the system cache for a miss and the state of the cache line storage location is set to data pending. A subsequent request that hits to the cache line storage location will detect the data pending state and as a result, the subsequent request will be sent to a replay buffer. When the fill for the original miss comes back from external memory, the state of the cache line storage location is updated to a clean state. Then, the request stored in the replay buffer is reactivated and allowed to complete its access to the cache line storage location. | 03-27-2014 |
20140089617 | Trust Zone Support in System on a Chip Having Security Enclave Processor - An SOC implements a security enclave processor (SEP). The SEP may include a processor and one or more security peripherals. The SEP may be isolated from the rest of the SOC (e.g. one or more central processing units (CPUs) in the SOC, or application processors (APs) in the SOC). Access to the SEP may be strictly controlled by hardware. For example, a mechanism in which the CPUs/APs can only access a mailbox location in the SEP is described. The CPU/AP may write a message to the mailbox, which the SEP may read and respond to. The SEP may include one or more of the following in some embodiments: secure key management using wrapping keys, SEP control of boot and/or power management, and separate trust zones in memory. | 03-27-2014 |
20140089638 | Multi-Destination Instruction Handling - Various techniques for processing instructions that specify multiple destinations. A first portion of a processor pipeline is configured to split a multi-destination instruction into a plurality of single-destination operations. A second portion of the pipeline is configured to process the plurality of single-destination operations. A third portion of the pipeline is configured to merge the plurality of single-destination operations into one or more multi-destination operations. The one or more multi-destination operations may be performed. The first portion of the pipeline may include a decode unit. The second portion of the pipeline may include a map unit, which may in turn include circuitry configured to maintain a list of free architectural registers and a mapping table that maps physical registers to architectural registers. The third portion of the pipeline may comprise a dispatch unit. In some embodiments, this may provide certain advantages such as reduced area and/or power consumption. | 03-27-2014 |
20140089647 | Branch Predictor for Wide Issue, Arbitrarily Aligned Fetch - In an embodiment, a processor may be configured to fetch N instruction bytes from an instruction cache (a “fetch group”), even if the fetch group crosses a cache line boundary. A branch predictor may be configured to produce branch predictions for up to M branches in the fetch group, where M is a maximum number of branches that may be included in the fetch group. In an embodiment, a branch direction predictor may be updated responsive to a misprediction and also responsive to the branch prediction being within a threshold of transitioning between predictions. To avoid a lookup to determine if the threshold update is to be performed, the branch predictor may detect the threshold update during prediction, and may transmit an indication with the branch. | 03-27-2014 |