Patent application number | Description | Published |
20080288744 | DETECTING MEMORY-HAZARD CONFLICTS DURING VECTOR PROCESSING - A method for performing parallel operations in a computer system when one or more memory hazards may be present, which may be implemented by a processor, is described. During operation, the processor receives instructions for detecting conflict between memory addresses in vectors when memory operations are performed in parallel using at least a portion of the vectors, and tracking positions in at least one of the vectors of any detected conflict between the memory addresses. Next, the processor executes the instructions for detecting the conflict between the memory addresses and tracking the positions. | 11-20-2008 |
20080288745 | GENERATING PREDICATE VALUES DURING VECTOR PROCESSING - A method for performing parallel operations in a computer system when one or more memory hazards may be present, which may be implemented by a processor, is described. During operation, the processor receives instructions for detecting conflict between memory addresses in vectors when operations are performed in parallel using at least a portion of the vectors, and generating one or more predicate values corresponding to any detected conflict between the memory addresses, where a given predicate value indicates elements in at least the portion of the vector that can be processed in parallel. Next, the processor executes the instructions for detecting the conflict between the memory addresses and generating the one or more predicate values. | 11-20-2008 |
20080288754 | GENERATING STOP INDICATORS DURING VECTOR PROCESSING - A method for performing parallel operations in a computer system when one or more memory hazards may be present, which may be implemented by a processor, is described. During operation, the processor receives instructions for detecting conflict between memory addresses in vectors when operations are performed in parallel using at least a portion of the vectors, and generating one or more stop indicators corresponding to any detected conflict between the memory addresses, where a given stop indicator indicates a memory hazard. Next, the processor executes the instructions for detecting the conflict between the memory addresses and generating the one or more stop indicators. | 11-20-2008 |
20080288759 | Memory-hazard detection and avoidance instructions for vector processing - A processor that is configured to perform parallel operations in a computer system where one or more memory hazards may be present is described. An instruction fetch unit within the processor is configured to fetch instructions for detecting one or more critical memory hazards between memory addresses if memory operations are performed in parallel on multiple addresses corresponding to at least a partial vector of addresses. Note that critical memory hazards include memory hazards that lead to different results when the memory addresses are processed in parallel than when the memory addresses are processed sequentially. Furthermore, an execution unit within the processor is configured to execute the instructions for detecting the one or more critical memory hazards. | 11-20-2008 |
20090077321 | Microprocessor with Improved Data Stream Prefetching - A microprocessor coupled to a system memory by a bus includes an instruction decode unit that decodes an instruction that specifies a data stream in the system memory and a stream prefetch priority. The microprocessor also includes a load/store unit that generates load/store requests to transfer data between the system memory and the microprocessor. The microprocessor also includes a stream prefetch unit that generates a plurality of prefetch requests to prefetch the data stream from the system memory into the microprocessor. The prefetch requests specify the stream prefetch priority. The microprocessor also includes a bus interface unit (BIU) that generates transaction requests on the bus to transfer data between the system memory and the microprocessor in response to the load/store requests and the prefetch requests. The BIU prioritizes the bus transaction requests for the prefetch requests relative to the bus transaction requests for the load/store requests based on the stream prefetch priority. | 03-19-2009 |
20100042789 | CHECK-HAZARD INSTRUCTIONS FOR PROCESSING VECTORS - The described embodiments provide a system that determines data dependencies between two vector memory operations or two memory operations that use vectors of memory addresses. During operation, the system receives a first input vector and a second input vector. The first input vector includes a number of elements containing memory addresses for a first memory operation, while the second input vector includes a number of elements containing memory addresses for a second memory operation, wherein the first memory operation occurs before the second memory operation in program order. The system then determines elements in the first and second input vectors where the memory addresses indicate that a dependency exists between the memory operations. The system next generates a result vector, wherein the result vector indicates the elements where dependencies exist between the memory operations. | 02-18-2010 |
20100042815 | METHOD AND APPARATUS FOR EXECUTING PROGRAM CODE - The described embodiments provide a system that executes program code. While executing program code, the processor encounters at least one vector instruction and at least one vector-control instruction. The vector instruction includes a set of elements, wherein each element is used to perform an operation for a corresponding iteration of a loop in the program code. The vector-control instruction identifies elements in the vector instruction that may be operated on in parallel without causing an error due to a runtime data dependency between the iterations of the loop. The processor then executes the loop by repeatedly executing the vector-control instruction to identify a next group of elements that can be operated on in the vector instruction and selectively executing the vector instruction to perform the operation for the next group of elements in the vector instruction, until the operation has been performed for all elements of the vector instruction. | 02-18-2010 |
20100042816 | BREAK, PRE-BREAK, AND REMAINING INSTRUCTIONS FOR PROCESSING VECTORS - The described embodiments provide a system that sets elements in a result vector based on an input vector. During operation, the system determines a location of a key element within the input vector. Next, the system generates a result vector. When generating the result vector, the system sets one or more elements of the result vector based on the location of the key element in the input vector. | 02-18-2010 |
20100042817 | SHIFT-IN-RIGHT INSTRUCTIONS FOR PROCESSING VECTORS - The described embodiments provide a processor for generating a result vector with shifted values from an input vector. During operation, the processor receives an input vector and a control vector. Using these vectors, the processor generates the result vector, which can contain shifted values or propagated values from the input vector, depending on the value of the control vector. In addition, a predicate vector can be used to control the values that are written to the result vector. | 02-18-2010 |
20100042818 | COPY-PROPAGATE, PROPAGATE-POST, AND PROPAGATE-PRIOR INSTRUCTIONS FOR PROCESSING VECTORS - The described embodiments provide a processor for generating a result vector with copied or propagated values from an input vector. During operation, the processor receives at least one input vector and a control vector. Using these vectors, the processor generates the result vector, which can contain copied propagated values from the input vector(s), depending on the value of the control vector. In addition, a predicate vector can be used to control the values that are written to the result vector. | 02-18-2010 |
20100049950 | RUNNING-SUM INSTRUCTIONS FOR PROCESSING VECTORS - The described embodiments provide a processor for generating a result vector with summed values from a first input vector. During operation, the processor receives the first input vector, a second input vector, and a control vector. When generating the result vector, the processor first captures a base value from a key element in the second input vector. The processor then writes the sum of the base value and values from relevant elements in the first input vector into selected elements in the result vector. In addition, a predicate vector can be used to control the values that are written to the result vector. | 02-25-2010 |
20100049951 | RUNNING-AND, RUNNING-OR, RUNNING-XOR, AND RUNNING-MULTIPLY INSTRUCTIONS FOR PROCESSING VECTORS - The described embodiments provide a processor for generating a result vector with shifted values. During operation, the processor receives a first input vector, a second input vector, and a control vector. When generating the result vector, the processor first captures a base value from a key element position in the second input vector. The processor then writes the product of the base value and values from relevant elements in the first input vector into selected elements in the result vector. In addition, a predicate vector can be used to control the values that are written to the result vector. | 02-25-2010 |
20100058037 | RUNNING-SHIFT INSTRUCTIONS FOR PROCESSING VECTORS - The described embodiments provide a processor for generating a result vector with shifted values. During operation, the processor receives a first input vector, a second input vector, and a control vector. When generating the result vector, the processor first captures a base value from a key element position in the second input vector. The processor then determines a number of bit positions to shift the base value using selected relevant elements in the first input vector. The processor then shifts the copy of the base value by the number of bit positions and writes the value into a corresponding element in the result vector. In addition, a predicate vector can be used to control the values that are written to the result vector. | 03-04-2010 |
20100077180 | GENERATING PREDICATE VALUES BASED ON CONDITIONAL DATA DEPENDENCY IN VECTOR PROCESSORS - Embodiments of a method for performing parallel operations in a computer system when one or more conditional dependencies may be present, where a given conditional dependency includes a dependency associated with at least two data elements based on a pair of conditions. During operation, a processor receives instructions for generating one or more predicate values based on actual dependencies, where a given predicate value indicates data elements that may be safely evaluated in parallel, and where the given actual dependency occurs when the pair of conditions matches one or more criteria. Then, the processor executes the instructions for generating the one or more predicate values. | 03-25-2010 |
20100077182 | GENERATING STOP INDICATORS BASED ON CONDITIONAL DATA DEPENDENCY IN VECTOR PROCESSORS - Embodiments of a method for performing parallel operations in a computer system when one or more conditional dependencies may be present, where a given conditional dependency includes a dependency associated with at least two data elements based on a pair of conditions. During operation, a processor receives instructions for generating one or more stop indicators based on actual dependencies, where a given stop indicator indicates the position of a given actual dependency that can lead to different results when the data elements are processed in parallel than when the data elements are processed sequentially, and where the given actual dependency occurs when the pair of conditions matches one or more criteria. Then, the processor executes the instructions for generating the one or more stop indicators. | 03-25-2010 |
20100077183 | CONDITIONAL DATA-DEPENDENCY RESOLUTION IN VECTOR PROCESSORS - Embodiments of a method for performing parallel operations in a computer system when one or more conditional dependencies may be present, where a given conditional dependency includes a dependency associated with at least two data elements based on a pair of conditions. During operation, a processor receives instructions for generating a vector of tracked positions of actual dependencies, where a given tracked position indicates the position of a given actual dependency, and where the given actual dependency occurs when the pair of condition matches one or more criteria. Then, the processor executes the instructions for generating the vector of tracked positions. | 03-25-2010 |
20100325398 | RUNNING-MIN AND RUNNING-MAX INSTRUCTIONS FOR PROCESSING VECTORS - The described embodiments provide a processor for generating a result vector that contains results from a comparison operation. During operation, the processor receives a first input vector, a second input vector, and a control vector. When subsequently generating a result vector, the processor first captures a base value from a key element position in the first input vector. For selected elements in the result vector, processor compares the base value and values from relevant elements to the left of a corresponding element in the second input vector, and writes the result into the element in the result vector. In the described embodiments, the key element position and the relevant elements can be defined by the control vector and an optional predicate vector. | 12-23-2010 |
20100325399 | VECTOR TEST INSTRUCTION FOR PROCESSING VECTORS - The described embodiments provide a processor that executes a vector instruction. The processor starts by receiving a vector instruction that uses at least one vector of values that includes N elements as an input. In addition, the processor optionally receives a predicate vector that includes N elements. The processor then executes the vector instruction. In the described embodiments, when executing the vector instruction, if the predicate vector is received, for one or more selected elements in the vector of values for which a corresponding element in the predicate vector is active, otherwise, for one or more selected elements in the vector of values, the processor checks the one or more selected elements to determine if the selected elements contain a predetermined value. When the selected elements contain the predetermined value, the processor sets a corresponding status flag. | 12-23-2010 |
20100325483 | NON-FAULTING AND FIRST-FAULTING INSTRUCTIONS FOR PROCESSING VECTORS - The described embodiments include a processor that handles faults during execution of a vector instruction. The processor starts by receiving a vector instruction that uses at least one vector of values that includes N elements as an input. In addition, the processor optionally receives a predicate vector that includes N elements. The processor then executes the vector instruction. In the described embodiments, when executing the vector instruction, if the predicate vector is received, for each element in the vector of values for which a corresponding element in the predicate vector is active, otherwise, for each element in the vector of values, the processor performs an operation for the vector instruction for the element in the vector of values. While performing the operation, the processor conditionally masks faults encountered (i.e., faults caused by an illegal operation). | 12-23-2010 |
20110035567 | ACTUAL INSTRUCTION AND ACTUAL-FAULT INSTRUCTIONS FOR PROCESSING VECTORS - The described embodiments include a processor that executes a vector instruction. The processor starts by receiving a vector instruction that optionally receives a predicate vector (which has N elements) as an input. The processor then executes the vector instruction. In the described embodiments, executing the vector instruction causes the processor to generate a result vector. When generating the result vector, if the predicate vector is received, for each element in the result vector for which a corresponding element of the predicate vector is active, otherwise, for each element of the result vector, the processor determines element positions for which a fault was masked during a prior operation. The processor then updates elements in the result vector to identify a leftmost element for which a fault was masked. | 02-10-2011 |
20110035568 | SELECT FIRST AND SELECT LAST INSTRUCTIONS FOR PROCESSING VECTORS - The described embodiments include a processor that executes a vector instruction. The processor starts by receiving a vector instruction that uses a first input vector, a second input vector, and a control vector, and optionally a predicate vector as inputs, wherein each of the vectors includes N elements. The processor then executes the vector instruction. In the described embodiments, when executing the vector instruction, the processor determines a key element position. If the predicate vector is received, the key element position is a predetermined active element position in the predicate vector, otherwise, the key element position is in a predetermined element position. The processor then uses the key element position to copy a result value into a result variable. When copying the result value into the result variable, if an element in the key element position of the control vector contains a predetermined value, the processor copies a value from the key element position in the second input vector into the result variable. Otherwise, the processor copies a value from the key element position in the first input vector into the result variable. | 02-10-2011 |
20110040941 | Microprocessor with Improved Data Stream Prefetching - A microprocessor coupled to a system memory by a bus includes an instruction decode unit that decodes an instruction that specifies a data stream in the system memory and a stream prefetch priority. The microprocessor also includes a load/store unit that generates load/store requests to transfer data between the system memory and the microprocessor. The microprocessor also includes a stream prefetch unit that generates a plurality of prefetch requests to prefetch the data stream from the system memory into the microprocessor. The prefetch requests specify the stream prefetch priority. The microprocessor also includes a bus interface unit (BIU) that generates transaction requests on the bus to transfer data between the system memory and the microprocessor in response to the load/store requests and the prefetch requests. The BIU prioritizes the bus transaction requests for the prefetch requests relative to the bus transaction requests for the load/store requests based on the stream prefetch priority. | 02-17-2011 |
20110093681 | REMAINING INSTRUCTION FOR PROCESSING VECTORS - The described embodiments include a processor that executes a vector instruction. The processor starts by receiving an input vector and optionally receiving a predicate vector as inputs. The processor then executes the vector instruction, which causes the processor to determine a key element position in the input vector and generate a result vector. When generating the result vector, if the predicate vector is received, for each element in the result vector for which a corresponding element of the predicate vector is active, otherwise, for each element of the result vector, the processor sets each element of the result vector to the right of the key element to a first predetermined value and sets each element of the result vector at or to the left of the key element to a second predetermined value. The processor then sets one or more processor status flags based on the values in the result vector. | 04-21-2011 |
20110113217 | GENERATE PREDICTES INSTRUCTION FOR PROCESSING VECTORS - The described embodiments include a processor that executes a vector instruction. The processor starts by receiving a first input vector, a second input vector, and optionally receiving a predicate vector (each of which includes N elements) as inputs. The processor then executes the vector instruction. Executing the vector instruction causes the processor to generate a result vector. When generating the result vector, if the predicate vector was received, for each element of the result vector for which the corresponding element of the predicate vector is active, otherwise, for each element of the result vector, the processor determines elements that are to be set in the result vector based on values in elements in the first input vector and the second input vector. The processor then sets the determined elements of the result vector to a first predetermined value. | 05-12-2011 |
20110276782 | RUNNING SUBTRACT AND RUNNING DIVIDE INSTRUCTIONS FOR PROCESSING VECTORS - The described embodiments provide a processor for generating a result vector with subtracted or mathematically divided values from a first input vector. During operation, the processor receives the first input vector, a second input vector, and a control vector, and optionally receives a predicate vector. The processor then records a value from an element at a key element position in the second input vector into a base value. Next, the processor generates a result vector. When generating the result vector, for each active element in the result vector to the right of the key element position, the processor is configured to set the element in the result vector equal to the base value minus a total of the values in each relevant element of the first input vector or to set the element in the result vector equal to the result of dividing the base value by a value in each relevant element of the first input vector, wherein the relevant elements include relevant elements from an element at the key element position to and including a predetermined element in the first input vector. | 11-10-2011 |
20110320883 | MEMORY-HAZARD DETECTION AND AVOIDANCE INSTRUCTIONS FOR VECTOR PROCESSING - A processor that is configured to perform parallel operations in a computer system where one or more memory hazards may be present is described. An instruction fetch unit within the processor is configured to fetch instructions for detecting one or more critical memory hazards between memory addresses if memory operations are performed in parallel on multiple addresses corresponding to at least a partial vector of addresses. Note that critical memory hazards include memory hazards that lead to different results when the memory addresses are processed in parallel than when the memory addresses are processed sequentially. Furthermore, an execution unit within the processor is configured to execute the instructions for detecting the one or more critical memory hazards. | 12-29-2011 |
20120060020 | VECTOR INDEX INSTRUCTION FOR PROCESSING VECTORS - The described embodiments include a processor that executes a vector instruction. The processor starts by receiving a start value and an increment value, and optionally receiving a predicate vector with N elements as inputs. The processor then executes the vector instruction. Executing the vector instruction causes the processor to generate a result vector. When generating the result vector, if the predicate vector is received, for each element in the result vector for which a corresponding element of the predicate vector is active, otherwise, for each element in the result vector, the processor sets the element in the result vector equal to the start value plus a product of the increment value multiplied by a specified number of elements to the left of the element in the result vector. | 03-08-2012 |
20120317441 | NON-FAULTING AND FIRST FAULTING INSTRUCTIONS FOR PROCESSING VECTORS - The described embodiments include a processor that handles faults during execution of a vector instruction. The processor starts by receiving a vector instruction that uses at least one vector of values that includes N elements as an input. In addition, the processor optionally receives a predicate vector that includes N elements. The processor then executes the vector instruction. In the described embodiments, when executing the vector instruction, if the predicate vector is received, for each element in the vector of values for which a corresponding element in the predicate vector is active, otherwise, for each element in the vector of values, the processor performs an operation for the vector instruction for the element in the vector of values. While performing the operation, the processor conditionally masks faults encountered (i.e., faults caused by an illegal operation). | 12-13-2012 |