| Patent application number | Description | Published |
| 20090063529 | METHOD AND STRUCTURE FOR FAST IN-PLACE TRANSFORMATION OF STANDARD FULL AND PACKED MATRIX DATA FORMATS - A computerized method provides for an in-place transformation of matrix A data including a New Data Structure (NDS) format and a transformation T having a compact representation. The NDS represents data of the matrix A in a format other than a row major format or a column major format, such that the data for the matrix A is stored as contiguous sub matrices of size MB by NB in an order predetermined to provide the data for a matrix processing. The transformation T is applied to the MB by NB blocks, using an in-place transformation processing, thereby replacing data of the block A | 03-05-2009 |
| 20090063607 | METHOD AND STRUCTURE FOR FAST IN-PLACE TRANSFORMATION OF STANDARD FULL AND PACKED MATRIX DATA FORMATS - A method and structure for an in-place transformation of matrix data. For a matrix A stored in one of a standard full format or a packed format and a transformation T having a compact representation, blocking parameters MB and NB are chosen, based on a cache size. A sub-matrix A | 03-05-2009 |
| 20090144736 | Performance Evaluation of Algorithmic Tasks and Dynamic Parameterization on Multi-Core Processing Systems - A method for evaluating performance of DMA-based algorithmic tasks on a target multi-core processing system includes the steps of: inputting a template for a specified task, the template including DMA-related parameters specifying DMA operations and computational operations to be performed; evaluating performance for the specified task by running a benchmark on the target multi-core processing system, the benchmark being operative to generate data access patterns using DMA operations and invoking prescribed computation routines as specified by the input template; and providing results of the benchmark indicative of a measure of performance of the specified task corresponding to the target multi-core processing system. | 06-04-2009 |
| 20090144738 | Performance Evaluation of Algorithmic Tasks and Dynamic Parameterization on Multi-Core Processing Systems - Apparatus for evaluating the performance of DMA-based algorithmic tasks on a target multi-core processing system includes a memory and at least one processor coupled to the memory. The processor is operative: to input a template for a specified task, the template including DMA-related parameters specifying DMA operations and computational operations to be performed; to evaluate performance for the specified task by running a benchmark on the target multi-core processing system, the benchmark being operative to generate data access patterns using DMA operations and invoking prescribed computation routines as specified by the input template; and to provide results of the benchmark indicative of a measure of performance of the specified task corresponding to the target multi-core processing system. | 06-04-2009 |
| 20090144744 | Performance Evaluation of Algorithmic Tasks and Dynamic Parameterization on Multi-Core Processing Systems - A method for evaluating performance of DMA-based algorithmic tasks on a target multi-core processing system includes the steps of: inputting a template for a specified task, the template including DMA-related parameters specifying DMA operations and computational operations to be performed; evaluating performance for the specified task by running a benchmark on the target multi-core processing system, the benchmark being operative to generate data access patterns using DMA operations and invoking prescribed computation routines as specified by the input template; and providing results of the benchmark indicative of a measure of performance of the specified task corresponding to the target multi-core processing system. | 06-04-2009 |
| 20090144745 | Performance Evaluation of Algorithmic Tasks and Dynamic Parameterization on Multi-Core Processing Systems - Apparatus for evaluating the performance of DMA-based algorithmic tasks on a target multi-core processing system includes a memory and at least one processor coupled to the memory. The processor is operative: to input a template for a specified task, the template including DMA-related parameters specifying DMA operations and computational operations to be performed; to evaluate performance for the specified task by running a benchmark on the target multi-core processing system, the benchmark being operative to generate data access patterns using DMA operations and invoking prescribed computation routines as specified by the input template; and to provide results of the benchmark indicative of a measure of performance of the specified task corresponding to the target multi-core processing system. | 06-04-2009 |
| Patent application number | Description | Published |
| 20090024830 | Executing Multiple Instructions Multiple Data ('MIMD') Programs on a Single Instruction Multiple Data ('SIMD') Machine - Executing Multiple Instructions Multiple Data (‘MIMD’) programs on a Single Instruction Multiple Data (‘SIMD’) machine, the SIMD machine including a plurality of compute nodes, each compute node capable of executing only a single thread of execution, the compute nodes initially configured exclusively for SIMD operations, the SIMD machine further comprising a data communications network, the network comprising synchronous data communications links among the compute nodes, including establishing a SIMD partition comprising a plurality of the compute nodes; booting the SIMD partition in MIMD mode; executing by launcher programs a plurality of MIMD programs on compute nodes in the SIMD partition; and re-executing a launcher program by an operating system on a compute node in the SIMD partition upon termination of the MIMD program executed by the launcher program. | 01-22-2009 |
| 20090024831 | Executing Multiple Instructions Multiple Data ('MIMD') Programs on a Single Instruction Multiple Data ('SIMD') Machine - Executing MIMD programs on a SIMD machine, including establishing on the SIMD machine a plurality of SIMD partitions; booting a first SIMD partition in MIMD mode; executing, on a compute node of the first SIMD partition booted in MIMD mode, a MIMD accelerator program; executing a SIMD program in a second SIMD partition, one instance of the SIMD program executing on each compute node of the second SIMD partition, each instance of the SIMD program carrying out a portion of the data processing effected by the SIMD program; and accelerating, by an instance of the SIMD program through the MIMD accelerator program, a portion of the data processing of the instance of the SIMD program. | 01-22-2009 |