Patent application number | Description | Published |
20090119556 | METHOD AND APPARATUS FOR TESTING LOGIC CIRCUIT DESIGNS - Disclosed is a logic testing system that includes a decompressor and a tester in communication with the decompressor. The tester is configured to store a seed and locations of scan inputs and is further configured to transmit the seed and the locations of scan inputs to the decompressor. The decompressor is configured to generate a test pattern from the seed and the locations of scan inputs. The decompressor includes a first test pattern generator, a second test pattern generator, and a selector configured to select the test pattern generated by the first test pattern generator or the test pattern generated by the second test pattern generator using the locations of scan inputs. | 05-07-2009 |
20090119563 | METHOD AND APPARATUS FOR TESTING LOGIC CIRCUIT DESIGNS - Disclosed is a logic testing system that includes a decompressor and a tester in communication with the decompressor. The tester is configured to store a seed and locations of scan inputs and is further configured to transmit the seed and the locations of scan inputs to the decompressor. The decompressor is configured to generate a test pattern from the seed and the locations of scan inputs. The decompressor includes a first test pattern generator, a second test pattern generator, and a selector configured to select the test pattern generated by the first test pattern generator or the test pattern generated by the second test pattern generator using the locations of scan inputs. | 05-07-2009 |
20090210762 | Method for Blocking Unknown Values in Output Response of Scan Test Patterns for Testing Circuits - A method includes compressing control patterns describing values required at the control signals of blocking logic gates, by linear feedback shift register LFSR reseeding; bypassing blocking logic gates for some groups of scan chains that do not capture unknown values in output response of scan test patterns for testing circuits; and reducing numbers of specified bits in densely specified ones of the control patterns for further reducing the size of a seed of the LFSR. | 08-20-2009 |
20090304268 | System and Method for Parallelizing and Accelerating Learning Machine Training and Classification Using a Massively Parallel Accelerator - A method system for training an apparatus to recognize a pattern includes providing the apparatus with a host processor executing steps of a machine learning process; providing the apparatus with an accelerator including at least two processors; inputting training pattern data into the host processor; determining coefficient changes in the machine learning process with the host processor using the training pattern data; transferring the training data to the accelerator; determining kernel dot-products with the at least two processors of the accelerator using the training data; and transferring the dot-products back to the host processor. | 12-10-2009 |
20100088490 | METHODS AND SYSTEMS FOR MANAGING COMPUTATIONS ON A HYBRID COMPUTING PLATFORM INCLUDING A PARALLEL ACCELERATOR - In accordance with exemplary implementations, application computation operations and communications between operations on a host processing platform may be adapted to conform to the memory capacity of a parallel accelerator. Computation operations may be split and scheduled such that the computation operations fit within the memory capacity of the accelerator. Further, the operations may be automatically adapted without any modification to the code of an application. In addition, data transfers between a host processing platform and the parallel accelerator may be minimized in accordance with exemplary aspects of the present principles to improve processing performance. | 04-08-2010 |
20120188263 | Method and system to dynamically bind and unbind applications on a general purpose graphics processing unit - A system for dynamically binding and unbinding of graphics processing unit GPU applications, the system includes a memory management for tracking memory of a GPU used by an application, and a source-to-source compiler for identifying nested structures allocated on the GPU so that the virtual memory management can track these nested structures, and identifying all instances where nested structures on the GPU are modified inside kernels. | 07-26-2012 |
20120192198 | Method and System for Memory Aware Runtime to Support Multitenancy in Heterogeneous Clusters - The invention solves the problem of sharing many-core devices (e.g. GPUs) among concurrent applications running on heterogeneous clusters. In particular, the invention provides transparent mapping of applications to many-core devices (that is, the user does not need to be aware of the many-core devices present in the cluster and of their utilization), time-sharing of many-core devices among applications also in the presence of conflicting memory requirements, and dynamic binding/binding of applications to/from many-core devices (that is, applications do not need to be statically mapped to the same many-core device for their whole life-time). | 07-26-2012 |
20130091507 | OPTIMIZING DATA WAREHOUSING APPLICATIONS FOR GPUS USING DYNAMIC STREAM SCHEDULING AND DISPATCH OF FUSED AND SPLIT KERNELS - Systems and methods for managing a processor and one or more co-processors for a database application whose queries have been processed into an intermediate form (IR) containing kernels of the database application that have been fused and split; dynamically scheduling such kernels on CUDA streams and further dynamically dispatching kernels to GPU devices by estimating execution time in order to achieve high performance. | 04-11-2013 |
20130097593 | Computer-Guided Holistic Optimization of MapReduce Applications - A method for compiler-guided optimization of MapReduce type applications that includes applying transformations and optimizations to Java bytecode of an original application by an instrumenter which carries out static analysis to determine application properties depending on the optimization being performed and provides an output of optimized Java bytecode, and executing the application and analyzing generated trace and feeds information back into the instrumenter by a trace analyzer, the trace analyzer and instrumenter invoking each other iteratively and exchanging information through files. | 04-18-2013 |
20130191612 | INTERFERENCE-DRIVEN RESOURCE MANAGEMENT FOR GPU-BASED HETEROGENEOUS CLUSTERS - Systems and methods are disclosed that share coprocessor resources between two or more applications in a computing cluster using a job selector to receive jobs from a job queue; a node selector coupled to the job selector; an off line profiler with an interference prediction model; a coprocessor dynamic interference detection module; and a coprocessor interference response module. | 07-25-2013 |
20130298130 | AUTOMATIC PIPELINING FRAMEWORK FOR HETEROGENEOUS PARALLEL COMPUTING SYSTEMS - Systems and methods for automatic generation of software pipelines for heterogeneous parallel systems (AHP) include pipelining a program with one or more tasks on a parallel computing platform with one or more processing units and partitioning the program into pipeline stages, wherein each pipeline stage contains one or more tasks. The one or more tasks in the pipeline stages are scheduled onto the one or more processing units, and execution times of the one or more tasks in the pipeline stages are estimated. The above steps are repeated until a specified termination criterion is reached. | 11-07-2013 |
20140047422 | COMPILER-GUIDED SOFTWARE ACCELERATOR FOR ITERATIVE HADOOP JOBS - Various methods are provided directed to a compiler-guided software accelerator for iterative HADOOP jobs. A method includes identifying intermediate data, generated by an iterative HADOOP application, below a predetermined threshold size and used less than a predetermined threshold time period. The intermediate data is stored in a memory device. The method further includes minimizing input, output, and synchronization overhead for the intermediate data by selectively using at any given time any one of a Message Passing Interface and Distributed File System as a communication layer. The Message Passing Interface is co-located with the HADOOP Distributed File System. | 02-13-2014 |
20140208072 | USER-LEVEL MANAGER TO HANDLE MULTI-PROCESSING ON MANY-CORE COPROCESSOR-BASED SYSTEMS - A method is disclosed to manage a multi-processor system with one or more multiple-core coprocessors by intercepting coprocessor offload infrastructure application program interface (API) calls; scheduling user processes to run on one of the coprocessors; scheduling offloads within user processes to run on one of the coprocessors; and affinitizing offloads to predetermined cores within one of the coprocessors by selecting and allocating cores to an offload, and obtaining a thread-to-core mapping from a user. | 07-24-2014 |
20140208327 | METHOD FOR SIMULTANEOUS SCHEDULING OF PROCESSES AND OFFLOADING COMPUTATION ON MANY-CORE COPROCESSORS - A method is disclosed to manage a multi-processor system with one or more manycore devices, by managing real-time bag-of-tasks applications for a cluster, wherein each task runs on a single server node, and uses the offload programming model, and wherein each task has a deadline and three specific resource requirements: total processing time, a certain number of manycore devices and peak memory on each device; when a new task arrives, querying each node scheduler to determine which node can best accept the task and each node scheduler responds with an estimated completion time and a confidence level, wherein the node schedulers use an urgency-based heuristic to schedule each task and its offloads; responding to an accept/reject query phase, wherein the cluster scheduler send the task requirements to each node and queries if the node can accept the task with an estimated completion time and confidence level; and scheduling tasks and offloads using a aging and urgency-based heuristic, wherein the aging guarantees fairness, and the urgency prioritizes tasks and offloads so that maximal deadlines are met. | 07-24-2014 |
20140208331 | METHODS OF PROCESSING CORE SELECTION FOR APPLICATIONS ON MANYCORE PROCESSORS - A runtime method is disclosed that dynamically sets up core containers and thread-to-core affinity for processes running on manycore coprocessors. The method is completely transparent to user applications and incurs low runtime overhead. The method is implemented within a user-space middleware that also performs scheduling and resource management for both offload and native applications using the manycore coprocessors. | 07-24-2014 |