Class / Patent application number | Description | Number of patent applications / Date published |
712012000 | Cube or hypercube | 11 |
20080209163 | DATA PROCESSING SYSTEM WITH BACKPLANE AND PROCESSOR BOOKS CONFIGURABLE TO SUPPPRT BOTH TECHNICAL AND COMMERCIAL WORKLOADS - A processor book designed to support both commercial workloads and technical workloads based on a dynamic or static mechanism of reconfiguring the external wiring interconnect. The processor book is configured as a building block for commercial workload processing systems with external connector buses (ECBs). The processor book is also provided with routing logic to enable to ECBs to be utilized for either book-to-book routing or routing within the same processor book. A table specific wiring scheme is provided for coupling the ECBs running off the chips of one MCM to the chips of the second MCM on the processor book so that the chips of the first MCM are connected directly to the chips of a second MCM that is logically furthest away and vice versa. Once the wiring of the ECBs are completed according to the wiring scheme, the operational and functional characteristics reflect those of a processor book configured for technical workloads. | 08-28-2008 |
20090006808 | ULTRASCALABLE PETAFLOP PARALLEL SUPERCOMPUTER - A novel massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. Novel use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node. | 01-01-2009 |
20090024829 | MIXED TORUS AND HYPERCUBE MULTI-RANK TENSOR EXPANSION METHOD - The present invention provides a mixed torus and hypercube multi-rank tensor expansion method which can be applied to the communication subsystem of a parallel processing system. The said expansion method is based on the conventional torus and hypercube topologies. A mixed torus and hypercube multi-rank tensor expansion interconnection network is built up by means of supernodes equipped with expansion interfaces. This method not only provides more bisection bandwidth to the entire system but also improves the long-range communication and global operations. Affirmatively, this expansion method can achieve better scalability and flexibility for the parallel system for a given system size. | 01-22-2009 |
20100023728 | METHOD AND SYSTEM FOR IN-PLACE MULTI-DIMENSIONAL TRANSPOSE FOR MULTI-CORE PROCESSORS WITH SOFTWARE-MANAGED MEMORY HIERARCHY - A method and system for transposing a multi-dimensional array for a multi-processor system having a main memory for storing the multi-dimensional array and a local memory is provided. One implementation involves partitioning the multi-dimensional array into a number of equally sized portions in the local memory, in each processor performing a transpose function including a logical transpose on one of said portions and then a physical transpose of said portion, and combining the transposed portions and storing back in their original place in the main memory. | 01-28-2010 |
20110131391 | Integrated Circuit with Stacked Computational Units and Configurable through Vias - A technique for manufacturing a three-dimensional integrated circuit includes stacking a memory unit on a first die that includes a first computational unit. In this case, the memory unit is included in a second die. A second computational unit that is included in a third die is stacked on the second die. Sets of vertical vias that extend through the first, second, and third dies are connected to connect components of the first and second computational units and the memory unit. Multiplexers of the first and second computational units are configured to selectively couple the components to different ones of the sets of vertical vias responsive to respective control words for each of the first and third dies. | 06-02-2011 |
20110219208 | MULTI-PETASCALE HIGHLY EFFICIENT PARALLEL SUPERCOMPUTER - A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency. | 09-08-2011 |
20120272040 | Enhanced Modularity in Heterogeneous 3D Stacks - A computer program product for generating and implementing a three-dimensional (3D) computer processing chip stack plan. The computer readable program code includes computer readable program code configured for receiving system requirements from a plurality of clients, identifying common processing structures and technologies from the system requirements, and assigning the common processing structures and technologies to at least one layer in the 3D computer processing chip stack plan. The computer readable program code is also configured for identifying uncommon processing structures and technologies from the system requirements and assigning the uncommon processing structures and technologies to a host layer in the 3D computer processing chip stack plan. The computer readable program code is further configured for determining placement and wiring of the uncommon structures on the host layer, storing placement information in the plan, and transmitting the plan to manufacturing equipment. The manufacturing equipment forms the 3D computer processing chip stack. | 10-25-2012 |
20130232319 | INFORMATION PROCESSING SYSTEM, ROUTING METHOD AND PROGRAM - A disclosed information processing system includes 2 | 09-05-2013 |
20130283005 | 3-D STACKED MULTIPROCESSOR STRUCTURES AND METHODS FOR MULTIMODAL OPERATION OF SAME - Three-dimensional (3-D) processor devices are provided, which are constructed by connecting processors in a stacked configuration. For instance, a semiconductor device includes a first processor chip comprising one or more processors, a second processor chip comprising one or more processors, and a plurality of input/output ports. The first and second processor chips are connected in a stacked configuration and commonly share the plurality of input/output ports. Methods are also provided to selectively operate the semiconductor device in one of a plurality of operating modes to control power of the semiconductor device. | 10-24-2013 |
20130283006 | 3-D STACKED MULTIPROCESSOR STRUCTURES AND METHODS FOR MULTIMODAL OPERATION OF SAME - Three-dimensional (3-D) processor structures are provided which are constructed by connecting processors in a stacked configuration. For example, a processor system includes a first processor chip comprising a first processor, and a second processor chip comprising a second processor. The first and second processor chips are connected in a stacked configuration with the first and second processors connected through vertical connections between the first and second processor chips. The processor system further includes a mode control circuit to selectively configure the first and second processors of the first and second processor chips to operate in one of a plurality of operating modes, wherein the processors can be selectively configured to operate independently, to aggregate resources, to share resources, and/or be combined to form a single processor image. | 10-24-2013 |
20150039856 | Efficient Complex Multiplication and Fast Fourier Transform (FFT) Implementation on the ManArray Architecture - Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described. | 02-05-2015 |