Patent application number | Description | Published |
20090327937 | Methods and apparatus for analyzing SIMD code - A method for analyzing and presenting in a graphical manner single instruction, multiple data (SIMD) instructions involves disassembling a stream of machine instructions into a stream of assembly language instructions. Instruction objects “M” and “N” are created to represent SIMD instructions “M” and “N” from the stream of instructions. Instruction objects “M” and “N” include multiple data objects corresponding to the multiple data items of the respective SIMD instruction. Different colors are assigned to at least two of the multiple data objects of instruction object “M.” If a data item of SIMD instruction “N” is based on a data item of SIMD instruction “M,” the color from the source object is automatically assigned to the target object. Dependencies between data items of instruction “M” and “N” are annotated by arrows between corresponding data objects. Other embodiments are described and claimed. | 12-31-2009 |
20100153934 | Prefetch for systems with heterogeneous architectures - A compiler for a heterogeneous system that includes both one or more primary processors and one or more parallel co-processors is presented. For at least one embodiment, the primary processors(s) include a CPU and the parallel co-processor(s) include a GPU. Source code for the heterogeneous system may include code to be performed on the CPU but also code segments, referred to as “foreign macro-instructions”, that are to be performed on the GPU. An optimizing compiler for the heterogeneous system comprehends the architecture of both processors, and generates an optimized fat binary that includes machine code instructions for both the primary processor(s) and the co-processor(s). The optimizing compiler compiles the foreign macro-instructions as if they were predefined functions of the CPU, rather than as remote procedure calls. The binary is the result of compiler optimization techniques, and includes prefetch instructions to load code and/or data into the GPU memory concurrently with execution of other instructions on the CPU. Other embodiments are described and claimed. | 06-17-2010 |
20110145798 | DEBUGGING MECHANISMS IN A CACHE-BASED MEMORY ISOLATION SYSTEM - Debugging software in systems with architecturally significant processor caches. A method may be practiced in a computing environment. The method includes acts for debugging a software application, wherein the software application is configured to use one or more architecturally significant processor caches coupled to a processor. The method includes beginning execution of the software application. A debugger is run while executing the software application. The software application causes at least one of reads or writes to be made to the cache in an architecturally significant fashion. The reads or writes made to the cache in an architecturally significant fashion are preserved while performing debugging operations that would ordinarily disturb the reads or writes made to the cache in an architecturally significant fashion. | 06-16-2011 |
20110153983 | Gathering and Scattering Multiple Data Elements - According to a first aspect, efficient data transfer operations can be achieved by: decoding by a processor device, a single instruction specifying a transfer operation for a plurality of data elements between a first storage location and a second storage location; issuing the single instruction for execution by an execution unit in the processor; detecting an occurrence of an exception during execution of the single instruction; and in response to the exception, delivering pending traps or interrupts to an exception handler prior to delivering the exception. | 06-23-2011 |
20110154303 | Endian Conversion Tool - In one embodiment of the invention code (e.g., compiler, tool) may generate information so a first code portion, which includes a pointer value in a first endian format (e.g., big endian), can be properly initialized and executed on a platform having a second endian format (e.g., little endian). Also, various embodiments of the invention may identify problematic regions of code (e.g., source code) where a particular byte order is cast away through void pointers. | 06-23-2011 |
20110197182 | DEBUGGING PARALLEL SOFTWARE USING SPECULATIVELY EXECUTED CODE SEQUENCES IN A MULTIPLE CORE ENVIRONMENT - Methods and apparatus relating to debugging parallel software using speculatively executed code sequences in a multiple core environment are described. In an embodiment, occurrence of a speculative code debug event is detected and a speculative code execution debug module is executed in response to occurrence of the event. Other embodiments are also disclosed and claimed. | 08-11-2011 |
20120030518 | LAST BRANCH RECORD INDICATORS FOR TRANSACTIONAL MEMORY - In one embodiment, a processor includes an execution unit and at least one last branch record (LBR) register to store address information of a branch taken during program execution. This register may further store a transaction indicator to indicate whether the branch was taken during a transactional memory (TM) transaction. This register may further store an abort indicator to indicate whether the branch was caused by a transaction abort. Other embodiments are described and claimed. | 02-02-2012 |
20130179668 | LAST BRANCH RECORD INDICATORS FOR TRANSACTIONAL MEMORY - In one embodiment, a processor includes an execution unit and at least one last branch record (LBR) register to store address information of a branch taken during program execution. This register may further store a transaction indicator to indicate whether the branch was taken during a transactional memory (TM) transaction. This register may further store an abort indicator to indicate whether the branch was caused by a transaction abort. Other embodiments are described and claimed. | 07-11-2013 |
20130263093 | OPTIONAL LOGGING OF DEBUG ACTIVITIES IN A REAL TIME INSTRUCTION TRACING LOG - In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for implementing optional logging of debug activities in a real time instruction tracing log. For example, in one embodiment, such means may include an integrated circuit having means for initiating instruction tracing for instructions of a traced application, mode, or code region, as the instructions are executed by the integrated circuit; means for generating a plurality of packets to a debug log describing the instruction tracing; means for initiating an alternative mode of execution within the integrated circuit; and means for suppressing indication of entering the alternative mode of execution. Additional and alternative means may be implemented for selectively causing an integrated circuit to operate in accordance with an invisible trace mode or a visible trace mode upon transition to the alternative mode of execution. | 10-03-2013 |
20140258695 | Last Branch Record Indicators For Transactional Memory - In one embodiment, a processor includes an execution unit and at least one last branch record (LBR) register to store address information of a branch taken during program execution. This register may further store a transaction indicator to indicate whether the branch was taken during a transactional memory (TM) transaction. This register may further store an abort indicator to indicate whether the branch was caused by a transaction abort. Other embodiments are described and claimed. | 09-11-2014 |
20140344552 | PROVIDING STATUS OF A PROCESSING DEVICE WITH PERIODIC SYNCHRONIZATION POINT IN INSTRUCTION TRACING SYSTEM - In accordance with embodiments disclosed herein, there is provided systems and methods for providing status of a processing device with a periodic synchronization point in an instruction tracing system. For example, the method may include generating a boundary packet based on a unique byte pattern in a packet log. The boundary packet provides a starting point for packet decode. The method may also include generating a plurality of state packets based on status information of the processor. The plurality of state packets follows the boundary packet when outputted into the packet log. | 11-20-2014 |
20140344553 | Gathering and Scattering Multiple Data Elements - According to a first aspect, efficient data transfer operations can be achieved by: decoding by a processor device, a single instruction specifying a transfer operation for a plurality of data elements between a first storage location and a second storage location; issuing the single instruction for execution by an execution unit in the processor; detecting an occurrence of an exception during execution of the single instruction; and in response to the exception, delivering pending traps or interrupts to an exception handler prior to delivering the exception. | 11-20-2014 |
20140372987 | Processor That Records Tracing Data In Non Contiguous System Memory Slices - A method is described that involves referring to first information from a directory table in system memory. The first information includes location information and size information of a first slice of system memory where first tracing data is to be stored. The method also includes tracking the amount of tracing data stored in the first slice of system memory and comparing the amount against the size information. The method also includes, before the first slice of system memory is filled, referring to second information from the directory table in system memory, where, the second information includes location information and size information of a second slice of system memory where second tracing data is to be stored. The first slice is not contiguous with the second slice of system memory. | 12-18-2014 |