Patent application number | Description | Published |
20090089537 | APPARATUS AND METHOD FOR MEMORY ADDRESS TRANSLATION ACROSS MULTIPLE NODES - A method for translating memory addresses in a plurality of nodes, that includes receiving a first memory access request initiated by a processor of a first node of the plurality of nodes, wherein the first memory access request comprises a process virtual address and a first memory operation, translating the process virtual address to a global system address, wherein the global system address corresponds to a physical memory location on a second node of the plurality of nodes, translating the global system address to an identifier corresponding to the second node, and sending a first message requesting the first memory operation to the second node based on the identifier, wherein the second node performs the first memory operation on the physical memory location. | 04-02-2009 |
20090089767 | METHOD AND SYSTEM FOR IMPLEMENTING A JUST-IN-TIME COMPILER - A method for implementing a just-in-time compiler involves obtaining high-level code templates in a high-level programming language, where the high-level programming language is designed for compilation to an intermediate language capable of execution by a virtual machine, and where each high-level code template represents an instruction in the intermediate language. The method further involves compiling the high-level code templates to native code to obtain optimized native code templates, where compiling the high-level code templates is performed, prior to runtime, using an optimizing static compiler designed for runtime use with the virtual machine. The method further involves implementing the just-in-time compiler using the optimized native code templates, where the just-in-time compiler is configured to substitute an optimized native code template when a corresponding instruction in the intermediate language is encountered at runtime. | 04-02-2009 |
20090313612 | METHOD AND APPARATUS FOR ENREGISTERING MEMORY LOCATIONS - One embodiment of the present invention provides a system that improves program performance by enregistering memory locations. During operation, the system receives program object code which has been generated for a given hardware implementation, and hence is optimized to use a specified number of registers that are available in that hardware implementation. Next, the system translates this object code to execute on a second hardware implementation which includes more registers than the first hardware implementation. The system makes use of these additional registers to improve the performance of the translated object code for the second hardware implementation. More specifically, the system identifies a memory access in the object code, where the memory access is associated with a memory location. The system then rewrites an instruction associated with this memory access to access the available register instead of the memory location. To preserve program semantics, the system subsequently moderates accesses to the memory location to ensure that no threads access a stale value in the enregistered memory location. | 12-17-2009 |
20090327374 | METHOD AND APPARATUS FOR PERFORMING CONCURRENT GARBAGE COLLECTION - One embodiment of the present invention provides a system that facilitates performing concurrent garbage collection. Note that the system uses hardware-supported GC barriers. During operation, the system executes a first mutator thread. While executing the first mutator thread, the system performs a garbage-collection operation using a garbage-collector thread. Performing the garbage-collection operation involves: discovering a live object in a from-space, which is being collected; creating a copy of the live object to a to-space, where live objects are copied to during garbage collection; and replacing the live object in the from-space with a forwarding pointer which points to a location of the copy of the live object in the to-space. Note that in some embodiments, the system marks cache lines comprising the live object in from-space as “forwarded,” which prevents any mutator threads from touching the cache lines. Additionally, in some embodiments, the system determines if the first mutator thread holds any additional references to the from-space. If so, the system leaves the first mutator thread marked as “dirty,” wherein dirty is the initial state for mutator threads. If not, the system marks the first mutator thread as “clean.” | 12-31-2009 |
20090327666 | METHOD AND SYSTEM FOR HARDWARE-BASED SECURITY OF OBJECT REFERENCES - A method for managing data, including obtaining a first instruction for moving a first data item from a first source to a first destination, determining a data type of the first data item, determining a data type supported by the first destination, comparing the data type of the first data item with the data type supported by the first destination to test a validity of the first instruction, and moving the first data item from the first source to the first destination based on the validity of the first instruction. | 12-31-2009 |
20100042980 | CROSS-DOMAIN INLINING IN A SYSTEM VIRTUAL MACHINE - A system and method are provided for inlining across protection domain boundaries with a system virtual machine. A protection domain comprises a unique combination of a privilege level and a memory address space. The system virtual machine interprets or dynamically compiles not only application code executing under guest operating systems, but also the guest operating systems. For a program call that crosses a protection domain boundary, the virtual machine assembles an intermediate representation (IR) graph that spans the boundary. Region nodes corresponding to code on both sides of the call are enhanced with information identifying the applicable protection domains. The IR is optimized and used to generate instructions in a native ISA (Instruction Set Architecture) of the virtual machine. Individual instructions reveal the protection domain in which they are to operate, and instructions corresponding to different domains may be interleaved. | 02-18-2010 |
20100042983 | CROSS-ISA INLINING IN A SYSTEM VIRTUAL MACHINE - A system and method are provided for inlining a program call between processes executing under separate ISAs (Instruction Set Architectures) within a system virtual machine. The system virtual machine hosts any number of virtual operating system instances, each of which may execute any number of applications. The system virtual machine interprets or dynamically compiles not only application code executing under virtual operating systems, but also the virtual operating systems. For a program call that crosses ISA boundaries, the virtual machine assembles an intermediate representation (IR) graph that spans the boundary. Region nodes corresponding to code on both sides of the call are enhanced with information identifying the virtual ISA of the code. The IR is optimized and used to generate instructions in a native ISA (Instruction Set Architecture) of the virtual machine. Individual instructions are configured and executed (or emulated) to perform as they would within the virtual ISA. | 02-18-2010 |
20100153662 | FACILITATING GATED STORES WITHOUT DATA BYPASS - One embodiment of the present invention provides a system that facilitates precise exception semantics for a virtual machine. During operation, the system executes a program in the virtual machine using a processor that includes a gated store buffer that stores values to be written to a memory. This gated store buffer is configured to delay a store to the memory until after a speculatively-optimized region of the program commits. The processor signals an exception when it detects that a load following the store is attempting to access the same memory region being written by the store prior to the commitment of the speculatively-optimized region. | 06-17-2010 |
20100153690 | USING REGISTER RENAME MAPS TO FACILITATE PRECISE EXCEPTION SEMANTICS - One embodiment of the present invention provides a system that facilitates precise exception semantics. The system includes a processor that uses register rename maps to support out-of-order execution, where the register rename maps track mappings between native architectural registers and physical registers for a program executing on the processor. These register rename maps include: 1) a working rename map that maps architectural registers associated with a decoded instruction to corresponding physical registers; 2) a retire rename map that tracks and preserves a set of physical registers that are associated with retired instructions; and 3) a checkpoint rename map that stores a mapping between a set of architectural registers and a set of physical registers for a preceding checkpoint in the program. When the program signals an exception, the processor uses the checkpoint rename map to roll back program execution to the preceding checkpoint. | 06-17-2010 |
20100153776 | USING SAFEPOINTS TO PROVIDE PRECISE EXCEPTION SEMANTICS FOR A VIRTUAL MACHINE - One embodiment of the present invention provides a system that provides precise exception semantics for a virtual machine. During operation, the system receives a program comprised of instructions that are specified in a machine instruction set architecture of the virtual machine, and translates these instructions into native instructions for the processor that the virtual machine is executing upon. While performing this translation, the system inserts one or more safepoints into the translated native instructions. The system then executes these native instructions on the processor. During execution, if the system detects that an exception was signaled by a native instruction, the system reverts the virtual machine to a previous safepoint to ensure that the virtual machine will precisely emulate the exception behavior of the virtual machine's instruction set architecture. The system uses a gated store buffer to ensure that any stores that occurred after the previous safepoint are discarded when reverting the virtual machine to the previous safepoint. | 06-17-2010 |
20100205344 | UNIFIED CACHE STRUCTURE THAT FACILITATES ACCESSING TRANSLATION TABLE ENTRIES - One embodiment provides a system that includes a processor with a unified cache structure that facilitates accessing translation table entries (TTEs). This unified cache structure can simultaneously store program instructions, program data, and TTEs. During a memory access, the system receives a virtual memory address. The system then uses this virtual memory address to identify one or more cache lines in the unified cache structure which are associated with the virtual memory address. Next, the system compares a tag portion of the virtual memory address with the tags for the identified cache line(s) to identify a cache line that matches the virtual memory address. The system then loads a translation table entry that corresponds to the virtual memory address from the identified cache line. | 08-12-2010 |
20100228936 | ACCESSING MEMORY LOCATIONS FOR PAGED MEMORY OBJECTS IN AN OBJECT-ADDRESSED MEMORY SYSTEM - One embodiment of the present invention provides a system that accesses memory locations in an object-addressed memory system. During a memory access in the object-addressed memory system, the system receives an object identifier and an address. The system then uses the object identifier to identify a paged memory object associated with the memory access. Next, the system uses the address and a page table associated with the paged memory object to identify a memory page associated with the memory access. After determining the memory page, the system uses the address to access a memory location in the memory page. | 09-09-2010 |
20100250870 | METHOD AND APPARATUS FOR TRACKING ENREGISTERED MEMORY LOCATIONS - One embodiment of the present invention provides a system that tracks enregistered memory locations. During operation, the system receives program object code that enregisters a memory location (e.g., a set of data at a given memory address). Next, the system executes this program object code using a thread. After enregistering the memory location, the system tracks the associated memory address and a thread identifier for the thread in a table that identifies enregistered memory locations. The system checks this table during memory accesses to ensure that other threads attempting to access an enregistered memory location receive a current value for the enregistered memory location. | 09-30-2010 |
20100333090 | METHOD AND APPARATUS FOR PROTECTING TRANSLATED CODE IN A VIRTUAL MACHINE - One embodiment provides a system that protects translated guest program code in a virtual machine that supports self-modifying program code. While executing a guest program in the virtual machine, the system uses a guest shadow page table associated with the guest program and the virtual machine to map a virtual memory page for the guest program to a physical memory page on the host computing device. The system then uses a dynamic compiler to translate guest program code in the virtual memory page into translated guest program code (e.g., native program instructions for the computing device). During compilation, the dynamic compiler stores in a compiler shadow page table and the guest shadow page table information that tracks whether the guest program code in the virtual memory page has been translated. The compiler subsequently uses the information stored in the guest shadow page table to detect attempts to modify the contents of the virtual memory page. Upon detecting such an attempt, the system invalidates the translated guest program code associated with the virtual memory page. | 12-30-2010 |
20130073883 | Dynamic Power Optimization For Computing Devices - In the various aspects, virtualization techniques may be used to reduce the amount of power consumed by execution of applications by power-optimizing the code prior to execution. A dynamic binary translator operating at the machine layer may use a power consumption model to identify code segments that can benefit from optimization and to perform an instruction-sequence to instruction-sequence translation of object code to generate power-optimized object code. Execution hardware may be instrumented with additional circuitry to measure the power consumption characteristics of executing code. The power consumption models may be updated and object code may be regenerated based on the measured the power consumption characteristics of previously executed code. In an aspect, power optimization may be accomplished when the computing device is connected to a battery charger. | 03-21-2013 |
20130080805 | DYNAMIC PARTITIONING FOR HETEROGENEOUS CORES - In the various aspects, a virtual machine operating at the machine layer may use power consumption models to partition object code into portions, identify the relative power efficiencies of the mobile device processors for the various code portions, and route the code portions to the mobile device processors that can perform the operations using the least amount of energy. A dynamic binary translator process may translate the object code portions into an instruction set language supported by the hardware component identified as being preferred. The code portions may be executed and the amount of power consumed may be measured, with the measurements used to generate and/or update performance and power consumption models. | 03-28-2013 |
20130198495 | Method and Apparatus For Register Spill Minimization - The aspects enable a computing device to allocate memory space to variables during runtime compilation of a software application. A compiler may be modified to identify operations that can be performed on either a main pipe or an alternative pipe, identify chains of related operations that can be performed on either the main pipe or the alternative pipe, identify points in the execution of code at which the number of live values will exceed the number of registers, and choosing a chain of operations as a candidate to be moved to the alternative pipe in order to reduce the number of live values at identified points in the execution of code. The entire chosen chain of operations may be moved to the alternative pipe. The alternative pipe may perform the computations and return the results to the main pipe for execution. | 08-01-2013 |