Patent application number | Description | Published |
20090089537 | APPARATUS AND METHOD FOR MEMORY ADDRESS TRANSLATION ACROSS MULTIPLE NODES - A method for translating memory addresses in a plurality of nodes, that includes receiving a first memory access request initiated by a processor of a first node of the plurality of nodes, wherein the first memory access request comprises a process virtual address and a first memory operation, translating the process virtual address to a global system address, wherein the global system address corresponds to a physical memory location on a second node of the plurality of nodes, translating the global system address to an identifier corresponding to the second node, and sending a first message requesting the first memory operation to the second node based on the identifier, wherein the second node performs the first memory operation on the physical memory location. | 04-02-2009 |
20090089790 | METHOD AND SYSTEM FOR COORDINATING HYPERVISOR SCHEDULING - A method for executing an application on a plurality of nodes, that includes synchronizing a first clock of a first node of the plurality of nodes and a second clock of a second node of the plurality of nodes, configuring a first hypervisor on the first node to execute a first application domain and a first privileged domain, wherein configuring the hypervisor comprises allocating a first number of cycles of the first clock to the first privileged domain, configuring a second hypervisor on the second node to execute a second application domain and a second privileged domain, wherein configuring the second hypervisor that includes allocating the first number of cycles of the first clock to the second privileged domain, and executing the application in the first application domain and the second application domain, wherein the first application domain and the second application domain execute semi-synchronously and the first privileged domain and the second privileged domain execute semi-synchronously. | 04-02-2009 |
20090216811 | DYNAMIC COMPOSITION OF AN EXECUTION ENVIRONMENT FROM MULTIPLE IMMUTABLE FILE SYSTEM IMAGES - A virtual file system is formed configured to enable the dynamic composition of immutable file system images. A file system containing a software distribution is divided into a plurality of mutually exclusive sub-trees. Each sub-tree includes a portion of the software distribution. An immutable file system image is formed for each sub-tree. During the booting of an operating system, a virtualization engine intercedes in the boot process to mount the immutable file system images to independent directories of the root file system. Upon request the virtualization engine, during run-time, combines virtual entries corresponding to immutable file system images so as to resemble the original software distribution. | 08-27-2009 |
20090216990 | DYNAMIC TRANSACTIONAL INSTANTIATION OF SYSTEM CONFIGURATION USING A VIRTUAL FILE SYSTEM LAYER - A virtual configuration system, comprising a virtualization engine and a configuration engine, for the dynamic instantiation of configuration files is disclosed. A mechanism is disclosed that allows for transactional updates to a repository of configuration settings comprising multiple files. Configuration entries are stored in a first memory location and a copy of the entries is stored in a second memory location. A virtual configuration file that includes a virtual configuration for each entry is created and used to provide the operating system with path and location information regarding the configuration entries. Simultaneously and during run-time of the computer, the configuration entries stored in the second memory location can be modified. Once the modifications are complete, a second virtual configuration file is created referencing the configuration entries stored at the second memory location. The first virtual configuration file is thereafter atomically replaced by the second virtual configuration file. | 08-27-2009 |
20090217262 | PLUGGABLE EXTENSIONS TO VIRTUAL MACHINE MONITORS - The functionality of a virtualization layer interposed between computer system hardware and a plurality of applications can be altered by pluggable extensions. According to one embodiment of the present invention, a virtualization layer is divided into a privileged portion and an unprivileged portion. While the privileged portion remains untouched, the functionality of the unprivileged portion can be modified by one or more pluggable extensions. Furthermore, file images operating on top of the virtualization layer, and in some cases unaware of the virtual nature of the virtualization layer, can be supplemented using pluggable extensions. | 08-27-2009 |
20090313612 | METHOD AND APPARATUS FOR ENREGISTERING MEMORY LOCATIONS - One embodiment of the present invention provides a system that improves program performance by enregistering memory locations. During operation, the system receives program object code which has been generated for a given hardware implementation, and hence is optimized to use a specified number of registers that are available in that hardware implementation. Next, the system translates this object code to execute on a second hardware implementation which includes more registers than the first hardware implementation. The system makes use of these additional registers to improve the performance of the translated object code for the second hardware implementation. More specifically, the system identifies a memory access in the object code, where the memory access is associated with a memory location. The system then rewrites an instruction associated with this memory access to access the available register instead of the memory location. To preserve program semantics, the system subsequently moderates accesses to the memory location to ensure that no threads access a stale value in the enregistered memory location. | 12-17-2009 |
20100042980 | CROSS-DOMAIN INLINING IN A SYSTEM VIRTUAL MACHINE - A system and method are provided for inlining across protection domain boundaries with a system virtual machine. A protection domain comprises a unique combination of a privilege level and a memory address space. The system virtual machine interprets or dynamically compiles not only application code executing under guest operating systems, but also the guest operating systems. For a program call that crosses a protection domain boundary, the virtual machine assembles an intermediate representation (IR) graph that spans the boundary. Region nodes corresponding to code on both sides of the call are enhanced with information identifying the applicable protection domains. The IR is optimized and used to generate instructions in a native ISA (Instruction Set Architecture) of the virtual machine. Individual instructions reveal the protection domain in which they are to operate, and instructions corresponding to different domains may be interleaved. | 02-18-2010 |
20100042983 | CROSS-ISA INLINING IN A SYSTEM VIRTUAL MACHINE - A system and method are provided for inlining a program call between processes executing under separate ISAs (Instruction Set Architectures) within a system virtual machine. The system virtual machine hosts any number of virtual operating system instances, each of which may execute any number of applications. The system virtual machine interprets or dynamically compiles not only application code executing under virtual operating systems, but also the virtual operating systems. For a program call that crosses ISA boundaries, the virtual machine assembles an intermediate representation (IR) graph that spans the boundary. Region nodes corresponding to code on both sides of the call are enhanced with information identifying the virtual ISA of the code. The IR is optimized and used to generate instructions in a native ISA (Instruction Set Architecture) of the virtual machine. Individual instructions are configured and executed (or emulated) to perform as they would within the virtual ISA. | 02-18-2010 |
20100153662 | FACILITATING GATED STORES WITHOUT DATA BYPASS - One embodiment of the present invention provides a system that facilitates precise exception semantics for a virtual machine. During operation, the system executes a program in the virtual machine using a processor that includes a gated store buffer that stores values to be written to a memory. This gated store buffer is configured to delay a store to the memory until after a speculatively-optimized region of the program commits. The processor signals an exception when it detects that a load following the store is attempting to access the same memory region being written by the store prior to the commitment of the speculatively-optimized region. | 06-17-2010 |
20100153690 | USING REGISTER RENAME MAPS TO FACILITATE PRECISE EXCEPTION SEMANTICS - One embodiment of the present invention provides a system that facilitates precise exception semantics. The system includes a processor that uses register rename maps to support out-of-order execution, where the register rename maps track mappings between native architectural registers and physical registers for a program executing on the processor. These register rename maps include: 1) a working rename map that maps architectural registers associated with a decoded instruction to corresponding physical registers; 2) a retire rename map that tracks and preserves a set of physical registers that are associated with retired instructions; and 3) a checkpoint rename map that stores a mapping between a set of architectural registers and a set of physical registers for a preceding checkpoint in the program. When the program signals an exception, the processor uses the checkpoint rename map to roll back program execution to the preceding checkpoint. | 06-17-2010 |
20100153776 | USING SAFEPOINTS TO PROVIDE PRECISE EXCEPTION SEMANTICS FOR A VIRTUAL MACHINE - One embodiment of the present invention provides a system that provides precise exception semantics for a virtual machine. During operation, the system receives a program comprised of instructions that are specified in a machine instruction set architecture of the virtual machine, and translates these instructions into native instructions for the processor that the virtual machine is executing upon. While performing this translation, the system inserts one or more safepoints into the translated native instructions. The system then executes these native instructions on the processor. During execution, if the system detects that an exception was signaled by a native instruction, the system reverts the virtual machine to a previous safepoint to ensure that the virtual machine will precisely emulate the exception behavior of the virtual machine's instruction set architecture. The system uses a gated store buffer to ensure that any stores that occurred after the previous safepoint are discarded when reverting the virtual machine to the previous safepoint. | 06-17-2010 |
20100161950 | SEMI-ABSOLUTE BRANCH INSTRUCTIONS FOR EFFICIENT COMPUTERS - Apparatus and methods are disclosed for a computation processor that can execute a semi-absolute branch instruction, as well as methods of operation and of generating the semi-absolute branch instruction. | 06-24-2010 |
20100228936 | ACCESSING MEMORY LOCATIONS FOR PAGED MEMORY OBJECTS IN AN OBJECT-ADDRESSED MEMORY SYSTEM - One embodiment of the present invention provides a system that accesses memory locations in an object-addressed memory system. During a memory access in the object-addressed memory system, the system receives an object identifier and an address. The system then uses the object identifier to identify a paged memory object associated with the memory access. Next, the system uses the address and a page table associated with the paged memory object to identify a memory page associated with the memory access. After determining the memory page, the system uses the address to access a memory location in the memory page. | 09-09-2010 |
20100235615 | METHOD AND SYSTEM FOR DISCOVERY OF A ROOT FILE SYSTEM - A method for discovery of a root file system that includes obtaining a tag corresponding to a boot image for an operating system, identifying, by a boot loader, a location of the boot image having a predefined value matching the tag, loading a kernel of the operating system retrieved from the boot image, and transferring execution to the kernel, wherein the boot loader provides the tag for the location to the kernel. The method further includes identifying, by the kernel, the location of the root file system based on the tag provided by the boot loader, and executing the operating system on a processor using the root file system identified by the kernel. | 09-16-2010 |
20100235813 | METHOD AND SYSTEM FOR CONFIGURING SOFTWARE MODULES TO EXECUTE IN AN EXECUTION ENVIRONMENT - A method for configuring software modules that includes accessing a properties repository that includes a plurality of properties of the execution environment of the computer system. The method further includes generating a configuration file for each software module. Generating a configuration file includes obtaining a generator module defined for the software module, and executing the generator module to instantiate the configuration file for the software module. The generator module is configured to identify a property required for the configuration file, obtain the value for the property from the properties repository, and store the value for the property in the configuration file in accordance with a customized format required by the software module. The method further includes storing the configuration file for each of the software modules. | 09-16-2010 |
20100250870 | METHOD AND APPARATUS FOR TRACKING ENREGISTERED MEMORY LOCATIONS - One embodiment of the present invention provides a system that tracks enregistered memory locations. During operation, the system receives program object code that enregisters a memory location (e.g., a set of data at a given memory address). Next, the system executes this program object code using a thread. After enregistering the memory location, the system tracks the associated memory address and a thread identifier for the thread in a table that identifies enregistered memory locations. The system checks this table during memory accesses to ensure that other threads attempting to access an enregistered memory location receive a current value for the enregistered memory location. | 09-30-2010 |
20100333090 | METHOD AND APPARATUS FOR PROTECTING TRANSLATED CODE IN A VIRTUAL MACHINE - One embodiment provides a system that protects translated guest program code in a virtual machine that supports self-modifying program code. While executing a guest program in the virtual machine, the system uses a guest shadow page table associated with the guest program and the virtual machine to map a virtual memory page for the guest program to a physical memory page on the host computing device. The system then uses a dynamic compiler to translate guest program code in the virtual memory page into translated guest program code (e.g., native program instructions for the computing device). During compilation, the dynamic compiler stores in a compiler shadow page table and the guest shadow page table information that tracks whether the guest program code in the virtual memory page has been translated. The compiler subsequently uses the information stored in the guest shadow page table to detect attempts to modify the contents of the virtual memory page. Upon detecting such an attempt, the system invalidates the translated guest program code associated with the virtual memory page. | 12-30-2010 |
20120180050 | METHOD AND SYSTEM FOR COORDINATING HYPERVISOR SCHEDULING - A method for executing an application on multiple nodes includes synchronizing a first clock of a first node and a second clock of a second node, configuring a first hypervisor on the first node to execute a first application domain and a first privileged domain, and configuring a second hypervisor on the second node to execute a second application domain and a second privileged domain. Configuring the hypervisor includes allocating a first number of cycles of the first clock to the first privileged domain. Configuring the second hypervisor includes allocating the first number of cycles of the first clock to the second privileged domain. The method further includes executing the application in the first application domain and the second application domain. The first application domain and the second application domain execute semi-synchronously and the first privileged domain and the second privileged domain execute semi-synchronously. | 07-12-2012 |
20130047077 | CONCURRENT PARSING AND PROCESSING OF SERIAL LANGUAGES - The aspects enable a processor to concurrently execute a first serial language code embedding a second serial language code during a page load by a browser. A parser parses the first serial language code until a segment of the embedded second serial language code is encountered. The segment of embedded second serial language code is extracted for execution by an execution engine, which proceeds concurrently with speculative parsing of the first serial language code. Code generated by execution of second serial language code is evaluated to determine if it is well-formed, and partial rollback and re-parsing of the first serial language code is performed if the code is not well-formed. Concurrent parsing of first serial language code and execution of second language code, with partial roll back and reparsing when necessary, continues until the first language code has been parsed and the second serial language code has been executed. | 02-21-2013 |
20130073883 | Dynamic Power Optimization For Computing Devices - In the various aspects, virtualization techniques may be used to reduce the amount of power consumed by execution of applications by power-optimizing the code prior to execution. A dynamic binary translator operating at the machine layer may use a power consumption model to identify code segments that can benefit from optimization and to perform an instruction-sequence to instruction-sequence translation of object code to generate power-optimized object code. Execution hardware may be instrumented with additional circuitry to measure the power consumption characteristics of executing code. The power consumption models may be updated and object code may be regenerated based on the measured the power consumption characteristics of previously executed code. In an aspect, power optimization may be accomplished when the computing device is connected to a battery charger. | 03-21-2013 |
20130080805 | DYNAMIC PARTITIONING FOR HETEROGENEOUS CORES - In the various aspects, a virtual machine operating at the machine layer may use power consumption models to partition object code into portions, identify the relative power efficiencies of the mobile device processors for the various code portions, and route the code portions to the mobile device processors that can perform the operations using the least amount of energy. A dynamic binary translator process may translate the object code portions into an instruction set language supported by the hardware component identified as being preferred. The code portions may be executed and the amount of power consumed may be measured, with the measurements used to generate and/or update performance and power consumption models. | 03-28-2013 |
20130198495 | Method and Apparatus For Register Spill Minimization - The aspects enable a computing device to allocate memory space to variables during runtime compilation of a software application. A compiler may be modified to identify operations that can be performed on either a main pipe or an alternative pipe, identify chains of related operations that can be performed on either the main pipe or the alternative pipe, identify points in the execution of code at which the number of live values will exceed the number of registers, and choosing a chain of operations as a candidate to be moved to the alternative pipe in order to reduce the number of live values at identified points in the execution of code. The entire chosen chain of operations may be moved to the alternative pipe. The alternative pipe may perform the computations and return the results to the main pipe for execution. | 08-01-2013 |
20130198728 | METHOD AND APPARATUS FOR AVOIDING REGISTER INTERFERENCE - The aspects enable a computing device to allocate memory space to variables during runtime compilation of a software application. A first variable associated with a code segment within code being compiled may be identified and assigned a priority tag. A second variable associated with another code segment within the code being compiled may also be assigned a priority tag. A determination may be made regarding whether the first and second variables are contemporaneously live during execution, and whether legal storage location sets for the first and second variables overlap. The assigned priority tags may be used for assigning storage locations to the first and second variables based on the determination. | 08-01-2013 |
20150046661 | Dynamic Address Negotiation for Shared Memory Regions in Heterogeneous Muliprocessor Systems - Mobile computing devices may be configured to compile and execute portions of a general purpose software application in an auxiliary processor (e.g., a DSP) of a multiprocessor system by reading and writing information to a shared memory. A first process (P1) on the applications processor may request address negotiation with a second process (P2) on the auxiliary processor, obtain a first address map from a first operating system, and send the first address map to the auxiliary processor. The second process (P2) may receive the first address map, obtain a second address map from a second operating system, identify matching addresses in the first and second address maps, store the matching addresses as common virtual addresses, and send the common virtual addresses back to the applications processor. The first and second processes (i.e., P1 and P2) may each use the common virtual addresses to map physical pages to the memory. | 02-12-2015 |
20150046679 | Energy-Efficient Run-Time Offloading of Dynamically Generated Code in Heterogenuous Multiprocessor Systems - Mobile computing devices may be configured to intelligently select, compile, and execute portions of a general purpose software application in an auxiliary processor (e.g., a DSP) of a multiprocessor system. A processor of the mobile device may be configured to determine whether portions of a software application are suitable for execution in an auxiliary processor, monitor operating conditions of the system, determine a historical context based on the monitoring, and determine whether the portions that were determined to suitable for execution in an auxiliary processor should be compiled for execution in the auxiliary processor based on the historical context. The processor may also be configured to continue monitoring the system, update the historical context information, and determine whether code previously compiled for execution on the auxiliary processor should be invoked or executed in the auxiliary processor based on the updated historical context information. | 02-12-2015 |
20150046912 | Method for Controlling Inlining in a Code Generator - The various aspects leverage the novel observation that the number of call sites in code is directly correlated with the code's compile time and provide methods implemented by a compiler operating on a computing device (e.g., a smartphone) for performing inline throttling based on a projected number of call sites in the code that would exist after performing inline expansion. The various aspects enable the compiler to improve the performance of the generated code by aggressive inlining while carefully managing increases in compile time, thereby decreasing the power required to compile the code while increasing performance of the computing device. Thus, by inlining enough call sites to reduce the costs of handling calls while accounting for the costs of inlining, the various aspects provide for an effective balance of short compile times and effective code performance. | 02-12-2015 |
20150052331 | Efficient Directed Acyclic Graph Pattern Matching To Enable Code Partitioning and Execution On Heterogeneous Processor Cores - Methods, devices, and systems for automatically determining how an application program may be partitioned and offloaded for execution by a general purpose applications processor and an auxiliary processor (e.g., a DSP, GPU, etc.) within a mobile device. The mobile device may determine the portions of the application code that are best suited for execution on the auxiliary processor based on pattern-matching of directed acyclic graphs (DAGS). In particular, the mobile device may identify one or more patterns in the code, particularly in a data flow graph of the code, comparing each identified code pattern to predefined graph patterns known to have a certain benefit when executed on the auxiliary processor (e.g., a DSP). The mobile device may determine the costs and/or benefits of executing the potions of code on the auxiliary processor, and may offload portions that have low costs and/or high benefits related to the auxiliary processor. | 02-19-2015 |
20150089484 | Fast, Combined Forwards-Backwards Pass Global Optimization Framework for Dynamic Compilers - The various aspects provide a dynamic compilation framework that includes a machine-independent optimization module operating on a computing device and methods for optimizing code with the machine-independent optimization module using a single, combined-forwards-backwards pass of the code. In the various aspects, the machine-independent optimization module may generate a graph of nodes from the IR, optimize nodes in the graph using forwards and backwards optimizations, and propagating the forwards and backwards optimizations to nodes in a bounded subgraph recognized or defined based on the position of the node currently being optimized. In the various aspects, the machine-independent optimization module may optimize the graph by performing forwards and/or backwards optimizations during a single pass through the graph, thereby achieving an effective degree of optimization and shorter overall compile times. Thus, the various aspects may provide a global optimization framework for dynamic compilers that is faster and more efficient than existing solutions. | 03-26-2015 |