Patent application number | Description | Published |
20130300752 | SYSTEM AND METHOD FOR COMPILER SUPPORT FOR KERNEL LAUNCHES IN DEVICE CODE - A system and method for compiling source code (e.g., with a compiler). The method includes accessing a portion of device source code and determining whether the portion of the device source code comprises a piece of work to be launched on a device from the device. The method further includes determining a plurality of application programming interface (API) calls based on the piece of work to be launched on the device and generating compiled code based on the plurality of API calls. The compiled code comprises a first portion operable to execute on a central processing unit (CPU) and a second portion operable to execute on the device (e.g., GPU). | 11-14-2013 |
20130304996 | METHOD AND SYSTEM FOR RUN TIME DETECTION OF SHARED MEMORY DATA ACCESS HAZARDS - A system and method for detecting shared memory hazards are disclosed. The method includes, for a unit of hardware operating on a block of threads, mapping a plurality of shared memory locations assigned to the unit to a tracking table. The tracking table comprises an initialization bit as well as access type information, collectively called the state tracking bits for each shared memory location. The method also includes, for an instruction of a program within a barrier region, identifying a second access to a location in shared memory within a block of threads executed by the hardware unit. The second access is identified based on a status of the state tracking bits. The method also includes determining a hazard based on a first type of access and a second type of access to the shared memory location. Information related to the first access is provided in the table. | 11-14-2013 |
20130305233 | METHOD AND SYSTEM FOR SEPARATE COMPILATION OF DEVICE CODE EMBEDDED IN HOST CODE - Embodiments of the present invention provide a novel solution that supports the separate compilation of host code and device code used within a heterogeneous programming environment. Embodiments of the present invention are operable to link device code embedded within multiple host object files using a separate device linking operation. Embodiments of the present invention may extract device code from their respective host object files and then linked them together to form linked device code. This linked device code may then be embedded back into a host object generated by embodiments of the present invention which may then be passed to a host linker to form a host executable file. As such, device code may be split into multiple files and then linked together to form a final executable file by embodiments of the present invention. | 11-14-2013 |
20130305234 | METHOD AND SYSTEM FOR MULTIPLE EMBEDDED DEVICE LINKS IN A HOST EXECUTABLE - Embodiments of the present invention provide a novel solution to generate multiple linked device code portions within a final executable file. Embodiments of the present invention are operable to extract device code from their respective host object filesets and then link them together to form multiple linked device code portions. Also, using the identification process described by embodiments of the present invention, device code embedded within host objects may also be uniquely identified and linked in accordance with the protocols of conventional programming languages. Furthermore, these multiple linked device code portions may be then converted into distinct executable forms of code that may be encapsulated within a single executable file. | 11-14-2013 |
20150149987 | METHOD AND APPARATUS FOR COMPILER PROCESSING FOR A FUNCTION MARKED WITH MULTIPLE EXECUTION SPACES - A method for processing a function with a plurality of execution spaces is disclosed. The method comprises creating an internal compiler representation for the function. Creating the internal compiler representation comprises copying substantially all lexical tokens corresponding to a body of the function. Further, the creating comprises inserting the lexical tokens into a plurality of conditional if-statements, wherein a conditional if-statement is generated for each corresponding execution space of said plurality of execution spaces, and wherein each conditional if-statement determines which execution space the function is executing in. During compilation, the method finally comprises performing overload resolution at a call site of an overloaded function by checking for compatibility with a first execution space specified by one of the plurality of conditional if-statements, wherein the overloaded function is called within the body of the function. | 05-28-2015 |
Patent application number | Description | Published |
20120254875 | Method for Transforming a Multithreaded Program for General Execution - A technique is disclosed for executing a program designed for multi-threaded operation on a general purpose processor. Original source code for the program is transformed from a multi-threaded structure into a computationally equivalent single-threaded structure. A transform operation modifies the original source code to insert code constructs for serial thread execution. The transform operation also replaces synchronization barrier constructs in the original source code with synchronization barrier code that is configured to facilitate serialization. The transformed source code may then be conventionally compiled and advantageously executed on the general purpose processor. | 10-04-2012 |
20140129783 | SYSTEM AND METHOD FOR ALLOCATING MEMORY OF DIFFERING PROPERTIES TO SHARED DATA OBJECTS - A system and method for allocating shared memory of differing properties to shared data objects and a hybrid stack data structure. In one embodiment, the system includes: (1) a hybrid stack creator configured to create, in the shared memory, a hybrid stack data structure having a lower portion having a more favorable property and a higher portion having a less favorable property and (2) a data object allocator associated with the hybrid stack creator and configured to allocate storage for shared data object in the lower portion if the lower portion has a sufficient remaining capacity to contain the shared data object and alternatively allocate storage for the shared data object in the higher portion if the lower portion has an insufficient remaining capacity to contain the shared data object. | 05-08-2014 |
20140129812 | SYSTEM AND METHOD FOR EXECUTING SEQUENTIAL CODE USING A GROUP OF HREADS AND SINGLE-INSTRUCTION, MULTIPLE-THREAD PROCESSOR INCORPORATING THE SAME - A system and method for executing sequential code in the context of a single-instruction, multiple-thread (SIMT) processor. In one embodiment, the system includes: (1) a pipeline control unit operable to create a group of counterpart threads of the sequential code, one of the counterpart threads being a master thread, remaining ones of the counterpart threads being slave threads and (2) lanes operable to: (2 | 05-08-2014 |
20140130021 | SYSTEM AND METHOD FOR TRANSLATING PROGRAM FUNCTIONS FOR CORRECT HANDLING OF LOCAL-SCOPE VARIABLES AND COMPUTING SYSTEM INCORPORATING THE SAME - A system and method of translating functions of a program. In one embodiment, the system includes: (1) a local-scope variable identifier operable to identify local-scope variables employed in the at least some of the functions as being either thread-shared local-scope variables or thread-private local-scope variables and (2) a function translator associated with the local-scope variable identifier and operable to translate the at least some of the functions to cause thread-shared memory to be employed to store the thread-shared local-scope variables and thread-private memory to be employed to store the thread-private local-scope variables. | 05-08-2014 |
20140130052 | SYSTEM AND METHOD FOR COMPILING OR RUNTIME EXECUTING A FORK-JOIN DATA PARALLEL PROGRAM WITH FUNCTION CALLS ON A SINGLE-INSTRUCTION-MULTIPLE-THREAD PROCESSOR - A system and method for compiling or runtime executing a fork-join data parallel program with function calls. In one embodiment, the system includes: (1) a partitioner operable to partition groups into a master group and at least one worker group and (2) a thread designator associated with the partitioner and operable to designate only one thread from the master group for execution and all threads in the at least one worker group for execution. | 05-08-2014 |
20150143347 | SOFTWARE DEVELOPMENT ENVIRONMENT AND METHOD OF COMPILING INTEGRATED SOURCE CODE - A software development environment (SDE) and a method of compiling integrated source code. One embodiment of the SDE includes: (1) a parser configured to partition an integrated source code into a host code partition and a device code partition, the host code partition including a reference to a device variable, (2) a translator configured to: (2a) embed device machine code, compiled based on the device code partition, into a modified host code, (2b) define a pointer in the modified host code configured to be initialized, upon execution of the integrated source code, to a memory address allocated to the device variable, and (2c) replace the reference with a dereference to the pointer, and (3) a host compiler configured to employ a host library to compile the modified host code. | 05-21-2015 |
Patent application number | Description | Published |
20090146915 | MULTIPLE VIEW DISPLAY DEVICE - A multiple view display device may employ an array of lenticular lenses to concurrently display a plurality of images at different viewing angles. The display device may include a plurality of pixels, one or more display drivers, and a lenticular lens array comprising a plurality of lenticular lenses. The pixels may be arranged in a matrix having rows and columns. A first display driver may display a first image using a first set of the columns, and a second display driver may display a second image using a second set of the columns. Each lenticular lens may be configured to direct, in a first direction, the light emitted by a first column of the first set of columns and to direct, in a second direction different than the first direction, the light emitted by a second column of the second set of columns. | 06-11-2009 |
20090147070 | PROVIDING PERSPECTIVE-DEPENDENT VIEWS TO VIDEO CONFERENCE PARTICIPANTS - During a video conference between a local endpoint and a remote endpoint, a display at the local endpoint may be configured to provide perspective-dependent views to local video conference participants. A local endpoint may receive a plurality of video streams and identify a first video stream that provides a first view of a remote participant and a second video stream that provides a second view of that participant taken concurrently from a different angle. A display at the local endpoint may display the first video stream at a first viewing angle that only allows the first view of the remote participant to be seen from a first region. The display may also concurrently display the second video stream at a second viewing angle that only allows the second view of the remote participant to be seen from a second region different than the first region. | 06-11-2009 |
20150127834 | OPTIMIZING PLACEMENT OF VIRTUAL MACHINES - Systems and methods are described for allocating resources in a cloud computing environment. The method includes receiving a computing request, the request for use of at least one virtual machine and a portion of memory. In response to the request, a plurality of hosts is identified and a cost function is formulated using at least a portion of those hosts. Based on the cost function, at least one host that is capable of hosting the virtual machine and memory is selected. | 05-07-2015 |
Patent application number | Description | Published |
20100293123 | COMPLEX SITUATION ANALYSIS SYSTEM - A system for generating a representation of a situation is disclosed. The system comprises one or more computer-readable media including computer-executable instructions that are executable by one or more processors to implement a method of generating a representation of a situation. The method comprises receiving input data regarding a target population. The method further comprises constructing a synthetic data set including a synthetic population based on the input data. The synthetic population includes a plurality of synthetic entities. Each synthetic entity has a one-to-one correspondence with an entity in the target population. Each synthetic entity is assigned one or more attributes based on information included in the input data. The method further comprises receiving activity data for a plurality of entities in the target population. The method further comprises generating activity schedules for each synthetic entity in the synthetic population. Each synthetic entity is assigned at least one activity schedule based on the attributes assigned to the synthetic entity and information included in the activity data. An activity schedule describes the activities of the synthetic entity and includes a location associated with each activity. The method further comprises receiving additional data relevant to the situation being represented. The additional data is received from at least two distinct information sources. The method further comprises modifying the synthetic data set based on the additional data. Modifying the synthetic data set includes integrating at least a portion of the additional data received from each of the at least two distinct information sources into the synthetic data set based on one or more behavioral theories related to the synthetic population. The method further comprises generating a social contact network based on the synthetic data set. The social contact network is used to generate the representation of the situation. | 11-18-2010 |
20130191312 | COMPLEX SITUATION ANALYSIS SYSTEM - Systems, methods, and computer-readable media for generating a data set are provided. One method includes generating a data set based on input data using a plurality of brokers. The method further includes receiving a request from a user and determining whether the request can be fulfilled using data currently in the data set. When the request can be fulfilled using data currently in the data set, the data is accessed using broker(s) configured to provide access to data within the data set. When the request cannot be fulfilled using data currently in the data set, at least one new broker is spawned using existing broker(s) and additional data needed to fulfill the request is added to the data set using the new broker. The method further includes generating a response to the request using one or more of the plurality of brokers. | 07-25-2013 |
20140201119 | COMPLEX SITUATION ANALYSIS SYSTEM - Systems, methods, and computer-readable media for generating a data set are provided. One method includes generating a data set based on input data using a plurality of brokers. The method further includes receiving a request from a user and determining whether the request can be fulfilled using data currently in the data set. When the request can be fulfilled using data currently in the data set, the data is accessed using broker(s) configured to provide access to data within the data set. When the request cannot be fulfilled using data currently in the data set, at least one new broker is spawned using existing broker(s) and additional data needed to fulfill the request is added to the data set using the new broker. The method further includes generating a response to the request using one or more of the plurality of brokers. | 07-17-2014 |
20150178621 | COMPLEX SITUATION ANALYSIS SYSTEM - Systems, methods, and computer-readable media for generating a data set are provided. One method includes generating a data set based on input data using a plurality of brokers. The method further includes receiving a request from a user and determining whether the request can be fulfilled using data currently in the data set. When the request can be fulfilled using data currently in the data set, the data is accessed using broker(s) configured to provide access to data within the data set. When the request cannot be fulfilled using data currently in the data set, at least one new broker is spawned using existing broker(s) and additional data needed to fulfill the request is added to the data set using the new broker. The method further includes generating a response to the request using one or more of the plurality of brokers. | 06-25-2015 |
Patent application number | Description | Published |
20140040196 | System and Method for Event-Based Synchronization of Remote and Local File Systems - A method for synchronizing a file system (FS) and a remote file system (RFS) includes monitoring the FS for FS events, generating FS event records, receiving RFS event records of RFS events, generating file system operations (FSOs) based on the FS and RFS event records, and communicating the FSOs to the FS and RFS to synchronize them. A method for generating the FSOs includes accessing a plurality of FS and/or RFS event records, processing the accessed records to generate processed event records, generating the FSOs based on the processed event records, and outputting the FSOs to cause synchronization of the FS and RFS. Systems are also described. The invention facilitates event-based, steady-state synchronization of local and remote file systems. | 02-06-2014 |
20140040197 | System and Method for Event-Based Synchronization of Remote and Local File Systems - A method for synchronizing a file system (FS) and a remote file system (RFS) includes monitoring the FS for FS events, generating FS event records, receiving RFS event records of RFS events, generating file system operations (FSOs) based on the FS and RFS event records, and communicating the FSOs to the FS and RFS to synchronize them. A method for generating the FSOs includes accessing a plurality of FS and/or RFS event records, processing the accessed records to generate processed event records, generating the FSOs based on the processed event records, and outputting the FSOs to cause synchronization of the FS and RFS. Systems are also described. The invention facilitates event-based, steady-state synchronization of local and remote file systems. | 02-06-2014 |
20140149461 | FLEXIBLE PERMISSION MANAGEMENT FRAMEWORK FOR CLOUD ATTACHED FILE SYSTEMS - A method of managing file permissions in a remote file storage system includes defining permissions for the remote file storage system and controlling access to objects on the remote file storage system according to the permissions of the remote file storage system. The permissions are transferred to a client file storage system remote from the remote file storage system, and access to objects on the client file storage system is controlled according to the permissions of the remote file storage system. A remote file storage system includes a permissions file generator operative to generate a permissions file, which is transmitted to a client file storage system for enforcement at the client file storage system. | 05-29-2014 |
Patent application number | Description | Published |
20150058730 | GAME EVENT DISPLAY WITH A SCROLLABLE GRAPHICAL GAME PLAY FEED - A method is disclosed for receiving a plurality of play events associated with a sporting event, wherein each play event of the plurality of play events comprises a game clock time, a description, and identifies a sports team of a plurality of sports teams; for each play event in the plurality of play events: generating a graphical tile that is associated with the play event; configuring an appearance of the graphical tile based, at least in part, on the description and the sports team of the play event; causing to display a graphical tile list in a graphical user interface of a mobile computing device, wherein the graphical tile list includes one or more of the graphical tiles listed in a chronological order based on the game clock time in the play event associated with each graphical tile. | 02-26-2015 |
20150058780 | GAME EVENT DISPLAY WITH SCROLL BAR AND PLAY EVENT ICONS - A method is disclosed for receiving a plurality of play events associated with a sporting event, wherein each play event of the plurality of play events comprises a timestamp; for each of the play events, associating the play event with a sports team of a plurality of sports teams; for each of the play events, associating with the play event a particular icon from among a plurality of different icons based, at least in part, on the sports team that is associated with the play event; causing to display a bar in a graphical user interface of a mobile computing device, wherein the bar represents at least a portion of a time duration of the sporting event; for each of the play events, causing to display the particular icon at a position in the bar, wherein the position is based, at least in part, on the timestamp of the event. | 02-26-2015 |
20150058781 | PROVIDING GAME AND FACILITY INFORMATION TO IN-STADIUM SPECTATORS - Techniques for providing play-by-play game information and stadium facility load information to in-stadium spectator devices and for using the provided information at the spectator devices. | 02-26-2015 |
Patent application number | Description | Published |
20080260119 | SYSTEMS, METHODS, AND COMPUTER PROGRAM PRODUCTS FOR PROVIDING SERVICE INTERACTION AND MEDIATION IN A COMMUNICATIONS NETWORK - Systems, methods, and computer program products for providing service interaction and mediation in a communications network are disclosed. According to one aspect, the subject matter described herein includes a system for providing service interaction and mediation in a communications network. The system includes a communications interface for receiving a client-to-SCIM message from a service client; and a service capability interaction manager (SCIM) module for providing service interaction between the service client and multiple application servers providing different types of services. Providing the service interaction includes receiving, from the communications interface, the client-to-SCIM service interaction message, and, in response to receiving the client-to-SCIM message, generating multiple SCIM-to-server messages and sending the SCIM-to-server messages to multiple application servers. Providing the service interaction also includes receiving multiple server-to-SCIM service interaction messages from at least some of the application servers that received the SCIM-to-server messages, and, in response to receiving the server-to-SCIM messages, generating a SCIM-to-client message containing an aggregation of at least a portion of data from at least some of the server-to-SCIM messages, and sending the SCIM-to-client message containing the aggregation to the service client via the communications interface. | 10-23-2008 |
20080285438 | METHODS, SYSTEMS, AND COMPUTER PROGRAM PRODUCTS FOR PROVIDING FAULT-TOLERANT SERVICE INTERACTION AND MEDIATION FUNCTION IN A COMMUNICATIONS NETWORK - Methods, systems, and computer program products for providing fault-tolerant service interaction and mediation function in a communications network are disclosed. According to one aspect, the subject matter described herein includes a method for providing fault-tolerant service interaction and mediation capability. The method includes providing an active instance of a service capability interaction manager (SCIM) function for providing service interaction and mediation between entities that request network services and entities that provide network services in a communications network. The method also includes providing a standby instance of the SCIM function. The active instance of the SCIM function performs service interaction and mediation between the entities that request network services and the entities that provide network services. In response to failure of the active SCIM function, the standby instance of the SCIM function takes over the service interaction and mediation previously performed by the active instance of the SCIM function. | 11-20-2008 |
20080311917 | Methods, systems, and computer program products for identifying a serving home subscriber server (HSS) in a communications network - Methods, systems, and computer program products for determining a serving home subscriber server (HSS) in a communications network are described. One method includes obtaining a subscriber identifier from a query message. An exceptions-based data structure contained in a database is accessed to locate a database entry associated with the subscriber identifier. Similarly, a range-based data structure contained in the database is accessed to locate the database entry associated with the subscriber identifier if the exceptions-based data structure does not contain the database entry. The method also includes acquiring serving HSS data corresponding to the located entry from either the exceptions-based data structure or the range-based data structure. | 12-18-2008 |
Patent application number | Description | Published |
20110138135 | Fast and Efficient Reacquisition of Locks for Transactional Memory Systems - A system and method is disclosed for fast lock acquisition and release in a lock-based software transactional memory system. The method includes determining that a group of shared memory areas are likely to be accessed together in one or more atomic memory transactions executed by one or more threads of a computer program in a transactional memory system. In response to determining this, the system associates the group of memory areas with a single software lock that is usable by the transactional memory system to coordinate concurrent transactional access to the group of memory areas by the threads of the computer program. Subsequently, a thread of the program may gain access to a plurality of the memory areas of the group by acquiring the single software lock. | 06-09-2011 |
20110202907 | METHOD AND SYSTEM FOR OPTIMIZING CODE FOR A MULTI-THREADED APPLICATION - In modern multi-threaded environments, threads often work cooperatively toward providing collective or aggregate throughput for an application as a whole. Optimizing in the small for “thread local” common path latency is often but not always the best approach for a concurrent system composed of multiple cooperating threads. Some embodiments provide a technique for augmenting traditional code emission with thread-aware policies and optimization strategies for a multi-threaded application. During operation, the system obtains information about resource contention between executing threads of the multi-threaded application. The system analyzes the resource contention information to identify regions of the code to be optimized. The system recompiles these identified regions to produce optimized code, which is then stored for subsequent execution. | 08-18-2011 |
20110246724 | System and Method for Providing Locale-Based Optimizations In a Transactional Memory - The system and methods described herein may reduce read/write fence latencies and cache pressure related to STM metadata accesses. These techniques may leverage locality information (as reflected by the value of a respective locale guard) associated with each of a plurality of data partitions (locales) in a shared memory to elide various operations in transactional read/write fences when transactions access data in locales owned by their threads. The locale state may be disabled, free, exclusive, or shared. For a given memory access operation of an atomic transaction targeting an object in the shared memory, the system may implement the memory access operation using a contention mediation mechanism selected based on the value of the locale guard associated with the locale in which the target object resides. For example, a traditional read/write fence may be employed in some memory access operations, while other access operations may employ an optimized read/write fence. | 10-06-2011 |
20120005530 | System and Method for Communication Between Concurrent Transactions Using Transaction Communicator Objects - Transactional memory implementations may be extended to include special transaction communicator objects through which concurrent transactions can communicate. Changes by a first transaction to a communicator may be visible to concurrent transactions before the first transaction commits. Although isolation of transactions may be compromised by such communication, the effects of this compromise may be limited by tracking dependencies among transactions, and preventing any transaction from committing unless every transaction whose changes it has observed also commits. For example, mutually dependent or cyclically dependent transactions may commit or abort together. Transactions that do not communicate with each other may remain isolated. The system may provide a communicator-isolating transaction that ensures isolation even for accesses to communicators, which may be implemented using nesting transactions. True (e.g., read-after-write) dependencies, ordering (e.g., write-after-write) dependencies, and/or anti-dependencies (e.g., write-after-read dependencies) may be tracked, and a resulting dependency graph may be perused by the commit protocol. | 01-05-2012 |
20120204163 | System and Method for Optimizing Software Transactional Memory Operations Using Static Caching of Memory Objects - Systems and methods for optimizing transactional memory operations may employ static analysis of source code and static caching of memory objects to elide redundant transactional accesses. For example, a compiler (or an optimizer thereof) may be configured to analyze code that includes an atomic transaction to determine if any read accesses to shared memory locations are dominated by a previous read or write access to the same locations and/or any write accesses to shared memory locations are post-dominated by a subsequent write access to the same locations. Any access within a transaction that is determined to be redundant (e.g., any access other than the first read of a given shared memory location from within the transaction or the last write to a given shared memory location from within the transaction) may be replaced (by the compiler/optimizer) with a non-transactional access to a cached shadow copy of the shared memory location. | 08-09-2012 |
20120311273 | System and Method for Synchronization Between Concurrent Transactions Using Transaction Condition Variables - The systems and methods described herein may extend transactional memory implementations to support transaction communicators and/or transaction condition variables for which transaction isolation is relaxed, and through which concurrent transactions can communicate and be synchronized with each other. Transactional accesses to these objects may not be isolated unless called within communicator-isolating transactions. A waiter transaction may invoke a wait method of a transaction condition variable, be added to a wait list for the variable, and be suspended pending notification of a notification event from a notify method of the variable. A notifier transaction may invoke a notify method of the variable, which may remove the waiter from the wait list, schedule the waiter transaction for resumed execution, and notify the waiter of the notification event. A waiter transaction may commit only if the corresponding notifier transaction commits. If the waiter transaction aborts, the notification may be forwarded to another waiter. | 12-06-2012 |
20120311606 | System and Method for Implementing Hierarchical Queue-Based Locks Using Flat Combining - The system and methods described herein may be used to implement a scalable, hierarchal, queue-based lock using flat combining. A thread executing on a processor core in a cluster of cores that share a memory may post a request to acquire a shared lock in a node of a publication list for the cluster using a non-atomic operation. A combiner thread may build an ordered (logical) local request queue that includes its own node and nodes of other threads (in the cluster) that include lock requests. The combiner thread may splice the local request queue into a (logical) global request queue for the shared lock as a sub-queue. A thread whose request has been posted in a node that has been combined into a local sub-queue and spliced into the global request queue may spin on a lock ownership indicator in its node until it is granted the shared lock. | 12-06-2012 |
20130047011 | System and Method for Enabling Turbo Mode in a Processor - The systems and methods described herein may enable a processor core to run at higher speeds than other processor cores in the same package. A thread executing on one processor core may begin waiting for another thread to complete a particular action (e.g., to release a lock). In response to determining that other threads are waiting, the thread/core may enter an inactive state. A data structure may store information indicating which threads are waiting on which other threads. In response to determining that a quorum of threads/cores are in an inactive state, one of the threads/cores may enter a turbo mode in which it executes at a higher speed than the baseline speed for the cores. A thread holding a lock and executing in turbo mode may perform work delegated by waiting threads at the higher speed. A thread may exit the inactive state when the waited-for action is completed. | 02-21-2013 |
20130047163 | Systems and Methods for Detecting and Tolerating Atomicity Violations Between Concurrent Code Blocks - The system and methods described herein may be used to detect and tolerate atomicity violations between concurrent code blocks and/or to generate code that is executable to detect and tolerate such violations. A compiler may transform program code in which the potential for atomicity violations exists into alternate code that tolerates these potential violations. For example, the compiler may inflate critical sections, transform non-critical sections into critical sections, or coalesce multiple critical sections into a single critical section. The techniques described herein may utilize an auxiliary lock state for locks on critical sections to enable detection of atomicity violations in program code by enabling the system to distinguish between program points at which lock acquisition and release operations appeared in the original program, and the points at which these operations actually occur when executing the transformed program code. Filtering and analysis techniques may reduce false positives induced by the transformations. | 02-21-2013 |
20130086348 | Lock-Clustering Compilation for Software Transactional Memory - A lock-clustering compiler is configured to compile program code for a software transactional memory system. The compiler determines that a group of data structures are accessed together within one or more atomic memory transactions defined in the program code. In response to determining that the group is accessed together, the compiler creates an executable version of the program code that includes clustering code, which is executable to associate the data structures of the group with the same software transactional memory lock. The lock is usable by the software transactional memory system to coordinate concurrent transactional access to the group of data structures by multiple concurrent threads. | 04-04-2013 |
20130290583 | System and Method for NUMA-Aware Locking Using Lock Cohorts - The system and methods described herein may be used to implement NUMA-aware locks that employ lock cohorting. These lock cohorting techniques may reduce the rate of lock migration by relaxing the order in which the lock schedules the execution of critical code sections by various threads, allowing lock ownership to remain resident on a single NUMA node longer than under strict FIFO ordering, thus reducing coherence traffic and improving aggregate performance. A NUMA-aware cohort lock may include a global shared lock that is thread-oblivious, and multiple node-level locks that provide cohort detection. The lock may be constructed from non-NUMA-aware components (e.g., spin-locks or queue locks) that are modified to provide thread-obliviousness and/or cohort detection. Lock ownership may be passed from one thread that holds the lock to another thread executing on the same NUMA node without releasing the global shared lock. | 10-31-2013 |
20130290967 | System and Method for Implementing NUMA-Aware Reader-Writer Locks - NUMA-aware reader-writer locks may leverage lock cohorting techniques to band together writer requests from a single NUMA node. The locks may relax the order in which the lock schedules the execution of critical sections of code by reader threads and writer threads, allowing lock ownership to remain resident on a single NUMA node for long periods, while also taking advantage of parallelism between reader threads. Threads may contend on node-level structures to get permission to acquire a globally shared reader-writer lock. Writer threads may follow a lock cohorting strategy of passing ownership of the lock in write mode from one thread to a cohort writer thread without releasing the shared lock, while reader threads from multiple NUMA nodes may simultaneously acquire the shared lock in read mode. The reader-writer lock may follow a writer-preference policy, a reader-preference policy or a hybrid policy. | 10-31-2013 |
20140282574 | System and Method for Implementing Constrained Data-Driven Parallelism - Systems and methods for implementing constrained data-driven parallelism may provide programmers with mechanisms for controlling the execution order and/or interleaving of tasks spawned during execution. For example, a programmer may define a task group that includes a single task, and the single task may define a direct or indirect trigger that causes another task to be spawned (e.g., in response to a modification of data specified in the trigger). Tasks spawned by a given task may be added to the same task group as the given task. A deferred keyword may control whether a spawned task is to be executed in the current execution phase or its execution is to be deferred to a subsequent execution phase for the task group. Execution of all tasks executing in the current execution phase may need to be complete before the execution of tasks in the next phase can begin. | 09-18-2014 |