Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Borthakur, CA

Apolak Borthakur, Palo Alto, CA US

Patent application number	Description	Published
20150324215	MIGRATION OF APPLICATIONS BETWEEN AN ENTERPRISE-BASED NETWORK AND A MULTI-TENANT NETWORK - A method of migrating applications from an enterprise-based network to a multi-tenant network of a compute service provider may include receiving a request to migrate an application running on a first virtual machine instance within the enterprise-based network. Dependencies of the application may be determined by identifying at least a second virtual machine instance within the enterprise-based network, where the at least second virtual machine instance associated with the application. Resource monitoring metrics associated with hardware resources used by the first virtual machine instance and the at least second virtual machine instance may be received. The first and at least second virtual machine instances may be migrated from the enterprise-based network to at least one virtual machine at a server within the multi-tenant network based on the monitoring metrics, thereby migrating the application from the enterprise-based network to the multi-tenant network.	11-12-2015

Dhruba Borthakur, Sunnyvale, CA US

Patent application number	Description	Published
20140067778	SELECTIVE OFFLINING STORAGE MEDIA FILESYSTEM - A method of operation of a storage control system includes: configuring a state change policy on a data server, the state change policy including an online duration for a storage device; activating the storage device based on the state change policy; mounting the storage device based on the state change policy; and scheduling a filesystem maintenance task to be performed on the storage device based on the state change policy.	03-06-2014
20140215007	MULTI-LEVEL DATA STAGING FOR LOW LATENCY DATA ACCESS - Techniques for facilitating and accelerating log data processing are disclosed herein. The front-end clusters generate a large amount of log data in real time and transfer the log data to an aggregating cluster. When the aggregating cluster is not available, the front-clusters write the log data to local filers and send the data when the aggregating cluster recovers. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further sends the aggregated log data stream to centralized NFS filers or a data warehouse cluster. The local filers and the aggregating cluster stage the log data for access by applications, so that the applications do not wait until the data reach the centralized NFS filers or data warehouse cluster.	07-31-2014

Dhrubajyoti Borthakur, Sunnyvale, CA US

Patent application number	Description	Published
20130290805	Distributed System for Fault-Tolerant Data Storage - Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, so long as sufficient undamaged elements are still accessible. The L1EC blocks and the data blocks are divided into distribution sets and these sets are stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. The L2EC blocks enable reconstruction of damaged elements at a data storage location without requiring communication with the other data storage locations.	10-31-2013
20140019405	AUTOMATED FAILOVER OF A METADATA NODE IN A DISTRIBUTED FILE SYSTEM - Switching an active metadata node is disclosed. An indication that a standby metadata node of a distributed file system should replace an active metadata node of the distributed file system as a new active metadata node of the distributed file system is received. The standby metadata node is included in a server. A request that indicates that the standby metadata node would like to become an exclusive metadata node writer of a transaction log is sent. A confirmation that the standby metadata node is the exclusive metadata node writer of the transaction log is received. Based at least in part on the confirmation, an update that the standby metadata node has become the new active metadata node of the distributed file system is provided.	01-16-2014
20140019495	PROCESSING A FILE SYSTEM OPERATION IN A DISTRIBUTED FILE SYSTEM - Processing a file system operation is disclosed. An indication of a desired operation of a distributed file system is received. A metadata node for the desired operation is identified. It is indicated to the identified metadata node to process the desired operation. In the event the identified metadata node becomes not fully functional before the processing by the identified metadata node is confirmed, the distributed file system is analyzed to determine whether to indicate again to process the desired operation.	01-16-2014
20140047266	Distributed System for Fault-Tolerant Data Storage - Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, and the L1EC blocks and the data blocks are divided into distribution sets and stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. Upon detecting a data disruption event, an inaccessible data storage location is identified and the elements that were stored at the inaccessible data storage location are reconstructed.	02-13-2014
20140188870	LSM CACHE - A variety of methods for improving efficiency in a database system are provided. In one embodiment, a method may comprise: generating multiple levels of data according to how recently the data have been updated, whereby most recently updated data are assigned to the newest level; storing each level of data in a specific storage tier; splitting data stored in a particular storage tier into two or more groups according to access statistics of each specific data; during compaction, storing data from different groups in separate data blocks of the particular storage tier; and when a particular data in a specific data block is requested, reading the specific data block into a low-latency storage tier.	07-03-2014
20140317448	INCREMENTAL CHECKPOINTS - A method and system on failure recovery in a storage system are disclosed. In the storage system, user data streams (e.g., log data) are collected by a scribeh system. The scribeh system may include a plurality of Calligraphus servers, HDFS and Zookeeper. The Calligraphus servers may shard the user data streams based on keys (e.g., category and bucket pairs) and stream the user data streams to Puma nodes. Sharded user data streams may be aggregated according to the keys in memory of a specific Puma node. Periodically, aggregated user data streams cached in memory of the specific Puma node, together with a Incremental checkpoint, are persisted to HBase. When a specific process on the specific Puma node fails, Ptail retrieves the Incremental checkpoint from HBase and then restores the specific process by requesting user data streams processed by the specific process from the scribeh system according to the Incremental checkpoint.	10-23-2014
20150134602	ATOMIC UPDATE OPERATIONS IN A DATA STORAGE SYSTEM - Technology is disclosed for performing atomic update operations in a storage system (“the technology”). The technology can receive an update command to update a value associated with a key stored in the storage system as a function of an input value; store the input value in a log stored at the storage system but not updating the value stored in the storage system; and update the value associated with the key with the received input values value based on the a function to generate an updated value, the updating occurring asynchronously with respect to receiving the update command.	05-14-2015
20150134692	QUERYING A SPECIFIED DATA STORAGE LAYER OF A DATA STORAGE SYSTEM - Technology is disclosed for retrieving data from a specific storage layer of a storage system (“the technology”). A query application programming interface (API) is provided that allows an application to specify a storage layer on which the query should be executed. The query API can be used in a multi-threaded environment which employs a combination of fast threads and slow threads to serve read/write requests from applications. The fast threads are configured to query on a first set of storage layers, e.g., storage layers in a primary storage, while the slow threads are configured to query on a second set of storage layers, e.g., storage layers in a secondary storage. If a fast thread does not find the requested data in the first set, the request is transferred to a slow thread and the fast thread is allocated to another request while the slow thread is serving the current request.	05-14-2015
20150293958	SCALABLE DATA STRUCTURES - The technology is directed to providing sequential access to data using scalable data structures. In some embodiments, the scalable data structures include a first data structure, e.g., hash map, and a second data structure, e.g., tree data structure (“tree”). The technology receives multiple key-value pairs representing data associated with an application. A key includes a prefix and a suffix. While the suffixes of the keys are distinct, some of the keys can have the same prefix. The technology stores the keys having the same prefix in a tree, and stores the root node of the tree in the first data structure. To retrieve values of a set of input keys with a given prefix, the technology retrieves a root node of a tree corresponding to the given prefix from the first data structure using the given prefix, and traverses the tree to obtain the values in a sequence.	10-15-2015

Patent applications by Dhrubajyoti Borthakur, Sunnyvale, CA US

Dhrubajyoti Borthakur, Palo Alto, CA US

Patent application number	Description	Published
20140214752	DATA STREAM SPLITTING FOR LOW-LATENCY DATA ACCESS - Techniques for facilitating and accelerating log data processing by splitting data streams are disclosed herein. The front-end clusters generate large amount of log data in real time and transfer the log data to an aggregating cluster. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further splits the log data into a plurality of data streams so that the data streams are sent to a receiving application in parallel. In one embodiment, the log data are randomly split to ensure the log data are evenly distributed in the split data streams. In another embodiment, the application that receives the split data streams determines how to split the log data.	07-31-2014