Patent application number | Description | Published |
20090113167 | DATA PROCESSING APPARATUS AND METHOD OF PROCESSING DATA - Data processing apparatus comprising: a chunk store containing specimen data chunks, a manifest store containing at least one manifest that represents at least a part of a data set and that comprises at least one reference to at least one of said specimen data chunks, a sparse chunk index containing information on only those specimen data chunks having a predetermined characteristic, the processing apparatus being operable to process input data into input data chunks and to use the sparse chunk index to identify at least one of said at least one manifest that includes at least one reference to one of said specimen data chunks that corresponds to one of said input data chunks having the predetermined characteristic. | 04-30-2009 |
20110040763 | DATA PROCESSING APPARATUS AND METHOD OF PROCESSING DATA - One embodiment is a data processing apparatus that has a chunk store containing specimen data chunks, a manifest store containing a plurality of manifests, each of which represents at least a part of previously processed data and includes at least one reference to at least one of the specimen data chunks, and a sparse chunk index containing information on only some specimen data chunks. Input data is processed into a plurality of input data segments. Each manifest of the first set has at least one reference to one of said specimen data chunks that corresponds to one of the input data chunks of a first input data segment. Specimen data chunks corresponding to other input data chunks of the first input data segment are identified by using the identified first set of manifests and at least one manifest identified when processing previous data. | 02-17-2011 |
20120143715 | SPARSE INDEX BIDDING AND AUCTION BASED STORAGE - Illustrated is a system and method that includes a receiving module, which resides on a back end node, to receive a set of hashes that is generated from a set of chunks associated with a segment of data. Additionally, the system and method further includes a lookup module, which resides on the back end node, to search for at least one hash in the set of hashes as a key value in a sparse index. The system and method also includes a bid module, which reside on the back end node, to generate a bid, based upon a result of the search. | 06-07-2012 |
20120239815 | DISTRIBUTED DIFFERENTIAL STORE WITH NON-DISTRIBUTED OBJECTS AND COMPRESSION-ENHANCING DATA-OBJECT ROUTING - One embodiment of the present invention provides a distributed, differential electronic-data storage system that includes client computers, component data-storage systems, and a routing component. Client computers direct data objects to component data-storage systems within the distributed, differential electronic-data storage system. Component data-storage systems provide data storage for the distributed, differential electronic-data storage system. The routing component directs data objects, received from the clients computers, through logical bins to component data-storage systems by a compression-enhancing routing method. | 09-20-2012 |
Patent application number | Description | Published |
20090037456 | Providing an index for a data store - Chunks are stored in a data store, where respective collections of chunks form respective files. An index that maps digests of chunks to pages containing information to recreate the chunks is provided, where the index is stored in persistent storage. | 02-05-2009 |
20100114832 | Forensic snapshot - Systems, methods, and other embodiments associated with forensic snapshots are described. One example method includes creating a snapshot of an operational data. The example method may also include creating a hash tree by hashing lowest level data blocks of the snapshot to produce lowest level hashes. Creating a hash tree may also include repeatedly growing the hash tree bottom up by selectively hashing lower level hashes into higher level hashes until a root node is produced. The example method may also include providing a forensic data associated with the hash tree, where the forensic data is used to verify the integrity of the snapshot. | 05-06-2010 |
20130018855 | DATA DEDUPLICATION - A method for data deduplication includes receiving a set of hashes derived from a data chunk of a set of input data chunks | 01-17-2013 |
20140257919 | REWARD POPULATION GROUPING - An apparatus and method convert user behavior data into rewards. The method and apparatus determine an undiluted reward for a user of a population of users. The method and apparatus divide the population of users into groups. The apparatus and method determine a dilution factor for each of the groups and assign and/or transmit a reward to the user based upon the undiluted reward and the dilution factor for the group to which that user belongs. | 09-11-2014 |
20140344229 | SYSTEMS AND METHODS FOR DATA CHUNK DEDUPLICATION - A method includes receiving information about a plurality of data chunks and determining if one or more of a plurality of back-end nodes already stores more than a threshold amount of the plurality of data chunks where one of the plurality of back-end nodes is designated as a sticky node. The method further includes, responsive to determining that none of the plurality of back-end nodes already stores more than a threshold amount of the plurality of data chunks, deduplicating the plurality of data chunks against the back-end node designated as the sticky node. Finally, the method includes, responsive to an amount of data being processed, designating a different back-end node as the sticky node. | 11-20-2014 |
20150066877 | SEGMENT COMBINING FOR DEDUPLICATION - A non-transitory computer-readable storage device includes instructions that, when executed, cause one or more processors to receive a sequence of hashes. Next, the one or more processors are further caused to determine locations of previously stored copies of a subset of the data chunks corresponding to the hashes. The one or more processors are further caused to group hashes and corresponding data chunks into segments based in part on the determined information. The one or more processors are caused to choose, for each segment, a store to deduplicate that segment against. Finally, the one or more processors are further caused to combine two or more segments chosen to be deduplicated against the same store and deduplicate them as a whole using a second index. | 03-05-2015 |
20150088840 | DETERMINING SEGMENT BOUNDARIES FOR DEDUPLICATION - A sequence of hashes is received. Each hash corresponds to a data chunk of data to be deduplicated. Locations of previously stored copies of the data chunks are determined, the locations determined based on the hashes. A breakpoint in the sequence of data chunks is determined based on the locations, the breakpoint forming a boundary of a segment of data chunks. | 03-26-2015 |