Patent application number | Description | Published |
20120131082 | COMPUTATION OF A REMAINDER BY DIVISION USING PSEUDO-REMAINDERS - Methods, computer systems, and computer program products for calculating a remainder by division of a sequence of bytes interpreted as a first number by a second number is provided. A pseudo-remainder by division associated with a first subsequence of the sequence of bytes is calculated. A property of this pseudo-remainder is that the first subsequence of the sequence of bytes, interpreted as a third number, and the pseudo-remainder by division have the same remainder by division when divided by the second number. A second subsequence of the sequence of bytes interpreted as the first number is appended to the pseudo-remainder, interpreted as a sequence of bytes, so as to create a sequence of bytes interpreted as a fourth number. The first number and the fourth number have the same remainder by division when divided by the second number. | 05-24-2012 |
20120143835 | EFFICIENT CONSTRUCTION OF SYNTHETIC BACKUPS WITHIN DEDUPLICATION STORAGE SYSTEM - Various embodiments are provided for facilitating construction of a synthetic backup in a deduplication storage system. In one embodiment, a deduplication storage system enables new input data to be deduplicated with data of synthetic backups already constructed, and for this purpose efficiently calculates deduplication digests for synthetic backups being constructed, based on already existing digests of data referenced by the synthetic backups. For each input data segment of the plurality of input data segments of a synthetic backup being constructed, a plurality of deduplication digests of stored data segments, referenced by the input data segment, is retrieved from an index. Each input data segment is partitioned into each of a plurality of fixed-sized data sub-segments. A calculation is performed producing a deduplication digest for a data sub-segment, where the calculation is based on the retrieved deduplication digests of the plurality of stored data sub-segments referenced by the input data sub-segment. | 06-07-2012 |
20120158812 | PARALLEL COMPUTATION OF A REMAINDER BY DIVISION OF A SEQUENCE OF BYTES - Methods, computer systems, and computer program products for calculating a remainder by division of a sequence of bytes interpreted as a first number by a second number are provided. A first remainder by division associated with a first subset of the sequence of bytes is calculated with a first processor. A second remainder by division associated with a second subset of the sequence of bytes is calculated with a second processor. The calculating of the second remainder by division may occur at least partially during the calculating of the first remainder by division. A third remainder by division is calculated based on the calculating of the first remainder by division and the calculating of the second remainder by division. | 06-21-2012 |
20120239625 | EFFICIENT CONSTRUCTION OF SYNTHETIC BACKUPS WITHIN DEDUPLICATION STORAGE SYSTEM - A deduplication storage system enables new input data to be deduplicated with data of synthetic backups already constructed, and for this purpose efficiently calculates deduplication digests for synthetic backups being constructed, based on already existing digests of data referenced by the synthetic backups. For each input data segment of the plurality of input data segments of a synthetic backup being constructed, a plurality of deduplication digests of stored data segments, referenced by the input data segment, is retrieved from an index. Each input data segment is partitioned into each of a plurality of fixed-sized data sub-segments. A calculation is performed producing a deduplication digest for a data sub-segment, where the calculation is based on the retrieved deduplication digests of the plurality of stored data sub-segments referenced by the input data sub-segment. | 09-20-2012 |
20120271873 | PARALLEL COMPUTATION OF A REMAINDER BY DIVISION OF A SEQUENCE OF BYTES - A remainder by division of a sequence of bytes interpreted as a first number by a second number is calculated. A first remainder by division associated with a first subset of the sequence of bytes is calculated with a first processor. A second remainder by division associated with a second subset of the sequence of bytes is calculated with a second processor. The calculating of the second remainder by division may occur at least partially during the calculating of the first remainder by division. A third remainder by division is calculated based on the calculating of the first remainder by division and the calculating of the second remainder by division. | 10-25-2012 |
20130073528 | SCALABLE DEDUPLICATION SYSTEM WITH SMALL BLOCKS - For scalable data deduplication working with small data chunks in a computing environment, for each of the small data chunks, a signature is generated based on a combination of a representation of characters that appear in the small data chunks with a representation of frequencies of the small data chunks. The signature is used to help in selecting the data to be deduplicated. | 03-21-2013 |
20130073529 | SCALABLE DEDUPLICATION SYSTEM WITH SMALL BLOCKS - For scalable data deduplication working with small data chunks in a computing environment, for each of the small data chunks, a signature is generated based on a combination of a representation of characters that appear in the small data chunks with a representation of frequencies of the small data chunks. The signature is used to help in selecting the data to be deduplicated. | 03-21-2013 |
20130179759 | INCREMENTAL MODIFICATION OF AN ERROR DETECTION CODE BACKGROUND OF THE INVENTION - Exemplary method, system, and computer program product embodiments for an incremental modification of an error detection code operation are provided. In one embodiment, by way of example only, for a data block requiring a first error detection code (EDC) value to be calculated and verified and is undergoing modification for at least one randomly positioned sub-blocks that becomes available and modified in independent time intervals, a second EDC value is calculated for each of the randomly positioned sub-blocks. An incremental effect of the second EDC value is applied for calculating the first EDC value and for recalculating the first EDC value upon replacing at least one of the randomly positioned sub-blocks. The resource consumption is proportional to the size of at least one of the randomly positioned sub-blocks that are added and modified. Additional system and computer program product embodiments are disclosed and provide related advantages. | 07-11-2013 |
20130232116 | CALCULATING DEDUPLICATION DIGESTS FOR A SYNTHETIC BACKUP BY A DEDUPLICATION STORAGE SYSTEM - Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input data and data in the synthetic backup. | 09-05-2013 |
20130232117 | CREATION OF SYNTHETIC BACKUPS WITHIN DEDUPLICATION STORAGE SYSTEM BY A BACKUP APPLICATION - A deduplication storage system and a backup application create a synthetic backup. Metadata instructions are provided to the deduplication storage system. Each of the metadata instructions specifies the data segment of an originating backup and a designated location of the data segment in the synthetic backup. A set of metadata instructions is transformed into a transformed set of metadata instructions. | 09-05-2013 |
20130232119 | CREATION OF SYNTHETIC BACKUPS WITHIN DEDUPLICATION STORAGE SYSTEM - A deduplication storage system and a backup application create a synthetic backup. Metadata instructions are provided to the deduplication storage system. Each of the metadata instructions specifies the data segment of an originating backup and a designated location of the data segment in the synthetic backup. Each of the metadata instructions are processed by locating those data sub-segments in the deduplication storage system specified by the data segment in each of the metadata instructions, and creating metadata references to each of the data sub-segments and adding the metadata references to metadata of the synthetic backup being created. | 09-05-2013 |
20130232120 | DEDUPLICATING INPUT BACKUP DATA WITH DATA OF A SYNTHETIC BACKUP PREVIOUSLY CONSTRUCTED BY A DEDUPLICATION STORAGE SYSTEM - Input backup data is deduplicated with data of a synthetic backup previously constructed by a deduplication storage system. A synthetic backup is constructed by processing metadata instructions provided by a backup application. Deduplication digests are calculated based on the data of the synthetic backup and the deduplication digests are stored in a digests index. When new backup data is processed, deduplication digests of the new data are calculated and searched in the digests index. Matching digests of previously constructed synthetic backups are located in the digests index. Each of the located matching digest references stored data are included in the synthetic backup, and the stored data is similar to the input backup data. Data matches are found in the input backup data and data in the synthetic backup. | 09-05-2013 |
20130290278 | SCALABLE DEDUPLICATION SYSTEM WITH SMALL BLOCKS - Exemplary method, system, and computer program product embodiments for scalable data deduplication working with small data chunk in a computing environment are provided. In one embodiment, by way of example only, for each of the small data chunk, a signature is generated based on a combination of a representation of characters that appear in the small data chunk with a representation of frequencies of the small data chunk. A signature is generated based on a combination of a representation of characters that appear. The signature is used to help in selecting the data to be deduplicated. Additional system and computer program product embodiments are disclosed and provide related advantages. | 10-31-2013 |
20130290279 | SCALABLE DEDUPLICATION SYSTEM WITH SMALL BLOCKS - Exemplary method, system, and computer program product embodiments for scalable data deduplication working with small data chunk in a computing environment are provided. In one embodiment, by way of example only, for each of the small data chunk, a signature is generated based on a combination of a representation of characters that appear in the small data chunk with a representation of frequencies of the small data chunk. A signature is generated based on a combination of a representation of characters that appear. The signature is used to help in selecting the data to be deduplicated. Additional system and computer program product embodiments are disclosed and provide related advantages. | 10-31-2013 |
20140188828 | CONTROLLING SEGMENT SIZE DISTRIBUTION IN HASH-BASED DEDUPLICATION - Segment sizes are controlled by setting the size of a segment boundary in a hash-based deduplication system. A subsequence of size K of a sequence of characters S is set. An increasing sequence of n probabilities and a corresponding sequence of n decreasingly restrictive logical tests are chosen to be applied on the sequence of characters S. Segment boundaries are set by using the sequence of the decreasingly restrictive logical tests by deciding to declare a segment boundary at a current position if one of the sequence of the decreasingly restrictive logical tests, with a corresponding probability of the sequence of n probabilities, returns a true value when applied on the sequence of characters S. | 07-03-2014 |