# Shmuel T. Klein, Rehovot IL

## Shmuel T. Klein, Rehovot IL

Patent application number | Description | Published |
---|---|---|

20090234821 | Systems and Methods for Efficient Data Searching, Storage and Reduction - Systems and methods enabling search of a repository for the location of data that is similar to input data, using a defined measure of similarity, in a time that is independent of the size of the repository and linear in a size of the input data, and a space that is proportional to a small fraction of the size of the repository. The similar data segments thus located are further analyzed to determine their common (identical) data sections, regardless of the order and position of the common data sections in the repository and input, and in a time that is linear in the segment size and in constant space. | 09-17-2009 |

20090234855 | Systems and Methods for Efficient Data Searching, Storage and Reduction - Systems and methods enabling search of a repository for the location of data that is similar to input data, using a defined measure of similarity, in a time that is independent of the size of the repository and linear in a size of the input data, and a space that is proportional to a small fraction of the size of the repository. The similar data segments thus located are further analyzed to determine their common (identical) data sections, regardless of the order and position of the common data sections in the repository and input, and in a time that is linear in the segment size and in constant space. | 09-17-2009 |

20120131082 | COMPUTATION OF A REMAINDER BY DIVISION USING PSEUDO-REMAINDERS - Methods, computer systems, and computer program products for calculating a remainder by division of a sequence of bytes interpreted as a first number by a second number is provided. A pseudo-remainder by division associated with a first subsequence of the sequence of bytes is calculated. A property of this pseudo-remainder is that the first subsequence of the sequence of bytes, interpreted as a third number, and the pseudo-remainder by division have the same remainder by division when divided by the second number. A second subsequence of the sequence of bytes interpreted as the first number is appended to the pseudo-remainder, interpreted as a sequence of bytes, so as to create a sequence of bytes interpreted as a fourth number. The first number and the fourth number have the same remainder by division when divided by the second number. | 05-24-2012 |

20120158812 | PARALLEL COMPUTATION OF A REMAINDER BY DIVISION OF A SEQUENCE OF BYTES - Methods, computer systems, and computer program products for calculating a remainder by division of a sequence of bytes interpreted as a first number by a second number are provided. A first remainder by division associated with a first subset of the sequence of bytes is calculated with a first processor. A second remainder by division associated with a second subset of the sequence of bytes is calculated with a second processor. The calculating of the second remainder by division may occur at least partially during the calculating of the first remainder by division. A third remainder by division is calculated based on the calculating of the first remainder by division and the calculating of the second remainder by division. | 06-21-2012 |

20120271873 | PARALLEL COMPUTATION OF A REMAINDER BY DIVISION OF A SEQUENCE OF BYTES - A remainder by division of a sequence of bytes interpreted as a first number by a second number is calculated. A first remainder by division associated with a first subset of the sequence of bytes is calculated with a first processor. A second remainder by division associated with a second subset of the sequence of bytes is calculated with a second processor. The calculating of the second remainder by division may occur at least partially during the calculating of the first remainder by division. A third remainder by division is calculated based on the calculating of the first remainder by division and the calculating of the second remainder by division. | 10-25-2012 |

20130073528 | SCALABLE DEDUPLICATION SYSTEM WITH SMALL BLOCKS - For scalable data deduplication working with small data chunks in a computing environment, for each of the small data chunks, a signature is generated based on a combination of a representation of characters that appear in the small data chunks with a representation of frequencies of the small data chunks. The signature is used to help in selecting the data to be deduplicated. | 03-21-2013 |

20130073529 | SCALABLE DEDUPLICATION SYSTEM WITH SMALL BLOCKS - For scalable data deduplication working with small data chunks in a computing environment, for each of the small data chunks, a signature is generated based on a combination of a representation of characters that appear in the small data chunks with a representation of frequencies of the small data chunks. The signature is used to help in selecting the data to be deduplicated. | 03-21-2013 |

20130290278 | SCALABLE DEDUPLICATION SYSTEM WITH SMALL BLOCKS - Exemplary method, system, and computer program product embodiments for scalable data deduplication working with small data chunk in a computing environment are provided. In one embodiment, by way of example only, for each of the small data chunk, a signature is generated based on a combination of a representation of characters that appear in the small data chunk with a representation of frequencies of the small data chunk. A signature is generated based on a combination of a representation of characters that appear. The signature is used to help in selecting the data to be deduplicated. Additional system and computer program product embodiments are disclosed and provide related advantages. | 10-31-2013 |

20130290279 | SCALABLE DEDUPLICATION SYSTEM WITH SMALL BLOCKS - Exemplary method, system, and computer program product embodiments for scalable data deduplication working with small data chunk in a computing environment are provided. In one embodiment, by way of example only, for each of the small data chunk, a signature is generated based on a combination of a representation of characters that appear in the small data chunk with a representation of frequencies of the small data chunk. A signature is generated based on a combination of a representation of characters that appear. The signature is used to help in selecting the data to be deduplicated. Additional system and computer program product embodiments are disclosed and provide related advantages. | 10-31-2013 |

20140188818 | OPTIMIZING A PARTITION IN DATA DEDUPLICATION - For optimizing a partition of a data block into matching and non-matching segments in data deduplication using a processor device in a computing environment, an optimal calculation operation is applied in polynomial time to the matching segments for selecting a globally optimal subset of a set of matching segments according to overhead considerations for minimizing an overall size of a deduplicated file by determining a trade off between a time complexity and a space complexity. | 07-03-2014 |

20140188828 | CONTROLLING SEGMENT SIZE DISTRIBUTION IN HASH-BASED DEDUPLICATION - Segment sizes are controlled by setting the size of a segment boundary in a hash-based deduplication system. A subsequence of size K of a sequence of characters S is set. An increasing sequence of n probabilities and a corresponding sequence of n decreasingly restrictive logical tests are chosen to be applied on the sequence of characters S. Segment boundaries are set by using the sequence of the decreasingly restrictive logical tests by deciding to declare a segment boundary at a current position if one of the sequence of the decreasingly restrictive logical tests, with a corresponding probability of the sequence of n probabilities, returns a true value when applied on the sequence of characters S. | 07-03-2014 |

20150088843 | OPTIMIZING A PARTITION IN DATA DEDUPLICATION - For optimizing a partition of a data block into matching and non-matching segments in data deduplication using a processor device in a computing environment, a sequence of matching segments is split into sub-parts for obtaining a globally optimal subset, to which an optimal calculation is applied. The solutions of optimal calculations for the entire range of the sequence are combined, and a globally optimal subset is built by means of a first two-dimensional table represented by a matrix C[i, j], and storing a representation of the globally optimal subset in a second two-dimensional table represented by a matrix PS[i, j] that holds, at entry [i, j] of the matrix, the globally optimal subset for a plurality of parameters in form of a bit-string of length j−i+1, wherein i and j are indices of bit positions corresponding to segments. | 03-26-2015 |