Patent application number | Description | Published |
20110016095 | Integrated Approach for Deduplicating Data in a Distributed Environment that Involves a Source and a Target - One aspect of the present invention includes a configuration of a storage management system that enables the performance of deduplication activities at both the client (source) and at the server (target) locations. The location of deduplication operations can then be optimized based on system conditions or predefined policies. In one embodiment, seamless switching of deduplication activities between the client and the server is enabled by utilizing uniform deduplication process algorithms and accessing the same deduplication index (containing information on the hashed data chunks). Additionally, any data transformations on the chunks are performed subsequent to identification of the data chunks. Accordingly, with use of this storage configuration, the storage system can find and utilize matching chunks generated with either client- or server-side deduplication. | 01-20-2011 |
20110040732 | APPROACH FOR SECURING DISTRIBUTED DEDUPLICATION SOFTWARE - The various embodiments of the present invention include techniques for securing the use of data deduplication activities occurring in a source-deduplicating storage management system. These techniques are intended to prevent fake data backup, target data contamination, and data spoofing attacks initiated by a source. In one embodiment, one technique includes limiting chunk querying to authorized users. Another technique provides detection of attacks and unauthorized access to keys within the target system. Additional techniques include the combination of validating the existence of data from the source by validating the data chunk, validating a data sample of the data chunk, or validating a hash value of the data chunk. A further embodiment involves the use of policies to provide authorization levels for chunk sharing and linking within the target. These techniques separately and in combination provide a comprehensive strategy to avoid unauthorized access to data within the target storage system. | 02-17-2011 |
20110218969 | APPROACH FOR OPTIMIZING RESTORES OF DEDUPLICATED DATA - Various techniques for improving the performance of restoring deduplicated data files from a server to a client within a storage management system are disclosed. In one embodiment, a chunk index is maintained on the client that tracks the chunks remaining on the client for each data file that is stored to and restored from the storage server. When a specific file is selected for restore from the storage server to the client, the client determines if any local copies of this specific file's chunks are stored in files already existing on the client data store. The file is then reconstructed from a combination of these local copies of the file chunks and chunks retrieved from the storage server. Therefore, only chunks that are not stored or are inaccessible to the client are retrieved from the server, reducing server-side processing requirements and the bandwidth required for data restore operations. | 09-08-2011 |
20120158664 | RESTORING DATA OBJECTS FROM SEQUENTIAL BACKUP DEVICES - Provided are computer program product, system, and method for restoring deduplicated data objects from sequential backup devices. A server stores data objects of extents having deduplicated data in the at least one sequential backup device. The server receives from a client a request for data objects. The server determines extents stored in the at least one sequential backup device for the requested data objects. The server or client sorts the extents according to an order in which they are stored in the at least one sequential backup device to generate a sort list. The server retrieves the extents from the at least one sequential backup device according to the order in the sort list to access the extents sequentially from the sequential backup device in the order in which they were stored. The server returns the retrieved extents to the client and the client reconstructs the requested data objects from the received extents. | 06-21-2012 |
20120158666 | RESTORING A RESTORE SET OF FILES FROM BACKUP OBJECTS STORED IN SEQUENTIAL BACKUP DEVICES - Provided are a computer program product, system, and method for restoring a restore set of files from backup objects stored in sequential backup devices. Backup objects are stored in at least one sequential backup device. A client initiates a restore request to restore a restore set of data in a volume as of a restore point-in-time. A determination is made of backup objects stored in at least one sequential backup device including the restore set of data for the restore point-in-time, wherein the determined backup objects are determined from a set of backup objects including a full volume backup and delta backups providing data in the volume at different points-in-time, and wherein extents in different backup objects providing data for blocks in the volume at different points-in-time are not stored contiguously in the sequential backup device. A determination is made of extents stored in the at least one sequential backup device for the determined backup objects. The determined extents are sorted according to an order in which they are stored in the at least one sequential backup device to generate a sort list. The extents are retrieved from the at least one sequential backup device according to the order in the sort list to access the extents sequentially from the sequential backup device in the order in which they were stored. The retrieved extents are returned to the client and the client reconstructs the restore data set from the received extents. | 06-21-2012 |
20120233131 | RESTORING DATA OBJECTS FROM SEQUENTIAL BACKUP DEVICES - Provided are computer program product, system, and method for restoring deduplicated data objects from sequential backup devices. A server stores data objects of extents having deduplicated data in the at least one sequential backup device. The server receives from a client a request for data objects. The server determines extents stored in the at least one sequential backup device for the requested data objects. The server or client sorts the extents according to an order in which they are stored in the at least one sequential backup device to generate a sort list. The server retrieves the extents from the at least one sequential backup device according to the order in the sort list to access the extents sequentially from the sequential backup device in the order in which they were stored. The server returns the retrieved extents to the client and the client reconstructs the requested data objects from the received extents. | 09-13-2012 |
20120290537 | IDENTIFYING MODIFIED CHUNKS IN A DATA SET FOR STORAGE - Provided are a computer program product, system, and method for identifying modified chunks in a data set for storage. Information is maintained on a data set of variable length chunks, including a digest of each chunk and information to locate the chunk in the data set. Modifications are received to at least one of the chunks in the data set. A determination is made of at least one range of least one of the chunks including data affected by the modifications, wherein each range identifies one chunk or sequential chunks having data affected by the modifications. The at least one chunk in each range is processed to determine at least one new chunk in each range, and for each determined new chunk, a digest of the new chunk. A determination is made as to whether at least one chunk outside of the at least one range has changed. For each determined at least one chunk outside of the at least one range that has changed, a determination is made of at least one new chunk and a new digest of the at least one new chunk. Adding to the set information the new digest information on the at least one new chunk and information to locate the new chunk in the data set. | 11-15-2012 |
20120290546 | IDENTIFYING MODIFIED CHUNKS IN A DATA SET FOR STORAGE - Provided are a computer program product, system, and method for identifying modified chunks in a data set for storage. Modifications are received to at least one of the chunks in the data set. A determination is made of at least one range of least one of the chunks including data affected by the modifications determination is made as to whether at least one chunk outside of the at least one range has changed. For each determined at least one chunk outside of the at least one range that has changed, a determination is made of at least one new chunk and a new digest of the at least one new chunk and information is added on the at least one new chunk and information to locate the new chunk in the data set. | 11-15-2012 |
20130101113 | ENCRYPTING DATA OBJECTS TO BACK-UP - Provided are a computer program product, system, and method for encrypting data objects to back-up to a server. A client private key is intended to be maintained only by the client. A data object of chunks to store at the server is generated. A first portion of the chunks in the data object is encrypted with the client private key and the first portion of the chunks in the data object encrypted with the client private key are sent to the server to store. A second portion of the chunks in the data object not encrypted with the client private key are sent to the server to store. | 04-25-2013 |
20130103945 | ENCRYPTING DATA OBJECTS TO BACK-UP - Provided are a computer program product, system, and method for encrypting data objects to back-up to a server. A client private key is intended to be maintained only by the client. A data object of chunks to store at the server is generated. A first portion of the chunks in the data object is encrypted with the client private key and the first portion of the chunks in the data object encrypted with the client private key are sent to the server to store. A second portion of the chunks in the data object not encrypted with the client private key are sent to the server to store. | 04-25-2013 |
20130138620 | OPTIMIZATION OF FINGERPRINT-BASED DEDUPLICATION - Described are embodiments of an invention for identifying chunk boundaries for optimization of fingerprint-based deduplication in a computing environment. Storage objects that are backed up in a computing environment are often compound storage objects which include many individual storage objects. The computing device of the computing environment breaks the storage objects into chunks of data by determining a hash value on a range of data. The computing device creates an artificial chunk boundary when the end of data of the storage object is reached. When an artificial chunk boundary is created for the end of data of a storage object, the computing device stores a pseudo fingerprint for the artificial chunk boundary. If a hash value matches a fingerprint or a pseudo fingerprint, then the computing device determines that the range of data corresponds to a chunk and the computing system defines the chunk boundaries. | 05-30-2013 |
20130144840 | OPTIMIZING RESTORES OF DEDUPLICATED DATA - For restoring deduplicated data, a method maintains a chunk index on a client computing system coupled to a client data store. The chunk index tracks chunks within files remaining on the client data store after storage of the files to a deduplicated server data store coupled to a server computing system. The method determines whether a valid entry for a first chunk exists in the chunk index. In addition, the method retrieves the first chunk from the server data store responsive to determining the valid entry for the first chunk does not exist in the chunk index. The method further retrieves the first chunk from the client data store specified in the valid entry of the chunk index responsive to determining that the valid entry exists in the chunk index and the first chunk resides in a first file at a first offset. | 06-06-2013 |
20140149699 | IDENTIFYING MODIFIED CHUNKS IN A DATA SET FOR STORAGE - Provided are a computer program product, system, and method for identifying modified chunks in a data set for storage. Information is maintained on a data set of variable length chunks, including a digest of each chunk and information to locate the chunk in the data set. Modifications are received to at least one of the chunks in the data set. A determination is made of chunks including data affected by the modifications. The determined chunks including data affected by the modifications are processed to determine new chunks and for each determined new chunk and for each determined new chunk, new digest information of the new chunk. The new digest information on the at least one new chunk and information to locate the new chunk in the data set are added to the set information. | 05-29-2014 |
20140351822 | CONTROLLING SOFTWARE PROCESSES THAT ARE SUBJECT TO COMMUNICATIONS RESTRICTIONS - Controlling a software process by causing the execution of a first software process on a computer, where the first software process is configured to exclusively access a resource on the computer, causing the execution of a second software process on the computer when the first software process has exclusive access to the resource, where the second software process is configured to perform a first predefined action that is independent of the second software process accessing the resource, attempt to access the resource, and perform a second predefined action that is dependent on the second software process accessing resource, and causing the first software process to terminate its exclusive access to the resource, thereby causing the second software process to access the resource and perform the second predefined action. | 11-27-2014 |
Patent application number | Description | Published |
20100131487 | HTTP CACHE WITH URL REWRITING - URL rewriting is a common technique for allowing users to interact with internet resources using easy to remember and search engine friendly URLs. When URL rewriting involves conditions derived for sources other than the URL, inconsistencies in HTTP kernel cache and HTTP user output cache may arise. Methods and a system for rewriting a URL while preserving cache integrity are disclosed herein. Conditions used by a rule set to rewrite a URL may be determined as cache friendly conditions or cache unfriendly conditions. If cache unfriendly conditions exist, the HTTP kernel cache is disabled and the HTTP user output cache is varied based upon a key. If no cache unfriendly conditions exist, then the HTTP kernel cache is not disabled and the HTTP user output cache is not varied. A rule set is applied to the URL and a URL rewrite is performed to create a rewritten URL. | 05-27-2010 |
20110178973 | Web Content Rewriting, Including Responses - A content rewriting system is described herein that allows web site administrators to setup rewriting of web responses in an easy and efficient manner. The system provides a configuration schema and an efficient workflow that enables web administrators to easily setup rules to modify HTML or other content without having a high performance penalty or losing flexibility. The content rewriting system applies regular expressions or wildcard patterns to a response to locate and replace the content parts based on the rewriting logic expressed by outbound rewrite rules. The system parses an initial response generated by a web application, applies one or more outbound rules to rewrite the response, and provides the rewritten response to a client that submitted a request for the response. | 07-21-2011 |
20140115444 | Web Content Rewriting, Including Responses - A content rewriting system is described herein that allows web site administrators to setup rewriting of web responses in an easy and efficient manner. The system provides a configuration schema and an efficient workflow that enables web administrators to easily setup rules to modify HTML or other content without having a high performance penalty or losing flexibility. The content rewriting system applies regular expressions or wildcard patterns to a response to locate and replace the content parts based on the rewriting logic expressed by outbound rewrite rules. The system parses an initial response generated by a web application, applies one or more outbound rules to rewrite the response, and provides the rewritten response to a client that submitted a request for the response. | 04-24-2014 |