Patent application number | Description | Published |
20090077004 | Data Recovery in a Hierarchical Data Storage System - Systems and methods for retrieving data from a storage system having a plurality of storage pools are provided. The system comprises a logic unit for processing configurable data retrieval instructions to determine a first storage pool from which target backup data is to be retrieved, in response to a data restore request; and a logic unit for retrieving the target backup data from the first storage pool to satisfy the restore request. The configurable data retrieval instructions are managed by a source external to the storage system with administrative authority to change the configurable data retrieval instructions to optimize data restoration from the storage system. | 03-19-2009 |
20090077140 | Data Recovery in a Hierarchical Data Storage System - Systems and methods for retrieving data from a storage system having a plurality of storage pools are provided. The method comprises processing configurable data retrieval instructions to determine a first storage pool from which target backup data is to be retrieved, in response to a data restore request; and retrieving the target backup data from the first storage pool to satisfy the restore request. The configurable data retrieval instructions are managed by a source external to the storage system with administrative authority to change the configurable data retrieval instructions to optimize data restoration from the storage system. | 03-19-2009 |
20090234892 | METHOD AND SYSTEM FOR ASSURING INTEGRITY OF DEDUPLICATED DATA - The present invention provides for a system and method for assuring integrity of deduplicated data objects stored within a storage system. A data object is copied to secondary storage media, and a digital signature such as a checksum is generated of the data object. Then, deduplication is performed upon the data object and the data object is split into chunks. The chunks are combined when the data object is subsequently accessed, and a signature is generated for the reassembled data object. The reassembled data object is provided if the newly generated signature is identical to the originally generated signature, and otherwise a backup copy of the data object is provided from secondary storage media. | 09-17-2009 |
20090271454 | Enhanced method and system for assuring integrity of deduplicated data - The present invention provides for an enhanced method and system for assuring integrity of deduplicated data objects stored within a storage system. A digital signature of the data object is generated to determine if the data object reassembled from a deduplicated state is identical to its pre-deduplication state. In one embodiment, generating the object signature of a data object before deduplication comprises generating an object signature from intermediate hash values computed from a hash function operating on each data chunk within the data object, the hash function also used to determine duplicate data chunks. In an alternative embodiment, generating the object signature of a data object before deduplication comprises generating an object signature on a portion of each data chunk of the data object. | 10-29-2009 |
20100036887 | EFFICIENT TRANSFER OF DEDUPLICATED DATA - One aspect of the present invention includes enabling the efficient transfer of deduplicated data between storage pools in a storage management system without unnecessary reassembly and deduplication of data objects. In one embodiment, the storage management system tracks deduplication information for the data chunks of data objects within an index at the storage management system level, in addition to tracking storage information for each data object within another index at the storage management system level. The data chunk deduplication information is then accessible by any storage pool. Accordingly, transfers of the data objects and data chunks of the data object are easily facilitated, even between non-deduplicating and deduplicating storage pools. | 02-11-2010 |
20100070478 | RETRIEVAL AND RECOVERY OF DATA CHUNKS FROM ALTERNATE DATA STORES IN A DEDUPLICATING SYSTEM - One aspect of the present invention includes retrieving and recovering data chunks from alternate data stores in a storage management system which utilizes deduplication. In one embodiment, deduplication information for data chunks of data objects is stored at a system-wide level to enable the transfer and access of data chunks stored among multiple storage pools. When a data object is accessed on a first storage pool that contains damaged or inaccessible data chunks, the undamaged and accessible chunks may be retrieved from the first storage pool, in addition to retrieving an undamaged copy of the damaged or inaccessible data chunks from alternate data storage pools. Thus, a complete data object can be retrieved or recovered with a combination of chunks from the first storage pool and other storage pools within the storage management system, without requiring the entire data object to be retrieved from a backup source. | 03-18-2010 |
20100082558 | POLICY-BASED SHARING OF REDUNDANT DATA ACROSS STORAGE POOLS IN A DEDUPLICATING SYSTEM - One aspect of the present invention includes enabling data chunks to be shared among different storage pools within a storage management system, according the use of deduplication and storage information kept at the system level, and applied with policy-based rules that define the scope of deduplication. In one embodiment, the parameters of performing deduplication are defined within the policy, particularly which of the plurality of storage pools allow deduplication to which other pools. Accordingly, a data object may be linked to deduplicated data chunks existent within other storage pools, and the transfer of a data object may occur by simply creating references to existing data chunks in other pools provided the policy allows the pool to reference chunks in these other pools. Additionally, a group of storage pools may be defined within the policy to perform a common set of deduplication activities across all pools within the group. | 04-01-2010 |
20100174881 | OPTIMIZED SIMULTANEOUS STORING OF DATA INTO DEDUPLICATED AND NON-DEDUPLICATED STORAGE POOLS - One aspect of the present invention includes an optimized simultaneous storage operation for data objects onto a combination of deduplicated and non-deduplicated storage pools. In one embodiment, a data object is provided for storage onto destination storage pools in a storage management system, and placed into a source buffer. The data object is first divided into data chunks if the data object has not previously been chunked within the storage management system. The data object is then simultaneously copied from the source buffer to each destination storage pool (deduplicating and non-deduplicating) with the following operation. If the destination pool utilizes deduplication, then the individual data chunks are only transferred if copies of the individual data chunks do not already exist on the destination storage pool. If the destination pool does not utilize deduplication, then all chunks of the data object are transferred to the destination storage pool. | 07-08-2010 |
20100299311 | METHOD AND SYSTEM FOR ASSURING INTEGRITY OF DEDUPLICATED DATA - The present invention provides for a system and method for assuring integrity of deduplicated data objects stored within a storage system. A data object is copied to secondary storage media, and a digital signature such as a checksum is generated of the data object. Then, deduplication is performed upon the data object and the data object is split into chunks. The chunks are combined when the data object is subsequently accessed, and a signature is generated for the reassembled data object. The reassembled data object is provided if the newly generated signature is identical to the originally generated signature, and otherwise a backup copy of the data object is provided from secondary storage media. | 11-25-2010 |
20110016095 | Integrated Approach for Deduplicating Data in a Distributed Environment that Involves a Source and a Target - One aspect of the present invention includes a configuration of a storage management system that enables the performance of deduplication activities at both the client (source) and at the server (target) locations. The location of deduplication operations can then be optimized based on system conditions or predefined policies. In one embodiment, seamless switching of deduplication activities between the client and the server is enabled by utilizing uniform deduplication process algorithms and accessing the same deduplication index (containing information on the hashed data chunks). Additionally, any data transformations on the chunks are performed subsequent to identification of the data chunks. Accordingly, with use of this storage configuration, the storage system can find and utilize matching chunks generated with either client- or server-side deduplication. | 01-20-2011 |
20110040732 | APPROACH FOR SECURING DISTRIBUTED DEDUPLICATION SOFTWARE - The various embodiments of the present invention include techniques for securing the use of data deduplication activities occurring in a source-deduplicating storage management system. These techniques are intended to prevent fake data backup, target data contamination, and data spoofing attacks initiated by a source. In one embodiment, one technique includes limiting chunk querying to authorized users. Another technique provides detection of attacks and unauthorized access to keys within the target system. Additional techniques include the combination of validating the existence of data from the source by validating the data chunk, validating a data sample of the data chunk, or validating a hash value of the data chunk. A further embodiment involves the use of policies to provide authorization levels for chunk sharing and linking within the target. These techniques separately and in combination provide a comprehensive strategy to avoid unauthorized access to data within the target storage system. | 02-17-2011 |
20110218969 | APPROACH FOR OPTIMIZING RESTORES OF DEDUPLICATED DATA - Various techniques for improving the performance of restoring deduplicated data files from a server to a client within a storage management system are disclosed. In one embodiment, a chunk index is maintained on the client that tracks the chunks remaining on the client for each data file that is stored to and restored from the storage server. When a specific file is selected for restore from the storage server to the client, the client determines if any local copies of this specific file's chunks are stored in files already existing on the client data store. The file is then reconstructed from a combination of these local copies of the file chunks and chunks retrieved from the storage server. Therefore, only chunks that are not stored or are inaccessible to the client are retrieved from the server, reducing server-side processing requirements and the bandwidth required for data restore operations. | 09-08-2011 |
20120005171 | Deduplication of data object over multiple passes - In each of a number of passes to deduplicate a data object, a transaction is started. Where an offset into the object has previously been set, the offset is retrieved; otherwise, the offset is set to reference a beginning of the object. A portion of the object beginning at the offset is deduplicated until an end-of-transaction criterion has been satisfied. The transaction is ended to commit deduplication; where the object has not yet been completely deduplicated, the offset is moved just past where deduplication has already occurred. The object is locked during each pass; other processes cannot access the object during each pass, but can access the object between passes. Each pass is relatively short, so the length of time in which the object is inaccessible is relatively short. By comparison, deduplicating an object within a single pass prevents other processes from accessing the object for a longer time. | 01-05-2012 |
20120158664 | RESTORING DATA OBJECTS FROM SEQUENTIAL BACKUP DEVICES - Provided are computer program product, system, and method for restoring deduplicated data objects from sequential backup devices. A server stores data objects of extents having deduplicated data in the at least one sequential backup device. The server receives from a client a request for data objects. The server determines extents stored in the at least one sequential backup device for the requested data objects. The server or client sorts the extents according to an order in which they are stored in the at least one sequential backup device to generate a sort list. The server retrieves the extents from the at least one sequential backup device according to the order in the sort list to access the extents sequentially from the sequential backup device in the order in which they were stored. The server returns the retrieved extents to the client and the client reconstructs the requested data objects from the received extents. | 06-21-2012 |
20120158666 | RESTORING A RESTORE SET OF FILES FROM BACKUP OBJECTS STORED IN SEQUENTIAL BACKUP DEVICES - Provided are a computer program product, system, and method for restoring a restore set of files from backup objects stored in sequential backup devices. Backup objects are stored in at least one sequential backup device. A client initiates a restore request to restore a restore set of data in a volume as of a restore point-in-time. A determination is made of backup objects stored in at least one sequential backup device including the restore set of data for the restore point-in-time, wherein the determined backup objects are determined from a set of backup objects including a full volume backup and delta backups providing data in the volume at different points-in-time, and wherein extents in different backup objects providing data for blocks in the volume at different points-in-time are not stored contiguously in the sequential backup device. A determination is made of extents stored in the at least one sequential backup device for the determined backup objects. The determined extents are sorted according to an order in which they are stored in the at least one sequential backup device to generate a sort list. The extents are retrieved from the at least one sequential backup device according to the order in the sort list to access the extents sequentially from the sequential backup device in the order in which they were stored. The retrieved extents are returned to the client and the client reconstructs the restore data set from the received extents. | 06-21-2012 |
20120215742 | DATA RETENTION USING LOGICAL OBJECTS - Various embodiments are provided for facilitation of data retention using logical objects. Following an operation creating a redundant copy of the data performed on a scheduled interval, a logical object containing a number of managed file versions, represented by a number of member objects for a recovery point, is created. The logical object is assigned a policy of a data retention policy construct associated with the recovery point. The logical object is adapted for reassignment between policies of the data retention policy construct associated with various recovery points. During the reassignment, the plurality of member objects representing the plurality of managed file versions are logically retained instead of performing a data copy operation to associate the plurality of managed file versions with another recovery point. | 08-23-2012 |
20120233131 | RESTORING DATA OBJECTS FROM SEQUENTIAL BACKUP DEVICES - Provided are computer program product, system, and method for restoring deduplicated data objects from sequential backup devices. A server stores data objects of extents having deduplicated data in the at least one sequential backup device. The server receives from a client a request for data objects. The server determines extents stored in the at least one sequential backup device for the requested data objects. The server or client sorts the extents according to an order in which they are stored in the at least one sequential backup device to generate a sort list. The server retrieves the extents from the at least one sequential backup device according to the order in the sort list to access the extents sequentially from the sequential backup device in the order in which they were stored. The server returns the retrieved extents to the client and the client reconstructs the requested data objects from the received extents. | 09-13-2012 |
20130013573 | RETRIEVAL AND RECOVERY OF DATA CHUNKS FROM ALTERNATE DATA STORES IN A DEDUPLICATING SYSTEM - For recovery of data chunks from alternate data stores, a method detects a damaged copy of a first data chunk of a deduplicated data object within a first storage pool of plurality of storage pools storing data chunks. The method further locates an undamaged copy of the first data chunk in an alternate storage pool within the plurality of storage pools from a system-wide deduplication index that indexes each data chunk in each storage pool. In addition, the method creating a new object holding the undamaged copy in the first storage pool, the new object linked to the damaged copy through the system-wide deduplication index. | 01-10-2013 |
20130054518 | APPLYING REPLICATION RULES TO DETERMINE WHETHER TO REPLICATE OBJECTS - A source server maintains a replication rule specifying a condition for a replication attribute and a replication action to take if the condition with respect to the replication attribute is satisfied, wherein the replication action indicates to include or exclude the object having an attribute value for the replication attribute that satisfies the condition. For each of the objects, the replication rule is applied by determining an attribute value of the object corresponding to the replication attribute in the replication rule and determining whether the determined attribute value satisfies the condition for the replication attribute defined in the determined replication rule. The replication action on the object in response to determining that the determined attribute value satisfies the condition for the replication attribute. | 02-28-2013 |
20130054523 | REPLICATION OF DATA OBJECTS FROM A SOURCE SERVER TO A TARGET SERVER - Data objects are replicated from a source storage managed by a source server to a target storage managed by a target server. A source list is built of objects at the source server to replicate to the target server. The target server is queried to obtain a target list of objects at the target server. A replication list is built indicating objects on the source list not included on the target list to transfer to the target server. For each object in the replication list, data for the object not already at the target storage is sent to the target server and metadata on the object is sent to the target server to cause the target server to include the metadata in an entry for the object in a target server replication database. An entry for the object is added to a source server replication database. | 02-28-2013 |
20130054524 | REPLICATION OF DATA OBJECTS FROM A SOURCE SERVER TO A TARGET SERVER - Data objects are replicated from a source storage managed by a source server to a target storage managed by a target server. A source list is built of objects at the source server to replicate to the target server. The target server is queried to obtain a target list of objects at the target server. A replication list is built indicating objects on the source list not included on the target list to transfer to the target server. For each object in the replication list, data for the object not already at the target storage is sent to the target server and metadata on the object is sent to the target server to cause the target server to include the metadata in an entry for the object in a target server replication database. An entry for the object is added to a source server replication database. | 02-28-2013 |
20130054545 | MANAGING DEREFERENCED CHUNKS IN A DEDUPLICATION SYSTEM - A chunk index has information on chunks in a storage space referenced in objects in the storage space. The chunk index includes a reference count for each chunk indicating a number of objects in which the chunk is referenced and a reference measurement representing a level of data object references to the chunk. One chunk is selected to remove from the storage space based on a criteria applied to the reference measurements of chunks having reference counts indicating that the chunks are not referenced in one object in the storage space. | 02-28-2013 |
20130054906 | MANAGING DEREFERENCED CHUNKS IN A DEDUPLICATION SYSTEM - A chunk index has information on chunks in a storage space referenced in objects in the storage space. The chunk index includes a reference count for each chunk indicating a number of objects in which the chunk is referenced and a reference measurement representing a level of data object references to the chunk. One chunk is selected to remove from the storage space based on a criteria applied to the reference measurements of chunks having reference counts indicating that the chunks are not referenced in one object in the storage space. | 02-28-2013 |
20130124487 | Deduplication of data object over multiple passes - In each of a number of passes to deduplicate a data object, a transaction is started. Where an offset into the object has previously been set, the offset is retrieved; otherwise, the offset is set to reference a beginning of the object. A portion of the object beginning at the offset is deduplicated until an end-of-transaction criterion has been satisfied. The transaction is ended to commit deduplication; where the object has not yet been completely deduplicated, the offset is moved just past where deduplication has already occurred. The object is locked during each pass; other processes cannot access the object during each pass, but can access the object between passes. Each pass is relatively short, so the length of time in which the object is inaccessible is relatively short. By comparison, deduplicating an object within a single pass prevents other processes from accessing the object for a longer time. | 05-16-2013 |
20130144840 | OPTIMIZING RESTORES OF DEDUPLICATED DATA - For restoring deduplicated data, a method maintains a chunk index on a client computing system coupled to a client data store. The chunk index tracks chunks within files remaining on the client data store after storage of the files to a deduplicated server data store coupled to a server computing system. The method determines whether a valid entry for a first chunk exists in the chunk index. In addition, the method retrieves the first chunk from the server data store responsive to determining the valid entry for the first chunk does not exist in the chunk index. The method further retrieves the first chunk from the client data store specified in the valid entry of the chunk index responsive to determining that the valid entry exists in the chunk index and the first chunk resides in a first file at a first offset. | 06-06-2013 |
20130159645 | DATA SELECTION FOR MOVEMENT FROM A SOURCE TO A TARGET - In one aspect of the present description, in connection with storing a first deduplicated data object in a primary storage pool, described operations include determining the duration of time that the first data object has resided in the primary storage pool, and comparing the determined duration of time to a predetermined time interval. In addition, described operations include, after the determined duration of time meets or exceeds the predetermined time interval, determining if the first data object has an extent referenced by another data object, and determining whether to move the first data object from the primary storage pool to a secondary storage pool as a function of whether the first data object has an extent referenced by another data object after the determined duration of time meets or exceeds the predetermined time interval. Other features and aspects may be realized, depending upon the particular application. | 06-20-2013 |
20130159648 | DATA SELECTION FOR MOVEMENT FROM A SOURCE TO A TARGET - In one aspect of the present description, in connection with storing a first deduplicated data object in a primary storage pool, described operations include determining the duration of time that the first data object has resided in the primary storage pool, and comparing the determined duration of time to a predetermined time interval. In addition, described operations include, after the determined duration of time meets or exceeds the predetermined time interval, determining if the first data object has an extent referenced by another data object, and determining whether to move the first data object from the primary storage pool to a secondary storage pool as a function of whether the first data object has an extent referenced by another data object after the determined duration of time meets or exceeds the predetermined time interval. Other features and aspects may be realized, depending upon the particular application. | 06-20-2013 |
20140052952 | MANAGING DEREFERENCED CHUNKS IN A DEDUPLICATION SYSTEM - A chunk index has information on chunks in a storage space referenced in objects in the storage space. The chunk index includes a reference count for each chunk indicating a number of objects in which the chunk is referenced and a reference measurement representing a level of data object references to the chunk. One chunk is selected to remove from the storage space based on a criteria applied to the reference measurements of chunks having reference counts indicating that the chunks are not referenced in one object in the storage space. | 02-20-2014 |
20140279912 | CLIENT OBJECT REPLICATION BETWEEN A FIRST BACKUP SERVER AND A SECOND BACKUP SERVER - Provided are a computer program product, system, and method for client object replication between a first backup server and a second backup server. Objects are backed-up from a client to a first backup server. The first backup server generates metadata on the objects from the client and transmits the metadata to a second backup server, wherein the objects backed-up at the first backup server are not copied to the second backup server. A determination is made that the first backup server is unavailable after transmitting the metadata. The metadata at the second backup server is used to determine modifications to the objects at the client since the metadata was last generated in response to determining that the first backup server is unavailable. The client backs-up the determined modifications to the objects to the second backup server to backup. | 09-18-2014 |