This enables ImgArchive to use cloud backup solutions or remote drives for of off-site backups.
This allows backups of the complete archive to be physicality stored remotely and in the case of disaster restored back locally to repair any damage to the local archive or in the worst case create a new local archive in the event that the local archive is destroyed.
One thing many people think, is why have a local archive? just store the complete archive in the cloud. However don’t forget the cloud is just someone else is computer and is not immune to data loss. The best way of guarantee the security against loss of the archive is to have a number of copies in geographically different places along with the convenience of having a local archive completely under your control.
The process of backing-up to remote devices.
In order to back up to remote storage it obviously needs to be transferred and delivered to the remote storage. These delivery systems may be, Google drive, Microsoft OneDrive Dropbox etc. In addition may also be a mapped drive or ftp. These delivery systems are generally slower than the local storage.
The two part solution
When ImgArchive accesses remote delivery systems it does so by accessing a virtual file system (VFS). This VSF contains all the information about the data that needs to be delivered remotely but contains none of the contents. The contents are kept main local archive and (once delivered) in the remote storage. The VFS is in other words a metadata file system containing the form of a file system with files and folders but only contains the information about the files and folder.
The VSF is an interface between the main archive and the remote storage. This is where the two parts come into play.
The main archive when it is required to transfer a file or folder it merely just sends the information about that file to the VFS, and in the case of a folder again, just the information of where the folder fits in the file system and does not create a real folder. This is the first part.
The second part is the actual transferring of the contents of the files to be backed-up. This enables ImgArchive to hand over the actual backup process to a second application. ImgArchive can then process other actions and not be slowed by slow delivery systems.
This second application is the delivery system and is called IA Remote backup (IARBK.exe).
IA Remote backup (IARBK.exe)
This application is responsible for managing the transfer of data to the remote storage.
IARBK runs side-by-side with iavault.exe, However there is a master slave relationship between the two applications. The Master being iavault.exe and the producer of information. The slave is IARBK consumes this information.
For example, when images are imported into ImgArchive Iavault.exe and remote storage needs to be accessed. ImgArchive will update the VFS with the files and folders that need transferring. In addition, will generate a Remote Job Journal file.
The aim of this file is to allow the Remote Backup process to carry out the updates to the VFS and not need to traverse the complete VFS for changes.
This file contains the details a list of all the files and folders needed to be transferred to the Remote site for a particular backup session. Iarbk.exe will read this file and update it as the files are transferred.
Iarbk.exe will by default copy the files to the remote site then transfer them back as a verification step to guaranty the files have been copied over without corruption. Once all the files have completed being transferred the Journal file will show a list of completed file transfers at this point a summary file will be generated detailing the result of the Remoting process. At this point Iarbk.exe will look for any new Journal files that need servicing otherwise will exit.
The Remote Job Journal (RJJ)
The VFS contains all the information required to validate that the remote transfers are complete and consistent with the files and directories in the local archive. It does this by generating checksums of the local versions of the file. This information is then stored in the VFS. When the remote files are transferred there can be validated by comparing the checksums of the remote file with the ones contained in the VFS.
Remote devices
ImgArchive can manage a number of Remote storage devices. These devices can be of different types. For example one device can store files in a Drop-Box account and another may be storing files to a server using FTP.
One thing to note, Remote storage devices share the VFS however a RJJ needs to be generated for each device. The reason for this is that each device is independent and with transfer files at different rates and could be sometimes off-line and used the RJJ files to catch up. The VFS is a common resource and each device must maintain being up-to-date with the VFS. The aim of the Remote storage devices is to be complete copy of the VFS.
Validating files
This can be done in two ways: the first is that if the machine that contains the remote storage can run a checksum generator it can create the checksum on each file transferred and transfer the checksum back to the VFS for comparison. If the checksums match then the file is valid and has been transferred correctly. The second method is to copy the files to be validated back to the local disk and generate the required checksums locally. . If the checksums match the VFS then the file is valid. The local copy can then be deleted. Unfortunately this will take longer and each file has to be transsfered to the remote storage then back to the local file system for a checksum to be generated.
Remote transfer plug-in
IARemote need to transfer files and create folders on the remote drive to create and maintain the remote Remote . In order to support and number of cloud backup solutions these functions may be implemented using different Remote Transfer Plug-ins (RTP) depending on cloud backup solution requirement and how these functions are provided.
Functions to be implemented:
Open connection
Close connection
login to connection
logout to connection
Change folder
Put file
Get file
File size
File date
Delete file
Make folder
Delete folder
Each directory is a file containing the file information for the remote directory. For example if there are three files, 1234.jpj, 2345.jpg, 3456.jpj in a directory “2012/2012-12-09/images”
Structure of the Mirror system
Virtual File System
The VFS is the metadata information of how the file is structured, for example were the folders are, and the contents of those folders. Note only the information on the files are stored not their contents.
Initialising the VFS.
Initially when first created the VFS is empty. The VFS can be populated from the real FS by using the generate function, or by explicitly making the VFS from the real FS using the same under lying functions that the generate function uses.
Remote Device driver
Each RDD needs the following information:
Location of the primary file system root – this will normally be the location of the master and derivative archives.
Location of the VFS.
Remote Device Type information.
Location of the Remote Job Journals
FTP Remote Device driver
This driver will be used to backup the archive to an external FTP server.
In order to handle this, a number of configuration items need to provided. These are as follows:
Username / Password of the account on the FTP server.
Host name of FTP server.
Root folder on the FTP server where the backup to be placed on the server.
Iarbk as a service or damon
When remote backup is enabled, the iarbk application needs to runs side-by-side with the iavault application in order to push the data to be backed-up to the remote backup. Iavault application will normally complete before Iarbk.exe.
The iarbk application can be started by the iavault application or run as a service in windows or a Damon in Linux or ios.
Job Files
These files contains details of the work to be carried out by the Iarbk.exe. These will be generated by the iavault when the VFS is updated with new items that needs to be backed-up from iavault activities. The job file contains which files have been added or modified in the VFS which now needs to updated in the remote storage. As the jobs are done, the job file will be modified to reflect the new status reflecting the fact the new status of the job.
For example: a new image is imported into the archive. It is added to the master archive and any specified backups plus the VFS is updated with the new image details, the job file is updated to reflect that a new image has been added. The iarbk application will need to be actioned to read the VFS and the job File. Iarbk will read the VFS and job file, update the external remote file storage with the new image and update both the VFS and job file. Initially the unprocessed job file will be stored in a pending folder, once completed the file will be moved into the completed folder.