SQLite is an open source database that is used in both large and small systems. Adobe uses it for Lightroom. Lightroom is Adobe’s archiving application. Another user of SQLite is the programming language Python. A SQLite database can be easily access using python.
As this database is popular and completely free unlike a number of other databases it has a large following. The main advantage of a database is that the data it holds can be searched and sorted much quicker than a flat file database. Data can be queried using SQL and applications can be made to use the data quickly. The disadvantages is that the database system are more complex to setup.
Damage to a database affects virtually all the systems using it. Sequence Identifiers Each image in the archive is uniquely identified by a sequence number. This is then used to cross reference images within the databases. The databases generate this number to two ways, the SQLite database will generate this number as a unique number primary number key in the Asset Properties table. All other tables in the database will then contain this number as their primary key.
Each image in the database must be referenced in the Asset Properties table all other tables it can be optional. The Asset Properties table will contain the full path to the image in the archive and an index that performs the reverse in that given an image path it will return the sequence number. In the case of the SQLite database, the database can generate this unique sequence number and carry out the indexing into the other tables and maintain a link to the actual image in the archive. Flat File database such as the XML and CVS databases cannot do this directly. The reason being is that that there is a set of CSV files per day and the sequence numbers are generated at the time the image is placed in the archive not the date the image was take. The sequence numbers are not guaranteed to be in any order. To solve this problem SIA maintains a file based sequence number lookup. Given a sequence number the lookup will return the full archive path. To carry out the reverse, you will start with the archive so the folder the correct CSV is known the Asset Properties CSV file will be ion image file name order so finding the sequence file number is trivial. Archive integrity One main function of an image archive is to safe guard the images within it. The archive can be damaged either intentionally or unintentionally at any time. If damage is done to the archive, the first thing for the archive to do is to inform you, the user, as soon as possible that the damage has taken place. The next thing is to inform you what damage has been done, then lastly help you fix the damage. SIA has mechanisms to monitor the integrity of the archive by recording the times that images are modified. In addition maintains a file map of the archive with both a CRC and MD5 checksums of each file in the archive. If the file map of the archive does not match the contents of the archive then these differences can be listed. Sometime these differences are relatively harmless, such as an image being modified without being marked as checked-out; on the other hand whole years’ worth of images may be missing. The file map will highlight this change. From the users point of view missing a year may not be seen until images from that year are needed, along period time may have passed before the damage may be apparent. Once the damage is identified a file list of damaged or missing files can be generated and the archive can be repaired from an archive mirror by copying the file back into the archive. A full integrity check can then be made of archive to verify that the repair was successful. Hook scripts A hook script is a program triggered by an archive repository event , such as an image being about to be processed to be put into the archive. This is for example a point where if the image say a RAW type then a picture type may be generated so both can be archived as a RAW/Picture pair. Views