Digital preservation encompasses all the activities that are responsible for ensuring the continued long-term access to information existing in digital formats. Digital preservation is therefore the ability to ensure that digital information remains accessible and carries forward sufficient attributes to ensure its authenticity.

Since its creation, KEEP SOLUTIONS assumes a pioneering role in building tools and scientific knowledge in the area of digital preservation. Its activity is not closed on the sale of services and products, but in the research, development and building a greater understanding about the risks that threaten the longevity of digital information as well as the most appropriate strategies to mitigate them.

KEEP SOLUTIONS’ activities in this field include, but are not limited to:

Development of tools and services

KEEP SOLUTIONS actively contributed to the development of some of the most popular tools in the field of digital preservation, e.g. DSpace, FITS, RODA, among others.

Involvement in european projects

KEEP SOLUTIONS has participated in several R&D projects at european level working side by side with some of the biggest experts in the field of digital preservation and internationally renowned institutions. We would like to highlight the participation on the projects SCAPE, 4C and e-ARK.

Participation in scientific events

Our employees have participated as speakers in all major scientific events in the field of digital preservation, e.g. iPRES, International Conference on Theory and Practice of Digital Libraries, International Conference on Asia-Pacific Digital Libraries, Open Repositories, BAD Congress, Luso-Brazilian Conference on Open Access and others.

Driven by its vast knowledge in the field of digital preservation, KEEP SOLUTIONS provides its clients products and services specially designed to assist them in the implementation of digital preservation solutions.

ISO 16363 consultancy services – Audit and certification of trustworthy digital repositories

In 2007, was published the Trustworthy Repositories Audit & Certification: Criteria & Checklist report. This document, generally known as TRAC, aimed at setting the requirements to assess the ability of a digital repository to store and ensure the continued access to digital collections. In June 2012, TRAC has been promoted to international standard, having been improved and republished under the name ISO 16363 – Audit and certification of trustworthy digital repositories.

The implementation of ISO 16363 enhances trust among repository stakeholders (i.e. producers , consumers , operators , managers , etc.) and is crucial for the establishment of a climate of transparency regarding the underlying processes that support the digital repository. This assessment framework identifies and defines the requirements that a digital repository should ensure to be considered a “trustworthy” repository.

The organisation responsible for a trustworthy repository must be able to demonstrate that it has all the necessary procedures in place to identify and mitigate threats to digital information. These threats may be different in nature (e.g. organisational, financial, technological, social, physical, environmental , etc.) so activities such as monitoring the repository environment, preservation planning, system maintenance and financial sustainability are constant concerns of any person that takes responsibility for a digital repository.

Based on our vast experience and know-how in the field of digital preservation, we provide a set of consultancy services for the implementation of the ISO 16363 standard. Among these are the diagnosis of compliance, internal audits to repositories and preservation processes, advising on the acquisition of necessary services and systems, etc.

Our consulting process for certification consists of the following stages:

  • Diagnosis – A preliminary assessment to determine the level of compliance of the repository with the ISO 16363.
  • Action plan – Development of an action plan aimed at increasing the compliance of the repository with the ISO 16363 standard.
  • Implementation of actions – Implementation of actions reported in the action plan.
  • Internal audit – Internal audit to the repository, inner processes, infrastructure and software to ensure that they meet the requirements of the standard.
  • Audit report – Final audit report where we report all the non-conformities detected, as well as potential improvement suggestions.

Format characterisation and risk analysis

A digital repository is generally composed of metadata (descriptive, technical, preservation, etc.) and digital information. The digital information is generally materialized in the form of files or sets of files that are commonly called “digital representations”.

Often those responsible for a digital repository do not know with precision the size and shape of their digital content. This can happen due to a wide variety of reasons, e.g. the management software is not able to generate technical metadata about the stored information, formats are incorrectly identified during the ingest process, corrupted or malformed data renders the identification process impossible, etc.

The lack of knowledge about the technical characteristics of digital assets is considered to be a great preservation risk. If one is not fully aware of the characteristics of its collections, one can not design a proper preservation plan that takes into the consideration all the risks inherent to each file format. An effective action plan directly depends on the existing formats, so the lack of this knowledge prevents the repository to act and properly protect their materials.

KEEP SOLUTIONS provides tools and services to support the characterisation of digital materials, including:

  • Identification of formats (e.g. PRONOM ID, Mimetype).
  • Validation of formats according to their specification.
  • Extraction of technical properties (e.g. width, height, compression scheme, colour scheme, number of pages, number of figures, number of tables, etc…).

The tools developed by KEEP SOLUTIONS are able to connect to any content management system, these being based on file systems, relational databases, service oriented, or other. KEEP SOLUTIONS also offers a risk analysis service, suggesting actions to increase data preservability in the long-term.

Quality assurance and validation of digitisation projects

In order to guarantee the continued access to digital information, it is necessary to ensure that the information one holds is the right information. In every digitisation project, there are requirements to be met. Generally, these include the definition of the formats on which the materials should be delivered, the compression algorithms and colour schemes, the metadata that should be produced and an assortment of quality attributes that are relevant to meet established institutional policy.

In a large digitisation project that involves scanning several thousand items, mistakes and omissions are commonly found. Quality control is therefore a fundamental activity for those who outsource digitisation services. If the quality criteria are not properly defined at the beginning of the project and are not subsequently validated during the receipt of the materials, there is a high risk that a significant percentage of digitised materials do not conform to what was previously defined (or contracted).

KEEP SOLUTIONS provides a set of tools and services to enable one to validate digitised materials. These tools are able to detect the following issues:

  • Format validation according to its specification (e.g. PDF/A, TIFF, JP2, etc.) formats.
  • Validation of materials according to pre-defined digitisation profile ( i.e. resolution, colour scheme, compression, etc.).
  • Validation of metadata (e.g. completeness, format compliance).
  • Detection of duplicated pages.
  • Detection of blank pages.

The tools we provide produce compliance reports that enable the identification of files that do not meet the pre-defined digitisation agreements. Based on this information the contractor may request the partial or global repetition of the digitisation process.

Large-scale format migration

The vast majority of information management systems do not impose any restrictions on the type of information that one may ingest into them. This capability, coupled with the absence of a proper ingest policy inevitably leads to the scenario where the dispersion of formats in the repository is so variable that it becomes virtually impossible from a technical and/or financial standpoint to implement the proper actions to ensure the continued access to these assets.

One way of mitigating this phenomenon is to reduce the distribution of formats through a process commonly known as “normalisation”. This process entails converting all the materials from less suitable formats to more preservation-oriented ones.

For example, compressed formats (e.g. JPEG or ZIP) are more vulnerable to bit rot incidents (i.e. accidental modifications of the bits of a file) than uncompressed formats. This means that changing one bit of information on a ZIP file could mean losing the ability to decode the file entirely. However, modifying one bit of information on an uncompressed file usually produces a small change in the overall decoding process, e.g. one pixel may change its color or a letter in a text document may be replaced by another.

KEEP SOLUTIONS offers a format migration/normalisation service that is able to integrate with any document management systems in your organisation. The system is supported by Web services so it can easily be invoked by existing software. The migration service can be expanded by adding new plugins enabling it to quickly and seamlessly adapt to new customer requirements.

Database preservation

Sometimes the systems that support vital business activities come to an end. The reasons for this occurrence are varied, e.g. the system is replaced by a more capable one, the system suffers an update or simply the service or activity in the organisation ceased to exit.

However, it is often necessary to ensure that we maintain access to the authentic records stored within these databases long after the database systems have been disabled. The preservation of these records is critical to the extent that they constitute material evidence that certain activity was performed. These records may not have been migrated to the new system (e.g. functionally ceased to exist) or may have been tampered with during the migration process, so having an independent way to assess the original records if fundamental in these cases.

KEEP SOLUTIONS provides database information extraction tools, with respective transformation into neutral independent formats, to ensuring continued access to data tens of years after the database management system has been terminated. We also provide navigation, search and viewing capabilities on the preserved data with the ability to cope with Terabyte-size databases.

The illustrations included in this page were created by Tom Woolley of the Curve Agency for the Jisc funded Digital Preservation Business Case Toolkit under the license Creative Commons Attribution-NonCommercial 3.0 Unported License.