1. Data Model for EWIG
1.1. Introduction
This document describes the data model for the EWIG long-term preservation system. The data model follows the Information Model as described in the reference model for an Open archival information system (OAIS; ISO 14761:2012). To distinguish EWIG terms from terms and entities used and defined in the OAIS, the latter will be in italics.
The purpose of the project is to preserve usability and utility of information over the long term.
Publisher: Digital Preservation Working Group (Zuse Institute Berlin)
License: CC0
Turtle Files:
1.1.1. Namespaces
Namespace Prefix | Namespace URI |
ewig | http://ewig.zib.de/ontologies/ewig# |
ewigvocab | http://ewig.zib.de/ontologies/vocab/ |
rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs | http://www.w3.org/2000/01/rdf-schema# |
owl | http://www.w3.org/2002/07/owl# |
skos | http://www.w3.org/2004/02/skos/core# |
dct | http://purl.org/dc/terms/ |
schema | http://schema.org/ |
pcdm | http://pcdm.org/models# |
pcdmrights | http://pcdm.org/rights# |
ore | http://www.openarchives.org/ore/terms/ |
edm | http://www.europeana.eu/schemas/edm/ |
premis | http://www.loc.gov/premis/rdf/v1 |
premis3 | http://www.loc.gov/premis/rdf/v3 |
sh | http://www.w3.org/ns/shacl# |
xsd | http://www.w3.org/2001/XMLSchema# |
2. Classes
2.1. Information Packages (ewig:InformationPackage)
The Information Package in the data model follows the definition 4.2.2.1 in OAIS: „The conceptual structure for supporting Long Term Preservation of information is the Information Package. An Information Package is a container that contains two types of Information Objects, the Content Information and the Preservation Description Information (PDI);[...]“. The OAIS Package Description is modelled through the aggregation of data from the SubmissionManifest and one or more Information Objects (which ones?)
Information Packages aggregate Information Objects and serve as containers during the different stages within the preservation workflow.
Information Packages of type TransferAggregation can aggregate *Submission or Archival* Information Packages. Submission or Archival Information Packages aggregate Information Objects comprising Content and Preservation Description Information.
Information Packages MUST include a RightsStatement as fallback statement for the Access Functional Entity and the SubmissionManifest as record fort he Administration Functional Entity.
Information Packages also include status messages for API access.
2.1.1. Property Usage
Property | Expected Object Type (Range) | Cardinality | Scope Note |
rdf:type | pcdm:Collection ewig:InformationPackage | 0..n | Subclass of ore:Aggregation. |
ewig:submissionManifest | ewig:SubmissionManifest | 0..1 | Submission metadata is mandatory for all IPs. |
dct:identifier | Literal (String) | 0..1 |
is being auto-generated by API: - for Transfer Aggregations: <contractId>,<submissionName> - for Information Packages: <contractId>,<submissionName>,<ieName> |
ewig:type | ewigvocab:packagetype# | 1 | TransferAggregation (TA), Submission (SIP), Archival (AIP), ArchivalCollection (AIC). |
ewig:memberOfAIC | ewig:InformationPackage | 0..1 | TAs/SIPs/AIPs can be a member of an AIC. |
ewig:memberOfTA | ewig:InformationPackage | 0..1 | SIPs/AIPs can be a member of a TA. |
owl:sameAs | URI to Fedora Resource | 0..1 | Links to Package equivalent in Fedora. |
ewig:status | ewigvocab:status# | 0..1 | Status for API. |
ewig:statusMessage | Literal (String) | 0..1 | Optional message contextualizing API Status. |
ewig:stage | ewigvocab:stage# | 0..1 | Workflow stage. |
skos:prefLabel | Literal (String) | 0..1 | Optional label/title for package for Dashboard. Identifier will be used if absent. |
dct:description | Literal (String) | 0..1 | Optional description |
dct:rights | ewig:RightsStatement | 1 | Mandatory rights information in every package. |
dct:dateAccepted | xsd:dateTime | 0..1 | Package processing has finished successfully. |
dct:dateSubmitted | xsd:dateTime | 0..1 | Package received. |
ewig:archivematicaUuid | Literal | 0..1 | Archivematica UUID of an AIP. |
premis3:size | Literal(xs:decimal) | 0..1 | Size of InformationPackage at the time of ingest in bytes. This is mandatory after ingest. |
ewig:callbackStatus | Literal (String) | 0..1 | Notification Status (Date / Response to external API) |
ewig:publisherUri | URI to ewig:Agent (Organization) | 1 | EWIG-URI of SubmittingOrganization |
2.2. Information Objects (ewig:InformationObject)
Information Objects follow the definition in 4.2.1.1 of the OAIS Reference Model. The Physical Object specialization of the Data Object is modelled as edm:aggregatedCHO.
Information Objects are categorized (ewig:use) according to a vocabulary (ewigvocab:use\#) including IE (PREMIS Intellectual Entity), PDI, SubmissionDocumentation, Transcripts (OCR/TEI), Service/Intermediate files and so on.
Information Objects MUST be memberOf a single Information Package within EWIG and MAY contain one or more files (Digital Data Objects).
ObjectPreservationTypes define sets of significant properties through another EWIG ontology for IEs which might come into being in the future.
2.2.1. Property Usage
Property | Expected Object Type (Range) | Cardinality | Scope Note |
rdf:type |
pcdm:Object ewig:InformationObject | 2..n | Subclass of ore:Aggregation. |
pcdm:memberOf | ewig:InformationPackage | 1 | Every InformationObjects MUST belong to one package. |
pcdm:hasFile | pcdm:File | 0..n | |
ewig:objectPreservationType | Literal (String) | 0..n | Signifies significant properties to be preserved. Not used yet. |
skos:prefLabel | Literal (String) | 0..1 | Optional label/title for object for Dashboard. Use# will be used if absent |
dct:description | Literal (String) | 0..1 | Optional description. |
edm:aggregatedCHO | edm:ProvidedCHO | 1 | Description of IE. See Europeana Mapping Guidelines 2.3: http://pro.europeana.eu/files/Europeana_Professional/Share_your_data/Technical_requirements/EDM_Documentation/EDM_Mapping_Guidelines_v2.3_112016.pdf |
ewig:structure | ore:Proxy | 0..n | Optional structuring information of object. |
dct:rights | ewig:RightsStatement | 0..1 | Rights statement overrides package rights. |
ewig:use | ewigvocab:use# | 1 | Categorizes files contained in object according to usage in LTDPS. |
2.3. SubmissionManifest (ewig:SubmissionManifest)
The SubmissionManifest contains the administrative (including rights) information for Administration Functional Entity. Semantics are according to the submission agreement...
Every SubmissionManifest MUST include a reference to a Contract.
2.3.1. Property Usage
Property | Expected Object Type (Range) | Cardinality | Scope Note |
rdf:type | ewig:SubmissionManifest | 1 | |
dct:identifier | Literal (String) | 0..1 | SubmissionName |
ewig:submissionManifestVersion | Literal (String) | 1 | Version of submission-manifest as given at time of delivery |
dct:isPartOf | Literal (String) | 1 | Name of SubmissionSet (AIC) |
dct:accrualPolicy | ewig:Contract | 1 | URI of Contract |
dct:publisher | Literal (String) | 1 | SubmittingOrganization as given at time of delivery |
ewig:publisherUri | URI to ewig:Agent (Organization) | 1 | EWIG-URI of SubmittingOrganization |
dct:creator | Literal (String) | 1 | Contact (Responsible Person) as given at time of delivery |
ewig:creatorUri | URI to ewig:Agent (Person) | 1 | EWIG-URI of Contact |
dct:contributor | Literal (String) | 1 | TransferCurator as given at time of delivery |
ewig:contributorUri | URI to ewig:Agent (Person) | 1 | EWIG-URI of TransferCurator Resource |
ewig:metadataFile | Literal (String) | 1 | Metadata File |
ewig:metadataFileFormat | URI | 1 | Metadata File Format |
ewig:dataSourceSystem | Literal (String) | 0..1 | System where data originates from. |
skos:prefLabel | Literal (String) | 0..1 | Optional label/title. |
dct:description | Literal (String) | 0..1 | Optional SubmissionDescription |
ewig:callbackParams | Literal (String) | 0..1 | Parameters for Transfer-/Ingest-Status Responses via callback URLs |
URL-Template: http://host/api.endpoint/?param1=<params>
< > will be replaced with the following parameters:
< code>: short success or error code
< message>: longer explanation of success or error condition
< ewig_id>: ewig identifier of information package
…
2.4. Agent (ewig:Agent)
The Agent contains information about Persons or Organizations relevant to administrative workflows. They do not act as premis:Agent within the PDI. An Agent cannot be a Person and an Organisation simultaneously.
2.4.1. Property Usage
Property | Expected Object Type (Range) | Cardinality | Scope Note |
rdf:type |
dct:Agent premis:Agent schema:Person|Organization | 1..n | Premis or dct agents allow software agents. |
dct:identifier | ISIL... ORCID? | 1 (Organization), 0..1 (Person) | Unique identifier for agent within EWIG. Use ISIL or ORCID if available. |
schema:name | Literal (String) | 1 (Organization), 0 (Person) | Name of organization. |
schema:alternateName | Literal (String) | 0..1 | Optional alternative or abbreviation. Will be displayed within ()s in dashboard. |
schema:email | Literal (String) | 0..1 | Personal or functional email-address. |
schema:familyname | Literal (String) | 0 (Organization), 1 (Person) | |
schema:givenname | Literal (String) | 0 (Organization), 1 (Person) | |
schema:honorificPrefix | Literal (String) | 0 (Organization), 1 (Person) | Prof./Dr./... |
schema:honorificSuffix | Literal (String) | 0 (Organization), 1 (Person) | Phd./MA/MDB/... |
schema:affiliation | ewig:Agent (Organization) Literal (String) | 0..1 | Parent organization or organization the person is loosely affiliated with at the time of recording. For work relation use worksFor. |
schema:jobTitle | Literal (String) | 0 (Organization), 1 (Person) | Administrative or functional role within organization regarding data submissions. |
schema:worksFor | ewig:Agent (Organization) | 0 (Organization), 1 (Person) | Employer (Organization). |
skos:prefLabel | Literal (String) | 0..1 | Optional label/title. |
dct:description | Literal (String) | 0..1 | Optional description/comments. |
ewig:login | Literal (String) | 0..1 (Organization), 0 (Person) | Transfer-Server Login of Organization |
2.5. RightsStatement (ewig:RightsStatement)
RightsStatements MUST include a rights declaration, a rights holder if applicable, licensing information (including PublicDomainMark). The semantics of accessRights are according tot he submission agreement. If certain reuse restrictions cannot be expressed through rights and license alone, a human readable legal note can be used in description. Embargos are modelled through pcdm:rightsOverride.
If not explicitly stated RightsStatements will be inherited through the hierarchy.
RightsStatements take preference over each other though the hierarchy bottom up (pcdm:File, InformationObject, InformationPackage) except for pcdm:rightsOverride.
2.5.1. Property Usage
Property | Expected Object Type (Range) | Cardinality | Scope Note |
rdf:type | dct:RightsStatement | 1 | |
dct:rights | URI (rightsstatements.org) | 1 | In Copyright, PublicDomain etc. |
dct:license | URI (creativecommons.org ...) | 0..1 | Permission to use restrictions/licenseinformation or rights reserved. |
dct:accessRights | ewigvocab:rightsScope# | 1 | Potentially available tot he public or restricted to access by the submitting institution. |
dct:rightsHolder | Literal (String) oder ewig:Agent | 0..n | Owner (legal body) of intellectual property/data that is entitled to select license. |
pcdmrights:rightsOverride | ewigvocab:rightsScope# | 0..1 | Embargo scope. I.e. access rights „Institution“ until expiration. |
pcdmrights:rightsOverrideExpiration | xsd:dateTime | 0..1 if rightsOverride | Embargo end. |
skos:prefLabel | Literal (String) | 0..1 | Human readable (not necessarily understandable) expression for display. |
dct:description | Literal (String) | 0..1 if rights NoC-CR | Can be used to express restrictions in case of „Out of copyright – Contractual restrictions“ or information/instructions how to get permission in case of rights reserved. |
2.6. Contract (ewig:Contract)
Contract MUST contain an identifier and information about contract length and size of storage (in Bytes).
2.6.1. Property Usage
Property | Expected Object Type (Range) | Cardinality | Scope Note |
rdf:type | dct:Policy | 1 | |
dct:identifier | Literal (String) | 1 | Contract Number |
dct:contributor | ewig:Agent (Organization) | 1..n | Contracting party other than ZIB. |
schema:startDate | xsd:dateTime | 1 | |
schema:endDate | xsd:dateTime | 0..1 | Optional if open-ended. |
premis3:size | Literal(xs:decimal) | 1 | Net storage allowance in bytes. -1 if unlimited. |
skos:prefLabel | Literal (String) | 0..1 | |
dct:description | Literal (String) | 0..1 |
##
2.7. File (pcdm:File)
2.7.1. Property Usage
Property | Expected Object Type | Range/Cardinality | Scope Note |
rdf:type |
pcdm:File ewigvocab:use# | 1..n | Usage per pcdm:use# or ewigvocab:use# subclasses. |
skos:prefLabel | Literal (String) | 0..1 | |
dct:description | Literal (String) | 0..1 | |
premis: | ... |
2.8. Vocabularies
An ontology has been developed in RDF, RDFS and OWL to provide us with terms where no suitable existing vocabulary term existed.
2.8.1. ewigvocab:packagetype#
There are four types of Information Packages within EWIG: TransferAggregations, SIP, AIP, AIC. DIPs are not relevant for this data model.
Label | Scope Note |
TA | Transfer Aggregation |
SIP | Submission Information Package |
AIP | Archival Information Package |
AIC | Archival Information Collection |
2.8.2. ewigvocab:rightsScope#
Label | Scope Note |
public | Everyone/the public is allowed to access. |
institution | Only submitting institution is allowed to access. |
license | License determines access (open/closed). |
2.8.3. ewigvocab:stage#
Different stages an Information Package can pass through. Will be reported by the API.
Label | Scope Note |
quarantine | Information Package (TransferAggregation) has been (logically) created and is in the process of transferring to a storage area. Archive hasn’t done anything yet. |
pre-ingest | IP has been transferred successfully and is in the process of being prepared for ingest into the Archive. |
backlog | An SIP has been prepared for ingest and is waiting for Ingest. |
ingest | SIP is going through the ingest workflow. |
storage | An AIP has been created and stored. |
2.8.4. ewigvocab:status#
Status of Information Packages within the different stages. Semantics depend on stage. Will be reported by the API.
Label | Scope Note |
incomplete | Stage is unable to proceed due to incomplete data. |
success | Stage has been completed without errors. |
failed | Stage has been terminated due to unrecoverable errors. |
interrupted | Stage is halted for (manual) data checks. |
deleted | IP has been deleted. |
deleted | IP has been deleted. |
2.8.5. ewigvocab:use#
Label | Scope Note |
submissionDocumentation | Contextual information from the Producer. Not actively monitored within the LTDPS. |
intellectualEntity | Primary Content Information. Focus of Preservation Actions. |
preservationDescription | Preservation Description Information enabling Management and Preservation Watch and Actions. |
preservationDerivative | Normalized/migrated derivative as new preservation master file |
metadataContainer | Metadata container files |