Skip to main content

Table 2 PAV provenance properties

From: PAV ontology: provenance, authoring and versioning

pav:createdBy

An agent primarily responsible for encoding the digital artifact or resource representation. This creation is distinct from forming the content, which is indicated with pav:contributedBy or its subproperties.

pav:createdBy is more specific than its superproperty dct:creator - which might or might not be interpreted to also cover the creation of the content of the artifact.

For instance, the author wrote’ this species has bigger wings than normal’ in his log book. The curator, going through the log book and identifying important knowledge, formalizes this as ‘locus perculus has wingspan > 0.5 m’. The artifact creator enters this knowledge as a digital resource in the knowledge system, thus creating the digital artifact (say as JSON, RDF, XML or HTML).

A different example is a news article. pav:authoredBy indicates the journalist who wrote the article. pav:contributedBy can indicate the artist who added an illustration. pav:curatedBy can indicate the editor who made the article conform to the news paper’s language style. pav:createdBy can indicate who put the article on the web site.

The software tool used by the creator to make the digital resource (say Protege, Wordpress or OpenOffice) can be indicated with pav:createdWith.

pav:createdOn

The date of creation of the digital artifact or resource representation. The agents responsible can be indicated with pav:createdBy.

This property is normally used in a functional way, indicating the time of creation, although PAV does not formally restrict this. pav:lastUpdateOn can be used to indicate minor updates that did not affect the creating date.

pav:createdWith

The software/tool used by the creator (pav:createdBy) when making the digital resource, for instance a word processor or an annotation tool. A more independent software agent that creates the resource without direct interactions by a human creator should instead be indicated using pav:createdBy.

pav:createdAt

The geo-location of the agents when creating the resource (pav:createdBy). For instance, a photographer takes a picture of the Eiffel Tower while standing in front of it.

pav:retrievedFrom

The URI where a resource has been retrieved from. Retrieval indicates that this resource has the same representation as the original resource. If the resource has been somewhat transformed, pav:importedFrom should be used instead. This property is normally used in a functional way, although PAV does not formally restrict this.

pav:retrievedBy

An entity responsible for retrieving the data from an external source. The retrieving agent is usually a software entity, which has done the retrieval from the original source without performing any transcription.

Retrieval indicates that this resource has the same representation as the original resource. If the resource has been somewhat transformed, use pav:importedFrom instead.

pav:retrievedOn

The date the source for this resource was retrieved. This property is normally used in a functional way, although PAV does not formally restrict this.

pav:importedFrom

The original source of imported information. Import means that the content has been preserved, but transcribed somehow, for instance to fit a different representation model by converting formats. The imported resource does not have to be complete but should be consistent with the knowledge conveyed by the original resource.

pav:importedBy

An agent responsible for importing data from a source given by pav:importedFrom. The importer is usually a software agent which has done the transcription from the original source. Note that pav:importedBy may overlap with pav:createdWith.

pav:importedOn

The date the resource was imported from a source given by pav:importedFrom. This property is normally used in a functional way, indicating the first import date, although PAV does not formally restrict this.

This property is normally used in a functional way, although PAV does not formally restrict this. If the resource is later reimported, this should instead be indicated with pav:lastRefreshedOn.

pav:lastRefreshedOn

The date of the last import of the resource. This property is used if this version has been updated due to a re-import, rather than the import creating new resources related using pav:previousVersion.

pav:providedBy

The original provider of the encoded information (e.g. PubMed, UniProt, Science Commons).

The provider might not coincide with the dct:publisher, which would describe the current publisher of the resource. For instance if the resource was retrieved, imported or derived from a source, that source was published by the original provider. pav:providedBy provides a shortcut to indicate that original provider on the new resource.

pav:sourceAccessedAt

A source which was accessed or consulted (but not retrieved, imported or derived from). For instance, a curator (pav:curatedBy) might have consulted figures in a published paper to confirm that a dataset was correctly pav:importedFrom the paper’s supplementary CSV file.

Another example: I can access the page for tomorrow weather in Boston (http://www.weather.com/weather/tomorrow/Boston+MA+02143) and I can blog ‘tomorrow is going to be nice’. The source does not make any claims about the nice weather, that is my interpretation; therefore the blog post has pav:sourceAccessedAt the weather page.

pav:sourceAccessedBy

The agent who accessed the source given by pav:sourceAcccessedAt .

pav:sourceAccessedOn

The date when the original source given by pav:sourceAccessedAt was accessed to create the resource.

For instance, if the source accessed described the weather forecast for the next day, the time of source access can be crucial information.

This property is normally used in a functional way, although PAV does not formally restrict this. If the source is subsequently checked again (say to verify validity), this should be indicated with pav:sourceLastAccessedOn.

pav:sourceLastAccessedOn

The date when the original source given by pav:sourceAccessedAt was last accessed and verified, especially when the source has previously been pav:sourceAccessedOn when creating the resource. This property is normally used in a functional way, although PAV does not formally restrict this.

 

This property can be useful together with pav:lastRefreshedOn or pav:lastUpdateOn, but could also be used alone, for instance when a source was verified and no further action was taken for the resource.