The paper will expand on a generic model of digital editing in which several layers of perception and documentation could be distinguished: a visual layer, a textual and the content i.e. the world to be represented by the text. As I have argued elsewhere (2014, 2015a, 2015b) all these levels can be important for the work with historical accounting documents. For the visual and the textual layer exist very well established technologies: digital imaging, unicode and XML markup in particular following the guidelines of the TEI. Using the appropriate technologie for each level realises the basic software development principle of separation of concern. In the case of economic and accounting documents it is the content layer which nees a deeper discussion. A look at the technologies first considered for the digital representation of this information shows that the modern business world has developed a wide set: spreadsheet software formats, SQL, dedicated business software etc. This kind of technologies often violates the principle of separation of concerns as they usually don’t use a separate method for the formal representation of logical concepts. The W3C Semantic Web recommendation for a Resource Description Framework (RDF) helps to solve this problem as it offers technologies dedicated to represent conceptualisation of “world facts”. The XML interoperability standard for business reporting XBRL adds the last facet in separation of concerns necessary in the encoding of historical economic and accounting document: It clearly distinguished between basic business reporting concepts like monetary values and their categorisation. It offers a framework to describe these kinds of taxonomies thus allowing historians to develop more than one model if the world represented in the text. Unfortunately XBRL is so complex that taxonomies are usually build by highly paid consultance. I conclude thus, that we should follow the need to separate several concerns in editing of and doing research with historical accounts by developing a RDF based ontology of economic facts that we think the monks, accountants, clerks, politicans etc. had in mind when they wrote the documents we are studying.
2. Edition text as trace - text as language
text as meaningtext as image
object
URI
talks aboutis read as
is read as
is about
transcription
looks
markup
scan ontology
is about
3. Edition text as trace - text as language
text as meaningtext as image
object
URI
talks aboutis read as
is read as
is
about
transcription
looks
markup
scan ontology
is
about
Historians
People in the Past and their Actions
4. Separation of Concerns 1
• Visual
image
• Textual
transcript, textual markup
• Content
formal data representation
6. Digital technologies in accounting and
economic reporting
• SQL
• Excel
• dedicated accounting software
– SAP, Oracle E-Business Suite, Sage, …
– Quicken, GnuCash, …
• XBRL
• W3C RDF Data Cube Vocabulary
7. W3C: RDF
(Resource Description Framework)
• Statements about resources in triples:
<resource> <predicate> <object> .
Gams:srbas.1536#bs_AllgemeinEmpfangen-10 bk:hasAccount bk:Income .
• Directed Graphs => networks
• International unique identifiers („IRI“)
• RDFs (RDF-Schema) allows class relationships
• OWL (Web Ontology language) allows more
complex class definitions
8. Why use RDF?
• W3C standard
• Base for the development of data
interoperability on the web, the so called
“Semantic Web”
• Graph data model is easier to handle for
complex data structures
10. Looking for vineyard prices in the area
of Devils Cliff?
• ?parcel <isNeighbourOf> <TheDevilsCliff> .
• ?person <sellsWhat> ?parcel .
• ?person <receives> ?amount .
11. RDF/SPARQL
• allows federation over different resources
• e.g.
– reusing currency conversion recorded in other
accounts
– reusing measurements and prices recorded in
other accounts
– reuse taxonomies of commodities / services
– aggregate information in different sources
14. XBRL
eXtensible Business Reporting Language
• Reporting not economic action itself
• Taxonomy of Concepts with constraints and
relationships
=> What kind of economic action is recorded is
defined by the taxonomy
• fact (pair of Concept-Value in a context(time,
entity, dimension)) collected in an instance
15. Taxonomy building
e.g.
• @balance
– the reported fact can be integrated in a
credit/debit confrontation
• <calculationArc>
– connects concepts as calculations: test if target
concept is result of a summation operation
• datatypes: monetaryItemType /
sharesItemType
– explicit money / company share related
16. XBRL Global Ledger
• extensible?
“The steps involved in creating a public extension are as follows (note
that in the following xxx is the 3 character code for the module being
created, yyyy-mm-dd is the desired publication date for the module):
1. Select a palette taxonomy (gl-plt-2015-03-25.xsd) and the all the gl-xxx-content-2015-03-25.xsd content model declaration schemas
from one of the provided combinations, choosing the combination that most closely resembles the desired end product of the
exercise.
2. Create a new subdirectory of “plt” (called case-x-y-z where x, y and z, and additional letters if necessary, represent the modules
involved) and copy these files into this directory.
3. Create a new taxonomy representing the module you wish to create, add concept definitions and create the lang folder and the
linkbases. Note that complexType definitions must be defined as global complexTypes. If the tool has the capability, save the element
declarations and the complexType definitions into separate files in separate directories (i.e., as....xxxgl-xxx-yyyy-mm-dd.xsd and .gl-
xxx-content=yyyy-mm-dd.xsd), save the linkbase files into the directory ....xxxx and go to step 8, otherwise save them in the same file
.gl-xxx-yyy-mm-dd.xsd and proceed to perform steps 4-7 manually.
4. Separate the taxonomy into gl-xxx-yyyy-mm-dd.xsd and gl-xxx-content-yyyy-mm-dd.xsd with the former containing the element
declarations and the latter the content model declarations that are relevant to the new module.
5. Create the directory ....xxx and move gl-xxx-yyyy-mm-dd.xsd and the linkbase files into that directory.
6. Add an <include> statement in gl-xxx-content-yyyy.mm-dd.xsd to include gl-xxx-yyyy-mm-dd.xsd
7. Change all necessary relative paths in the linkbase files and the gl-xxx-yyyy-mm-dd.xsd schema file.
8. Edit each gl-xxx-content-yyyy-mm-dd.xsd in the palette directory as necessary to incorporate any concepts from the new module into
the appropriate content models (this will usually be content models for elements declared in gl-cor-content-yyyy-mm-dd.xsd).
9. Ensure that presentation links in the newly created presentation linkbase reflect the content model modifications made in step 8. “
http://www.xbrl.org/int/gl/2007-04-17/GLIS-REC-2007-04-17.htm
17. Historical vs Historians accounting
Historical taxonomy
• „rubrics“
– Income and expenditure
– Territorial
– …
• „accounts“
– Company partners
– Counterparties
– …
=> Defined by legal, business
and social practice
Historians Taxonomy
• Costs of by type of labor
(slave, contractual)
• Staple prices
• ...
=> Defined by research
question
18. TechnicalConceptual
Separation of Concerns
„digital representation“ of the
written document
• Image
• transcription („text“)
• linguistical representation
(„text“)
• In image formats
• and TEI/XML
„facts“ represtented
• in contemporary taxonomy
• in „modern“ taxonomy i.e.
by the historians
– economical
– socioeconomics
– ...
• In RDF with ontologies to be
developed,
• reusing basic concepts from
XBRL/XBRL-GL?
You know that probably because I‘ve talked about in several times
Das ist realisierbar mit TEI->fs, expliziter SemWeb/RDF-Erweiterung der TEI
Objekt => URI
-> Scan -> Koordinaten
-> Transkription <- Koordinaten
-> Ontologie <- Wörter, Koordinaten, Metadaten
-> Metadaten <- Transkription, Scan
Und im Ontologie-Bereich haben die Digital Humanities noch am meisten Nachholbedarf
Das ist realisierbar mit TEI->fs, expliziter SemWeb/RDF-Erweiterung der TEI
Objekt => URI
-> Scan -> Koordinaten
-> Transkription <- Koordinaten
-> Ontologie <- Wörter, Koordinaten, Metadaten
-> Metadaten <- Transkription, Scan
Und im Ontologie-Bereich haben die Digital Humanities noch am meisten Nachholbedarf
The first two point have luckily established their standards – and i nthe case of image even james conversion
o let‘s stick to the last point.
It has – from my point of view – two dimensions: Which technology is best adapted to represent content of accounts digitally? (1) And
Which data schema to use?
SQL: Structured Query Language
… I would suggest to use a generic model: RDF
http://www.w3.org/TR/vocab-data-cube/
Ich muß gestehen, daß es natürlich nur geht, wenn Weinpreise kodiert sind...
http://linkeddata.uriburner.com/sparql?default-graph-uri=&query=PREFIX+gl-bus%3A+%3Chttp%3A%2F%2Fwww.xbrl.org%2Fint%2Fgl%2Fbus%2F2010-04-12%3E+PREFIX+bk%3A+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Frem%2Fbookkeeping%2F%3E+PREFIX+srbas%3A+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Fsrbas%2F%3E+PREFIX+g2o%3A%3Chttp%3A%2F%2Fgams.uni-graz.at%2Fonto%2F%23%3E+SELECT+%3Fdate+%3Fmenge+%3Fpreis+%3Finhalt+FROM+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Farchive%2Fobjects%2Fo%3Asrbas.1535%2Fdatastreams%2FRDF%2Fcontent%3E+FROM+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Farchive%2Fobjects%2Fo%3Asrbas.1536%2Fdatastreams%2FRDF%2Fcontent%3E+FROM+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Farchive%2Fobjects%2Fo%3Asrbas.1537%2Fdatastreams%2FRDF%2Fcontent%3E+FROM+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Farchive%2Fobjects%2Fo%3Asrbas.1539%2Fdatastreams%2FRDF%2Fcontent%3E+FROM+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Farchive%2Fobjects%2Fo%3Asrbas.1540%2Fdatastreams%2FRDF%2Fcontent%3E+FROM+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Farchive%2Fobjects%2Fo%3Asrbas.1541%2Fdatastreams%2FRDF%2Fcontent%3E+FROM+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Farchive%2Fobjects%2Fo%3Asrbas.1542%2Fdatastreams%2FRDF%2Fcontent%3E+FROM+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Farchive%2Fobjects%2Fo%3Asrbas.1543%2Fdatastreams%2FRDF%2Fcontent%3E+FROM+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Farchive%2Fobjects%2Fo%3Asrbas.1544%2Fdatastreams%2FRDF%2Fcontent%3E+FROM+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Farchive%2Fobjects%2Fo%3Asrbas.1545%2Fdatastreams%2FRDF%2Fcontent%3E+FROM+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Farchive%2Fobjects%2Fo%3Asrbas.1548%2Fdatastreams%2FRDF%2Fcontent%3E+FROM+%3Chttp%3A%2F%2Fgams.uni-graz.at%2Farchive%2Fobjects%2Fo%3Asrbas.3155%2Fdatastreams%2FONTOLOGY%2Fcontent%3E+WHERE+{+%3Fp+bk%3AmentionedAt+%3Fsource+%3B++++gl-bus%3AmeasurableQuantity+%3Fmenge+%3B++++gl-bus%3AmeasurableCostPerUnit+%3Fpreis+.+%3Fsource+bk%3Ainhalt+%3Finhalt+.+%3Fsource+g2o%3ApartOf+%3Fo+.+%3Fo+srbas%3Afrom+%3Fdate+.+}+ORDER+BY+%3Fdate&should-sponge=soft&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000000
From „Technology“ to „Taxonomy“
SAP export to XBRL: http://www.sap.com/solutions/sapbusinessobjects/large/enterprise-performance-management/xbrl-publishing/index.epx
balance: debit/credit-able
calculation: connect concepts as calculations (testing only!), @weight as multiplier in the calculation => +1 / -1 e.g.