SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Downloaden Sie, um offline zu lesen
[Unclear] w ordsare denoted in square brackets.
FAIR Data webinar series #3:
I for Interoperable – ANDS Webinar
13 September 2017
Video & slides available from ANDS website
START OF TRANSCRIPT
Keith Russell: My name's Keith Russell, I work for the Australian National Data
Service, I am your host for today. My colleague, Susannah Sabine, is
behind the site scenes co-hosting the webinar with me. Just a usual
little bit of background, the Australian National Data Service works
with research organisations around Australia to establish - or have
them trusted partnerships, reliable services, and enhance capability in
the research sector. We work together with two other NCRIS funded
projects - Research - RDS, Research Data Services, and Nectar - to
create an aligned set of joint investments to deliver transformation in
the research sector.
So this webinar is part of a series of activities we are undertaking to -
which aim to support the Australian research community in increasing
our ability to manage our research data as a national asset. So as I
mentioned earlier, this is a third in a series of a webinars around FAIR.
So we've already had the webinars on findable and accessible, and
today interoperable, next week the reusable.
Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 2 of 14
So today I will give a brief introduction about what is interoperable as
described under the FAIR data principles in FORCE11. Then I'm very
grateful that Simon and Jonathan have - are available to talk about
what they did in practice in the OzNome project to make their data
interoperable. I think it's a great example to show how this quite
complex topic can actually be carried forward in practice.
So this is what FORCE11 says about interoperable, and first of all a
few things to keep in mind. So just reiterating a few things I
mentioned in the very first webinar. So when they talk about data,
and as you look at these headings you'll see that they talk about data
and metadata, so interoperable applies both to the metadata,
describing the data collection, and the actual data itself. Another point
to keep in mind is throughout the FAIR principles they think a lot
around not only data being usable for humans, but also for machines.
That provides huge benefits in bringing together disparate datasets, in
bringing together bits of knowledge that are distributed over different
datasets.
Interoperable is a key element there to make sure that data can be
brought together, and actually can be - you can - we can get those
benefits out of bringing data together which will enable new
knowledge discovery, new relationships to be discovered, new
patterns to be recognised. All those pieces of work.
So as we look at these three headings that they have listed under
interoperable, the first one there is that data and metadata use a
formal accessible shared and broadly applicable language for
knowledge representation. To keep in mind there is that not only for
you as the - or the researcher that has created the data, but also for
another researcher that wants to understand the data and use the
data, it's useful that they understand the language you have used.
That that is a standardised language, something that other users can
also pick up and use. So ideally that is the case for the metadata -
Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 3 of 14
sorry, that is definitely the case for the metadata, and ideally that
would also be used in the actual data itself.
A very basic example, if a researcher has observed that they saw a
magpie they can write in, I saw a magpie. But it's much more useful
for a researcher somewhere else on the other side of the world that
you write in that it's an Australian magpie and that is a Cracticus
tibicen. That means that a researcher on the other side of the world
has - using a standard language will actually be able to better
understand what you meant and what that description is about.
Now it's not just in the actual wording used, in the vocabulary used,
but it's also in - it's useful to have a framework around that which will
allow the data to also be machine readable and picked up by
machines and used and interpreted. Now one obvious example which
gets mentioned quite a lot is using RDF and ontologies. That is quite
common in the life sciences, and a number of life science researchers
and that were quite active in the FORCE11 group. But one thing they
emphasise is that it doesn't just have to be through RDF and
ontologies. There might be other solutions for this, and they don't
want to make it exclusively through those technologies. So that's
something to keep in mind.
Regarding the making of data interoperable, that's what I've invited
Simon and Jonathan to come and talk about, and they'll be able to talk
about it in much more detail.
The second point here is around vocabularies and using vocabularies.
They emphasise that if you use a vocabulary, well, first of all try and
use one that already exists, and is agreed on by the community. If
you have terms in there that are not in that vocabulary, but otherwise
it fits, try and get them added to that vocabulary. Finally, if that is not
possible, then, and only then, start creating your own vocabulary. So
please don't go out and create vocabularies for everything. Rather
look if there is already a community agreed vocabulary. Also make
Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 4 of 14
sure that that vocabulary itself is fair. So findable, accessible,
interoperable, reusable.
So in your dataset you should have a reference to that vocabulary you
are referring to, and make sure that that vocabulary can be found just
as long as your dataset can also be found.
Final point they make is that the data and the metadata should include
qualified references to other data and metadata. So what they mean
there is that shouldn't just be a reference to another dataset, for
example, but also an indication what that relationship is. So it's not
just it's related somehow to this other dataset, but perhaps it is a
subset of another dataset or it builds on another dataset using
standardised terminology.
A little more on qualified references, from the perspective of the
metadata especially, it's valuable to not only refer to other players or
other elements around your dataset, but to do that using identifiers.
So for example, if you are describing your dataset and saying, well, it
was created - somebody was involved in creating that dataset.
Provide a qualified identifier that that person was, for example, the
author of that dataset, and if possible also use an identifier to identify
that person. That allows other relationships to be made, and it allows
further connections to be made, and that information to be picked up
and used especially in machine - when being analysed by machines.
So just a list here of possible identifiers, these are just examples there
are more identifiers out there. But for example, if you are referring to
an author include their ORCID, if you are referring to a publication use
the DOI that is related to that publication. If you are referring to
software nowadays you can assign a DOI to a software package and
refer to that DOI, et cetera.
Well, I think I've rambled on enough for now. So I would like to hand
over to Simon and Jonathan, I'm very grateful that they have made
their time available. So just a brief introduction. Simon is a research
scientist at CSIRO Land and Water's Environment Information
Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 5 of 14
Systems research program. He specialises in distributed
architectures and information standards for environmental data
focusing on geosciences and water.
Jonathan Yu is a research computer scientist specialising in
information architectures, data integration, linked data, semantic web,
data analytics and visualisation. He's part of the Environmental
Informatics Group in CSIRO Land and Water.
So together they have been very active in applying their thinking
around making data interoperable in the OzNome project. Now one
thing I want to point out is that in the OzNome project they did a whole
series of work around the FAIR data principles in all different aspects.
Today I have asked them specially to focus on interoperable. But
please keep in mind that they have also done a whole bunch of other
work.
So without any further ado I'd like to hand over to Simon and
Jonathan. I'm very intrigued how they've picked up interoperability
and used that in the OzNome project.
Jonathan Yu: Okay, thanks Keith. So thanks for the introductions as well. So today
we'll be presenting on some of the work we did in the OzNome
initiative. Particularly looking at Land and Water and the data that we
have in CSIRO, and how to make that interoperable accordingly to
some of the principles that FAIR espouses. But as we will talk about,
some of the implementations that we have explored around the FAIR
principles into actionable questions to address how FAIR your data.
So if you haven't come across OzNome, this is a CSIRO-led initiative
aiming to connect information ecosystems throughout Australia. The
OzNome name was coined echoing the genome project. So Oz being
Australia, and the Nome being a gnome kind of inspired project. But
really what we're looking at here is tools, services, products, methods,
approaches and practices, and infrastructure to support having more
connected information infrastructures.
Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 6 of 14
In the previous year, as Keith mentioned, we focused on
environmental information infrastructures. There's a couple of links
there you can follow. Today we'll be talking about an example in the
water space.
Simon Cox: Okay, so as part of establishing the OzNome architecture, OzNome
infrastructure, we felt that we needed to assist potential data providers
to understand what good data was, what in the context of this seminar
series, what FAIR data is, we called it OzNome data. Basically we
developed a rating - a set of rating criteria and a tool to allow
assessment by data providers of the data that they are providing.
This is just on the right-hand side of the screen here, you can see a
screen capture of the sort of the kick-off page of the tool.
You'll also notice that we've got a slightly adapted version of the FAIR
criteria - findable, accessible, interoperable and reusable - but we also
add in the last line there, trusted. Which appears to go a little bit
beyond what has been conceived in FAIR until now, but we suggest
would be a useful addition. We're kind of bundling the interoperable
and reusable together, we see those as being very closely related.
Obviously, it's teasing out some of the issues around what it is that
makes data interoperable. Keith's given a sort of high level overview
and indicated what some of the concerns might be.
We've done our own take on this, a bit - actually fairly strongly leaning
on our experience over a number of years, more than a decade now
actually of working in the data standards communities, in particular the
geospatial data standards communities. Some of the learning which
we've got from there which we're applying directly in here. Obviously
environmental data, which is what we're largely - what our heritage is,
where we've largely been working. A lot of that is geospatial so it
makes sense to be building on that.
Just a bit of a reminder, the FORCE11 FAIR principles, this is a
summary slide from Michel Dumontier, who's one of the original
authors of the papers and the developers of the FAIR principles. They
Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 7 of 14
got these - the guiding principles with the four key words and are
teased out into three or four sub-principles in each case with the F-A-I
and R letters.
We're looking at the interoperable set here, which Keith has already
shown. It's interesting that Michel has recently done a study
evaluating a number of repositories, particularly in Europe and some
of them are broader than that, but here's the list of repositories that
were evaluated. Scored those on the FAIR principles, the data's
available in this form actually, this table, shoots off to the right of the
screen and there's lots more going on there. But looking at the
summary of the results it's fairly notable that the tallest red bar here is
in the interoperable category. So what this is saying is, of the FAIR
data principles this is the one which is hardest to meet, the one that's
hardest to conform to.
So really that's the focus of the approach that we've taken, which is to
kind of lead people through how they can make their data more FAIR,
more OzNomic, more interoperable. The particular way in which
we've broken out the question of interoperability is on, if you look at
the numbered terms here, is it loadable, is it usable, is it
comprehensible, is it linked, as well as is it licenced.
I'm just going to go through some of the details of those, and you'll see
the - sense it's fairly repetitive of some of the concerns that Keith
explained at the beginning. But we're putting some more concrete
examples onto these criteria just to indicate to our data providers that
when we say a standard data format we mean something like, CSV or
JSON or XML or netCDF. These are all important file formats towards
the left, and then they're kind of general, but netCDF is one that's
used a lot in the remote sensing and environmental science
communities.
So we've got a bit of a ladder here of different levels of conformance
which you can reach about whether a dataset would be loadable. Is it
in a unique file format? Well, that means that you've got to have some
Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 8 of 14
unique software to load it. Or is it in a standard data format, and
normally that would be denoted by one of the standard MIME types.
Best of all would be for data to be provided in multiple standard
formats, giving a choice to the user so that whatever their favourite
platform for loading data they can use.
Next question, even when you've loaded it can you use it? If it's - if
the structures within the dataset, even if it's loaded, if the structures
are unclear then it's not going to be very usable. That comes down to
the matter of, is there a schema that's provided which makes explicit
the structures within the datasets. A lot of sort of traditional data,
yeah, there's a structure in there but the schema's not available
independently of the data, if you like the schema is implicit. It's not
formalised. The schema maybe is different every time.
A lot of spreadsheets are done that way, a spreadsheet has got a lot
of boxes. But if every time you use it you add different columns and
use the pages in a spreadsheet in a different way, then it takes a little
while for the users to get their head's around what's going on before
they can use it. So there's various explicit schema languages like
DDL, which is loaded and used for relational systems, XML schema.
There's something coming out in the open knowledge world these
days called data packaging, which allows you essentially to describe a
schema for a CSV file. Then you've got in the RDF, the semantic web
space, RDFS and OWL. JSON even has a schema language these
days, although it's not broadly used.
So it's nice to provide data with a schema, but best of all would be to
say, the data I am using I am using this community schema. This
community, and for example the Open Geospatial Consortium
provides a number of community schemas for observations, for time
series, for hydrology, for geoscience. If you're publishing or
attempting to share data in any of these disciplines then best to go off
and find a community schema.
Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 9 of 14
Then even when you've got it loaded and you understand what the
structures are you've still got the question about what the words and
numbers are inside the boxes. Do the column headings, are they
explicit enough to understand, are they just shorthand for something
which the project leader when he was developing the data knew that
he or she would understand it the next week. But even he or she if
they came back to it the next year may not understand it. Best of
course is if the field labels are linked and do have explanations,
probably in plain text. Better still is to use standard labels, for
example the universal code for units of measure, units codes. Of the
climate and forecast conventions coming out of the FluidEarth
community.
So the ladder that we've got her says, oh, you're using standard
labels. Is it just some of the field names are linked to standard
externally managed vocabularies, or are all the field names linked to
standard externally managed vocabularies? You get this ladder better
and better and better.
Then the question about how well linked is your data? Well, if it's just
a file sitting on a service somewhere and there's no links in or out,
yeah, you're lucky to find it. If most of the datasets that we're - that
this community would be expecting is that they are indexed in a
catalogue or they are available from a landing page. That's the
situation where you've got inbound links to the dataset. Best of all is
when there are outbound links embedded or implicit in the data
structures in a dataset which says exactly how it's related. This links
in with some of the previous concerns that we had there about field
names and these kinds of things.
So I'm going to hand back to Jonathan to tell you tease through a
case study that we've got here really based on the AWRA-L - the
Australian Water Resources Assessment datasets. So Jonathan.
Jonathan Yu: Yes, so as mentioned earlier, in the OzNome project we looked at a
practice example and a case study in the AWRA-L dataset. This is a
Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 10 of 14
continental cell dataset that has historical time series from 1911. The
bureau published an operational version online, and you can find that
on their website. But often scientists have to basically deal with this
dataset by knowing where it is and knowing how to use it implicitly.
Knowing how to reference the requisite geospatial features and
understand the field name values.
So I've got an example in the - oh sorry, so the next slide shows the
assessment of it using our tool. Just focusing on the interoperable
side of things we have rated it as a web service, you can get it by the
web. However the reference definitions are text only, and they are
localised in the dataset itself. Now I'll give an example in the next
slide.
So this is coming out from the netCDF metadata that this dataset, you
can access this via online through THREDDS or via their netCDF
tools. But this is a summary of the metadata that comes along with
the data. So we've got long name here, Potential evapotranspiration,
we've got the name which is a label for the field, e0_avg. Units, mm,
and a standard name which is a convention in netCDF to refer to the
actual - to a property which is e0_avg, which in this case isn't part of
the CF conventions that's often used with this format.
So if you are an expert in this area and you've used this dataset many
times you will know what this is. If you are a newcomer you have to
do a lot of work to - well, a little bit of work to understand what actually
this data field needs.
In the OzNome project what we did was enrich this with external
variables. So if you go to the next slide Simon, so this is the same
field. We've added - these added lines at the bottom here, they tease
out what this particular data field means in the context of externally
defined vocabularies. So we've now enriched this with a scaled
quantity kind identifier, Potential evapotranspiration. It's an http URI
where you can resolve it and get a definition. So similarly for
substance ortaxon, unit ID and feature of interest.
Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 11 of 14
We'll just talk about what they are. So this is what - a part of the
project was to explore, could we define vocabularies for these from
which we could reference outbound links from the data to the
definition. This is just a summary of what we did in the context of the
AWRA-L dataset. This is an example of potential evapotranspiration.
We've got a conception model here where we've got broader notions
of potential evapotranspiration. We've got linked relationships out to
thinks like feature of interest, object of interest, and unit of measure.
So this view provides a vocabulary entry for potential
evapotranspiration, not only the identifier for it, not only the description
for it, but a richer model than you would get from if you just had
something inline. So you've got outbound relationships from this
concept to its related concepts essentially. So this is a demonstration
of defining the concepts externally, having them quite richly explained
through this medium, but having the ability to link that from the dataset
itself to this definition to make it more interoperable.
So that if we have another dataset that talked about potential
evapotranspiration it could potentially be linked and interoperable. A
revised OzNome maturity estimation using the OzNome five-star tool
and just focusing on the interoperable field we see that it's, for using
the same tool and assessing it based on the criteria, we've gone up
form two star to more than four stars in the interoperable space. The
reason for that is that we now have reference definitions as linked
data and externally hosted observed property vocabulary definitions.
Rather than just inline labels of what it is.
It provides more interoperability and if the vocabulary was
standardised then we would have a higher estimation in that field. But
it's just a demonstration of how we went about making something
more interoperable through the OzNome project.
Simon Cox: Yeah, I'll just pick up at the end here and just comment that when we
were starting this data ratings exercise we actually didn't look at FAIR
at the beginning. We developed our own set of criteria, these key
Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 12 of 14
words here, and then subsequently correlated them with the FAIR
principles. One of the interesting things was there was three lines in
this table here, the ones in red, which didn't correlate with concerns
that had been identified within FAIR.
The first one might be seen as trivial, but we thought it was a question
that was worth asking, particularly when working with research
scientists and talking about making their data available, which was the
question about, the first question, is your data intended to be used by
anybody else? There's lots of data generated which is never shared.
Now that's not necessarily a good thing, and to a certain extent having
the question there highlights the fact that there is a question to be
asked and that some scientists need, researchers, need to be
encouraged to think about making their data available, about
publishing it.
So I think in terms of the FAIR principles this one was kind of the
implicit starting point. If it's published, yes, it's implicitly FAIR.
A couple of other rows, one concern which comes up, particularly
we've worked a lot with agencies that have sort of systematic data
collection processes with systematic curation and maintenance
revisiting. A dataset is refreshed every day or every month or every
year, all that. That concern didn't seem to be particularly addressed in
the FAIR principles as they stand. So we'd say the concern about
whether the data is expected to be updated and maintained, and
maybe a bit more than FAIR.
The bottom row there was well as the concern about this is a, if you
like, an elaboration of the assessment of data that you might do,
which is to get some information about how well trusted it is. Now a
lot of that is about who else is using it, how much it's - well, that's often
the criteria you'll use. Who else is using it, how many times is it being
used, what other products have been generated from this dataset and
so can I trust it?
Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 13 of 14
So just emphasising that row there is the interoperable, it corresponds
with the interoperability which is what we've really been focusing on
today. The use of standards I guess. Standards is a funny word, you
have to be a bit careful with it. Capital S standard, sometimes people
think that's just to do with ISO of Australian Standards or whatever.
Really the point of that standards is that they are community
agreements. They are community agreements which are available for
additional members of the community to join in. But it's important to
think of them as agreements - agreements to do things in a common
way.
So finally just a slide with some links to some of the material that
we've been showing today. We'll say thank you for listening.
Keith Russell: Thank you Simon, thank you Jonathan. That was really interesting
and a really useful way to see what it actually means in practice.
Because in think interoperable can be quite a complex difficult subject,
sometimes also one that requires much more knowledge of the actual
field of research that's going on that you're talking about. So think this
is a great example of where you've been working in a specific field to
try and make that data more interoperable.
Thanks very much for your time, and this is a really interesting
discussion and really starting to tease out a number of the issues, and
a number of the things that probably will need developing further.
I've just put up a slide which links off to a number of resources, and
some of these Simon already mentioned. So ANDS has a service,
Research Vocabularies Australia, which anybody around the country -
or actually internationally also - can use if you don't have your own
tool to set up a vocabulary. That is a possible way of doing. There
are also already existing vocabularies in there. So have a look at that
if that's of interest. We also have an interest group that works in this
space.
If you are looking at the metadata and having qualified relationships
within the metadata and using identifiers, there's a few links there to
Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 14 of 14
places where you can find information about possibly identifiers. We
are also trying to pull that metadata, describing datasets, together and
sharing that internationally through a number of hubs. That's taking
place through the Scholix project. The Research Data Australia is sort
of an Australian hub contributing into that international hub -
international effort. So have a look there if you're interested.
We did 23 research starter things last year, and two of the things are
relevant for our discussion today. If you are interested in digging into
it a little further and discovering a little bit more about it, and
discovering what the vocabularies mean in practice, have a go at
Thing 12. Or if you are more interested in the identifiers and link data
have a look at Thing 14.
Finally I would like to first of thank Simon and Jonathan again for their
time and for the excellent presentation and the insights that they
brought to the table. Finally we would like to acknowledge NCRIS, the
National Collaborative Infrastructure Strategy Program that provides
the funding for ANDS.
So thanks again and look forward to seeing you all next week.
END OF TRANSCRIPT

Weitere ähnliche Inhalte

Was ist angesagt?

RDFa From Theory to Practice
RDFa From Theory to PracticeRDFa From Theory to Practice
RDFa From Theory to PracticeAdrian Stevenson
 
Sherpa Software Whitepaper Solving .Pst Management Problems In Microsoft Ex...
Sherpa Software Whitepaper   Solving .Pst Management Problems In Microsoft Ex...Sherpa Software Whitepaper   Solving .Pst Management Problems In Microsoft Ex...
Sherpa Software Whitepaper Solving .Pst Management Problems In Microsoft Ex...gopi1985
 
Fco open data in half day th-v2
Fco open data in half day  th-v2Fco open data in half day  th-v2
Fco open data in half day th-v2Tony Hirst
 
Ten Simple Rules for Open Access Publishers
Ten Simple Rules for Open Access PublishersTen Simple Rules for Open Access Publishers
Ten Simple Rules for Open Access PublishersPhilip Bourne
 
Sharing Data on the Web
Sharing Data on the WebSharing Data on the Web
Sharing Data on the Web3 Round Stones
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing SerendipityDorothea Salo
 
Research Data and Scholarly Communication
Research Data and Scholarly CommunicationResearch Data and Scholarly Communication
Research Data and Scholarly CommunicationDorothea Salo
 
Emerging Technologies
Emerging TechnologiesEmerging Technologies
Emerging Technologiesrobin fay
 
Uses Of Internet In A Day To Day Life
Uses Of Internet In A Day To Day LifeUses Of Internet In A Day To Day Life
Uses Of Internet In A Day To Day LifeSundeep Malik
 

Was ist angesagt? (10)

RDFa From Theory to Practice
RDFa From Theory to PracticeRDFa From Theory to Practice
RDFa From Theory to Practice
 
Sherpa Software Whitepaper Solving .Pst Management Problems In Microsoft Ex...
Sherpa Software Whitepaper   Solving .Pst Management Problems In Microsoft Ex...Sherpa Software Whitepaper   Solving .Pst Management Problems In Microsoft Ex...
Sherpa Software Whitepaper Solving .Pst Management Problems In Microsoft Ex...
 
Fco open data in half day th-v2
Fco open data in half day  th-v2Fco open data in half day  th-v2
Fco open data in half day th-v2
 
Ten Simple Rules for Open Access Publishers
Ten Simple Rules for Open Access PublishersTen Simple Rules for Open Access Publishers
Ten Simple Rules for Open Access Publishers
 
Sharing Data on the Web
Sharing Data on the WebSharing Data on the Web
Sharing Data on the Web
 
Marek Navratil Thesis
Marek Navratil ThesisMarek Navratil Thesis
Marek Navratil Thesis
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing Serendipity
 
Research Data and Scholarly Communication
Research Data and Scholarly CommunicationResearch Data and Scholarly Communication
Research Data and Scholarly Communication
 
Emerging Technologies
Emerging TechnologiesEmerging Technologies
Emerging Technologies
 
Uses Of Internet In A Day To Day Life
Uses Of Internet In A Day To Day LifeUses Of Internet In A Day To Day Life
Uses Of Internet In A Day To Day Life
 

Ähnlich wie Transcript FAIR 3 -I-for-interoperable-13-9-17

Transcript - Provenance and Social Science data
Transcript  - Provenance and Social Science dataTranscript  - Provenance and Social Science data
Transcript - Provenance and Social Science dataARDC
 
Transcript of Webinar: Data management plans (DMPs) - audio
Transcript of Webinar: Data management plans (DMPs) - audioTranscript of Webinar: Data management plans (DMPs) - audio
Transcript of Webinar: Data management plans (DMPs) - audioARDC
 
Transcript _Rise of drones in Australian research space
Transcript _Rise of drones in Australian research spaceTranscript _Rise of drones in Australian research space
Transcript _Rise of drones in Australian research spaceARDC
 
Metaphic or the art of looking another way.
Metaphic or the art of looking another way.Metaphic or the art of looking another way.
Metaphic or the art of looking another way.Suresh Manian
 
Transcript - Tracking Research Data Footprints via Integration with Research ...
Transcript - Tracking Research Data Footprints via Integration with Research ...Transcript - Tracking Research Data Footprints via Integration with Research ...
Transcript - Tracking Research Data Footprints via Integration with Research ...ARDC
 
2014 11-17 crichton institute talk on open data
2014 11-17 crichton institute talk on open data2014 11-17 crichton institute talk on open data
2014 11-17 crichton institute talk on open dataPeterWinstanley1
 
The OpenCon Intro to Open Data
The OpenCon Intro to Open DataThe OpenCon Intro to Open Data
The OpenCon Intro to Open DataRoss Mounce
 
Ands National Identifier Solution
Ands National Identifier SolutionAnds National Identifier Solution
Ands National Identifier SolutionAndrew Treloar
 
20170313 mr - gss presentation
20170313   mr - gss presentation20170313   mr - gss presentation
20170313 mr - gss presentationMichael Rose
 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media suresh sood
 
Research proposal attic media
Research proposal attic mediaResearch proposal attic media
Research proposal attic mediaguestb67122e
 
Research Proposal Attic Media
Research Proposal Attic MediaResearch Proposal Attic Media
Research Proposal Attic Mediaguestb67122e
 
Research Proposal Attic Media
Research Proposal  Attic  MediaResearch Proposal  Attic  Media
Research Proposal Attic Mediaguestb67122e
 
Research Proposal Attic Media
Research Proposal Attic MediaResearch Proposal Attic Media
Research Proposal Attic Mediaguestb67122e
 
Research Proposal Attic Media
Research Proposal Attic MediaResearch Proposal Attic Media
Research Proposal Attic Mediaguestb67122e
 
Fair - Interoperability - Keith Russell
Fair  - Interoperability - Keith RussellFair  - Interoperability - Keith Russell
Fair - Interoperability - Keith RussellARDC
 
Text Analytics for Dummies 2010
Text Analytics for Dummies 2010Text Analytics for Dummies 2010
Text Analytics for Dummies 2010Seth Grimes
 

Ähnlich wie Transcript FAIR 3 -I-for-interoperable-13-9-17 (20)

Transcript - Provenance and Social Science data
Transcript  - Provenance and Social Science dataTranscript  - Provenance and Social Science data
Transcript - Provenance and Social Science data
 
Transcript of Webinar: Data management plans (DMPs) - audio
Transcript of Webinar: Data management plans (DMPs) - audioTranscript of Webinar: Data management plans (DMPs) - audio
Transcript of Webinar: Data management plans (DMPs) - audio
 
Transcript _Rise of drones in Australian research space
Transcript _Rise of drones in Australian research spaceTranscript _Rise of drones in Australian research space
Transcript _Rise of drones in Australian research space
 
Metaphic or the art of looking another way.
Metaphic or the art of looking another way.Metaphic or the art of looking another way.
Metaphic or the art of looking another way.
 
Transcript - Tracking Research Data Footprints via Integration with Research ...
Transcript - Tracking Research Data Footprints via Integration with Research ...Transcript - Tracking Research Data Footprints via Integration with Research ...
Transcript - Tracking Research Data Footprints via Integration with Research ...
 
2014 11-17 crichton institute talk on open data
2014 11-17 crichton institute talk on open data2014 11-17 crichton institute talk on open data
2014 11-17 crichton institute talk on open data
 
Information Retrieval thru Cellular Devices
Information Retrieval thru Cellular DevicesInformation Retrieval thru Cellular Devices
Information Retrieval thru Cellular Devices
 
Database Essay
Database EssayDatabase Essay
Database Essay
 
The OpenCon Intro to Open Data
The OpenCon Intro to Open DataThe OpenCon Intro to Open Data
The OpenCon Intro to Open Data
 
What are the FAIR data principles?
What are the FAIR data principles?What are the FAIR data principles?
What are the FAIR data principles?
 
Ands National Identifier Solution
Ands National Identifier SolutionAnds National Identifier Solution
Ands National Identifier Solution
 
20170313 mr - gss presentation
20170313   mr - gss presentation20170313   mr - gss presentation
20170313 mr - gss presentation
 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media
 
Research proposal attic media
Research proposal attic mediaResearch proposal attic media
Research proposal attic media
 
Research Proposal Attic Media
Research Proposal Attic MediaResearch Proposal Attic Media
Research Proposal Attic Media
 
Research Proposal Attic Media
Research Proposal  Attic  MediaResearch Proposal  Attic  Media
Research Proposal Attic Media
 
Research Proposal Attic Media
Research Proposal Attic MediaResearch Proposal Attic Media
Research Proposal Attic Media
 
Research Proposal Attic Media
Research Proposal Attic MediaResearch Proposal Attic Media
Research Proposal Attic Media
 
Fair - Interoperability - Keith Russell
Fair  - Interoperability - Keith RussellFair  - Interoperability - Keith Russell
Fair - Interoperability - Keith Russell
 
Text Analytics for Dummies 2010
Text Analytics for Dummies 2010Text Analytics for Dummies 2010
Text Analytics for Dummies 2010
 

Mehr von ARDC

Introduction to ADA
Introduction to ADAIntroduction to ADA
Introduction to ADAARDC
 
Architecture and Standards
Architecture and StandardsArchitecture and Standards
Architecture and StandardsARDC
 
Data Sharing and Release Legislation
Data Sharing and Release Legislation   Data Sharing and Release Legislation
Data Sharing and Release Legislation ARDC
 
Australian Dementia Network (ADNet)
Australian Dementia Network (ADNet)Australian Dementia Network (ADNet)
Australian Dementia Network (ADNet)ARDC
 
Investigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspectiveInvestigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspectiveARDC
 
NCRIS and the health domain
NCRIS and the health domainNCRIS and the health domain
NCRIS and the health domainARDC
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataARDC
 
Clinical trials data sharing
Clinical trials data sharingClinical trials data sharing
Clinical trials data sharingARDC
 
Clinical trials and cohort studies
Clinical trials and cohort studiesClinical trials and cohort studies
Clinical trials and cohort studiesARDC
 
Introduction to vision and scope
Introduction to vision and scopeIntroduction to vision and scope
Introduction to vision and scopeARDC
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things dataARDC
 
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian DuncanARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian DuncanARDC
 
Skilling-up-in-research-data-management-20181128
Skilling-up-in-research-data-management-20181128Skilling-up-in-research-data-management-20181128
Skilling-up-in-research-data-management-20181128ARDC
 
Research data management and sharing of medical data
Research data management and sharing of medical dataResearch data management and sharing of medical data
Research data management and sharing of medical dataARDC
 
Findable, Accessible, Interoperable and Reusable (FAIR) data
Findable, Accessible, Interoperable and Reusable (FAIR) dataFindable, Accessible, Interoperable and Reusable (FAIR) data
Findable, Accessible, Interoperable and Reusable (FAIR) dataARDC
 
Applying FAIR principles to linked datasets: Opportunities and Challenges
Applying FAIR principles to linked datasets: Opportunities and ChallengesApplying FAIR principles to linked datasets: Opportunities and Challenges
Applying FAIR principles to linked datasets: Opportunities and ChallengesARDC
 
How to make your data count webinar, 26 Nov 2018
How to make your data count webinar, 26 Nov 2018How to make your data count webinar, 26 Nov 2018
How to make your data count webinar, 26 Nov 2018ARDC
 
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
Ready, Set, Go! Join the Top 10 FAIR Data Things Global SprintReady, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
Ready, Set, Go! Join the Top 10 FAIR Data Things Global SprintARDC
 
How FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of dataHow FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of dataARDC
 
Peter neish DMPs BoF eResearch 2018
Peter neish DMPs BoF eResearch 2018Peter neish DMPs BoF eResearch 2018
Peter neish DMPs BoF eResearch 2018ARDC
 

Mehr von ARDC (20)

Introduction to ADA
Introduction to ADAIntroduction to ADA
Introduction to ADA
 
Architecture and Standards
Architecture and StandardsArchitecture and Standards
Architecture and Standards
 
Data Sharing and Release Legislation
Data Sharing and Release Legislation   Data Sharing and Release Legislation
Data Sharing and Release Legislation
 
Australian Dementia Network (ADNet)
Australian Dementia Network (ADNet)Australian Dementia Network (ADNet)
Australian Dementia Network (ADNet)
 
Investigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspectiveInvestigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspective
 
NCRIS and the health domain
NCRIS and the health domainNCRIS and the health domain
NCRIS and the health domain
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research data
 
Clinical trials data sharing
Clinical trials data sharingClinical trials data sharing
Clinical trials data sharing
 
Clinical trials and cohort studies
Clinical trials and cohort studiesClinical trials and cohort studies
Clinical trials and cohort studies
 
Introduction to vision and scope
Introduction to vision and scopeIntroduction to vision and scope
Introduction to vision and scope
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things data
 
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian DuncanARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
 
Skilling-up-in-research-data-management-20181128
Skilling-up-in-research-data-management-20181128Skilling-up-in-research-data-management-20181128
Skilling-up-in-research-data-management-20181128
 
Research data management and sharing of medical data
Research data management and sharing of medical dataResearch data management and sharing of medical data
Research data management and sharing of medical data
 
Findable, Accessible, Interoperable and Reusable (FAIR) data
Findable, Accessible, Interoperable and Reusable (FAIR) dataFindable, Accessible, Interoperable and Reusable (FAIR) data
Findable, Accessible, Interoperable and Reusable (FAIR) data
 
Applying FAIR principles to linked datasets: Opportunities and Challenges
Applying FAIR principles to linked datasets: Opportunities and ChallengesApplying FAIR principles to linked datasets: Opportunities and Challenges
Applying FAIR principles to linked datasets: Opportunities and Challenges
 
How to make your data count webinar, 26 Nov 2018
How to make your data count webinar, 26 Nov 2018How to make your data count webinar, 26 Nov 2018
How to make your data count webinar, 26 Nov 2018
 
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
Ready, Set, Go! Join the Top 10 FAIR Data Things Global SprintReady, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
 
How FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of dataHow FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of data
 
Peter neish DMPs BoF eResearch 2018
Peter neish DMPs BoF eResearch 2018Peter neish DMPs BoF eResearch 2018
Peter neish DMPs BoF eResearch 2018
 

Kürzlich hochgeladen

An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPCeline George
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Osopher
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptxAneriPatwari
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Celine George
 
Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxAvaniJani1
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17Celine George
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...Nguyen Thanh Tu Collection
 

Kürzlich hochgeladen (20)

An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERP
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptx
 
Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17
 
Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...
 
Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptx
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
 

Transcript FAIR 3 -I-for-interoperable-13-9-17

  • 1. [Unclear] w ordsare denoted in square brackets. FAIR Data webinar series #3: I for Interoperable – ANDS Webinar 13 September 2017 Video & slides available from ANDS website START OF TRANSCRIPT Keith Russell: My name's Keith Russell, I work for the Australian National Data Service, I am your host for today. My colleague, Susannah Sabine, is behind the site scenes co-hosting the webinar with me. Just a usual little bit of background, the Australian National Data Service works with research organisations around Australia to establish - or have them trusted partnerships, reliable services, and enhance capability in the research sector. We work together with two other NCRIS funded projects - Research - RDS, Research Data Services, and Nectar - to create an aligned set of joint investments to deliver transformation in the research sector. So this webinar is part of a series of activities we are undertaking to - which aim to support the Australian research community in increasing our ability to manage our research data as a national asset. So as I mentioned earlier, this is a third in a series of a webinars around FAIR. So we've already had the webinars on findable and accessible, and today interoperable, next week the reusable.
  • 2. Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 2 of 14 So today I will give a brief introduction about what is interoperable as described under the FAIR data principles in FORCE11. Then I'm very grateful that Simon and Jonathan have - are available to talk about what they did in practice in the OzNome project to make their data interoperable. I think it's a great example to show how this quite complex topic can actually be carried forward in practice. So this is what FORCE11 says about interoperable, and first of all a few things to keep in mind. So just reiterating a few things I mentioned in the very first webinar. So when they talk about data, and as you look at these headings you'll see that they talk about data and metadata, so interoperable applies both to the metadata, describing the data collection, and the actual data itself. Another point to keep in mind is throughout the FAIR principles they think a lot around not only data being usable for humans, but also for machines. That provides huge benefits in bringing together disparate datasets, in bringing together bits of knowledge that are distributed over different datasets. Interoperable is a key element there to make sure that data can be brought together, and actually can be - you can - we can get those benefits out of bringing data together which will enable new knowledge discovery, new relationships to be discovered, new patterns to be recognised. All those pieces of work. So as we look at these three headings that they have listed under interoperable, the first one there is that data and metadata use a formal accessible shared and broadly applicable language for knowledge representation. To keep in mind there is that not only for you as the - or the researcher that has created the data, but also for another researcher that wants to understand the data and use the data, it's useful that they understand the language you have used. That that is a standardised language, something that other users can also pick up and use. So ideally that is the case for the metadata -
  • 3. Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 3 of 14 sorry, that is definitely the case for the metadata, and ideally that would also be used in the actual data itself. A very basic example, if a researcher has observed that they saw a magpie they can write in, I saw a magpie. But it's much more useful for a researcher somewhere else on the other side of the world that you write in that it's an Australian magpie and that is a Cracticus tibicen. That means that a researcher on the other side of the world has - using a standard language will actually be able to better understand what you meant and what that description is about. Now it's not just in the actual wording used, in the vocabulary used, but it's also in - it's useful to have a framework around that which will allow the data to also be machine readable and picked up by machines and used and interpreted. Now one obvious example which gets mentioned quite a lot is using RDF and ontologies. That is quite common in the life sciences, and a number of life science researchers and that were quite active in the FORCE11 group. But one thing they emphasise is that it doesn't just have to be through RDF and ontologies. There might be other solutions for this, and they don't want to make it exclusively through those technologies. So that's something to keep in mind. Regarding the making of data interoperable, that's what I've invited Simon and Jonathan to come and talk about, and they'll be able to talk about it in much more detail. The second point here is around vocabularies and using vocabularies. They emphasise that if you use a vocabulary, well, first of all try and use one that already exists, and is agreed on by the community. If you have terms in there that are not in that vocabulary, but otherwise it fits, try and get them added to that vocabulary. Finally, if that is not possible, then, and only then, start creating your own vocabulary. So please don't go out and create vocabularies for everything. Rather look if there is already a community agreed vocabulary. Also make
  • 4. Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 4 of 14 sure that that vocabulary itself is fair. So findable, accessible, interoperable, reusable. So in your dataset you should have a reference to that vocabulary you are referring to, and make sure that that vocabulary can be found just as long as your dataset can also be found. Final point they make is that the data and the metadata should include qualified references to other data and metadata. So what they mean there is that shouldn't just be a reference to another dataset, for example, but also an indication what that relationship is. So it's not just it's related somehow to this other dataset, but perhaps it is a subset of another dataset or it builds on another dataset using standardised terminology. A little more on qualified references, from the perspective of the metadata especially, it's valuable to not only refer to other players or other elements around your dataset, but to do that using identifiers. So for example, if you are describing your dataset and saying, well, it was created - somebody was involved in creating that dataset. Provide a qualified identifier that that person was, for example, the author of that dataset, and if possible also use an identifier to identify that person. That allows other relationships to be made, and it allows further connections to be made, and that information to be picked up and used especially in machine - when being analysed by machines. So just a list here of possible identifiers, these are just examples there are more identifiers out there. But for example, if you are referring to an author include their ORCID, if you are referring to a publication use the DOI that is related to that publication. If you are referring to software nowadays you can assign a DOI to a software package and refer to that DOI, et cetera. Well, I think I've rambled on enough for now. So I would like to hand over to Simon and Jonathan, I'm very grateful that they have made their time available. So just a brief introduction. Simon is a research scientist at CSIRO Land and Water's Environment Information
  • 5. Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 5 of 14 Systems research program. He specialises in distributed architectures and information standards for environmental data focusing on geosciences and water. Jonathan Yu is a research computer scientist specialising in information architectures, data integration, linked data, semantic web, data analytics and visualisation. He's part of the Environmental Informatics Group in CSIRO Land and Water. So together they have been very active in applying their thinking around making data interoperable in the OzNome project. Now one thing I want to point out is that in the OzNome project they did a whole series of work around the FAIR data principles in all different aspects. Today I have asked them specially to focus on interoperable. But please keep in mind that they have also done a whole bunch of other work. So without any further ado I'd like to hand over to Simon and Jonathan. I'm very intrigued how they've picked up interoperability and used that in the OzNome project. Jonathan Yu: Okay, thanks Keith. So thanks for the introductions as well. So today we'll be presenting on some of the work we did in the OzNome initiative. Particularly looking at Land and Water and the data that we have in CSIRO, and how to make that interoperable accordingly to some of the principles that FAIR espouses. But as we will talk about, some of the implementations that we have explored around the FAIR principles into actionable questions to address how FAIR your data. So if you haven't come across OzNome, this is a CSIRO-led initiative aiming to connect information ecosystems throughout Australia. The OzNome name was coined echoing the genome project. So Oz being Australia, and the Nome being a gnome kind of inspired project. But really what we're looking at here is tools, services, products, methods, approaches and practices, and infrastructure to support having more connected information infrastructures.
  • 6. Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 6 of 14 In the previous year, as Keith mentioned, we focused on environmental information infrastructures. There's a couple of links there you can follow. Today we'll be talking about an example in the water space. Simon Cox: Okay, so as part of establishing the OzNome architecture, OzNome infrastructure, we felt that we needed to assist potential data providers to understand what good data was, what in the context of this seminar series, what FAIR data is, we called it OzNome data. Basically we developed a rating - a set of rating criteria and a tool to allow assessment by data providers of the data that they are providing. This is just on the right-hand side of the screen here, you can see a screen capture of the sort of the kick-off page of the tool. You'll also notice that we've got a slightly adapted version of the FAIR criteria - findable, accessible, interoperable and reusable - but we also add in the last line there, trusted. Which appears to go a little bit beyond what has been conceived in FAIR until now, but we suggest would be a useful addition. We're kind of bundling the interoperable and reusable together, we see those as being very closely related. Obviously, it's teasing out some of the issues around what it is that makes data interoperable. Keith's given a sort of high level overview and indicated what some of the concerns might be. We've done our own take on this, a bit - actually fairly strongly leaning on our experience over a number of years, more than a decade now actually of working in the data standards communities, in particular the geospatial data standards communities. Some of the learning which we've got from there which we're applying directly in here. Obviously environmental data, which is what we're largely - what our heritage is, where we've largely been working. A lot of that is geospatial so it makes sense to be building on that. Just a bit of a reminder, the FORCE11 FAIR principles, this is a summary slide from Michel Dumontier, who's one of the original authors of the papers and the developers of the FAIR principles. They
  • 7. Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 7 of 14 got these - the guiding principles with the four key words and are teased out into three or four sub-principles in each case with the F-A-I and R letters. We're looking at the interoperable set here, which Keith has already shown. It's interesting that Michel has recently done a study evaluating a number of repositories, particularly in Europe and some of them are broader than that, but here's the list of repositories that were evaluated. Scored those on the FAIR principles, the data's available in this form actually, this table, shoots off to the right of the screen and there's lots more going on there. But looking at the summary of the results it's fairly notable that the tallest red bar here is in the interoperable category. So what this is saying is, of the FAIR data principles this is the one which is hardest to meet, the one that's hardest to conform to. So really that's the focus of the approach that we've taken, which is to kind of lead people through how they can make their data more FAIR, more OzNomic, more interoperable. The particular way in which we've broken out the question of interoperability is on, if you look at the numbered terms here, is it loadable, is it usable, is it comprehensible, is it linked, as well as is it licenced. I'm just going to go through some of the details of those, and you'll see the - sense it's fairly repetitive of some of the concerns that Keith explained at the beginning. But we're putting some more concrete examples onto these criteria just to indicate to our data providers that when we say a standard data format we mean something like, CSV or JSON or XML or netCDF. These are all important file formats towards the left, and then they're kind of general, but netCDF is one that's used a lot in the remote sensing and environmental science communities. So we've got a bit of a ladder here of different levels of conformance which you can reach about whether a dataset would be loadable. Is it in a unique file format? Well, that means that you've got to have some
  • 8. Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 8 of 14 unique software to load it. Or is it in a standard data format, and normally that would be denoted by one of the standard MIME types. Best of all would be for data to be provided in multiple standard formats, giving a choice to the user so that whatever their favourite platform for loading data they can use. Next question, even when you've loaded it can you use it? If it's - if the structures within the dataset, even if it's loaded, if the structures are unclear then it's not going to be very usable. That comes down to the matter of, is there a schema that's provided which makes explicit the structures within the datasets. A lot of sort of traditional data, yeah, there's a structure in there but the schema's not available independently of the data, if you like the schema is implicit. It's not formalised. The schema maybe is different every time. A lot of spreadsheets are done that way, a spreadsheet has got a lot of boxes. But if every time you use it you add different columns and use the pages in a spreadsheet in a different way, then it takes a little while for the users to get their head's around what's going on before they can use it. So there's various explicit schema languages like DDL, which is loaded and used for relational systems, XML schema. There's something coming out in the open knowledge world these days called data packaging, which allows you essentially to describe a schema for a CSV file. Then you've got in the RDF, the semantic web space, RDFS and OWL. JSON even has a schema language these days, although it's not broadly used. So it's nice to provide data with a schema, but best of all would be to say, the data I am using I am using this community schema. This community, and for example the Open Geospatial Consortium provides a number of community schemas for observations, for time series, for hydrology, for geoscience. If you're publishing or attempting to share data in any of these disciplines then best to go off and find a community schema.
  • 9. Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 9 of 14 Then even when you've got it loaded and you understand what the structures are you've still got the question about what the words and numbers are inside the boxes. Do the column headings, are they explicit enough to understand, are they just shorthand for something which the project leader when he was developing the data knew that he or she would understand it the next week. But even he or she if they came back to it the next year may not understand it. Best of course is if the field labels are linked and do have explanations, probably in plain text. Better still is to use standard labels, for example the universal code for units of measure, units codes. Of the climate and forecast conventions coming out of the FluidEarth community. So the ladder that we've got her says, oh, you're using standard labels. Is it just some of the field names are linked to standard externally managed vocabularies, or are all the field names linked to standard externally managed vocabularies? You get this ladder better and better and better. Then the question about how well linked is your data? Well, if it's just a file sitting on a service somewhere and there's no links in or out, yeah, you're lucky to find it. If most of the datasets that we're - that this community would be expecting is that they are indexed in a catalogue or they are available from a landing page. That's the situation where you've got inbound links to the dataset. Best of all is when there are outbound links embedded or implicit in the data structures in a dataset which says exactly how it's related. This links in with some of the previous concerns that we had there about field names and these kinds of things. So I'm going to hand back to Jonathan to tell you tease through a case study that we've got here really based on the AWRA-L - the Australian Water Resources Assessment datasets. So Jonathan. Jonathan Yu: Yes, so as mentioned earlier, in the OzNome project we looked at a practice example and a case study in the AWRA-L dataset. This is a
  • 10. Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 10 of 14 continental cell dataset that has historical time series from 1911. The bureau published an operational version online, and you can find that on their website. But often scientists have to basically deal with this dataset by knowing where it is and knowing how to use it implicitly. Knowing how to reference the requisite geospatial features and understand the field name values. So I've got an example in the - oh sorry, so the next slide shows the assessment of it using our tool. Just focusing on the interoperable side of things we have rated it as a web service, you can get it by the web. However the reference definitions are text only, and they are localised in the dataset itself. Now I'll give an example in the next slide. So this is coming out from the netCDF metadata that this dataset, you can access this via online through THREDDS or via their netCDF tools. But this is a summary of the metadata that comes along with the data. So we've got long name here, Potential evapotranspiration, we've got the name which is a label for the field, e0_avg. Units, mm, and a standard name which is a convention in netCDF to refer to the actual - to a property which is e0_avg, which in this case isn't part of the CF conventions that's often used with this format. So if you are an expert in this area and you've used this dataset many times you will know what this is. If you are a newcomer you have to do a lot of work to - well, a little bit of work to understand what actually this data field needs. In the OzNome project what we did was enrich this with external variables. So if you go to the next slide Simon, so this is the same field. We've added - these added lines at the bottom here, they tease out what this particular data field means in the context of externally defined vocabularies. So we've now enriched this with a scaled quantity kind identifier, Potential evapotranspiration. It's an http URI where you can resolve it and get a definition. So similarly for substance ortaxon, unit ID and feature of interest.
  • 11. Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 11 of 14 We'll just talk about what they are. So this is what - a part of the project was to explore, could we define vocabularies for these from which we could reference outbound links from the data to the definition. This is just a summary of what we did in the context of the AWRA-L dataset. This is an example of potential evapotranspiration. We've got a conception model here where we've got broader notions of potential evapotranspiration. We've got linked relationships out to thinks like feature of interest, object of interest, and unit of measure. So this view provides a vocabulary entry for potential evapotranspiration, not only the identifier for it, not only the description for it, but a richer model than you would get from if you just had something inline. So you've got outbound relationships from this concept to its related concepts essentially. So this is a demonstration of defining the concepts externally, having them quite richly explained through this medium, but having the ability to link that from the dataset itself to this definition to make it more interoperable. So that if we have another dataset that talked about potential evapotranspiration it could potentially be linked and interoperable. A revised OzNome maturity estimation using the OzNome five-star tool and just focusing on the interoperable field we see that it's, for using the same tool and assessing it based on the criteria, we've gone up form two star to more than four stars in the interoperable space. The reason for that is that we now have reference definitions as linked data and externally hosted observed property vocabulary definitions. Rather than just inline labels of what it is. It provides more interoperability and if the vocabulary was standardised then we would have a higher estimation in that field. But it's just a demonstration of how we went about making something more interoperable through the OzNome project. Simon Cox: Yeah, I'll just pick up at the end here and just comment that when we were starting this data ratings exercise we actually didn't look at FAIR at the beginning. We developed our own set of criteria, these key
  • 12. Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 12 of 14 words here, and then subsequently correlated them with the FAIR principles. One of the interesting things was there was three lines in this table here, the ones in red, which didn't correlate with concerns that had been identified within FAIR. The first one might be seen as trivial, but we thought it was a question that was worth asking, particularly when working with research scientists and talking about making their data available, which was the question about, the first question, is your data intended to be used by anybody else? There's lots of data generated which is never shared. Now that's not necessarily a good thing, and to a certain extent having the question there highlights the fact that there is a question to be asked and that some scientists need, researchers, need to be encouraged to think about making their data available, about publishing it. So I think in terms of the FAIR principles this one was kind of the implicit starting point. If it's published, yes, it's implicitly FAIR. A couple of other rows, one concern which comes up, particularly we've worked a lot with agencies that have sort of systematic data collection processes with systematic curation and maintenance revisiting. A dataset is refreshed every day or every month or every year, all that. That concern didn't seem to be particularly addressed in the FAIR principles as they stand. So we'd say the concern about whether the data is expected to be updated and maintained, and maybe a bit more than FAIR. The bottom row there was well as the concern about this is a, if you like, an elaboration of the assessment of data that you might do, which is to get some information about how well trusted it is. Now a lot of that is about who else is using it, how much it's - well, that's often the criteria you'll use. Who else is using it, how many times is it being used, what other products have been generated from this dataset and so can I trust it?
  • 13. Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 13 of 14 So just emphasising that row there is the interoperable, it corresponds with the interoperability which is what we've really been focusing on today. The use of standards I guess. Standards is a funny word, you have to be a bit careful with it. Capital S standard, sometimes people think that's just to do with ISO of Australian Standards or whatever. Really the point of that standards is that they are community agreements. They are community agreements which are available for additional members of the community to join in. But it's important to think of them as agreements - agreements to do things in a common way. So finally just a slide with some links to some of the material that we've been showing today. We'll say thank you for listening. Keith Russell: Thank you Simon, thank you Jonathan. That was really interesting and a really useful way to see what it actually means in practice. Because in think interoperable can be quite a complex difficult subject, sometimes also one that requires much more knowledge of the actual field of research that's going on that you're talking about. So think this is a great example of where you've been working in a specific field to try and make that data more interoperable. Thanks very much for your time, and this is a really interesting discussion and really starting to tease out a number of the issues, and a number of the things that probably will need developing further. I've just put up a slide which links off to a number of resources, and some of these Simon already mentioned. So ANDS has a service, Research Vocabularies Australia, which anybody around the country - or actually internationally also - can use if you don't have your own tool to set up a vocabulary. That is a possible way of doing. There are also already existing vocabularies in there. So have a look at that if that's of interest. We also have an interest group that works in this space. If you are looking at the metadata and having qualified relationships within the metadata and using identifiers, there's a few links there to
  • 14. Transcript-FAIR-3-I-for-Interoperable-13-9-17 Page 14 of 14 places where you can find information about possibly identifiers. We are also trying to pull that metadata, describing datasets, together and sharing that internationally through a number of hubs. That's taking place through the Scholix project. The Research Data Australia is sort of an Australian hub contributing into that international hub - international effort. So have a look there if you're interested. We did 23 research starter things last year, and two of the things are relevant for our discussion today. If you are interested in digging into it a little further and discovering a little bit more about it, and discovering what the vocabularies mean in practice, have a go at Thing 12. Or if you are more interested in the identifiers and link data have a look at Thing 14. Finally I would like to first of thank Simon and Jonathan again for their time and for the excellent presentation and the insights that they brought to the table. Finally we would like to acknowledge NCRIS, the National Collaborative Infrastructure Strategy Program that provides the funding for ANDS. So thanks again and look forward to seeing you all next week. END OF TRANSCRIPT