1. Abstract:
TM351
is
a
new
course
on
databases,
data
explora:on
and
simple
data
visualisa:on,
due
for
first
presenta:on
in
2015J.
The
course
will
require
students
to
make
use
of
several
different
databases
and
work
with
them
in
an
interac:ve
fashion.
The
course
team
are
proposing
to
use
a
virtual
machine
(VM)
to
deliver
all
the
course
soKware
to
student
as
an
integrated
package,
allowing
the
same
"student
soKware
lab"
configura:on
to
run
on
different
plaMorms
(Windows,
OS/X,
Linux)
or
even
on
a
cloud
server.
The
main
user
interface
to
the
services
being
run
on
the
VM
will
be
via
a
web
browser
on
the
student's
host
opera:ng
system
(that
is,
their
"normal"
one)
in
the
form
of
an
IPython
Notebook.
This
interface
provides
a
blend
of
text
cells
(wriYen
using
markdown)
that
can
display
HTML
text,
images,
etc,
and
executable
code
cells.
Code
cells
can
be
edited
by
students
and
then
the
code
fragments
contained
within
them
executed
on
a
cell
by
cell
basis,
the
output
from
the
code
execu:on
being
inserted
within
the
document.
In
this
presenta:on,
I
will
review
the
ra:onale
for
adop:ng
the
use
of
a
virtual
machine
to
deliver
soKware
environments
to
students,
and
demonstrate
how
we
intend
to
make
use
of
IPython
notebooks
for
the
delivery
of
interac:ve
course
materials.
Possible
applica:ons
to
other
courses
will
also
be
reviewed,
along
with
a
considera:on
of
possible
workflows
associated
with
the
produc:on,
maintenance
and
student
use.
1
2. TM351
–
a
new
course,
currently
in
produc:on…
-‐ Level
3
-‐ 30
points
-‐ First
presenta:on
slated
for
October
2015
(15J)
2
3. It’s
replacing
a
“tradi:onal”
databases
course,
but
we’re
planning
quite
a
twists…
What
those
twists
are
in
content
terms,
though,
is
the
subject
of
an
other
presenta:on…
3
4. What
I
am
going
to
talk
about
are
two
new
things
we’re
exploring
in
the
context
of
the
course,
and
which
we’re
hopefully
might
also
prove
aYrac:ve
to
other
course
teams.
4
5. The
first
thing
are
virtual
machines.
These
have
already
been
used
on
a
couple
of
other
OU
courses
–
TM128
and
M812
both
use
virtual
machines
–
but
we
are
taking
a
more
fundamental
view
about
how
to
use
notebooks
to
delivering
interac:ve
teaching
material
as
well
as
soKware
applica:on
services.
So
what
is
a
virtual
machine?
5
6. We’re
all
familiar
with
the
idea
that
a
student
can
run
OU
supplied
soKware,
either
third
party
soKware
or
OU
created
soKware,
or
a
combina:on
of
both
in
the
case
of
open
source
applica:ons
where
we
take
some
open
code
and
then
modify
it
ourselves,
on
the
student’s
own
desktop.
6
7. We
may
even
require
students
to
install
more
that
one
piece
of
soKware,
perhaps
further
requiring
that
these
applica:ons
can
interoperate.
With
a
move
to
be
be
“open”
and
agnos:c
towards
a
par:cular
opera:ng
system,
there
are
considerable
challenges
to
be
faced:
-‐ soKware
libraries
should
ideally
be
cross
plaMorm
rather
than
mul:ple
na:ve
implementa:ons
of
the
ostensibly
the
same
applica:on;
-‐ soKware
versions
across
applica:ons
should
update
in
synch
with
each
other;
-‐ the
UI,
or
look
and
feel,
should
be
the
same
across
plaMorms
–
or
we
have
more
wri:ng
to
do;
-‐ support
issues
are
likely
to
scale
badly
for
us
as
we
have
to
cope
with
more
varia:ons
on
the
configura:on
of
individual
student
machines
(for
example,
different
opera:ng
systems,
different
versions
of
the
same
opera:ng
system);
7
8. One
way
of
mi:ga:ng
against
change
is
to
seYle
on
a
single
UI
space
–
such
as
a
browser.
Applica:ons
can
be
built
solely
within
the
browser,
and
made
available
to
the
user
requiring
liYle
more
desktop
(or
server)
applica:on
support
other
than
a
web
server.
Applica:on
front
ends
wriYen
in
HTML5
and
Javascript
can
provide
an
experience
rich
enough
to
rival
that
of
a
na:ve
applica:on.
Applica:on
front
ends
can
also
be
created
for
applica:ons
running
as
services
either
on
the
students’
desktop
or
via
a
remote
server.
Applica:ons
can
draw
on
files
in
a
folder
on
the
student’s
desktop
machine,
and
the
browser
can
be
used
to
save
files
(e.g.
from
the
internet)
into
that
folder.
8
9. To
get
round
the
problem
of
having
to
install
soKware
onto
mul:ple
different
possible
system
configura:ons,
how
much
easier
would
it
be
if
we
knew
exactly
what
opera:ng
system
each
student
was
running
and
they
were
all
running
exactly
the
same
opera:ng
system.
Virtualisa:on
plaMorms
such
as
Viirtualbox
and
VMware
are
cross-‐plaMorm
applica:ons
that
can
be
downloaded
to
a
student’s
own
machine
and
that
then
allow
an
addi4onal
guest
opera4ng
system
to
be
installed
in
its
own
container
running
on
the
the
student’s
own
computer
(the
host)
via
the
virtualisa4on
pla;orm.
The
guest
opera:ng
system
and
the
soKware
that
runs
on
the
guest
opera:ng
system
are
said
to
define
a
virtual
machine
or
“VM”.
The
virtual
machine
can
be
defined
by
a
central
service
and
then
delivered
to
the
students
in
such
a
way
that
each
receives
a
copy
of
exactly
the
same
virtual
machine
in
terms
of
its
opera:ng
system
and
the
applica:ons
preinstalled
onto
it.
9
10. What
this
means
is
that
we
can
define
a
VM,
preinstall
soKware
onto
it,
and
ship
it
to
students
so
they
can
run
it
via
a
virtualisa:on
plaMorm
installed
onto
their
machine.
The
VM
can
run
applica:ons
as
services,
exposing
their
UIs
via
a
browser.
Files
can
easily
be
shared
between
the
host
and
guest
machines.
As
far
as
students
are
concerned,
all
they
need
to
do
is
install
a
virtualisa:on
system
onto
their
computer,
and
then
the
same
OU
virtual
machine
into
that
system
irrespec:ve
of
the
opera:ng
system
they
happen
to
be
running.
10
11. It
is
also
possible
to
run
the
VM
on
a
remote
server,
with
the
students
accessing
the
services
running
in
that
VM
via
their
browser.
This
means
that
students
can
access
services
using
computers
that
themselves
may
not
be
capable
of
installing
or
running
par4cular
applica4ons
–
such
as
some
tablet
computers.
11
12. Notebook
compu:ng
is
my
great
hope
for
the
future.
Notebook
compu4ng
is
like
spreadsheet
compu4ng,
a
democra:sa:on
of
access
to
and
the
process
of
prac:cally
based,
task
oriented
compu:ng.
Spreadsheets
help
you
get
stuff
done,
even
if
you
don’t
consider
yourself
to
be
a
programmer.
My
hope
is
that
the
notebook
metaphor
–
and
it’s
actually
quite
an
old
one
–
can
similarly
encourage
people
who
don’t
consider
themselves
programmers
to
do
and
to
use
programmy
things.
12
13. Notebook
compu4ng
buys
us
in
to
two
ways
of
thinking
that
I
think
are
useful
from
a
pedagogical
perspec:ve
–
that
is,
pedagogy
not
just
as
a
way
of
teaching
but
also
as
a
way
of
learning
in
the
sense
of
learning
about
something
through
inves4ga4ng
it.
Here,
I’m
thinking
of
an
inves:ga:on
as
a
form
of
problem
based
learning
–
I’m
not
up
enough
on
educa:onal
or
learning
theory
to
know
whether
there
is
a
body
of
theory,
or
even
just
a
school
of
thought,
about
“inves:ga:ve
learning”.
These
two
ways
of
thinking
are
literate
programming
and
reproducible
research.
13
14. In
case
you
haven’t
already
realised
it,
code
is
an
expressive
medium.
Code
has
its
poets,
and
ar:sts,
as
well
as
its
architects,
engineers
and
technicians.
One
of
the
grand
masters
of
code
is
Don
–
Donald
–
Knuth.
Don
Knuth
said
“A
literate
programmer
is
an
essayist
who
writes
programs
for
humans
to
understand”
as
part
of
a
longer
quote.
Here’s
that
longer
quote:
“Literate
programming
is
a
programming
methodology
that
combines
a
programming
language
with
a
documenta:on
language,
making
programs
more
robust,
more
portable,
and
more
easily
maintained
than
programs
wriYen
only
in
a
high-‐level
language.
“Computer
programmers
already
know
both
kind
of
languages;
they
need
only
learn
a
few
conven:ons
about
alterna:ng
between
languages
to
create
programs
that
are
works
of
literature.
A
literate
programmer
is
an
essayist
who
writes
programs
for
humans
to
understand,
instead
of
primarily
wri:ng
instruc:ons
for
machines
to
follow.
When
programs
are
wriYen
in
the
recommended
style
they
can
be
transformed
into
documents
by
a
document
compiler
and
into
efficient
code
by
an
algebraic
compiler.”
Notebooks
are
environments
that
encourage
the
programming
of
wri:ng
literate
code.
Notebooks
encourage
you
to
write
prose
and
illustrate
it
with
code
–
and
the
14
15. The
other
idea
that
the
notebooks
buy
is
into
is
reproducible
research.
I
love
this
idea
and
think
you
should
too.
It
lets
archiving
make
sense.
Do
I
really
have
to
say
any
more
than
just
show
that
quote?
Now
you
may
say
that
that’s
all
very
well
for,
I
don’t
know,
physics
or
biology,
or
science,
or
economics.
Or
social
science
in
general,
where
they
do
all
sorts
of
inexplicable
things
with
sta:s:cs
and
probably
should
try
to
keep
track
of
what
they
doing.
But
not
the
humani:es.
But
that’s
not
quite
right,
because
in
the
digital
humani4es
there
are
computa:onal
tools
that
you
can
use.
Par:cularly
in
the
areas
of
text
analysis
and
visualisa:on.
Such
as
some
of
the
visualisa:ons
we
saw
in
the
first
part
of
this
presenta:on.
But
you
need
a
tool
that
democra:ses
access
to
this
technology.
You
need
an
environment
that
the
social
scien:sts
found
in
the
form
of
a
spreadsheet.
But
beYer.
15
16. (I
also
like
to
think
of
notebooks
as
a
place
where
I
can
have
a
conversa4on
with
data.).
16
17. So
how
do
notebooks
help?
The
tool
I
want
to
describe
is
–
are
–
called
IPython
Notebooks.
IPython
Notebooks
let
you
execute
code
wriYen
in
the
Python
programming
language
in
an
interac:ve
way.
But
they
also
work
with
other
languages
–
Javascript,
Ruby,
R,
and
so
on,
as
well
as
other
applica:ons.
I
use
a
notebook
for
drawing
diagrams
using
Graphviz,
for
example.
They
also
include
words
–
of
introduc:on,
of
analysis,
of
conclusion,
of
reflec:on.
And
they
also
include
the
things
the
code
wants
to
tell
u,
or
that
the
data
wants
to
tell
us
via
the
code.
The
code
outputs.
(Or
more
correctly,
the
code+data
outputs.)
17
18. (I
also
like
to
think
of
notebooks
as
a
place
where
I
can
have
a
conversa4on
with
data.).
18
19. (I
also
like
to
think
of
notebooks
as
a
place
where
I
can
have
a
conversa4on
with
data.).
19
20. (I
also
like
to
think
of
notebooks
as
a
place
where
I
can
have
a
conversa4on
with
data.).
20
21. The
first
thing
notebooks
let
you
do
is
write
text
for
the
non-‐coding
reader.
Words.
In
English.
(Or
Spanish.
Or
French.
I
would
say
Chinese,
but
I
haven’t
checked
what
character
sets
are
supported,
so
I
can’t
say
that
for
definite
un:l
I
check!)
“Literate
programming
is
a
programming
methodology
that
combines
a
programming
language
with
a
documenta:on
language”.
That’s
what
Knuth
said.
But
we
can
take
it
further.
Past
code.
Past
documenta:on.
To
write
up.
To
story.
The
medium
in
which
we
can
write
our
human
words
is
a
simple
text
markup
language
called
markdown.
If
you’ve
ever
wriYen
HTML,
it’s
not
that
hard.
If
you’ve
ever
wriYen
and
email
and
wrapped
asterisks
around
a
word
or
phrase
to
emphasise
it,
or
wriYen
a
list
of
items
down
by
puxng
each
new
item
onto
a
new
line
and
preceding
it
with
a
dash,
it’s
that
easy.
21
22. Here’s
a
notebook,
and
here’s
some
text.
There’s
also
some
code.
But
note
the
text
–
we
have
a
header,
and
then
some
“human
text”.
You
might
also
no:ce
some
up
and
down
arrows
in
the
notebook
toolbar.
These
allow
us
to
rearrange
the
order
of
the
cells
in
the
notebook
in
a
straighMorward
way.
In
a
sense,
we
are
encouraged
to
rearrange
the
sequence
of
cells
into
an
order
that
makes
more
sense
as
a
narra:ve
for
the
reader
of
the
document,
or
in
the
execu:on
of
an
inves:ga:on.
The
downside
of
this
is
that
we
can
author
a
document
in
a
‘non-‐linear’
way
and
then
linearise
it
for
final
distribu:on
simply
by
reordering
the
order
in
which
the
cells
are
presented.
There
are
constraints
though
–
if
a
cell
computa4onally
depends
on
the
result
of,
or
state
change
resul:ng
from,
the
execu:on
of
a
prior
cell,
their
rela:ve
ordering
cannot
be
changed.
22
23. As
well
as
human
readable
text
cells
–
markdown
cells
or
header
cells
at
a
variety
of
levels
–
there
are
also
code
cells.
Code
cells
allow
you
to
write
(or
copy
and
paste
in)
code
and
then
run
it.
Applica:ons
give
you
menu
op:ons
that
in
the
background
copy,
paste
and
execute
the
code
you
want
to
run,
or
apply
to
some
par:cular
set
of
data,
or
text.
Code
cells
work
the
same
way,
but
they’re
naked.
They
show
you
the
code.
At
this
point
it’s
important
to
remember
that
code
can
call
code.
Thousands
of
lines
of
code
that
do
really
clever
and
difficult
things
can
be
called
from
a
single
line
of
code.
OKen
code
with
a
sensible
func:on
name
just
like
a
sensible
menu
item
label.
A
self-‐describing
name
that
calls
the
masses
of
really
clever
code
that
someone
else
has
wriYen
behind
the
scenes.
But
you
know
which
code
because
you
just
called
it.
Explicitly.
Let’s
see
an
example
–
not
a
brilliant
example,
but
an
example
nonetheless.
23
24. Here’s
some
code.
It’s
actually
two
code
cells
–
in
one,
I
define
a
func:on.
In
the
second,
I
call
it.
(Already
this
is
revisionist.
I
developed
the
func:on
by
not
wrapping
it
in
a
func:on.
It
was
just
a
series
of
lines
of
code
that
wrote
to
perform
a
par:cular
task.
But
it
was
a
useful
task.
So
I
wrapped
the
lines
of
code
in
a
func:on,
and
now
I
can
call
those
lines
of
code
just
by
calling
the
func:on
name.
I
can
also
hide
the
func:on
in
another
file,
outside
of
the
notebook,
then
just
include
it
in
any
notebook
I
want
to…
…or
within
a
notebook,
I
could
just
copy
a
set
of
lines
of
code
and
repeatedly
paste
them
into
the
notebook,
applying
them
to
a
different
set
of
data
each
:me…
but
that
just
gets
messy,
and
that’s
what
being
able
to
call
a
bunch
of
lines
of
coped
wrapped
up
in
a
func:on
call
avoids.
24
25. As
far
as
reproducible
research
goes,
the
ability
of
a
notebook
to
execute
a
code
element
and
display
the
output
from
execu4ng
that
code
means
that
there
is
a
one-‐
to-‐one
binding
between
a
code
fragment
and
the
data
on
which
it
operates
and
the
output
obtained
from
execu:ng
just
that
code
on
just
that
data.
25
26. The
output
of
the
code
is
not
a
human
copied
and
pasted
artefact.
The
output
of
the
code
–
in
this
case,
the
result
of
execu:ng
a
par:cular
func:on
–
is
only
and
exactly
the
output
from
execu:ng
that
func:on
on
a
specified
dataset.
26
27. The
output
of
a
code
cell
is
not
limited
to
the
arcane
outputs
of
a
computa:onal
func:on.
We
can
display
data
table
results
as
data
tables.
27
28. We
can
also
generate
rich
HTML
outputs
–
in
this
case
an
interac:ve
map
overlaid
with
markers
corresponding
to
loca:ons
specified
in
a
dataset,
and
with
lines
connec:ng
markers
as
defined
by
connec:ons
described
in
the
original
dataset.
We
can
also
delete
the
outputs
of
all
the
code
cells,
and
then
rerun
the
code,
one
step
–
one
cell
–
aKer
the
other.
Reproducing
results
becomes
simply
a
maYer
of
rerunning
the
code
in
the
notebook
against
the
data
loaded
in
by
the
notebook
–
and
then
comparing
the
code
cell
outputs
to
the
code
cell
outputs
of
the
original
document.
Tools
are
also
under
development
that
help
spot
differences
between
those
outputs,
at
least
in
cases
where
the
outputs
are
text
based.
28
29. So
can
we
run
virtual
machines
and
IPython
notebooks
together?
29
30. The
IPython
notebooks
are
actually
browser
based
front
end
applica:ons
being
powered
by
an
IPython
server…
30
31. It’s
easy
enough
to
run
the
IPython
server
on
a
virtual
machine,
either
running
as
a
guest
VM
on
a
student’s
host
computer,
or
running
as
on
online
service
accessed
by
the
student
via
the
web
using
their
own
web
browser.
31
32. There
is
a
lot
more
that
could
be
said
–
for
example:
-‐ workflows
around
the
building/provisioning
of
virtual
machines,
-‐ how
we
might
be
able
to
host
such
machines
either
centrally
or
as
a
self-‐service
op:on,
-‐ the
corollary
between
notebook
style
compu:ng
and
spreadsheets,
-‐ the
no:on
of
conversa4ons
with
data,
-‐ etc.
etc.
32