4. Source: Patrick Kirch, On the Road of the Winds: An Archaeological History of the Pacific Islands before European Contact (Oakland: University of California Press, 2017), p. 6.
5. Sources: Fishhook map and photo: Val Attenbrow, ‘Aboriginal fishing in Port Jackson’, in The Natural History of Sydney (Sydney, 2010); Dingo photograph: Henry Whitehead - Original photograph, CC BY-SA
3.0, https://commons.wikimedia.org/w/index.php?curid=12057483; Sahul and Sundaland map: Kirch, p. 57; Linguistic data: Rachel Hendery (personal communication), and POLLEX database.
Language Word
Hawaiian kapu
NZ Maori tapu
Proto southern
Vanuatu
*tabur
Gugu Yimidhirr thabul
7. Dunlop’s transcription:
Nge a runba wonung bulkirra umbilinto bulwarra;
Pital burra kultan wirripang buntoa
Modern reconstruction (Wafer 2017, p. 204):
ngayaranpa wanang palkirr yampilintu pulwarra
pital para katan wiripang pantuwa
Dunlop’s poetic translation:
Our home is the gibber-gunyah
Where hill joins hill on high;
…
And the rushing of wings, as the wangas pass,
Sweeps the wallaby’s print from the glistening grass.
Modern literal translation (Wafer 2017, p. 206):
Ours is the place where the mountains cohabit with the heights
The eaglehawks and wallabies are happy
TheSydneyMorningHerald,11Oct1848
9. Harvest Clean Decipher
Encoder-Decoder Text
Correction Model
Language Classification
Model
Train Train
ALTA dataset of 6000 hand-
corrected articles
(Cassidy and Mollá, 2017)
???
InferCleaned tokensCleanTokenised text
Local
PostgreSQL
Database
Untold Riches
Public API
11. RNN Basics:
A single time-step
RNN Cell
(this time-
step)
i 0, 0, 0, … 1, … 0, 0, 0
‘one-
hot’
vector
RNN Cell
(previous
time-step)
0.99
0.24
0.01
...
c<t-1>(n-1)
c<t-1>(n)
o<t>
RNN Cell
(next time-
step)
c<t>
c<t>: the ‘memory cell’ at time-step t. It is updated each
time and saved for the next time-step
o<t>: the ‘output’ at time-step t.
k n
12. Model design #1: A general
English-language model
RNN
Cell
k
RNN
Cell
i
RNN
Cell
n
RNN
Cell
d
RNN
Cell
y
RNN
Cell
n
RNN
Cell
e
RNN
Cell
E
RNN
Cell
S
σ σ σ σ σ σ σ σ σ
σ
The ‘softmax activation function’ guesses the next letter based on
the output of the cell, returning a vector of probabilities, e.g.:
(P(a)=0.01, P(b)=0.4, P(c)=0.02, … P(z)=0.001, P(S)=0.1, P(E)=0.002)
14. Model design #2: Binary
classification model
RNN
Cell
S
The ‘softmax activation function’ predicts whether the
whole word is English or Australian and simply outputs
a two-vector, e.g.:
(P(English)=0.37, P(Australian)=0.63)
RNN
Cell
RNN
Cell
k
RNN
Cell
RNN
Cell
i
RNN
Cell
RNN
Cell
n
RNN
Cell
RNN
Cell
e
RNN
Cell
RNN
Cell
E
RNN
Cell
…
…
…
Concatenate
σ