Weitere ähnliche Inhalte Ähnlich wie C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire (20) Mehr von DataStax Academy (20) Kürzlich hochgeladen (20) C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Environment by Joe Maguire1. Data
Modelers
Save
Their
Careers:
Surviving
and
Thriving
with
NoSQL
Joe
Maguire
Data
Quality
Strategies,
LLC
h=p://www.DataQualityStrategies.com/
©
2013
Data
Quality
Strategies,
LLC
2. Thesis
• RelaIonal
DBMS’s
have
dominated,
• ...so
relaIonal
modeling
subsumed
other
forms,
including
conceptual
modeling.
• As
R-‐DBMS
wanes,
so
does
relaIonal
modeling
–
and
sadly,
whatever
it
subsumed.
• Conceptual
modeling
must
be
saved.
• RelaIonal
modelers
can
step
in
to
save
it...
• ...with
some
significant
effort.
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
2
3. My
PerspecIve
• Over
three
decades
in
industry
• Career
is
a
three-‐legged
stool
– Product
development
for
soVware
vendors
– SoluIon
design
for
enterprises
– Author,
Industry
Analyst,
Thought
Leader
• Specialize
in
– Modeling
– Requirements
analysis
– Data
architecture
– Data
quality
• Joe.Maguire@DataQualityStrategies.com
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
3
4. Agenda
• History
• Current
Events
• Your
Future
as
a
Data
Modeler
• Q&A
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
4
5. A
Big-‐Picture
Framework
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
5
Meta-‐model
Data
Perspec1ve
Conceptual
• EnIIes
• A=ributes
• RelaIonships
• IdenIfiers
Logical
• Tables
• Columns
• Primary
and
foreign
keys
Physical
• Indexes
• Table
spaces
• VerIcal
and
horizontal
parIIoning
• DenormalizaIons
6. Good
Ideas
in
the
Framework
• InformaIon
Hiding
– e.g.,
conceptual
excludes
implementaIon
details
• The
Type/Instance
disIncIon
– Models
describe
categories,
data
describes
members
• ApplicaIon/Data
Independence
– Data
modeling
is
separate
from
process
modeling
• User
Requirements
≠
System
Requirements
– Users
should
not
parIcipate
in
logical
and
physical
• Model-‐Driven
Development
– Forward
and
reverse
engineering
across
model
levels
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
6
7. A
Big-‐Picture
Framework,
distorted
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
7
Meta-‐model
Data
Perspec1ve
RelaIonal
• EnIIes
/
Tables
• A=ributes
/
Columns
• RelaIonships
/
FKs
• IdenIfiers
/
PKs
Physical
• Indexes
• Table
spaces
• VerIcal
and
horizontal
parIIoning
• DenormalizaIons
8. How
the
DistorIon
Happens
• Tool
Vendors
Dismiss
Conceptual
Modeling
– Because
their
tools
cannot
support
it
anyway
• Info
Mgmt
Specialists
Confuse
Models
w
Reality
– E.g.,
believing
the
relaIonal
model
suffices
to
describe
the
universe
• InsItuIonalized
Expediency
– We
know
about
conceptual
modeling,
but
to
save
Ime,
we
combine
it
with
relaIonal
modeling...
– ...then
we
formalize
that
into
our
dev
processes...
– ...and
eventually,
that
becomes
the
“best
pracIces.”
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
8
9. DistorIons,
Revisited
• Summary
of
DistorIons:
– DistorIon:
Conceptual
means
vague
– DistorIon:
Logical
implies
relaIonal
• Rather
than
XML,
OO,
KV
Store,
Array
Database,
Graph
Database
• Results
of
DistorIons:
– Two
levels
only:
relaIonal
and
physical
– RelaIonal
modeling
used
for
user
requirements
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
9
10. Agenda
• History
• Current
Events
• Your
Future
as
a
Data
Modeler
• Q&A
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
10
11. Current
Events:
NoSQL
• The
“Just
Say
No”
InterpretaIon
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
11
Meta-‐model
Data
Perspec1ve
Logical
RelaIonal
• EnIIes
/
Tables
• A=ributes
/
Columns
• RelaIonships
/
FKs
• IdenIfiers
/
PKs
Physical
NO
LONGER
RELATIONAL:
• Schemas
Based
on
Big
Table
ImplementaIons
• Alien
DDL
language
• Limited
Support
from
Modeling
Tools
12. Current
Events:
NoSQL
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
12
• The
“Not
Only
SQL”
InterpretaIon
– Okay,
so
there
might
be
some
work
for
you
– But
you’re
at
risk
of
being
marginalized
13. Agenda
• History
• Current
Events
• Your
Future
as
a
Data
Modeler
• Summary
• Q&A
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
13
14. Your
Future
as
a
Modeler
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
14
• Remaining
Relevant
– Selfishly:
Saving
your
career
– Nobly:
Serving
your
client
/
company
/
customer
• What
you
can
do:
– Wait
for
relaIonal
projects
– Become
a
NoSQL
database
designer
– Help
your
client
choose
data
plasorms
• That
starts
with
understanding
the
problems
– which
starts
with
CONCEPTUAL
MODELING.
15. A
New
(?)
Modeling
Framework
• Conceptual
Modeling
• Choosing
a
Logical
Meta-‐model
• Logical
Modeling
• Physical
Modeling
• Tool
Support?
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
15
16. Conceptual
Modeling
• Behaviors
and
constructs
will
compare
to
RelaIonal
Modeling:
– Keep
some
– Discard
some
– Stress
some
– Change
some
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
16
18. Keep
Some
• Keep
EnIIes
• Keep
A=ributes
• Keep
RelaIonships
• Keep
IdenIfiers
• Keep
Maximum
Cardinality
of
RelaIonships
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
18
19. Keep
EnIIes
• Minimum
Expressiveness
• EnIIes,
Not
Tables
– Don’t
express
Horizontal
or
VerIcal
ParIIoning
for
performance
• But
yes
is
moIvated
by
privacy/security/risk
• EnIty
names,
not
table
names
– Honor
user
vocabulary,
not
IT
naming
standards
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
19
20. Keep
A=ributes
• Honor
User
Phenomenon
– A=ributes
are
part
of
user
discourse
• A=ributes,
not
columns
– Worry
about
scale
(nominal,
numeric,
ordinal,
Boolean,
cyclic),
not
data
type
– A=ribute
names,
not
column
names
• Support
in-‐progress
models
– During
which
a=ributes
can
become
enIIes
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
20
21. Keep
RelaIonships
• Minimum
Expressiveness
– A=ributes
are
part
of
user
discourse
• Allow
many-‐many
and
collecIon
enIIes
– If
the
la=er
seem
strange,
you’ve
been
in
IT
too
long
• RelaIonships,
not
FKs
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
21
22. Keep
IdenIfiers
• IdenIfiers,
not
PKs
– IDs
are
not
moIvated
by
computerizaIon,
but
by
typography
– IDs
predate
the
informaIon
revoluIon
• and
the
automoIve
revoluIon,
for
that
ma=er
• Support
in-‐process
modeling
– IDs
help
the
modeler
ferret
out
the
homonym
problem
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
22
23. Discard
Some
• Discard
Foreign
Keys
– They’re
relaIonal
• Discard
Minimum
Cardinality
– A
funcIon
of
process
or
policy,
not
data
– Over-‐reported
by
users
• Discard
Most
Constraints
– A
funcIon
of
process
or
policy,
not
data
– Are
over-‐reported
by
users
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
23
24. Keep/Discard
Rule
of
Thumb
• Keep
– Anything
that
helps
you
and
the
users
together
discover
and
name
the
user
categories
• Discard
– Anything
else
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
24
26. Stress
Some
• Stress
Consistency
Requirements
– RelaIonal
modelers
(of
non-‐distributed
databases)
have
not
been
asking
about
these.
• Stress
Data
Volume
/
Velocity
Requirements
– Can
lead
or
force
your
to
relax
applicaIon-‐data
independence
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
26
27. Change
Some
• Change
your
process
– From
math-‐y
normalizaIon
to
English-‐y
conversaIon
with
users
– Very
difficult
to
achieve
rigor
conversaIonally
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
27
• More
help:
– Mastering
Data
Modeling:
A
User-‐Driven
Approach
by
Carlis
&
Maguire
– DataStax
Webinar:
25
June
28. A
New
Modeling
Framework
• Conceptual
Modeling
• Choosing
a
Logical
Meta-‐Model
• Logical
Modeling
• Physical
Modeling
• Tool
Support?
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
28
29. Choosing
a
Logical
Meta-‐Model
• Don’t
Assume
RelaIonal
(Duh...)
• Don’t
Assume
Big
Table
• Lots
of
Choices
– RelaIonal
– Big
Table
– XML/Document
Database
– Graph
database
– Array
database
– ...
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
29
30. A
New
Modeling
Framework
• Conceptual
Modeling
• Choosing
a
Logical
Meta-‐Model
• Logical
Modeling
• Physical
Modeling
• Tool
Support?
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
30
31. Logical,
Physical,
and
Tool
Support
• Community
needs
to
develop
a
roster
of
shapes
– And
the
a=endant
transformaIons
from
conceptual
shapes
to
Big-‐Table
shapes
• During
Logical
Big-‐Table
modeling,
process
requirements
will
infiltrate
– including
things
like
minimum
cardinality
• Minimal
support
from
modeling
tools
– Because
few
tools
support
conceptual
modeling
– Because
vendors
have
not
caught
up
to
NoSQL
yet
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
31
32. Agenda
• History
• Current
Events
• Your
Future
as
a
Data
Modeler
• Summary
• Q&A
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
32
33. Summary
• Re-‐commit
to
conceptual
modeling
for
requirements
analysis
– Some
but
not
all
relaIonal-‐modeling
skills
will
apply
– Must
learn
to
focus
on
user
communicaIon,
not
nerdy
stuff
like
intermediate
normal
forms
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
33
34. Summary
• Remember
the
fundamentals,
so
that
you
can
make
informed
decisions
about
relaxing
them
– ApplicaIon-‐data
independence
– Consistency
level
as
a
user
requirement
– DeclaraIve
data
retrieval
(from
informaIon
hiding)
• AddiIonal
benefits
– Users
will
like
you
be=er
– Agile
developers
will
like
you
be=er
– This
framework
works
in
tradiIonal,
all-‐SQL
environments
#Cassandra13
©
2013
Data
Quality
Strategies,
LLC
34