1. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Visualization of high dimensional and large data set
by RnavGraph and its application of suicide data in
Japan
Takafumi Kubota1 , Makoto Tomita2 ,
Fumio Ishioka3 and Toshiharu Fujita1
1
The Institute of Statistical Mathematics
2
Tokyo Medical and Dental University
3
Okayama University
December 26, 2011
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
2. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
1 Introduction
Statistics of Suicide in Japan
Objective
2 Spatio Clustering of Suicide Data in Japan
Statistics of Community for the Death from Suicide
Heirachical cluster analysis
3 Application of RnavGraph
Install
Application of the Suicide data
4 Summary and Future Studies
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
3. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Statistics of Suicide in Japan
We briefly introduce statistics of suicide in Japan at the points of
When?
Where?
Who?
Sex
Age-group
We changed the color of Age-group to red because it is our
objective of this presentation.
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
4. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Statistics of Suicide in Japan
When? (Time Series of the Number of Suicide)
White paper of suicide prevention (2011)
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
5. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Statistics of Suicide in Japan
The number of suicide rapidly increased from 1997 to 1998
Burst of the economic bubble (1990-1992)
Economic recession (1993-1997)
→ Bankruptcy, corporate downsizing, unemployment,...
In this study, we use the time period of 1988-1992; before
rapidly increased time periods
For our future studies, we will use other time periods:
→ (1988-1992),1993-1997,1998-2002,2003-2007,...
Individually (Purely spatial clustering)
Simultaneously (Spatio-temporal clustering)
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
6. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Statistics of Suicide in Japan
Where? (Hotspot and Coolspot)
The results of spatial clustering. The color legend is as follows
Hotspot
Most likely cluster
Second most likely cluster
Coolspot
Most likely cluster
Second most likely cluster
Otherwise
Kubota, et al. (2011)
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
7. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Statistics of Suicide in Japan
Hotspots and Coolspots of Male Case in 1988-1992
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
8. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Statistics of Suicide in Japan
Who? (Sex and Age Group of the Number of Suicide)
White paper of suicide prevention (2011)
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
9. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Objective
Objective
From the bar chart, we can find differences of proportions between
age groups.
→
Our goal is to find characteristics of age-grouped spatial data of
suicide in Japan.
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
10. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Objective
Analysis Procedure
1 Dendrogram; the results of hierarchical clustering
2 Dynamic tree cut
3 Reasoning for each cluster
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
11. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Objective
How we apply RnavGraph to the results of clustering?
To visualize the result of clustering, we will find the common points
in same cluster.
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
12. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Statistics of Community for the Death from Suicide
Statistics of Community for the Death from Suicide
(Fujita, 2009) was updated from the Ministry of Health,
Labour and Welfare demographic survey of death
Population Survey Death Report of the Ministry of Health,
Labour and Welfare
Time: (73-77, 78-82, 83-87,) 88-92, (93-97, 98-02, 02-07
, 08-09)
Place: 354 Secondary medical care zones
Sex: Male (, Female)
16 age groups
→ 4 age groups (weighted average)
10-29(10-14,15-19,20-24,25-29)
30-49(30-34,35-39,40-44,45-49)
50-69(50-54,55-59,60-64,65-69)
70+(70-74,75-79,80-84,85+)
(Ways, Marriage and Job ) . . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
13. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Heirachical cluster analysis
Result; 1900 male
From the result, it seems that there are four groups.
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
14. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Heirachical cluster analysis
Choropleth map
4 clusters cut by dynamicTreeCut
Langfelder, et al.
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
15. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Install
What is RnavGraph?
RnavGraph provides interactive visualization tools for
exploring high dimensional space through lower dimensional
trajectories, based on the concepts first presented in Hurley
and Oldford (2011).
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
16. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Install
Install
Environment
Windows 7 (64 bit)
R 2.14.0 (execute as Administrator(?))
1 install.packages(c("PairViz", "scagnostics",
2 "rgl", "grid", "MASS", "RGtk2", "hexbin", "vegan"),
3 dependencies = TRUE)
4 source("http://www.bioconductor.org/biocLite.R")
5 biocLite("graph")
6 biocLite("RBGL")
7 biocLite("RDRToolbox")
8 install.packages("RnavGraph")
9 install.packages("RnavGraphImageData")
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
17. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Install
Hello RnavGraph World!
1 library(RnavGraph)
2 ng.iris <- ng_data(name = "iris", data = iris[,1:4],
3 shortnames = c(’s.L’, ’s.W’, ’p.L’, ’p.W’),
4 group = iris$Species,
5 labels = substr(iris$Species,1,2))
6 navGraph(ng.iris)
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
18. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Application of the Suicide data
Data
suigm90 int.csv
secid age1 age2 age3 age4 group
1 101 11.91 28.50 32.98 50.45 1
2 102 9.70 40.21 46.79 36.73 2
3 103 18.93 27.49 34.52 49.23 1
... ... ... ... ... ... ...
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
19. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Application of the Suicide data
Application of Suicide Data
1 require(RnavGraph)
2 sui.m90c <- read.csv("suigm90_int.csv")
3 ng.suim90c <- ng_data(name = "SuicideMale90",
4 data = sui.m90c[,2:5])
5 ng_set(ng.suim90c, "group") <- sui.m90c[,6]
6 navGraph(ng.suim90c)
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
20. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Application of the Suicide data
Output of navGraph
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
21. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Application of the Suicide data
Application of Suicide Data (scagNav)
1 ng.sui<-ng_data(name="suicide",
2 data=sui.m90c[,2:5],
3 shortnames=c("a1","a2","a3","a4"),
4 group=sui.m90c[,6])
5 nav.sui <- scagNav(data = ng.sui,
6 scags = c("Monotonic", "NotMonotonic", "Clumpy",
7 "NotClumpy", "Convex", "NotConvex",
8 "Stringy", "NotStringy", "Skinny",
9 "NotSkinny", "Outlying","NotOutlying",
10 "Sparse", "NotSparse", "Striated",
11 "NotStriated", "Skewed", "NotSkewed"),
12 topFrac = 0.2, combineFn = max,
13 glyphs = shortnames(ng.sui), sep = ’:’)
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
22. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Application of the Suicide data
Outputs of scagNav
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
23. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Reasoning
group 3 (purple): High rate
→Large square
group 4 (orange): Low rate
→Small square
group 2 (green): High rate of age 1 (10-29)
→Long right hand
group 1 (blue): Others
For our future studies, we will use other time periods:
→ (1988-1992),1993-1997,1998-2002,2003-2007,...
Individually (Purely spatial clustering)
Simultaneously (Spatio-temporal clustering)
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
24. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
REFERENCES (1)
Fujita, T. (2009). Statistics of Community for the Death from Suicide.
National Institute of Mental Health, National Center of Neurology and
Psychiatry, Japan.
Hurley, C. and Oldford, R.W. (2011). Graphs as navigational infrastructure
for high dimensional data spaces, (Computational Statistics, to appear).
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T. (2011). Spatial
Autocorrelation Statistics and Spatial Clustering in the Areas in Japan with
Low Suicide Rates, Joint2011, pp. ???
Waddell, A. and Oldford, W. (2011). RnavGraph: an R package to visualize
high dimensional data using graphs as navigational infrastructure.
http://cran.r-project.org/web/packages/RnavGraph/vignettes/
RnavGraph.pdf(Dec. 26, 2011)
White paper of suicide prevention (2011). Cabinet Office (in Japanese)
http://www8.cao.go.jp/jisatsutaisaku/whitepaper/index-w.html
(Dec. 17, 2011)
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
25. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
REFERENCES (2)
Langfelder, P., Zhang, B. and Horvath, S. Defining clusters from a
hierarchical cluster tree:the Dynamic Tree Cut library for R
http://www.genetics.ucla.edu/labs/horvath/
CoexpressionNetwork/BranchCutting/ (Dec. 26, 2011)
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan
26. Introduction Spatio Clustering of Suicide Data in Japan Application of RnavGraph Summary and Future Studies
. . . . . . . . . .
. . . . . . . . . .
Q&A
Thank you very much for
your kind attention.
Takafumi Kubota (The Institute of Statistical Mathematics)
tkubota@ism.ac.jp
. . . . . .
Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.
Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan