SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
BIG DATA ANALYTICS ON
THE INTERNET
Dr. Shaozhong SHI
drshishaozhong@gmail.com
Drawing data from geographically
dispersed data stores over the Internet
 A showcase of internationally remote access to and
use of open data and application over the Internet is
presented.
 It shows how automation in big data analytics can be
achieved on the Internet.
 It shows the importance of standardisation and
accessibility of data.
 It illustrates, with a live example, how Open Source
tools can be utilised for advancing big data analytics.
Drawing data from geographically
dispersed data stores over the Internet
 It shows the design of a new application with use of
Open Source tools such as Pandas, Numpy and
Metplotlib.
 It explains how full automation in sourcing and
processing data and generating analytical output can
be achieved.
 It shows the importance of the standardisation of data
and the role of geographical identifiers in automated
data processing.
Some key solutions for working
across multiple Pandas dataframes
(tables)
 This PowerPoint show covers some keys which
are important to data linkage, data integration,
working across multiple Pandas dataframes
(tables), and automation in processing.
 These are key solutions for automated exact
processing of records.
 The showcase implementation is provided in a
IPython notebook. See at the link below:
 http://dev.mapofagriculture.com:9999/ipython/notebooks/sshaozhong/
2016-05-16_Automatic_Aggregation_Disaggregation_Showcase.ipynb
Original Online Data from USGS
 The original data used is a large well structured
Excel sheet at the following USGS website:
 http://water.usgs.gov/pubs/sir/2006/5012/excel/Nutri
ent_Inputs_1982-2001jan06.xls
 It is used as the input to the program. It is geo-
indexed with Federal Information Processing
Standards (FIPS) codes.
 The data is read in the newly developed program
and stored as a Pandas dataframe table.
 A subset of data was extracted for creation of a
Pandas dataframe table to serve as the input table.
Original data:
Nitrogen Input from Fertilizer Use (kilograms)
in each year between 1987 and 2001
A subset of a large spread sheet
Characterisation of the new algorithm for spatial
statistical aggregation and disaggregation
 The primary questions that this work set out to answer is
whether automated means can be designed and developed
for use in data integration and integrated processing of
agricultural census dataset,
 and whether automated aggregation by states and dis-
aggregation of values at state level into values at county
level.
 To this end, an exploratory design, development and testing
were carried out. An integrated set of algorithms were
researched, designed, implemented and tested on the Map
of Agriculture platform.
 The integrated algorithms are collectively called Data
Linkage for Data Integration and Automated Aggregation
and Dis-aggregation.
Characterisation of the new algorithm for spatial
statistical aggregation and disaggregation
 The Automated Aggregation and Dis-aggregation is a
prototype program that was developed in order to
enable rapid development of data integration and
integrated processing with Open Source Python tools
and libraries.
 The automated Aggregation and Dis-aggregation use
Python and Pandas, Numpy libraries.
 It has efficient, exact data integration, data inflow and
outflow in Pandas dataframe tables, integrated
processing characteristics.
Characterisation of the new algorithm for spatial
statistical aggregation and disaggregation
 The two sets of algorithmic solutions implemented are
automatic online sourcing of structured data
 and Automated Aggregation and Dis-aggregation itself.
The first is to access, read in and take a set of data.
 The second is to carry out an integrated processing for
aggregating county level statistics into state level
statistics and dis-aggregating state level statistics into
county level statistics by rule.
 Automated aggregation: addition and summing used.
 A loop for summing up farm and non-farm statistics at
county level for each year from 1987 to 2001.
 Aggregated state level statistics are produced by using
the State FIPS codes as the key.
Working of the processing
Aggregation:
Input
Working of the processing
Aggregation:
Output of
Adding farm
And nofarm
Statistics
Recursively
Carried out
For all years
Working of processing
Aggregation:
Output of
Application of
Groupby with
The use of
StateFIPS
Characterisation of the new algorithm for spatial
statistical aggregation and disaggregation
 The output of automatic aggregation is a Pandas dataframe
table which is indexed with the State FIPS codes.
 Dis-aggregation:
 The showcase uses a rule assuming that county level
statistics contributing to state level statistics proportionally
as determined by the area within the state.
 The totals of land areas of the states are collected from the
output of the aggregated output table through vLookup. It
is stored as a Python dictionary as a geo-referenced
dataset.
 These are mapped exactly into right positions in a new
column in the intermediary table for producing dis-
aggregated statistics.
 Output of dis-aggregating statistics on Nitrogen Input
Characterisation of the new
algorithm
 Then, calculation of ratio between each county and its state
takes place.
 A loop is used to calculate dis-aggregated statistics for all
counties for each of years from 1987 to 2001.
 This results in a Pandas dataframe table as a dis-
aggregated table.
Characterisation of the new algorithm for spatial
statistical aggregation and disaggregation
 New approach of dis-aggregating tabular statistics into
smaller geographical units (no intersection of geometric
objects is required):
 Calculation of ratio between each county and its state
takes place. A loop is used to calculate dis-aggregated
Nitrogen input statistics for all counties for each of years
between 1987 and 2001. The total of a state times the
ratio yields a dis-aggregated sum for the county. This
logic of dis-aggregation has been used in areal
interpolations as a technique for spatial disaggregation
(Flowerdew and Green, 1992&1994; Goodchild, Anselin
and Deichmann, 1993). This results in a Pandas
dataframe table as a dis-aggregated table.
Characterisation of the new algorithm for spatial
statistical aggregation and disaggregation
 Hitherto, areal interpolation and Dasymetric mapping
(Flowerdew and Green, 1992&1994; Goodchild,
Anselin
 and Deichmann, 1993) are the only known approach
and methods for spatially dis-aggregating statistics in
relevance to the current work, particularly regarding
the processing of tabular statistics in vector GIS
datasets. The current work uses the logic of areal
interpolation, as far as the datasets involved can
currently allow. The difference between the current
implementation of calculations and areal interpolation
is that the current implementation does not involve
intersection of area features/polygons.
Characterisation of the new algorithm for
spatial statistical aggregation and
disaggregation
 There is a degree of uncertainty related to the
estimates. Improvement in estimation requires further
research in the future. Nevertheless, it is a step
forward in enabling estimation given the situation
where no data are collected at county level. It offers a
means to provide a quantitative indication. It is
particularly useful to the processing of tabular
statistics or when patterns need to be visualised at
large scales.
Characterisation of the new algorithm for
spatial statistical aggregation and
disaggregation
 The algorithmic solutions are characterised by their
capabilities to track the geo-referenced data entries
throughout cycles of processing, and exact geo-
referenced data retrieval and mapping, namely data
inflow and outflow from Pandas dataframe tables.
 The dis-aggregation algorithm/procedure can be used
for directly processing of tabular statistics without
involving intersection of polygons, particularly in
situations when neatly nested geospatial boundaries
files of US states and counties are used.
Characterisation of the new algorithm for
spatial statistical aggregation and
disaggregation
 The new algorithm can carry out automatic online
sourcing of datasets and integrated processing with
Open Source Python libraries. The new algorithm
can be further extended for linking geodata from
various sources, and for creation of indexed tabular
datasets with geographical identifiers.
 It can carry out automatic aggregation and dis-
aggregation of agricultural census datasets for all
states and counties in the USA.
Characterisation of the new algorithm for
spatial statistical aggregation and
disaggregation
 The Federation Information Processing Standards (FIPS)
codes were used as geographical identifiers for geo-
referenced data entries. It plays a critical role in retrieving
data from databases and mapping data into right
positions. It plays an efficient role in enabling vLookup
solutions for retrieving data and mapping to exact
positions in tables as desired.
 Geographical identifiers serve as the key and are critically
important in linking data between tables and creating geo-
indexed tabular datasets. Geographical identifiers track
attribute data entries in reference to geospatial objects.
 This vLookup solution can be modified and used for other
geodata projects.
Output
 The output of the program includes an
aggregated statistical table by states and a
dis-aggregated table by counties.
Dis-aggregating wheat statistics into
all counties
 Data columns of StateFIPS, State
Abbreviation, County name, country FIPS
and ratio are taken from the table of dis-
aggregated nitrogen input to form a new
Pandas DataFrame table.
 Data on wheat is extracted from the
QuickStats are used. These data are state
level statistics. The data are dis-aggregated
into all counties.
Dis-aggregating wheat statistics into
all counties
 Output of dis-aggregating wheat statistics
Issues encountered
 Data type issues were encountered and resolved.
 Clear understanding of data types and methods for
changing and handling is required.
 After application of groupby command in Pandas
dataframe, the original indexing is found meaningless.
The use of FIPS codes ensures that data indexing and
linkage in records are maintained throughout
processing cycles. Mapping geo-referenced data into
exact positions in columns is very important.
Update Geo-databases and Create
digital models in Geographical Information
Systems to visualise spatial variation
 A standard Geographical Information System has digital
map associated with a tabular database of records.
 Areal interpolation and Dasymetric mapping techniques
have gained its popularity in using tabular records and
combine these with area boundary files for creating
map models.
 The approach presented in this talk is based on the use
of a neatly nested area boundary files in the
administrative hierarchy of areas of the USA.
 No intersection of digital boundaries is needed.
Analytical example: Change over time
Analytical example: Rate of Change
References
 https://www.nass.usda.gov/Quick_Stats/
 https://www.python.org/downloads/
 https://www.scipy.org/scipylib/download.html
 http://matplotlib.org/downloads.html
 https://pypi.python.org/pypi/pylab
 Contact
 4 Haythrop Close, Downhead Park, Milton Keynes,
Buckinghamshire, United Kingdom, MK15 9DD
 Mobile: +44-7909844462
 EMail: drshishaozhong@gmail.com

Weitere ähnliche Inhalte

Was ist angesagt?

2017 GIS in Emergency Management Track: Situational Awareness: Building an O...
2017 GIS in Emergency Management Track:  Situational Awareness: Building an O...2017 GIS in Emergency Management Track:  Situational Awareness: Building an O...
2017 GIS in Emergency Management Track: Situational Awareness: Building an O...GIS in the Rockies
 
Hive Correlation Optimizer
Hive Correlation OptimizerHive Correlation Optimizer
Hive Correlation OptimizerYin Huai
 
Project on nypd accident analysis using hadoop environment
Project on nypd accident analysis using hadoop environmentProject on nypd accident analysis using hadoop environment
Project on nypd accident analysis using hadoop environmentSiddharth Chaudhary
 
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy LansleyUsing R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy LansleyGuy Lansley
 
Geolocation analysis using HiveQL
Geolocation analysis using HiveQLGeolocation analysis using HiveQL
Geolocation analysis using HiveQLPriyanka Kale
 
2017 GIS in Development Track: USGS POD Implementation in USGS Cloud to Suppo...
2017 GIS in Development Track: USGS POD Implementation in USGS Cloud to Suppo...2017 GIS in Development Track: USGS POD Implementation in USGS Cloud to Suppo...
2017 GIS in Development Track: USGS POD Implementation in USGS Cloud to Suppo...GIS in the Rockies
 
Timmons Group ESRI Replication Solutions
Timmons Group ESRI Replication SolutionsTimmons Group ESRI Replication Solutions
Timmons Group ESRI Replication SolutionsTimmons Group
 
Dr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GISDr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GISShaun Lewis
 
Reactive Databases for Big Data applications
Reactive Databases for Big Data applicationsReactive Databases for Big Data applications
Reactive Databases for Big Data applicationsGraph-TA
 
CKANへの空間情報機能拡張実装の試み
CKANへの空間情報機能拡張実装の試みCKANへの空間情報機能拡張実装の試み
CKANへの空間情報機能拡張実装の試みYoichi Kayama
 
congress_project_w205_conference-FINAL
congress_project_w205_conference-FINALcongress_project_w205_conference-FINAL
congress_project_w205_conference-FINALAmir Ziai
 
Merging statistics and geospatial information - demography / commuting / spat...
Merging statistics and geospatial information - demography / commuting / spat...Merging statistics and geospatial information - demography / commuting / spat...
Merging statistics and geospatial information - demography / commuting / spat...Mirosław Migacz
 
Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression IJECEIAES
 
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
2004-09-12 Data and Tools for Web-Based Monitoring and AnalysisRudolf Husar
 
An Introduction to Mapping, GIS and Spatial Modelling in R (presentation)
An Introduction to Mapping, GIS and Spatial Modelling in R (presentation)An Introduction to Mapping, GIS and Spatial Modelling in R (presentation)
An Introduction to Mapping, GIS and Spatial Modelling in R (presentation)Rich Harris
 
Graphalytics: A big data benchmark for graph processing platforms
Graphalytics: A big data benchmark for graph processing platformsGraphalytics: A big data benchmark for graph processing platforms
Graphalytics: A big data benchmark for graph processing platformsGraph-TA
 
Field Data Collecting, Processing and Sharing: Using web Service Technologies
Field Data Collecting, Processing and Sharing: Using web Service TechnologiesField Data Collecting, Processing and Sharing: Using web Service Technologies
Field Data Collecting, Processing and Sharing: Using web Service TechnologiesNiroshan Sanjaya
 

Was ist angesagt? (19)

2017 GIS in Emergency Management Track: Situational Awareness: Building an O...
2017 GIS in Emergency Management Track:  Situational Awareness: Building an O...2017 GIS in Emergency Management Track:  Situational Awareness: Building an O...
2017 GIS in Emergency Management Track: Situational Awareness: Building an O...
 
Hive Correlation Optimizer
Hive Correlation OptimizerHive Correlation Optimizer
Hive Correlation Optimizer
 
Project on nypd accident analysis using hadoop environment
Project on nypd accident analysis using hadoop environmentProject on nypd accident analysis using hadoop environment
Project on nypd accident analysis using hadoop environment
 
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy LansleyUsing R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
 
Geolocation analysis using HiveQL
Geolocation analysis using HiveQLGeolocation analysis using HiveQL
Geolocation analysis using HiveQL
 
2017 GIS in Development Track: USGS POD Implementation in USGS Cloud to Suppo...
2017 GIS in Development Track: USGS POD Implementation in USGS Cloud to Suppo...2017 GIS in Development Track: USGS POD Implementation in USGS Cloud to Suppo...
2017 GIS in Development Track: USGS POD Implementation in USGS Cloud to Suppo...
 
Timmons Group ESRI Replication Solutions
Timmons Group ESRI Replication SolutionsTimmons Group ESRI Replication Solutions
Timmons Group ESRI Replication Solutions
 
Dr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GISDr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GIS
 
TYBSC IT SEM 6 GIS
TYBSC IT SEM 6 GISTYBSC IT SEM 6 GIS
TYBSC IT SEM 6 GIS
 
Reactive Databases for Big Data applications
Reactive Databases for Big Data applicationsReactive Databases for Big Data applications
Reactive Databases for Big Data applications
 
CKANへの空間情報機能拡張実装の試み
CKANへの空間情報機能拡張実装の試みCKANへの空間情報機能拡張実装の試み
CKANへの空間情報機能拡張実装の試み
 
congress_project_w205_conference-FINAL
congress_project_w205_conference-FINALcongress_project_w205_conference-FINAL
congress_project_w205_conference-FINAL
 
Merging statistics and geospatial information - demography / commuting / spat...
Merging statistics and geospatial information - demography / commuting / spat...Merging statistics and geospatial information - demography / commuting / spat...
Merging statistics and geospatial information - demography / commuting / spat...
 
Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression
 
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
2004-09-12 Data and Tools for Web-Based Monitoring and Analysis
 
An Introduction to Mapping, GIS and Spatial Modelling in R (presentation)
An Introduction to Mapping, GIS and Spatial Modelling in R (presentation)An Introduction to Mapping, GIS and Spatial Modelling in R (presentation)
An Introduction to Mapping, GIS and Spatial Modelling in R (presentation)
 
Maps with leafletR
Maps with leafletRMaps with leafletR
Maps with leafletR
 
Graphalytics: A big data benchmark for graph processing platforms
Graphalytics: A big data benchmark for graph processing platformsGraphalytics: A big data benchmark for graph processing platforms
Graphalytics: A big data benchmark for graph processing platforms
 
Field Data Collecting, Processing and Sharing: Using web Service Technologies
Field Data Collecting, Processing and Sharing: Using web Service TechnologiesField Data Collecting, Processing and Sharing: Using web Service Technologies
Field Data Collecting, Processing and Sharing: Using web Service Technologies
 

Andere mochten auch

Ppt for Application of big data
Ppt for Application of big dataPpt for Application of big data
Ppt for Application of big dataPrashant Sharma
 
Big data ppt
Big data pptBig data ppt
Big data pptYash Raj
 
big data overview ppt
big data overview pptbig data overview ppt
big data overview pptVIKAS KATARE
 
Big Data in Manufacturing Final PPT
Big Data in Manufacturing Final PPTBig Data in Manufacturing Final PPT
Big Data in Manufacturing Final PPTNikhil Atkuri
 
GI2015 programme+proceedings
GI2015 programme+proceedingsGI2015 programme+proceedings
GI2015 programme+proceedingsIGN Vorstand
 
GI2010 symposium-kubicek+stachon+stampach+geryk (visual healthdata)
GI2010 symposium-kubicek+stachon+stampach+geryk (visual healthdata)GI2010 symposium-kubicek+stachon+stampach+geryk (visual healthdata)
GI2010 symposium-kubicek+stachon+stampach+geryk (visual healthdata)IGN Vorstand
 
GI2010 symposium-klosa (explorers pal-amateurvermessungstechnik_osm)
GI2010 symposium-klosa (explorers pal-amateurvermessungstechnik_osm)GI2010 symposium-klosa (explorers pal-amateurvermessungstechnik_osm)
GI2010 symposium-klosa (explorers pal-amateurvermessungstechnik_osm)IGN Vorstand
 
GI2013 ppt iliev_tto_general_eng_final_reduced
GI2013 ppt iliev_tto_general_eng_final_reducedGI2013 ppt iliev_tto_general_eng_final_reduced
GI2013 ppt iliev_tto_general_eng_final_reducedIGN Vorstand
 
GI2012 pekarek+hoffmann-poster inmap
GI2012 pekarek+hoffmann-poster inmapGI2012 pekarek+hoffmann-poster inmap
GI2012 pekarek+hoffmann-poster inmapIGN Vorstand
 
僕が銀座のキャバ嬢と付き合えた方法
僕が銀座のキャバ嬢と付き合えた方法僕が銀座のキャバ嬢と付き合えた方法
僕が銀座のキャバ嬢と付き合えた方法大和 金太郎
 
QM2011_MobileStrategies
QM2011_MobileStrategiesQM2011_MobileStrategies
QM2011_MobileStrategiesHeather Zink
 
Effective planning and delivery of virtual classes meetings
Effective planning and delivery of virtual classes meetingsEffective planning and delivery of virtual classes meetings
Effective planning and delivery of virtual classes meetingsHeather Zink
 
Final bio of aids presentation
Final bio of aids presentationFinal bio of aids presentation
Final bio of aids presentationGaby Rivera
 

Andere mochten auch (20)

Ets train ppt_big_data_basics_v2.0
Ets train ppt_big_data_basics_v2.0Ets train ppt_big_data_basics_v2.0
Ets train ppt_big_data_basics_v2.0
 
Ppt for Application of big data
Ppt for Application of big dataPpt for Application of big data
Ppt for Application of big data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
big data overview ppt
big data overview pptbig data overview ppt
big data overview ppt
 
Big data Ppt
Big data PptBig data Ppt
Big data Ppt
 
Big Data in Manufacturing Final PPT
Big Data in Manufacturing Final PPTBig Data in Manufacturing Final PPT
Big Data in Manufacturing Final PPT
 
A Brand New Bag
A Brand New BagA Brand New Bag
A Brand New Bag
 
GI2015 programme+proceedings
GI2015 programme+proceedingsGI2015 programme+proceedings
GI2015 programme+proceedings
 
GI2010 symposium-kubicek+stachon+stampach+geryk (visual healthdata)
GI2010 symposium-kubicek+stachon+stampach+geryk (visual healthdata)GI2010 symposium-kubicek+stachon+stampach+geryk (visual healthdata)
GI2010 symposium-kubicek+stachon+stampach+geryk (visual healthdata)
 
GI2010 symposium-klosa (explorers pal-amateurvermessungstechnik_osm)
GI2010 symposium-klosa (explorers pal-amateurvermessungstechnik_osm)GI2010 symposium-klosa (explorers pal-amateurvermessungstechnik_osm)
GI2010 symposium-klosa (explorers pal-amateurvermessungstechnik_osm)
 
GI2013 ppt iliev_tto_general_eng_final_reduced
GI2013 ppt iliev_tto_general_eng_final_reducedGI2013 ppt iliev_tto_general_eng_final_reduced
GI2013 ppt iliev_tto_general_eng_final_reduced
 
GI2012 pekarek+hoffmann-poster inmap
GI2012 pekarek+hoffmann-poster inmapGI2012 pekarek+hoffmann-poster inmap
GI2012 pekarek+hoffmann-poster inmap
 
僕が銀座のキャバ嬢と付き合えた方法
僕が銀座のキャバ嬢と付き合えた方法僕が銀座のキャバ嬢と付き合えた方法
僕が銀座のキャバ嬢と付き合えた方法
 
QM2011_MobileStrategies
QM2011_MobileStrategiesQM2011_MobileStrategies
QM2011_MobileStrategies
 
Effective planning and delivery of virtual classes meetings
Effective planning and delivery of virtual classes meetingsEffective planning and delivery of virtual classes meetings
Effective planning and delivery of virtual classes meetings
 
Final bio of aids presentation
Final bio of aids presentationFinal bio of aids presentation
Final bio of aids presentation
 

Ähnlich wie GI2016 ppt shi (big data analytics on the internet)

Data Imputation by Soft Computing
Data Imputation by Soft ComputingData Imputation by Soft Computing
Data Imputation by Soft Computingijtsrd
 
Analysis of parking citations mapreduce techniques
Analysis of parking citations   mapreduce techniquesAnalysis of parking citations   mapreduce techniques
Analysis of parking citations mapreduce techniquesSindhujanDhayalan
 
Final Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_SharmilaFinal Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_SharmilaNithin Kakkireni
 
Map reduce advantages over parallel databases report
Map reduce advantages over parallel databases reportMap reduce advantages over parallel databases report
Map reduce advantages over parallel databases reportAhmad El Tawil
 
A Study on Data Visualization Techniques of Spatio Temporal Data
A Study on Data Visualization Techniques of Spatio Temporal DataA Study on Data Visualization Techniques of Spatio Temporal Data
A Study on Data Visualization Techniques of Spatio Temporal DataIJMTST Journal
 
Components of gis
Components of gisComponents of gis
Components of gisPramoda Raj
 
The Role of Data Science in Real Estate
The Role of Data Science in Real EstateThe Role of Data Science in Real Estate
The Role of Data Science in Real EstateCARTO
 
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...dbpublications
 
Spatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use CasesSpatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use Casesmathieuraj
 
Association Rule Mining using RHadoop
Association Rule Mining using RHadoopAssociation Rule Mining using RHadoop
Association Rule Mining using RHadoopIRJET Journal
 
Analysis of S2 (Spherical) Geometry Library Algorithm for GIS Geocoding Engin...
Analysis of S2 (Spherical) Geometry Library Algorithm for GIS Geocoding Engin...Analysis of S2 (Spherical) Geometry Library Algorithm for GIS Geocoding Engin...
Analysis of S2 (Spherical) Geometry Library Algorithm for GIS Geocoding Engin...TELKOMNIKA JOURNAL
 
Big Data on Implementation of Many to Many Clustering
Big Data on Implementation of Many to Many ClusteringBig Data on Implementation of Many to Many Clustering
Big Data on Implementation of Many to Many Clusteringpaperpublications3
 
What is GIS (PDF).pdf
What is GIS (PDF).pdfWhat is GIS (PDF).pdf
What is GIS (PDF).pdfKartikBhatt43
 
TYBSC IT PGIS Unit I Chapter I- Introduction to Geographic Information Systems
TYBSC IT PGIS Unit I  Chapter I- Introduction to Geographic Information SystemsTYBSC IT PGIS Unit I  Chapter I- Introduction to Geographic Information Systems
TYBSC IT PGIS Unit I Chapter I- Introduction to Geographic Information SystemsArti Parab Academics
 
SHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docxSHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docxShahbazKhan77289
 
R programming language in spatial analysis
R programming language in spatial analysisR programming language in spatial analysis
R programming language in spatial analysisAbhiram Kanigolla
 
A REVIEW PAPER ON BIG DATA ANALYTICS
A REVIEW PAPER ON BIG DATA ANALYTICSA REVIEW PAPER ON BIG DATA ANALYTICS
A REVIEW PAPER ON BIG DATA ANALYTICSSarah Adams
 
Performance Analysis of Hashing Mathods on the Employment of App
Performance Analysis of Hashing Mathods on the Employment of App Performance Analysis of Hashing Mathods on the Employment of App
Performance Analysis of Hashing Mathods on the Employment of App IJECEIAES
 

Ähnlich wie GI2016 ppt shi (big data analytics on the internet) (20)

Data Imputation by Soft Computing
Data Imputation by Soft ComputingData Imputation by Soft Computing
Data Imputation by Soft Computing
 
survey paper 2
survey paper 2survey paper 2
survey paper 2
 
Analysis of parking citations mapreduce techniques
Analysis of parking citations   mapreduce techniquesAnalysis of parking citations   mapreduce techniques
Analysis of parking citations mapreduce techniques
 
Final Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_SharmilaFinal Report_798 Project_Nithin_Sharmila
Final Report_798 Project_Nithin_Sharmila
 
Map reduce advantages over parallel databases report
Map reduce advantages over parallel databases reportMap reduce advantages over parallel databases report
Map reduce advantages over parallel databases report
 
A Study on Data Visualization Techniques of Spatio Temporal Data
A Study on Data Visualization Techniques of Spatio Temporal DataA Study on Data Visualization Techniques of Spatio Temporal Data
A Study on Data Visualization Techniques of Spatio Temporal Data
 
Components of gis
Components of gisComponents of gis
Components of gis
 
Data Dimensional Reduction by Order Prediction in Heterogeneous Environment
Data Dimensional Reduction by Order Prediction in Heterogeneous EnvironmentData Dimensional Reduction by Order Prediction in Heterogeneous Environment
Data Dimensional Reduction by Order Prediction in Heterogeneous Environment
 
The Role of Data Science in Real Estate
The Role of Data Science in Real EstateThe Role of Data Science in Real Estate
The Role of Data Science in Real Estate
 
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
 
Spatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use CasesSpatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use Cases
 
Association Rule Mining using RHadoop
Association Rule Mining using RHadoopAssociation Rule Mining using RHadoop
Association Rule Mining using RHadoop
 
Analysis of S2 (Spherical) Geometry Library Algorithm for GIS Geocoding Engin...
Analysis of S2 (Spherical) Geometry Library Algorithm for GIS Geocoding Engin...Analysis of S2 (Spherical) Geometry Library Algorithm for GIS Geocoding Engin...
Analysis of S2 (Spherical) Geometry Library Algorithm for GIS Geocoding Engin...
 
Big Data on Implementation of Many to Many Clustering
Big Data on Implementation of Many to Many ClusteringBig Data on Implementation of Many to Many Clustering
Big Data on Implementation of Many to Many Clustering
 
What is GIS (PDF).pdf
What is GIS (PDF).pdfWhat is GIS (PDF).pdf
What is GIS (PDF).pdf
 
TYBSC IT PGIS Unit I Chapter I- Introduction to Geographic Information Systems
TYBSC IT PGIS Unit I  Chapter I- Introduction to Geographic Information SystemsTYBSC IT PGIS Unit I  Chapter I- Introduction to Geographic Information Systems
TYBSC IT PGIS Unit I Chapter I- Introduction to Geographic Information Systems
 
SHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docxSHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docx
 
R programming language in spatial analysis
R programming language in spatial analysisR programming language in spatial analysis
R programming language in spatial analysis
 
A REVIEW PAPER ON BIG DATA ANALYTICS
A REVIEW PAPER ON BIG DATA ANALYTICSA REVIEW PAPER ON BIG DATA ANALYTICS
A REVIEW PAPER ON BIG DATA ANALYTICS
 
Performance Analysis of Hashing Mathods on the Employment of App
Performance Analysis of Hashing Mathods on the Employment of App Performance Analysis of Hashing Mathods on the Employment of App
Performance Analysis of Hashing Mathods on the Employment of App
 

Mehr von IGN Vorstand

GI2016 final programm & proceedings of abstracts & summaries
GI2016 final programm & proceedings of abstracts & summariesGI2016 final programm & proceedings of abstracts & summaries
GI2016 final programm & proceedings of abstracts & summariesIGN Vorstand
 
GI2016 ppt hoffmann address+history from_gi2000_to_gi2016
GI2016 ppt hoffmann address+history from_gi2000_to_gi2016GI2016 ppt hoffmann address+history from_gi2000_to_gi2016
GI2016 ppt hoffmann address+history from_gi2000_to_gi2016IGN Vorstand
 
GI2016 ppt böhm saxonian_gdi_1_grenze_hi_hedo
GI2016 ppt böhm saxonian_gdi_1_grenze_hi_hedoGI2016 ppt böhm saxonian_gdi_1_grenze_hi_hedo
GI2016 ppt böhm saxonian_gdi_1_grenze_hi_hedoIGN Vorstand
 
GI2016 ppt böhm saxonian_gdi_2_eine_bwk_entsteht
GI2016 ppt böhm saxonian_gdi_2_eine_bwk_entstehtGI2016 ppt böhm saxonian_gdi_2_eine_bwk_entsteht
GI2016 ppt böhm saxonian_gdi_2_eine_bwk_entstehtIGN Vorstand
 
GI2016 ppt böhm saxonian_gdi_3_vimage
GI2016 ppt böhm saxonian_gdi_3_vimageGI2016 ppt böhm saxonian_gdi_3_vimage
GI2016 ppt böhm saxonian_gdi_3_vimageIGN Vorstand
 
GI2016 ppt charvat senslog api as tools for collection of big vgi data
GI2016 ppt charvat senslog api as tools for collection of big vgi dataGI2016 ppt charvat senslog api as tools for collection of big vgi data
GI2016 ppt charvat senslog api as tools for collection of big vgi dataIGN Vorstand
 
GI2016 ppt charvat workshop geoss & conference inspire2016
GI2016 ppt charvat workshop geoss & conference inspire2016GI2016 ppt charvat workshop geoss & conference inspire2016
GI2016 ppt charvat workshop geoss & conference inspire2016IGN Vorstand
 
GI2016 ppt mayer copernicus_dresden
GI2016 ppt mayer copernicus_dresdenGI2016 ppt mayer copernicus_dresden
GI2016 ppt mayer copernicus_dresdenIGN Vorstand
 
GI2016 ppt schiller dbd-bauprofessor & zuse-dualsemantik
GI2016 ppt schiller dbd-bauprofessor & zuse-dualsemantikGI2016 ppt schiller dbd-bauprofessor & zuse-dualsemantik
GI2016 ppt schiller dbd-bauprofessor & zuse-dualsemantikIGN Vorstand
 
GI2016 ppt schiller kostenkalkül
GI2016 ppt schiller kostenkalkülGI2016 ppt schiller kostenkalkül
GI2016 ppt schiller kostenkalkülIGN Vorstand
 
GI2016 ppt shi (automatic interaction and seamless integration of big data hu...
GI2016 ppt shi (automatic interaction and seamless integration of big data hu...GI2016 ppt shi (automatic interaction and seamless integration of big data hu...
GI2016 ppt shi (automatic interaction and seamless integration of big data hu...IGN Vorstand
 
GI2016 ppt shi (cartography and communication)
GI2016 ppt shi (cartography and communication)GI2016 ppt shi (cartography and communication)
GI2016 ppt shi (cartography and communication)IGN Vorstand
 
GI2016 Open Call for Presentations
GI2016 Open Call for PresentationsGI2016 Open Call for Presentations
GI2016 Open Call for PresentationsIGN Vorstand
 
GI2015 ppt hoffmann_address_intro
GI2015 ppt hoffmann_address_introGI2015 ppt hoffmann_address_intro
GI2015 ppt hoffmann_address_introIGN Vorstand
 
CoO + GI2015 ppt_charvat ict for a sustainable agriculture – public support n...
CoO + GI2015 ppt_charvat ict for a sustainable agriculture – public support n...CoO + GI2015 ppt_charvat ict for a sustainable agriculture – public support n...
CoO + GI2015 ppt_charvat ict for a sustainable agriculture – public support n...IGN Vorstand
 
CoO + GI2015 ppt_mayer ict for a sustainable agriculture - status and missing
CoO + GI2015 ppt_mayer ict for a sustainable agriculture - status and missingCoO + GI2015 ppt_mayer ict for a sustainable agriculture - status and missing
CoO + GI2015 ppt_mayer ict for a sustainable agriculture - status and missingIGN Vorstand
 
GI2015 ppt karas dresden j.karas
GI2015 ppt karas dresden j.karasGI2015 ppt karas dresden j.karas
GI2015 ppt karas dresden j.karasIGN Vorstand
 
GI2015 ppt hladikova copernicus_agriculture_forestry_lh
GI2015 ppt hladikova copernicus_agriculture_forestry_lhGI2015 ppt hladikova copernicus_agriculture_forestry_lh
GI2015 ppt hladikova copernicus_agriculture_forestry_lhIGN Vorstand
 
GI2015 ppt fiore eurisy_presentation
GI2015 ppt fiore eurisy_presentationGI2015 ppt fiore eurisy_presentation
GI2015 ppt fiore eurisy_presentationIGN Vorstand
 
GI2014 programme+proceedings final
GI2014 programme+proceedings finalGI2014 programme+proceedings final
GI2014 programme+proceedings finalIGN Vorstand
 

Mehr von IGN Vorstand (20)

GI2016 final programm & proceedings of abstracts & summaries
GI2016 final programm & proceedings of abstracts & summariesGI2016 final programm & proceedings of abstracts & summaries
GI2016 final programm & proceedings of abstracts & summaries
 
GI2016 ppt hoffmann address+history from_gi2000_to_gi2016
GI2016 ppt hoffmann address+history from_gi2000_to_gi2016GI2016 ppt hoffmann address+history from_gi2000_to_gi2016
GI2016 ppt hoffmann address+history from_gi2000_to_gi2016
 
GI2016 ppt böhm saxonian_gdi_1_grenze_hi_hedo
GI2016 ppt böhm saxonian_gdi_1_grenze_hi_hedoGI2016 ppt böhm saxonian_gdi_1_grenze_hi_hedo
GI2016 ppt böhm saxonian_gdi_1_grenze_hi_hedo
 
GI2016 ppt böhm saxonian_gdi_2_eine_bwk_entsteht
GI2016 ppt böhm saxonian_gdi_2_eine_bwk_entstehtGI2016 ppt böhm saxonian_gdi_2_eine_bwk_entsteht
GI2016 ppt böhm saxonian_gdi_2_eine_bwk_entsteht
 
GI2016 ppt böhm saxonian_gdi_3_vimage
GI2016 ppt böhm saxonian_gdi_3_vimageGI2016 ppt böhm saxonian_gdi_3_vimage
GI2016 ppt böhm saxonian_gdi_3_vimage
 
GI2016 ppt charvat senslog api as tools for collection of big vgi data
GI2016 ppt charvat senslog api as tools for collection of big vgi dataGI2016 ppt charvat senslog api as tools for collection of big vgi data
GI2016 ppt charvat senslog api as tools for collection of big vgi data
 
GI2016 ppt charvat workshop geoss & conference inspire2016
GI2016 ppt charvat workshop geoss & conference inspire2016GI2016 ppt charvat workshop geoss & conference inspire2016
GI2016 ppt charvat workshop geoss & conference inspire2016
 
GI2016 ppt mayer copernicus_dresden
GI2016 ppt mayer copernicus_dresdenGI2016 ppt mayer copernicus_dresden
GI2016 ppt mayer copernicus_dresden
 
GI2016 ppt schiller dbd-bauprofessor & zuse-dualsemantik
GI2016 ppt schiller dbd-bauprofessor & zuse-dualsemantikGI2016 ppt schiller dbd-bauprofessor & zuse-dualsemantik
GI2016 ppt schiller dbd-bauprofessor & zuse-dualsemantik
 
GI2016 ppt schiller kostenkalkül
GI2016 ppt schiller kostenkalkülGI2016 ppt schiller kostenkalkül
GI2016 ppt schiller kostenkalkül
 
GI2016 ppt shi (automatic interaction and seamless integration of big data hu...
GI2016 ppt shi (automatic interaction and seamless integration of big data hu...GI2016 ppt shi (automatic interaction and seamless integration of big data hu...
GI2016 ppt shi (automatic interaction and seamless integration of big data hu...
 
GI2016 ppt shi (cartography and communication)
GI2016 ppt shi (cartography and communication)GI2016 ppt shi (cartography and communication)
GI2016 ppt shi (cartography and communication)
 
GI2016 Open Call for Presentations
GI2016 Open Call for PresentationsGI2016 Open Call for Presentations
GI2016 Open Call for Presentations
 
GI2015 ppt hoffmann_address_intro
GI2015 ppt hoffmann_address_introGI2015 ppt hoffmann_address_intro
GI2015 ppt hoffmann_address_intro
 
CoO + GI2015 ppt_charvat ict for a sustainable agriculture – public support n...
CoO + GI2015 ppt_charvat ict for a sustainable agriculture – public support n...CoO + GI2015 ppt_charvat ict for a sustainable agriculture – public support n...
CoO + GI2015 ppt_charvat ict for a sustainable agriculture – public support n...
 
CoO + GI2015 ppt_mayer ict for a sustainable agriculture - status and missing
CoO + GI2015 ppt_mayer ict for a sustainable agriculture - status and missingCoO + GI2015 ppt_mayer ict for a sustainable agriculture - status and missing
CoO + GI2015 ppt_mayer ict for a sustainable agriculture - status and missing
 
GI2015 ppt karas dresden j.karas
GI2015 ppt karas dresden j.karasGI2015 ppt karas dresden j.karas
GI2015 ppt karas dresden j.karas
 
GI2015 ppt hladikova copernicus_agriculture_forestry_lh
GI2015 ppt hladikova copernicus_agriculture_forestry_lhGI2015 ppt hladikova copernicus_agriculture_forestry_lh
GI2015 ppt hladikova copernicus_agriculture_forestry_lh
 
GI2015 ppt fiore eurisy_presentation
GI2015 ppt fiore eurisy_presentationGI2015 ppt fiore eurisy_presentation
GI2015 ppt fiore eurisy_presentation
 
GI2014 programme+proceedings final
GI2014 programme+proceedings finalGI2014 programme+proceedings final
GI2014 programme+proceedings final
 

Kürzlich hochgeladen

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 

Kürzlich hochgeladen (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 

GI2016 ppt shi (big data analytics on the internet)

  • 1. BIG DATA ANALYTICS ON THE INTERNET Dr. Shaozhong SHI drshishaozhong@gmail.com
  • 2. Drawing data from geographically dispersed data stores over the Internet  A showcase of internationally remote access to and use of open data and application over the Internet is presented.  It shows how automation in big data analytics can be achieved on the Internet.  It shows the importance of standardisation and accessibility of data.  It illustrates, with a live example, how Open Source tools can be utilised for advancing big data analytics.
  • 3. Drawing data from geographically dispersed data stores over the Internet  It shows the design of a new application with use of Open Source tools such as Pandas, Numpy and Metplotlib.  It explains how full automation in sourcing and processing data and generating analytical output can be achieved.  It shows the importance of the standardisation of data and the role of geographical identifiers in automated data processing.
  • 4. Some key solutions for working across multiple Pandas dataframes (tables)  This PowerPoint show covers some keys which are important to data linkage, data integration, working across multiple Pandas dataframes (tables), and automation in processing.  These are key solutions for automated exact processing of records.  The showcase implementation is provided in a IPython notebook. See at the link below:  http://dev.mapofagriculture.com:9999/ipython/notebooks/sshaozhong/ 2016-05-16_Automatic_Aggregation_Disaggregation_Showcase.ipynb
  • 5. Original Online Data from USGS  The original data used is a large well structured Excel sheet at the following USGS website:  http://water.usgs.gov/pubs/sir/2006/5012/excel/Nutri ent_Inputs_1982-2001jan06.xls  It is used as the input to the program. It is geo- indexed with Federal Information Processing Standards (FIPS) codes.  The data is read in the newly developed program and stored as a Pandas dataframe table.  A subset of data was extracted for creation of a Pandas dataframe table to serve as the input table.
  • 6. Original data: Nitrogen Input from Fertilizer Use (kilograms) in each year between 1987 and 2001 A subset of a large spread sheet
  • 7. Characterisation of the new algorithm for spatial statistical aggregation and disaggregation  The primary questions that this work set out to answer is whether automated means can be designed and developed for use in data integration and integrated processing of agricultural census dataset,  and whether automated aggregation by states and dis- aggregation of values at state level into values at county level.  To this end, an exploratory design, development and testing were carried out. An integrated set of algorithms were researched, designed, implemented and tested on the Map of Agriculture platform.  The integrated algorithms are collectively called Data Linkage for Data Integration and Automated Aggregation and Dis-aggregation.
  • 8. Characterisation of the new algorithm for spatial statistical aggregation and disaggregation  The Automated Aggregation and Dis-aggregation is a prototype program that was developed in order to enable rapid development of data integration and integrated processing with Open Source Python tools and libraries.  The automated Aggregation and Dis-aggregation use Python and Pandas, Numpy libraries.  It has efficient, exact data integration, data inflow and outflow in Pandas dataframe tables, integrated processing characteristics.
  • 9. Characterisation of the new algorithm for spatial statistical aggregation and disaggregation  The two sets of algorithmic solutions implemented are automatic online sourcing of structured data  and Automated Aggregation and Dis-aggregation itself. The first is to access, read in and take a set of data.  The second is to carry out an integrated processing for aggregating county level statistics into state level statistics and dis-aggregating state level statistics into county level statistics by rule.  Automated aggregation: addition and summing used.  A loop for summing up farm and non-farm statistics at county level for each year from 1987 to 2001.  Aggregated state level statistics are produced by using the State FIPS codes as the key.
  • 10. Working of the processing Aggregation: Input
  • 11. Working of the processing Aggregation: Output of Adding farm And nofarm Statistics Recursively Carried out For all years
  • 12. Working of processing Aggregation: Output of Application of Groupby with The use of StateFIPS
  • 13. Characterisation of the new algorithm for spatial statistical aggregation and disaggregation  The output of automatic aggregation is a Pandas dataframe table which is indexed with the State FIPS codes.  Dis-aggregation:  The showcase uses a rule assuming that county level statistics contributing to state level statistics proportionally as determined by the area within the state.  The totals of land areas of the states are collected from the output of the aggregated output table through vLookup. It is stored as a Python dictionary as a geo-referenced dataset.  These are mapped exactly into right positions in a new column in the intermediary table for producing dis- aggregated statistics.
  • 14.  Output of dis-aggregating statistics on Nitrogen Input
  • 15. Characterisation of the new algorithm  Then, calculation of ratio between each county and its state takes place.  A loop is used to calculate dis-aggregated statistics for all counties for each of years from 1987 to 2001.  This results in a Pandas dataframe table as a dis- aggregated table.
  • 16. Characterisation of the new algorithm for spatial statistical aggregation and disaggregation  New approach of dis-aggregating tabular statistics into smaller geographical units (no intersection of geometric objects is required):  Calculation of ratio between each county and its state takes place. A loop is used to calculate dis-aggregated Nitrogen input statistics for all counties for each of years between 1987 and 2001. The total of a state times the ratio yields a dis-aggregated sum for the county. This logic of dis-aggregation has been used in areal interpolations as a technique for spatial disaggregation (Flowerdew and Green, 1992&1994; Goodchild, Anselin and Deichmann, 1993). This results in a Pandas dataframe table as a dis-aggregated table.
  • 17. Characterisation of the new algorithm for spatial statistical aggregation and disaggregation  Hitherto, areal interpolation and Dasymetric mapping (Flowerdew and Green, 1992&1994; Goodchild, Anselin  and Deichmann, 1993) are the only known approach and methods for spatially dis-aggregating statistics in relevance to the current work, particularly regarding the processing of tabular statistics in vector GIS datasets. The current work uses the logic of areal interpolation, as far as the datasets involved can currently allow. The difference between the current implementation of calculations and areal interpolation is that the current implementation does not involve intersection of area features/polygons.
  • 18. Characterisation of the new algorithm for spatial statistical aggregation and disaggregation  There is a degree of uncertainty related to the estimates. Improvement in estimation requires further research in the future. Nevertheless, it is a step forward in enabling estimation given the situation where no data are collected at county level. It offers a means to provide a quantitative indication. It is particularly useful to the processing of tabular statistics or when patterns need to be visualised at large scales.
  • 19. Characterisation of the new algorithm for spatial statistical aggregation and disaggregation  The algorithmic solutions are characterised by their capabilities to track the geo-referenced data entries throughout cycles of processing, and exact geo- referenced data retrieval and mapping, namely data inflow and outflow from Pandas dataframe tables.  The dis-aggregation algorithm/procedure can be used for directly processing of tabular statistics without involving intersection of polygons, particularly in situations when neatly nested geospatial boundaries files of US states and counties are used.
  • 20. Characterisation of the new algorithm for spatial statistical aggregation and disaggregation  The new algorithm can carry out automatic online sourcing of datasets and integrated processing with Open Source Python libraries. The new algorithm can be further extended for linking geodata from various sources, and for creation of indexed tabular datasets with geographical identifiers.  It can carry out automatic aggregation and dis- aggregation of agricultural census datasets for all states and counties in the USA.
  • 21. Characterisation of the new algorithm for spatial statistical aggregation and disaggregation  The Federation Information Processing Standards (FIPS) codes were used as geographical identifiers for geo- referenced data entries. It plays a critical role in retrieving data from databases and mapping data into right positions. It plays an efficient role in enabling vLookup solutions for retrieving data and mapping to exact positions in tables as desired.  Geographical identifiers serve as the key and are critically important in linking data between tables and creating geo- indexed tabular datasets. Geographical identifiers track attribute data entries in reference to geospatial objects.  This vLookup solution can be modified and used for other geodata projects.
  • 22. Output  The output of the program includes an aggregated statistical table by states and a dis-aggregated table by counties.
  • 23. Dis-aggregating wheat statistics into all counties  Data columns of StateFIPS, State Abbreviation, County name, country FIPS and ratio are taken from the table of dis- aggregated nitrogen input to form a new Pandas DataFrame table.  Data on wheat is extracted from the QuickStats are used. These data are state level statistics. The data are dis-aggregated into all counties.
  • 24. Dis-aggregating wheat statistics into all counties  Output of dis-aggregating wheat statistics
  • 25. Issues encountered  Data type issues were encountered and resolved.  Clear understanding of data types and methods for changing and handling is required.  After application of groupby command in Pandas dataframe, the original indexing is found meaningless. The use of FIPS codes ensures that data indexing and linkage in records are maintained throughout processing cycles. Mapping geo-referenced data into exact positions in columns is very important.
  • 26. Update Geo-databases and Create digital models in Geographical Information Systems to visualise spatial variation  A standard Geographical Information System has digital map associated with a tabular database of records.  Areal interpolation and Dasymetric mapping techniques have gained its popularity in using tabular records and combine these with area boundary files for creating map models.  The approach presented in this talk is based on the use of a neatly nested area boundary files in the administrative hierarchy of areas of the USA.  No intersection of digital boundaries is needed.
  • 29. References  https://www.nass.usda.gov/Quick_Stats/  https://www.python.org/downloads/  https://www.scipy.org/scipylib/download.html  http://matplotlib.org/downloads.html  https://pypi.python.org/pypi/pylab  Contact  4 Haythrop Close, Downhead Park, Milton Keynes, Buckinghamshire, United Kingdom, MK15 9DD  Mobile: +44-7909844462  EMail: drshishaozhong@gmail.com