Gehören Sie zu den Ersten, denen das gefällt!
Rafael Gimenez – Scaling up in a world of geolocated data
While the implementation of analytic operations on distributed computing frameworks has been widely describing, enabling the computational core of a Big Data system with capabilities for supporting geospatial querying on data is yet a challenging issue. This session aims to target that specific aspect by reviewing how researchers at BDigital Technology Centre have designed and implemented a stack for advanced Machine Learning on Urban Data by providing a way to geoquery massive amounts of HDFS data from Spark processes without hindering the overall system performance..The geospatial dimension of data is getting revealed as the most natural, powerful and intuitive way to explore the expanding world of data and services. The ability to rely on real-world axis such as places, people, events and things can provide better answers for everyday tasks for individuals, as well as a deep understanding for businesses and administrations.The Urban Data Analytics team at BDigital research efforts are focused on that scenario, with an offering built upon the ability to rapidly deploy pre-defined (but also arbitrary) analytic functions on geospatial time-series of data. Currently available developments already provide support for characterization, classification, clustering, anomaly detection and trajectory mining, while multivariate analytics and predictive functions (both on single and combined time-series) are targeted for the near future.In order to enable such analytic operations on geospatially enabled data, the underlying computational infrastructure must provide the distributed computational processes with a tool to support large-speed and highly dynamic geoquerying operations on massive amounts of data. The combination of end-to-end geoquerying components and GIS enhancements for HDFS data has been implemented and tested by BDigital as the most promising solution for such requirements.