Increasingly, the size, variety, update rate and complexity of location-based
datasets exceed the capacity of common spatial computing platforms to manage,
process, and analyze the data with reasonable effort. Such data is known as
Spatial Big Data (SBD). Examples include cell-phone trajectories, location-based
service requests, social media check-ins, sensor-measurements, temporally-detailed
SBD has transformative potential. For example, a 2011 McKinsey Global
Institute report estimates savings of about $600 billion annually by 2020 in
terms of fuel and time saved by helping vehicles avoid congestion and idling.
Geo-social media is leveraged for timely detection of tornadoes and outbreaks.
Sciences are investigating SBDs for spatio-temporal hypothesis generation
as well as for complex questions, where progress was hampered by data paucity.
SBD challenges, opportunities and debates arise at the level of platforms,
analytics and scientific methods.
Platforms (e.g., Hadoop, SQL/OGIS) are challenged by iterative and interdependent
spatial algorithms as well as increasing variety (e.g., Lagrangian frame of
reference). Opportunities include both adaptation to current platforms
(e.g., non-iterative algorithms, Mahot) and explorations of alternative platforms.
Analytics methods are challenged by spatial auto-correlation, geographic
heterogeneity and need to reduce user burden by estimating neighborhood relationships.
A current debate juxtaposes rise of both simpler models (e.g., data as the model)
and more complex models (e.g., ensembles).
Scientific method debates include data quality measures (e.g., bias vs. timeliness),
impact of corporate ownership of SBD on transparency and reproducibility,
and effect of data-intensive science (fourth paradigm) of classical methods
(e.g., theory-based, hypothesis testing).
Spatial, Spatio-temporal, Big Data, Data Analytics.
Some of the ideas discussed in this talk appeared in the following publications:
Enabling Spatial Big Data via CyberGIS: Challenges and Opportunities ,
under review as a book chapter in
CyberGIS: Fostering a New Wave of Geospatial Innovation and Discovery
(Ed. S. Wang and M. Goodchild), Springer , 2014 (expected).
Spatial Big Data: Platforms, Analytics and Science ,
under review for
GeoJournal Special Issue on Big Data, (planned).
Spatial Big Data: Case Studies on Volume, Velocity, and Variety ,
Big Data: Techniques and Technologies in Geoinformatics (Ed. H. Karimi),
isbn 978-1-46-658651-2, CRC Press, 2014.
Spatial Big-Data Challenges Intersecting Mobility and Cloud Computing
11th International ACM SIGMOD Workshop on Data Engineering for Wireless and Mobile Access, 2012.
(A summary appeared in
NSF Workshop on Social Networks and Mobility in the Cloud , 2012.)
This talk has been presented at following forums:
- August 2013:
Aalto University, Helsinki, Finland,
Department of Surveying and Planning,
Course on Spatial Data Mining
- July 2013:
Korean Advanced Institute of Technology
Computer Science Department,
Global Lecture on Spatial Data Analytics:
kaist link ,
google link .
- May 2013: Arizona State University,
School of Computing, Informatics and Decision Systems Eng.
Distinguished Colloquium on Spatial Big Data
- May 2013: Ohio State University,
Institute of Population Research (IPR)
Workshop on Big Data for Social Sciences,
- Mar. 2013:
NSF Workshops on Big Data and Urban Informatics:
e-Infrastructure for Social Science Research on Sustainable Urban Systems , Chicago.
- Nov. 2012:
1st ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data (BigSpatial-2012)
- June 2012,
ARO Workshop on Big Data at Large: Applications and Algorithms
- May 2012,
NSF/SDSC Workshop on Big Data Benchmarking , San Jose, CA.
- Feb. 2012,
NSF Workshop on Social Networks and Mobility in the Cloud