SLIDES:
ABSTRACT:
Increasingly, the size, variety, update rate and complexity of location-based
datasets exceed the capacity of common spatial computing platforms to manage,
process, and analyze the data with reasonable effort. Such data is known as
Spatial Big Data (SBD). Examples include cell-phone trajectories, location-based
service requests, social media check-ins, sensor-measurements, temporally-detailed
road-maps, etc.
SBD has transformative potential. For example, a 2011 McKinsey Global
Institute report estimates savings of about $600 billion annually by 2020 in
terms of fuel and time saved by helping vehicles avoid congestion and idling.
Geo-social media is leveraged for timely detection of tornadoes and outbreaks.
Sciences are investigating SBDs for spatio-temporal hypothesis generation
as well as for complex questions, where progress was hampered by data paucity.
SBD challenges, opportunities and debates arise at the level of platforms,
analytics and scientific methods.
Platforms (e.g., Hadoop, SQL/OGIS) are challenged by iterative and interdependent
spatial algorithms as well as increasing variety (e.g., Lagrangian frame of
reference). Opportunities include both adaptation to current platforms
(e.g., non-iterative algorithms, Mahot) and explorations of alternative platforms.
Analytics methods are challenged by spatial auto-correlation, geographic
heterogeneity and need to reduce user burden by estimating neighborhood relationships.
A current debate juxtaposes rise of both simpler models (e.g., data as the model)
and more complex models (e.g., ensembles).
Scientific method debates include data quality measures (e.g., bias vs. timeliness),
impact of corporate ownership of SBD on transparency and reproducibility,
and effect of data-intensive science (fourth paradigm) of classical methods
(e.g., theory-based, hypothesis testing).
KEYWORDS:
Spatial, Spatio-temporal, Big Data, Data Analytics.
NOTE 1:
Some of the ideas discussed in this talk appeared in the following publications:
- Sushil K. Prasad et al.,
Parallel Processing over Spatial-Temporal Datasets from Geo, Bio, Climate
and Social Science Communities: A Research Roadmap.
IEEE BigData Congress, 2017, pages 232-250.
-
Agriculture Big Data (AgBD) Challenges and Opportunities From Farm To Table:
A Midwest Big Data Hub Community Whitepaper,
,
NSF Midwest Big Data Hub, December, 2017.
-
Transdisciplinary Foundations of Geospatial Data Science
(
html
,
pdf
)
ISPRS International Journal of Geo-Informatics, 6(12), 2017.
doi:10.3390/ijgi6120395.
(with Y. Xie, E. Eftelioglu, R. Ali, X. Tang, Y. Li, and R. Doshi)
-
Spatiotemporal Data Mining: A Computational Perspective ,
ISPRS
International Journal on Geo-Informtion, 4(4):2306-2338, 2015 (DOI:
10.3390/ijgi4042306). (w/ Z. Jiang, R. Ali, E. Efteliglu, X. Tang, V. Gunturi, and X. Zhou).
-
Identifying patterns in spatial information: a survey of methods
(
pdf
),
S. Shekhar, M. R. Evans, J. M. Kang and P. Mohan,
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery ,
193-214, 1(3), May/June 2011. (DOI: 10.1002/widm.25).
-
M. Evans, D. Oliver, K. Yang, X. Zhou, R. Ali, and S. Shekhar,
Enabling Spatial Big Data via CyberGIS: Challenges and Opportunities ,
CyberGIS for Geospatial Discovery and Innovation
(Ed. S. Wang and M. Goodchild), Springer, 2019,
isbn 978-94-024-1529-2.
-
Spatial Big Data: Platforms, Analytics and Science ,
under review for
GeoJournal Special Issue on Big Data, (planned).
-
Spatial Big Data: Case Studies on Volume, Velocity, and Variety ,
in
Big Data: Techniques and Technologies in Geoinformatics (Ed. H. Karimi),
isbn 978-1-46-658651-2, CRC Press, 2014.
-
Spatiotemporal data mining in the era of big spatial data: Algorithms and applications
Proceedings ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data,
pages 1-10, November 2012 ( DOI: 10.1145/2447481.2447482 ).
(with R. Vatsavai, A. Ganguly, V. Chandola, A. Stefanidis, S. Klasky).
-
Spatial Big-Data Challenges Intersecting Mobility and Cloud Computing
,
11th International ACM SIGMOD Workshop on Data Engineering for Wireless and Mobile Access, 2012.
A summary appeared in
NSF Workshop on Social Networks and Mobility in the Cloud , 2012.
(wht Michael R. Evans, Viswanath Gunturi, KwangSoo Yang).
NOTE 2:
This talk has been presented at following forums:
- August 2013:
Aalto University, Helsinki, Finland,
Department of Surveying and Planning,
Course on Spatial Data Mining
- July 2013:
Korean Advanced Institute of Technology
Computer Science Department,
Global Lecture on Spatial Data Analytics:
kaist link ,
google link .
- May 2013: Arizona State University,
School of Computing, Informatics and Decision Systems Eng.
Distinguished Colloquium on Spatial Big Data
- May 2013: Ohio State University,
Institute of Population Research (IPR)
Workshop on Big Data for Social Sciences,
Columbus, OH.
- Mar. 2013:
NSF Workshops on Big Data and Urban Informatics:
e-Infrastructure for Social Science Research on Sustainable Urban Systems , Chicago.
- Nov. 2012:
1st ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data (BigSpatial-2012)
.
- June 2012,
ARO Workshop on Big Data at Large: Applications and Algorithms
- May 2012,
NSF/SDSC Workshop on Big Data Benchmarking , San Jose, CA.
- Feb. 2012,
NSF Workshop on Social Networks and Mobility in the Cloud