S1: Spatial Access Methods
Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S2: 4.1 Storage: Disks and Files
  • Secondary Storage - Disks
    • hardware: platters, arms, spindle, cache (?)
    • data layout: sectors, blocks, tracks, cylinders
    • access time: seek , rotational delay, transfer
    • avg. access time is 10-15 msec vs. microsec for memory
    • ... sequential I/O vs. random I/O
  • File (Table) organization
    • Field, Records, Files
    • Blocking factor = number of Records per disk block
    • Blobs for large fields (e.g. polygon, image)
    • May factor out large fields in separate file
    • ... to keep blocking factor high


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S3: Access Methods
  • General Issues
    • 2 types: file-structure, separate index file
    • Access Method Interface (AMI):
    • ... open, close, search, bulk load, insert, delete, update
    • Functionality - partition records into buckets
  • Evaluation is function of queries not data
    • point/ range query, spatial queries (e.g. nearest neighbor)
    • ... a sample of spatial queries on pp. 119
    • unbuffered (cold) vs. buffered (hot) searches
    • Metrics: # of I/Os to answer queries
    • Implementation issues: concurrency & recovery,
    • ... cost estimation model for query processing strategies


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S4: 4.2 Spatial Indexing Overview
  • Clustering
    • Space Filling Curves
  • Access Methods
    • B+ tree with a space filling curve
    • Grid Files
    • R-trees
    • Hundreds of variations
  • Spatial Queries from GIS, VLSI, Data Mining, ...
    • OGIS predicates: inside, Overlap, Contains, Nearest ...
    • Ex. of Point, Range and Nearest neighbor queries
    • ... Get name of a city with given point coordinates
    • ... Get all cities in given area
    • ... Find nearest city to a given point


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S5: Clustering
  • Levels of clustering
    • A. Intra-object
    • B. Within a disk page or a segment
    • C. Across disk pages and or segments
  • Relational solutions for 1-d keys
    • A. OODBMS clusters objects in a part-hierarchy
    • B. Sorting records
    • C. Contiguous allocation by tracks, cylinders
  • New Issues in Spatial Databases
    • B. Sorting -- > space filling curves (e.g. Z, Hilbert)
    • ... captures about half neighborhood relationships
    • ... natural for point data, point queries
    • ... line/polygons -- > MOBR -- > points in 4-d space
    • ... rectangle query -- > a set of range queries
    • ... Nearest neighbor query not explored.


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S6: B+trees Basics
  • Query: point, range queries, 1-dim keys
  • B tree data-structure
    • tree: balanced, high fanout, page utilization (50 percent)
    • Pointer to data only in leaf nodes
    • Compression for keys, pointers
    • B+tree: threaded leaf pages for range queries
  • Concurrency & Recovery issues
    • level: leaf, internal node (sub-tree), root (tree)
    • predicate locking
    • Solution similar to R-tree (Section 4.3.1)
    • high concurrency locking: need not be 2PL


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S7: B+trees with a space filling curve
  • Query: point, range queries, N-dim keys
  • Two component to indexing spatial datasets
    • A space filling curve (e.g. Z) orders spatial keys
    • Keys in B+tree are Z-values for spatial keys
  • Processing Point Query - Fast
    • Compute z-order of query point
    • Search B+tree with z-order value
  • Processing Range Query [Orenstein] - Tedious
    • Compute set of z-order ranges inside query rectangle
    • Each z-order range is a range query on B+tree
  • Processing Nearest Neighbor Query
    • Z-order based may not effective (?)
    • If MOBR(child node) is kept with keys then
    • ... adapt algorithm for R-tree [Rossopolous]


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S8: R-trees
  • Spatial queries (e.g. point/ range queries in N dim. )
  • Data Structure: like B-tree
    • But keys are MOBR(child), one key per child
    • MOBR(child) inside MOBR(parent)
  • Performance study
    • RISC-II VLSI data set (Guttman 84)
    • Sequoia 2000 (Postgres & SHORE) w/ M/2, quadratic split
    • Content based retrieval, data mining
  • Point/Range Search
    • Recurse if overlap( MOBR(child), query point/MOBR)
    • Traverse multiple paths! (DFS)
  • Q? Will the algo. work for all OGIS operators?


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S9: R-trees - Spatial Query Processing
  • OGIS Predicates - inside, touch, overlap, ...
    • Overlap is a nice filter for many topological predicates
    • Eact predicate checked in refine step
  • General strategy for other OGIS predicates
    • child J eliminates child I iff
    • ... some condition C is satisfied
    • Recurse on childs which can not be eliminated
  • Nearest Neighbor Search for point QP
    • C = min-dist(MOBR(I), QP) > max-dist(MOBR(J), QP)
    • min-dist(rectangle R, QP) = distance(QP, closest point in R)
    • min-dist(rectangle R, QP) = distance(QP, farthest point in R)
    • ... consider corner point and sides of rectangle R
  • Q? Will general strategy work for all OGIS predicates?


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S10: Update Processing with R-tree
  • Insertion: Where: ChooseLeaf. Least enlargement.
    • Ties broken by picking smallest key.
    • Node splitting as in B+-tree.
    • Q? what goes in each partition?
    • AdjustTree required. Why not in B+-tree?
  • Deletion: Seek and destroy
    • CondenseTree: throw entries from under-full pages
    • ... back into tree while preserving level.
    • Why different from B+-tree?
  • Update: delete, update, reinsert
  • Split algorithms: exhaustive
    • quadratic: start with most distant seeds,
    • ... greedily add based on maximal preference
    • linear: like quadratic, but picks seeds differently


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S11: R-tree Variants
  • R*-tree - several optimizations
    • Faster split algorithm (n * logn)
    • Reduces sibling overlap, Smaller MOBR for parents
    • Improves page utilization, reduces splits
    • Out-performs R-tree on point/range search.
    • But Concurrency problems (reinsertion of extreme entries).
  • R+-Tree
    • no sibling overlap
    • But replicate items across leafs
    • Faster search but slow insert/delete/update
    • R-trees partition data. R+-trees partition space.
    • ... like K-D-B trees, Quad trees


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S12: Grid Files
  • Spatial queries (e.g. point/ range queries in N dim. )
  • Data Structure: Grid directory
    • N dimension vectors: cross-product is non-uniform grid
    • Grid directory : cell to data page
    • ... many cells may share a data page
    • Grid directory can be quite large!
  • Point Search
    • Binary search of each dimension to identify a cell
    • Fetch grid-cell and then the data-page pointed to
    • 2 I/Os for poit query (very fast)
  • Insert of a point key
    • Binary search of each dimension to identify a cell
    • If the page pointed has no page then split
    • ... insert a whole new row or column
    • ... induces more sharing of pages across cells


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S13: Grid Files
  • Q? How does Grid Files deal with extended objects?
    • ? Replicate across cells (pages)
    • ? Decompose large objects into sub-objects
  • Range Search
    • Binary search of each dimension - > a set of cells
    • Duplicate elimination on pages pointed to
    • Fetch unique data-page pointed to
    • #I/Os proportional to result size (very fast)
  • Nearest Neighbor Search
    • Extend R-tree algorithms on the unique pages in grid
  • Q? Will the algo. work for all OGIS operators?


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S14: 4.3 Related Topics
  • Topics
    • Concurrency Control
    • Join Index
    • TR* tree for object decomposition


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)

The views and opinions expressed in this page are strictly those of the page author.
The contents of this page have not been reviewed or approved by the University of Minnesota.