Complexity theory seems to be unimportant in our daily work, until it hits one on the head.
Pretty much every computer science
student has to learn a little bit about computation complexity
theory -- those big-O notations. Well, most of us forget about
it pretty quickly afterwards. Here is an incident that
reminded me the peril of such practice.
We were working on
some thing that uses an algorithm that require O(N^2) in space and
and O(N^3) in time on input with N records. I programmed it
and tested the algorithm on a few sample data sets. Everything
seems to work fine. Until one day we gave it a larger data
set. After quite a few hours of running, the program crashed
without much indication of what went wrong. I checked the
program and rerun some of the smaller tests, nothing obvously wrong
was spotted -- it should have worked. Don't know what to do, I
rerun the program on the larger data set. Sure enough, after a
few hours it crashed again. After a few days of head
scratching, I eventually realized that the data set was too large
and it was running out of computer memory. The computer had
1GB of memory, which is about 256 M words. Assuming that the
algorithm requires exactly N^2 words, N can be no more than 19,000,
which is much smaller than I was expecting. The larger data
set had more than 50,000 data points. Of course, it would run
out of memory.
Thank goodness, it ran out of memory,
otherwise I would have to wait for a long time for the algorithm to
finish.
PS: This particular algorithm was used in producing a
number of published papers including "Minimizing
I/O Costs of Multi-Dimensional Queries with Bitmap Indices"
and "Optimizing
I/O Costs of Multi-Dimensional Queries using Bitmap Indices."
| ||||||
© John Wu
The views and opinions expressed in this page are strictly those of the page author.
The contents of this page have not been reviewed or approved by the University of Minnesota.