Overview
Recently I spent some time on a paper of SIGMOD 2019. It is a paper that optimizing an old algorithm based on model architecture of computer. We can learn some methods of programming skills to optimize algorithms.
The paper is:
It is about a classical algorithm “Interpolation Search”. There are some interesting paper that discussing the complexity analysis of this algorithm. You can find more from the refs of papers and I list some of them here and will discuss some in the following sections.
- Perl, Yehoshua, Alon Itai, and Haim Avni. “Interpolation search—a log log N search.” Communications of the ACM 21.7 (1978): 550-553.
- Yuan, Simon. “‘Understanding the complexity of interpolation search.” Retrieved on April 20 (2019).
Throughout this blog, I will only focus on the uniform distribution assumption on data. It covers the main skills of this paper.
Classical problem and Ideas
Given a vector of numeric data with length N
that is already sorted, and a target value T
that might or might not in the vector. Return the index of the target if found, otherwise return None.
This classical searching problem in a sorted data set is familiar to every programmer that I am sure that binary search is the first thing come to our mind. Binary search will cut off at least half of the data each probe so that the worst time complexity of binary search is O(log(n))
.
But computer scientists always ask if we can do better? Sometimes if we take statistical information into consideration we may outbreak some boundaries. (Diversion: bucket sort is such an example that outbreaks nlog(n)
.)
So, based on the data is uniformly sampled from some random variable, how about predicting the accurate index of the target and then jumping into that place to probe?
This skill can be viewed as another concrete implementation of probe-cut based search algorithm. Perl, Yehoshua 1978 had an abstract description of the search procedure.
Based on the strategy of choosing the probe-index, we will jump into a new state with invariant assumption (data is uniformly distributed). A new state is just a state with data range changed. (Diversion: can we do better or do more work on this?)
The loglog(N)
complexity is interesting and for the detailed analysis please refer to the papers I list above. Paper Yuan, Simon 2019 gave a time complexity bound formula: T(n) \le \hat{C} + T(\sqrt(n)). This formula is straightforward based on the tree model, I am still trying to find its asymptotic solution using analytical combinatorics and I believe it should have been studied.
Skills of this paper to improve performance
This section is the core of this blog that what we can learn from this SIGMOD paper. The following picture shows all the skills of this paper. It tries best to reduce the cost of the loop body: cache friendly, float number computing optimization.

And we focus on the algorithm for uniformly distributed data:

Guard
This idea is quite simple: when we are quite close to the target, let’s just do sequence scan search. Sequence scan is much more friendly to the modern cache architecture. And for sequence scan search, the algorithm use a skill to save a cmp and branch instruction.

This part of skill can be summarized in one sentence: write data-cache friendly code and locality friendly code.
Another thing is loop-unrolling, but I do not get much point from the SIP algorithm itself.

Slope reuse and Fixed-point arithmetic
The kind of optimization skills is about numeric computation. Slope reuse is an approximation algorithm that breaks the classical algorithm’s analysis logic. Not to compute slope in each iteration surely saves a lot of things, but what we are missing? Is the bound of the algorithm still log log (N)
? It is not as easy to prove and analyze as just to give out an idea. I do not have the answer for the question now.
For FIXED-POINT ARITHMETIC again this is sort of approximation and how it affects the speed of convergence is not analyzed.
This two skills combined together can reduce the div instructions in the core loop so that improves the performance.
My own thoughts
lack of math
The paper only use simulation to “prove” its conclusion, it lacks of some math analysis and that part may be very difficult. For example, when we are close to the target, we just turn to sequence scan, this is intuitive but what accurate improvement does this method bring?
What will never be out-of-date?
This algorithm was studied long before about (70-80 years ago I believe) and now it appears again in the top conf of computer science. I believe there are many many many such kinds of classical algorithms that we could|should study them again. Or in other words, as an engineer, we should always keep spending some time learning fundamentals, the analysis will sharp our mind and together with new techniques it may create new values.