Steve O'Hara-Smith wrote:
That brings up an interesting point. What really constitutes a pass, and
why do we care about how many passes?
If it's only about being sure you don't access the *original* data in
its original location, then the queue method will work fine. But then,
so will my algorithm that I hereby propose as an alternative:
Step 1: make a copy of the list; this can easily be done in one pass.
Step 2: use any algorithm you like to find the 5th element from the end.
If there is something unsatisfying about the algorithm, I think it indicates
that the specification "in one pass" has failed to capture what is really
desired. If you want, you can take the make-a-copy-first approach and turn
any algorithm that merely reads a data structure into a one-pass algorithm.
So I think it's important to be specific about someone is really after when
asking for something that's one-pass. Perhaps the goal is to minimize the
times we hit main memory. After all, in today's processors, L1 and L2 cache
are many times faster than main memory, and linked lists are notorious for
being spread all over memory nearly at random, which means just about every
node is a cache miss. So, one motivation for doing it in one pass is to
avoid paying that penalty in the case where the whole linked list won't
fit in the cache. (If the whole linked list does fit, then every node is
a cache hit the second time around, so no big deal. In which case,
incidentally, the cache has effectively done a make-a-copy-first optimization
on your algorithm.)
Anyway, if the point is to avoid extra cache misses (or hits to disk), then
the two pointers, one which trails the other by 5 nodes, solves the problem
quite well, provided that the cache is large enough to hold 5 nodes in it,
which is a pretty fair bet.
Getting back to the original puzzle, another way to solve this is to use
a recursive function: you can build up a list of addresses on the stack,
and when you hit the end, you merely return a number which indicates the
length of the sublist that you saw. Building it up, you could first start
with a recursive function to find the length of a list:
int linked_list_length (Node *list)
{
if (list == NULL) { return 0; }
return (1 + linked_list_length (list->next));
}
Now, to find the pointer to the Kth node from the end, you simply
need to modify the above function to check if it ever sees a sublist
of the appropriate length:
int kth_from_end (int k, Node *list, Node **result)
{
int sublist_length =
(list == NULL)
? 0
: (1 + kth_from_end (k, list->next, result));
if (sublist_length == k) { *result = list; }
return sublist_length;
}
That will waste O(n) stack, but it has the advantage that it doesn't
have a bunch of special cases like an algorithm that first moves a
pointer past 5 nodes before it starts another pointer moving. It
really wouldn't need very many test cases.
Actually, come to think of it, the point of the original question in
the interview may have been not so much the algorithm, but realizing
what test cases would needed for it. When you have special cases,
you need to come up with the right list of test cases to cover all
the cases, and that's a skill in itself.
- Logan