Tech Support > Computers & Technology > Programming > Optimization problem
Optimization problem
Posted by Padu on September 8th, 2003


Simple problem that's making my hair fall off!

The optimization in question is to reduce the number of multiplications in
an expression
i.e.:

x = a*b*c-a*d+b*c (5 multiplications)

i = b*c
x = a*(i-d)+i (optimized, 2 multiplications, same result)


Following the same concept, I want to do the same thing with these two
expressions:

x = a*b*c-a*d+b*c
y = a*b*c+a*c-b*d (8 multiplications total)


So far, this is what I got:

i = b*c
x = a*(i-d) + i
y = a*(i+c) - b*d (4 multiplications)

My professor says that it can be done with just 3 multiplications.... I
don't see how. Any hints?

Padu


Posted by Roger Willcocks on September 8th, 2003


"Padu" <padu@merlotti.com> wrote in message
news:MaicnbGvz4inM8GiU-KYvg@iswest.net...
x - y

--
Roger



Posted by Padu on September 8th, 2003


"Roger Willcocks" wrote

What do you mean by that? Could you further explain?

Padu



Posted by Corey Murtagh on September 8th, 2003


Padu wrote:

Without solving it myself (might be beyond me - been many years since
Algebra :>) I would guess that he's advising you to search for solutions
to y that use the value of x. You've used two of your multiplies on
getting the value of x, and both x and y rely on the same variables, so
see if you can construct a single-multiply equation for y that includes x.

Just a theory :>

--
Corey Murtagh
The Electric Monk
"Quidquid latine dictum sit, altum viditur!"


Posted by Thad Smith on September 9th, 2003


Roger Willcocks wrote:
Padu wrote:

You asked for a HINT, not a solution. It looks like a good hint to me.

Hint #2: factor.

Thad

Posted by Padu on September 9th, 2003


"Thad Smith" wrote:

The "x-y" and the "factor" hints were fundamental.... Half of my neurons are
dead by now (give me a discount, I've finished college almost 10 years ago)

here is my solution... now you guys can tell me if there is any other way to
do that....

Original non-optimized algorithm using 8 multiplications
x = abc-ad+bc
y = abc+ac-bd

Optimized algorithm using only 3 multiplications
i = b.c
x = a.(i-d) + i
D = (-a + b).(d + c)
y = x - D

Thank you all

Padu




Posted by Randy Crawford on September 10th, 2003


Now, for extra credit, contrast the runtime of the original problem vs.
the "optimized" expressions on a modern RISC processor with three
floating point units.

It's an enlightening exercise. And it demonstrates why the automation
of compiler optimization is damned hard.

Randy


Padu wrote:
--
Randy Crawford http://cac.engin.umich.edu
crwfrdATumich.edu http://www-personal.engin.umich.edu/~crwfrd


Posted by Padu on September 10th, 2003


"Randy Crawford" wrote

Ok, now I'm really intrigued.... Implemented both algorithms using Delphi on
a Pentium IV 1.6Ghz, here are the odd results I got:

"Optimized Algorithm"
i = b.c
x = a.(i-d) + i
D = (-a + b).(d + c)
y = x - D

Here it is the assembly code generated: (to simplify your reading, I will
modify the parameters of instructions)

*i=b.c
MOV eax, (b) //by (b) I mean the value of variable b. in the original we
would have something like [ebp - $4]
IMUL eax, (c) //first IMUL
MOV (i), eax
*x=a.(i-d)+i
MOV eax, (i)
SUB eax, (c)
IMUL eax, (a) //second IMUL
ADD eax, (i)
MOV (x), eax
*D=(-a+b).(d+c)
MOV eax,(a)
NEG eax
ADD eax, (b)
MOV edx, (d)
ADD edx, (c)
IMUL eax, edx //third IMUL
MOV (D), eax

Summary: 18 instructions being that they are: 9 MOV, 3 IMUL, 3 ADD, 2 SUB
and 1 NEG
Run a loop of 1Billion calculations and got it done in 38.938 seconds

Now, from the premisse that IMUL's are "much" slower than ADD's and SUB's, I
would expect the non optimized algorithm to be much slower.... let's see
what happens


"Non-Optimized Algorithm"
x = abc-ad+bc
y = abc+ac-bd

*x = abc-ad+bc
MOV eax, (a)
IMUL eax, (b)
IMUL eax, (c)
MOV edx, (a)
IMUL edx, (d)
SUB eax, edx
MOV edx, (b)
IMUL edx, (c)
ADD eax, edx
MOV (x), eax
*y = abc+ac-bd
MOV eax, (a)
IMUL eax, (b)
IMUL eax, (c)
MOV edx, (a)
IMUL edx, (c)
ADD eax, edx
MOV edx, (b)
IMUL edx, (d)
SUB edx, eax
MOV (y), eax

Summary: 20 instructions being that they are: 8 IMUL, 8 MOV, 2 ADD, 2 SUB
Run a loop of 1Billion calculations and got it done in 35.094 seconds


So, the non-optimized actually runs "FASTER!!!!" How's that?
If you want I can send the source.... Delphi 6

Padu



Posted by William on September 11th, 2003


"Padu" <padu@merlotti.com> wrote in message
news:cd-cnS-W7Mj65sKiXTWJkQ@iswest.net...

Probably because CPUs are usually optimized for common coding
practices and tricky (unexpected) code can defeat them. (There
are exceptions, sometimes a lucky quirk in a CPU can be taken
advantage of.)

Same holds true of many compilers. Sometimes code that doesn't
appear to be optimized as source comes out highly optimized, and
sometimes source code you spent hours tweaking runs really
slowly. Once upon a time replacing X*4 with X<<2 made sense,
but that was quite a while ago. If a shift op instead of a multiply
makes sense, any decent compiler will substitute it - or find an
even better way (on at least one CPU, there IS a faster way to
multiply an integer by 4, and MSVC++ knows what it is). -Wm





Similar Posts