- MD5 Alive?
- Posted by JuiceMan on August 20th, 2007
I'm hoping I can find somebody who might know about the guts of MD5 ,
really on a platform or implementation independent way.
Background: I've got an existing web app (Perl/CGI) chugging along
doing its thing. It's been happy for some time now. I'm relying on MD5
to give me a unique value for a password, data (m/d/y), some other
flags, prefereneces, or status as to what the user is doing all told
about 70-80 characters. I'm not relying solely on MD5 for password
authentication. I have other alogrithmns being used for that. This has
nothing to do with on-line commerce. If the user submits a form either
one or all of the following could be wrong: username, not todays date,
my code may not be able to understand where they are. I find something
wrong I reset user has to relog in and starts from the beginning. What
has become a nice little artifact of this is that users have to relog
in everyday.
So I was young back then I choose to use authentication over the use
of cookies. I'm not going to update this to change, but as I ponder my
next design for the current client I do have opportunity to try
something new. Though I'd rather not. So MD5:
risk that anybody is go looking into the source and see the hidden
fields. It's an even less of a risk that somebody would even think I'm
doing MD5 for this. I was young back then
Also in the grand scheme
things not that major user log in is not impacted password is
maintained under another encryption function of the database itself.
Worse case user will avoid logging in each day or confuse my script as
to what there doing. There's nothing for them to see that they can't
see now.
So now as I ponder my new design I would be interested in answers to
the following questions, and it sounds like I just stumbled onto a
site that might be able to answer them:
What's even better (I think?) is that I've got a twist on MD5. I'm
running the digest 25 times on itself, breaking the digest into 2
parts and inserting a known string (like a salt) in between the parts
and rerunning it another 25 times all 3 reassembled pieces(original
first, my salt, and original second). Am I interfering with the nature
MD5 by doing this and losing anything it's giving me, by doing this do
I run the risk of not getting a unique value from MD5? (Q2 Yes or No).
I've visited one of those cracker sites. I've ran MD5 a 5-6 times on
itself and they were able to tell me what my previous digest was and
eventually back to the original Ok fine. I ran it 1250 times on itself
and they weren't so successful.
I've gotten a detail implmentation of MD5 which matches what the
cracker site tells me, and the results both match for "abc" (just for
fun). I also have a detailed write up as to MD5's working. I might
just fall in love with this. Now instead being in some library
somewhere I have a sense of control of it and I know my host won't
take it away from me. As I read it there are 4 functions that are used
each "round" summarized below:
1. AND(x,y) OR AND(Notx, y)
2. AND(x,z) OR AND(Notz,y)
3. EXOR (X,Y,Z)
4. EXOR(Y, (AND (X, NotZ)
Sorry, notation's not the best but hopefully you can decipher. So what
if I changed the order of the functions like did 2,1,3,4. Can I louse
up MD5 so that it won't be effective in producing unique digest (Q3.
Yes or No)? What about if changed or scrambled what x, y, z like kept
the calculations the same like made my z into x, y into z, and x
became y? (Q4. Rearrange the letters hurts MD5?).
I'm guessing I could really louse up MD5 -- if I really got in there
and started playing around with it. As I do gaze into the future I do
like authentication over cookies although the latter will probably win/
has won out. I don't need a 128 characters of output. If I could
shorten that maybe do some different operations within the MD5
algorihtmn and not compromise it I might be able to make good use of
it.
Sorry, this inevitably brings up a debate between the use of cookies
and authentication to maintain state in HTML documents. I do admit
that cookies are better and they will win :<, but please don't blame a
guy for trying or thinking. In my brief of your site sounds like this
is what you folks do anyway.
- Posted by Ertugrul Soeylemez on August 22nd, 2007
JuiceMan <jaysgeneral@snet.net> (07-08-19 17:07:22):
MD5 is weak regarding collision resistance. Whether this is a threat to
your application depends on your specific usage scenario. If I
understood your scenario properly, then you should be safe. Still
switch to a stronger hash function for future applications.
Probably yes. Say, you're running MD5 five times, and let F = G = MD5.
You're calculating the hash value with the following formula:
H = (G . G . G . G . F) P
where P is the plaintext and `.' means function composition. Although F
and G are the same function, there is a slight difference. While F's
input space is infinite (the set of all finite messages), G's input
space is finite and in fact the same as its output space. So if MD5 is
not a bijection, then you're losing entropy by applying it multiple
times without changing the intermediate hash values. On the other hand,
you add complexity by adding computation time, but this may be a small
benefit.
Put short: Probably you shouldn't do this.
MD5 does not, and in fact cannot, guarantee uniqueness. If, like stated
above, MD5 with the same input space as output space is a bijection,
then it does guarantee uniqueness for inputs of exactly 128 bits, but we
don't know that even for the original MD5. So reasoning about how a
modified version behaves is going to be even more difficult without
proper research.
Consider that there are reasons for the way MD5 works, you shouldn't
change anything at all. Your function might be more secure, but history
has shown that in general such customizations rather lead to weaker
primitives. If you find MD5 too weak, use another hash function, but I
urge you to refrain from custom adjustments without background knowledge
and a lot of research.
There are many other methods of maintaining state in query-response
protocols. For non-public sites I like the idea of using SSL
certificate-based authentication. Unfortunately the average user won't
be able to comprehend this, and will eventually look for alternatives,
so this method really isn't suitable for public sites.
Regards,
Ertugrul Söylemez.
--
Security is the one concept, which makes things in your life stay as
they are. Otto is a man, who is afraid of changes in his life; so
naturally he does not employ security.
- Posted by Sebastian G. on August 22nd, 2007
Ertugrul Soeylemez wrote:
Actually we should assume quite the contrary: If MD5 is a pseudorandom
function, than it's a pseudorandom mapping and therefore about 1/e of all
outputs will not occur, and about the same number will be collisions.
- Posted by JuiceMan on August 23rd, 2007
On Aug 22, 7:53 am, "Sebastian G." <se...@seppig.de> wrote:
Thanks for the feedback. I figured monkeying around with the internals
of MD5 would not help me.
However, some alarming points were brought up and I wouldn't mind
clarification on them.
It sounds like I got a big NO to my 25 times, split, 25 more times. As
in this is NOT making MD5 any more secure and in fact might be making
it less secure.
Running the Digest on itself say upwards of 1000 times is -- in the
circles that I've been in kind of an accepted thing to do.
For example if I take "abc" and run MD5 on it, take the answer run MD5
on that, and do so a 1000 more times what I get is no more secure then
if I ran it once?
If I'm reading correctly it even sounds that what I get utlimately at
the end of 1000 times might be the same as if it had been done with
"xyz" 1000 times?
Is this true?
- Posted by Ertugrul Soeylemez on August 26th, 2007
JuiceMan <jaysgeneral@snet.net> (07-08-22 19:36:55):
But we don't know that. Some attacks were published, which show
evidence, that MD5 might not be a pseudo-random function.
It wouldn't, unless you're a researcher. But you may use one of the
established methods, if you really need more strength. However, mostly
just using another hash function is probably the best method.
The idea isn't extremely bad, but you shouldn't try to invent your own
strengthening function, without putting a lot of research effort on it.
Basically you're doing something quite similar to HMAC, and probably
that is part of what you're looking for.
PBKDF2 may be the answer. You can use it together with HMAC for exactly
the purpose of strengthening.
Yes, but not the way you're doing it. Look up PBKDF2.
Yes, this is true. But it's not related to the fact that you're
applying MD5 multiple times. However, doing that directly as you do it
_might_ increase the probablility for this to happen.
Regards,
Ertugrul Söylemez.
--
Security is the one concept, which makes things in your life stay as
they are. Otto is a man, who is afraid of changes in his life; so
naturally he does not employ security.
- Posted by Unruh on August 28th, 2007
JuiceMan <jaysgeneral@snet.net> writes:
Help you with what? MD5 is a cryptographic hash. Its primary purpose is to
make finding the preimage from the output very very hard. Its purpose is
NOT to prevent output collisions. What are you suing it for that you would
worry about the collisions.
Really bad idea. On each level, you loose about as he says, 1/e outputs.
Ie, aftr 1000 times,. you will have only something like 1/e^1000 unique
outputs. (I am sure it is more than that, but lets stick with the random
mapping assumption).
Probably much less secure. What are you trying to do?
- Posted by Sebastian G. on August 28th, 2007
Unruh wrote:
This would mean you loose about 0.66 bits per iteration, so after 194
iterations you would have created a unique collision...
You assume that the iterations are independent. However, for MD5 you're very
likely not not stumble upon an additional collision after the first
iteration. This is a property inherited by the design.
Be careful about your assumptions.
- Posted by Unruh on September 4th, 2007
"Sebastian G." <seppi@seppig.de> writes:
AGreed, but what makes you think that there are far MD5(MD5(random)) has
fewer collisions than MD5(random)?
Ie, what makes you think that the output of MD5 does not act like a random
input to MD5?
- Posted by Ertugrul Soeylemez on September 5th, 2007
Unruh <unruh-spam@physics.ubc.ca> (07-09-04 23:12:55):
You must have misunderstood something.
Actually it's exactly the fact that MD5 output _does_ look like random
input to MD5. Read again carefully.
Regards,
Ertugrul Söylemez.
--
Security is the one concept, which makes things in your life stay as
they are. Otto is a man, who is afraid of changes in his life; so
naturally he does not employ security.
- Posted by Sebastian G. on September 5th, 2007
Unruh wrote:
Because MD5 has been designed to resist this kind of attack. As I described,
it would be trivial to create a collision if it behaved otherwise.
A design decision.
- Posted by Sebastian G. on September 5th, 2007
Ertugrul Soeylemez wrote:
And maybe you should read again as well. If it really behaved like this,
then 192 iterations would be sufficient to create a collision in MD5 with
probability 1/e. So about 8 bit bit of security. That's clearly not a
cryptographic hash.
- Posted by Ertugrul Soeylemez on September 5th, 2007
"Sebastian G." <seppi@seppig.de> (07-09-05 06:09:16):
Uhm, just a typo. Add the "not".
Regards,
Ertugrul Söylemez.
--
Security is the one concept, which makes things in your life stay as
they are. Otto is a man, who is afraid of changes in his life; so
naturally he does not employ security.
- Posted by Unruh on September 6th, 2007
"Sebastian G." <seppi@seppig.de> writes:
No idea what you are talking about. It IS trivial to create collisions.
Take 10^64 inputs and the probablility is high that there will be a
collision (Birthday calculation). That helps one not at all in any
security application.
Exactly which design decision are you talking about?
- Posted by Sebastian G. on September 7th, 2007
Unruh wrote:
1. 2^64 inputs are already sufficient, and still 2^64 are not so trivial at all.
2. According to the iterated attack you described, 2^8 (!) inputs would be
sufficient.
Collision resistance on iteration
- Posted by ciphex on September 20th, 2007
On 7 Sep, 17:45, "Sebastian G." <se...@seppig.de> wrote:
The only reason i can think of for 'fiddling' with MD5 in order to
protect
authentication code in an online application is to defeat reverse
lookup
databases. In this case it would be better to mangle the input before
hashing in some way that doesn't decrease the randomness of the
initial input text - for example, a simple substitution/transposition
scheme might work. The cracker would then have a problem, there is
less chance that the resulting hash would appear in the database
(assuming the resulting input is long enough).
On the whole though, if you don't understand the implications of this
kind of step then I'd avoid it and, as many have said, use a stronger
hashing algorithm.