Tech Support > Computer Hardware > Microprocessors > How to stymie a disassembler
How to stymie a disassembler
Posted by Jonathan Kirwan on February 21st, 2006


On Fri, 17 Feb 2006 17:02:24 -0800, David R Brooks
<davebXXX@iinet.net.au> wrote:

The technique definitely predates Z80 as "standard fare." I used it
in 8080 coding in 1976 (POP HL comes to mind.) But I did that because
I'd learned it years before from other peoples' code running on the
PDP-11, where this style of coding was actually in the PDP-11 manuals
as an example (the JSR was wonderfully adapted to this kind of thing.)
It was more than common at the time. And I suspect that it well
predated the PDP-11 experiences I had, as a standard technique.
Probably dating at least back to the very earliest hardware JSR which
put the return address into a register. (I even tried it out on an
HP-2116, which didn't even have a stack and instead poked the return
address into the first location of the subroutine.)

Jon

Posted by Paul Keinanen on February 21st, 2006


On Tue, 21 Feb 2006 06:56:53 GMT, Jonathan Kirwan
<jkirwan@easystreet.com> wrote:

The in-line parameters were frequently used on early PDP-11 operating
system run time libraries.

Later on with processors and operating systems supporting separate I
and D spaces (to get a 64 KiB+64 KiB addressing range), the in-line
parameters would be in the I space, while the data manipulation
instruction would try to access them in the D space, thus, in-line
parameters could not be used with separate I/D spaces enabled.

One light weight Fortran IV compiler generated some kind of threaded
code using one of the general purpose register as a pseudo "program
counter". Trying to make any sense of the code with a disassembler was
very hard.

Paul


Posted by Didi on February 21st, 2006


Paul Keinanen wrote:
This reminds me an almost horror story of about 5 years ago I had
with a 5420 DSP. I had written some code which modified its program
memory - it was OK to have it accessed both as dsct and psct.
A day or two later - and after a lot of pulled hairs, I suppose - I
found
a documented problem in the errata sheet which said writing to
the psct might go unnoticed (I guess because of some hidden cache)
prior to execution of this or that opcode...
I remember it particularly well because I had just begun to code
for this DSP, and I was using the assembler I had written for the
purpose the month before that... so there were many doubtraising
variables - apart from the errata sheet which I had had around
all the time after all... :-)

But here it is - try to catch _that_ kind of thing with a
disassembler....
(not that I would imagine any use out of doing so, to be sure).

Dimiter

------------------------------------------------------
Dimiter Popoff Transgalactic Instruments

http://www.tgi-sci.com
------------------------------------------------------


Posted by Jonathan Kirwan on February 21st, 2006


On Tue, 21 Feb 2006 12:07:35 +0200, Paul Keinanen <keinanen@sci.fi>
wrote:

I can imagine.

This also reminds me of the examples provided in the PDP-11 manuals
for coroutine execution. Ah, here it is in this copy of the PDP-11/70
manual:

JSR PC, @(SP)+

It exchanges the program counter with the top entry on the stack. If
you split your main thread into two pieces and used it with other
means, as well, it could complicate automated disassembly.

Jon

Posted by randyhyde@earthlink.net on February 21st, 2006



Richard H. wrote:
I'm more interested in the concept of disassembly rather than hacking
(or anti-hacking). For example, the purpose of this current thread is
not so much to discover ways to thwart hackers (though that thread will
be coming up in a couple months) as it is to explain why disassemblers
cannot do a perfect job of disassembling code.

Actually, I've got just the guy in mind. But the publisher would see to
this anyway.

Actually, I've been really blessed with a set of copy editors from No
Starch Press who've done a good job of making stuff more readable, even
without completely understanding everything that I'm writing about.

I usually feed No Starch a completed first draft. In my experience, too
many things change over the course of writing a technical book and I
don't believe in wasting the editor's time on material that gets tossed
or rewritten because of material appearing later in the book.

This being my sixth or seventh book project (published, anyway), I know
all about that one. :-). It's amazing how many people think I'm being
"inethical" making money off books. They don't have a clue how little
you really make relative to all the work you put into it. Quite
honestly, the main reason I go through the process rather than simply
posting PDFs to my web site is because you get a much higher-quality
result when you've gone through publication. You don't get to take
shortcuts. The work gets refereed, and it gets improved in ways you
can't possibly do yourself by having a team of editors pour over the
work (and having a publisher invest money in the project with an
expectation of getting a return on their investment).

Which disassembly is not :-)

The first book or two was good for the ego. After that, no so. Indeed,
I've found that you start to make people jealous after a series of
these things and having the books with your name on them starts to work
against you.

As for the resume, I was laid off in 2000 at the height of the tech
bubble burst and let me tell you that having several books on my resume
didn't help one bit when searching for a new job.
Cheers,
Randy Hyde


Posted by Clifford Heath on February 22nd, 2006


Paul Keinanen wrote:
Not really. The code consisted of a sequence of thread pointer
and arguments, like any assembly code. You needed to look up
the actual routine that the pointer pointed to by its content,
but nearly all of them were short sequences of instructions
ending with JMP @(R4)+ (they fetched operands with MOV @(R4)+)
and they were all in the system library with names, so it wasn't
that hard to trace.

geez, that's a blast from the past - I last used that compiler
in 1981 :-).

Posted by oldfogie on February 23rd, 2006


"randyhyde@earthlink.net" <spamtrap@crayne.org> wrote in message
I spent years writing anti-virus software "back in the day" when they were
all hand-written in assembly, and I've seen about every trick in the book
and found my way around all of them.

It seems that your target of "stymie a disassembler" is actually too
limited. As well as being hard to disassemble, the code should be
able to stop debugging / tracing of the code. The code (virus in this
discussion) should run normally, but not be disassembled or run in
a debugger.

Encryption and misdirection are the two most common and successful.

For example, one virus "Whale" used subroutines that were encrypted.
When a routine was called, it first decrypted the routine and re-encrypted
it at routine exit.

Polymorphic encrypted viruses used algorithms that generated "random"
code that was actually a decryptor for the rest of the virus. Each
infection had a different decryptor, and each each looked like you
were disassembling random bytes.

Another virus, I've forgotten the name, took extreme steps to make sure
that it was not running in a disassembler, making sure that it was not being
traced. I'd have to look up it's exact mechanism.

One virus trick was to move the stack. That doesn't sound like a problem
until you realize where they had moved it. It wound up at the middle of the
interrupt table. The virus turned off interrupts (so the timer tick would
not
occur), but any attempt at debugging or tracing (INT 3, TRAP flag) would
push registers to stack and destroy the interrupt table.

Viruses can also be extremely sneaky in how they play with the flags
register. For example, push parameters to stack and call a routine.
Everything
looks normal, but the routine ends with an IRET instead of a RET, and the
flags (like TRAP) have changed.

The way to get around tricks like this is to do simulation, not disassembly.
I wrote an x86 interpreter that set up it's own memory space and used that,
so
none of the virus tricks would actually stop the debugger/interpreter. I
also, later, wrote a disassembler based on the interpreter engine.

The source and executables to the interpreter and disassembler are available
online, at
http://www.datapackrat.com/source/source.html

If you interested in my digging back into the old virus tricks, contact me
at the email address on the website.

Bill




Posted by Heikki Orsila on March 1st, 2006


randyhyde@earthlink.net <randyhyde@earthlink.net> wrote:
Have you already mentioned tricks taking advantage of CCRs (condition
code registers) in your article? I updated the article at:

http://en.wikipedia.org/wiki/Trace_vector_decoder

The trace vector that does decrypting is affected by the CCR too. A zero
integer result in the main program would cause the Z-flag bit to be set in
CCR, which would in turn affect the result of the decryption.

This kind of obfuscation is easily overcome by an instruction set simulator
that records all executed instructions in plain text.

--
Heikki Orsila Barbie's law:
heikki.orsila@iki.fi "Math is hard, let's go shopping!"
http://www.iki.fi/shd

Posted by randyhyde@earthlink.net on March 1st, 2006



Heikki Orsila wrote:
No, I hadn't consider this, but it is an interesting article.
Thanks.

Any interpreter/emulator or virtual machine execution of the code can
overcome just about any software scheme (i.e., short of external
hardware that does part of the calculation) I've seen. But for now, I'm
just happy with ideas that foul up disassemblers. You'll never stop a
determined hacker with decent tools, of course.
Cheers,
Randy Hyde


Posted by Matthias Melcher on March 2nd, 2006



Of course these are derived somewhat from what was written in the thread
before, but I have seen those used and they worked well.

- genearte a division-by-zero exception and continue execution in the
exception, then add garbled code after the buggy instructioon that will
keept the disassembler busy

- this can be imporved by using some high resolution timer tocalculate
the value, so that running the code will generate a div-zero, but
tracing it will not

- write a byte code interpreter. Then use the byte code to write yet
another byte code interpreter. Then write you code in bytecode-bytecode.
The disassmbler will decode the first level interpreter, but now a
hacker will have to understand the interpreter and write a disassembler
for it, etc. .

- Scattered jump tables that use calculated modulo values and ranges are
also a neat thing to throw disassemblers off, since they will have a
hard time finding true machine code entry points.

- old CPU's use logic instead of micro programs to execute instructions.
That way, the Z80 for example has a few hundred undocumented
instructions that do very whacy stuff. However, that whacy stuff is
repeatable across the whole Z80 family, so these instructions are used
to throw of disassmeblers.

I want to mention two more hardware based crazy techniques that are
obviously not detectable by a disassmbler.

The Sinclair ZX80 and 81 actually *jumps* right into the screen buffer
memory, abusing the PC (=IP) as a video scan line counter. An external
hardware reads the data from RAM, but pulls the CPU address bus low,
making the Z80 execute "nop", no-operation.

In the Sinclair Spectrum, there was no memory paging. When they added
external "mass" storage, the new hardware would again watch the address
bus and override the internal ROM when the PC would be 0x0008,
overriding the "error code" function.

Posted by Rufus V. Smith on March 7th, 2006



"Jonathan Kirwan" <jkirwan@easystreet.com> wrote in message
news:30dlv1lvj5t3niiokn9kmv52du5qbrmhf9@4ax.com...
Put return address into a register? Luxury!

On the PDP-8, the return address was placed at the first memory
address of your subroutine (no stack). So forget re-entrancy without some
fancy footwork.

Years and braincells have long passed, but it was something like this


JMS PCCHAR ; Print a constant character
14 ; formfeed character (octal, IIRC)
.... ; code continues


And the subroutine was something like:

PCCHAR, ; subroutine to print character in caller's memory
0 ; storage for return address
CLA CLL ; clear ACC (a data fetch is really an ADD)
TAD I PCCHAR ; get value indirectly through address stored at PCCHAR
(Add to ACC)
JMS PCHAR ; some other routine to print the character in
ACC(-umulator)
ISZ PCCHAR ; bump the address (past the character) in PCCHAR
NOP ; defensive programming, ISZ shouldn't skip on zero
JMP I PCCHAR ; jump indirectly through the address at PCCHAR (this is
the return)


Rufus



Posted by Tauno Voipio on March 7th, 2006


Rufus V. Smith wrote:
This was the customary means in the early minis, including
PDP-8 and HP 2116, and also HP 2114, Honeywell DDP-112, DDP-316
and DDP-516.

It made re-entrant code interesting to write. One solution was to
have a helper routine for pushing the return address, and which
had the final called routine address as a parameter. In the custom
of that tie, the parameter was a word in code following the JMS/JSR
(or whatever it was called).

--

Tauno Voipio
tauno voipio (at) iki fi

Posted by CBFalconer on March 7th, 2006


"Rufus V. Smith" wrote:
Well, the 8080 was an elegant architecture, when compared to the
PDP8. The advantage of this technique was twofold - it
automatically embedded the constant string, and it simplified the
output routine. I.E.:

call dump
db "string to dump",0
more code

; disturbs nothing, outputs zero terminated inline string at sp^.
dump: xthl
push psw
dump1: mov a,m
inx h
ora a
cnz putch; outputs the char in a, preserving it.
ora a
jnz dump1
pop psw
xthl
ret

if putc preserves flags, or sets them on the content of a, then the
ora a after cnz can be eliminated. The net combination is very
usable, and doesn't require much remembering about registers
disturbed. Not bad for a 15 byte subroutine.

Note that variations of this allow the output device (or file) to
be specified by the bc or de register. The only code affected is
putch, and something outside the call dump and its associated
string. It could be advantageous to have two versions of dump, say
dump and dumpf, where dump calls putc to go to stdout, and dumpf
calls putcd, to output to file identified by de. That handles
almost all the constant string output needed in an application.

I used another variant of it to load multibyte binary constants in
my PascalP code generator. Here the calling code looked like:

; de holds destination address
call load
dw length
db data of length length
more code

Notice that all these things did not foul up normal stack usage, in
that they could be interrupted, and the common subroutines
reentered if needed. By disturbing nothing, or only a register to
return a value, the routines became easily reusable without
thought.

When Intel developed the 8086/8 they destroyed the elegance by
omitting the xthl instruction, or its equivalent. This made it
impossible to manipulate the registers without destroying
something. The elimination of conditional calls was a nuisance,
but not fatal to really taut assembly code.

The only disadvantage was that it became almost impossible to
combine identical strings, or identical large constants.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>



Posted by Jonathan Kirwan on March 7th, 2006


On Tue, 07 Mar 2006 22:19:26 GMT, Tauno Voipio
<tauno.voipio@INVALIDiki.fi> wrote:

I coded on both the PDP-8 and the HP 21xx series (x16 and x14, both,
since one was usually the main proc and the other the i/o proc on
their timesharing system.) The others I didn't experience.

I avoided function recursion and reentrancy, at the time.

Jon

Posted by Jim Stewart on March 7th, 2006


Jonathan Kirwan wrote:
To further amplify the "old geek" part of the thread, years
ago I worked at a tiny company that made S100 peripherals.
We had 2 programmers and the target machines were IMSAI's.
The IMSAIs had an LED on the front panel that indicated a
"stack" bus cycle.

Since one of the programers was more of a hardware type and
his experience was with mini's that didn't have stacks, he
never used the stack and his program never lit the stack LED.
The other programmer used the stack. You could always tell
who wrote the program currently running by looking at the light.














Posted by Jonathan Kirwan on March 7th, 2006


On Tue, 07 Mar 2006 16:41:50 -0800, Jim Stewart <jstewart@jkmicro.com>
wrote:

Yup. What was most memorable for me about the IMSAIs was their nice,
flat, plastic switches (I have a box of them here, both colors!) Much
better than the metal bat handles of the Altair 8800 and much easier
on the fingers.

Hehe. I'll add that I could tell which program area of the code was
running and what it was doing by listening to the AM radio I kept
nearby.

Jon

Posted by Paul Keinanen on March 8th, 2006


On Tue, 07 Mar 2006 23:34:13 GMT, Jonathan Kirwan
<jkirwan@easystreet.com> wrote:

Recursion was/is overrated anyway, which would consume excessive stack
space to store local variables in inactive stack frames. When writing
an arithmetic expression evaluator for the DDP-316/516, I never even
considered using recursion for that.

The much worse flaw with those kinds of architectures that stored the
return address at the entry point of the routine, is that it is not
re-entrant. This is not a problem in a typical batch system, in which
a single job was executed to completion, no matter how long it lasted.

However, it would be impossible to use shared libraries in some
multitasking systems or even use the same subroutine from main code
and from an interrupt service routine. In a time sharing system, the
whole process, including common libraries, had to be swapped out when
the time quantum had elapsed.

The problem with in-line parameters is that in order for the caller to
call a function with different parameter values, memory locations just
after the call instruction would have to be modified. This would
prevent code sharing in common libraries, make it impossible to have
write protection on code or have different address spaces for code and
data etc. In practice, the in-line parameters should either be
constants (and the subroutine should know that is a constant) or
pointers to the actual parameter.

At least the old Fortran versions always passed all parameters by
reference, thus, the caller would put a list of pointers after the
call, which pointed to the actual variable storage or constants set up
in an other area.

Paul



Similar Posts