- How to locate compiled binary code relative to source code?
- Posted by John Doe on July 13th, 2004
Win32. I am learning about the guts of the PE format to implement modifying
an exe
after distribution. As an exercise I want to solve the following problem:
I need to be able to identify in the compiled EXE a specific location
relative to the source code.
IE: I need to write code like this:
void f ()
{
some code;
some code;
IDENTIFIER1;
some code a;
some code b;
some code c;
IDENTIFIER2;
some code;
}
When it compiles, I need to be able to locate in the binary exe the offsets
of IDENTIFIER1 and IDENTIFIER2 such that in between these offsets in the
compiled exe image are the instructions for "some code a, b and c". I don't
really care about IDENTIFIERs themselves - really all I want is to know is:
exactly where the instructions are (in the binary) that are in between them
(a,b,c).
Constant strings do not work (at least directly) because the compiler
relocates them to a data(?) section as a lump with all the other constant
strings and indirectly references them. "#pragma comment" does the same
thing.
I thought about using a unique series of inline assembly (that does
effectively nothing), and searching for the compiled opcodes that
correspond - but this is very awkward since I may need numerous sequences
which all have to be unique.
Any ideas?
TIA.
M.
- Posted by Phlip on July 13th, 2004
John Doe wrote:
You are asking for trouble.
If you really want trouble, try posting this to
news:microsoft.public.win32.programmer.kernel
--
Phlip
http://industrialxp.org/community/bi...UserInterfaces
- Posted by Thomas Matthews on July 13th, 2004
John Doe wrote:
Your best bet is assembly language. Put in labels where you need
them and export them (make them public).
One can always get the location of a function by using a function
pointer. However, to get at a certain location within the
function is not possible.
Have you tried having the compiler print out the assembly language
listing?
In the old days of assembly language and simple operating systems,
we could write programs that modified themselves. But alas, how
times change.
--
Thomas Matthews
C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book
- Posted by Phlip on July 13th, 2004
Thomas Matthews wrote:
Modern CPUs pipeline instructions, so modifying them causes chip-level
undefined behavior.
I think the OP wants install-time modifications, not run-time.
--
Phlip
http://industrialxp.org/community/bi...UserInterfaces
- Posted by Gerry Quinn on July 13th, 2004
In article <10f7vjmj90o9173@corp.supernews.com>, nope@nospam.com says...
All the same, that's probably what you need. However, I don't think the
problem is as bad as you think. Let's say you identify one unique
series of assembly that does nothing, and that can get compiled reliably
by the compiler (you might have to switch off optimisations around it).
Then it should be easy to make as many unique sequences as you want.
For the sake of argument, say you can identify the following sequence:
Push register A on stack
Load A, 555555555
Pop register A from stack
You probably can, because constants of 555555555 will not appear very
often in code.
Then if you add another line after loading 5555555555:
Load A, n
...you have a distinguishable identifier for every value of n.
I don't know if the exact example above will work, but it should be
possible to do something of the kind.
- Gerry Quinn
- Posted by Corey Murtagh on July 13th, 2004
Gerry Quinn wrote:
<snip>
Just as an extension to that, to aid in searching and so on, how about
adding a DoNothing subroutine that just returns immediately and call that?
ie:
PUSH EAX
MOVE EAX, 0
CALL DoNothing
POP EAX
A little longer, but you can #define it (compiler-specific ASM blocks
aside):
#define MARK_CODE(n) \
asm { \
PUSH EAX \
MOVE EAX, n \
CALL DoNothing \
POP EAX \
};
Then you just add MARK_CODE(0), etc. to your program to add
easily-searchable markers... assuming you don't call DoNothing from
anywhere else, and you have debug symbols in your disassembled code.
Might work :>
--
Corey Murtagh
The Electric Monk
"Quidquid latine dictum sit, altum viditur!"
- Posted by TLOlczyk on July 14th, 2004
On Tue, 13 Jul 2004 08:28:04 -0700, "John Doe" <nope@nospam.com>
wrote:
Hmmm. I think I-mode publications is out of business, but it's worth a
shot. They make a CD of the last five years [ublications of about
twenty magazines. Other than that you might want to check out
whether MSJ ( and whatever it turned into ) keeps CDs of old issues.
Look for Matt Pietraks columns, and you should be able to find
something. , I know he wrote a column about printing out a stack trace
of an unhandled exception, but I think he only read the symbols
of the system functions ( using dbghlp ).
If you are looking at Borland, I heard a rumor that they were
making a dll available to read their symbol tables, but I haven't seen
it. ( Not enough present interest. ) If they don't, then they are
very proprietary about their debug info. So you may be out of luck.
The reply-to email address is olczyk2002@yahoo.com.
This is an address I ignore.
To reply via email, remove 2002 and change yahoo to
interaccess,
**
Thaddeus L. Olczyk, PhD
There is a difference between
*thinking* you know something,
and *knowing* you know something.
- Posted by Programmer Dude on July 14th, 2004
Corey Murtagh writes:
Heh! Cool ideas. One more. You can just JMP around an arbitrary block
of instructions, which would allow you to use just anything as marker.
- Posted by Thomas G. Marshall on July 15th, 2004
Programmer Dude <Chris@Sonnack.com> coughed up the following:
Are the days gone where you could simply place things into a function abc()
in program X, and then have your modifier program look for the value (the
location) for abc in the symbol table generated by the compiler when
compiling X?
- Posted by Arthur J. O'Dwyer on July 15th, 2004
On Thu, 15 Jul 2004, Thomas G. Marshall wrote:
Yes. A lot of people will compile their programs with different
compilers, and/or simply turn off debugging information. Makes for
smaller binaries if you leave out the symbol table. To take an
extreme example, anyone who's ever watched IE crash on a machine with
the MSVC++ debugger installed knows that IE's binary doesn't have any
symbols attached. 
Of course, the OP /is/ asking for trouble; what happens if the compiler
realizes that the JMP or the PUSH/MOV/POP is useless, and optimizes it
out? Or inlines the function call to 'DoNothing'? Then he loses.
-Arthur
- Posted by TLOlczyk on July 16th, 2004
On Thu, 15 Jul 2004 09:58:06 -0400 (EDT), "Arthur J. O'Dwyer"
<ajo@nospam.andrew.cmu.edu> wrote:
The release compiler is notatbly more buggier than the
debug one ( at least up to v6 ). So by toggling to a release version,
you make your program buggier.
The reply-to email address is olczyk2002@yahoo.com.
This is an address I ignore.
To reply via email, remove 2002 and change yahoo to
interaccess,
**
Thaddeus L. Olczyk, PhD
There is a difference between
*thinking* you know something,
and *knowing* you know something.
- Posted by Gerry Quinn on July 16th, 2004
In article <49sef05u9e3a6mq0ubcv1aqftrbuomni6i@4ax.com>, olczyk2002
@yahoo.com says...
Evidence? (Of course changing to a release version sometimes makes
array boundary bugs show up because there is less padding.)
- Gerry Quinn
- Posted by Gerry Quinn on July 16th, 2004
In article <Pine.LNX.4.58-035.0407150953380.1705@unix43.andrew.cmu.edu>,
ajo@nospam.andrew.cmu.edu says...
There's usually a pragma available to turn off optimisation at a given
point. Alternatively, one can do certain tricks that will frighten away
the optimiser, e.g. changing a global variable that is accessed at the
end of the program and printed on screen for a microsecond.
- Gerry Quinn