Author Topic: Re: IDA2PAT Reloaded doesn't seem to log all signatures.  (Read 13669 times)

0 Members and 2 Guests are viewing this topic.

Offline Administrator

  • Administrator
  • Newbie
  • *****
  • Posts: 13
  • Karma: +0/-0
    • View Profile
    • Macromonkey
Re: IDA2PAT Reloaded doesn't seem to log all signatures.
« on: May 03, 2015, 12:27:19 AM »
From a question PM'ed to me on unknowncheats.me:

Quote
Originally Posted by BackDoorMan
Hello sir

I've just downloaded your IDA plugin for generating .pat files however it doesn't export sigs for all of the functions.
What could be the problem?

----------------log------------------------------------

== IDA2PAT-Reloaded plug-in: v: 1.1 - Mar 28 2011, By Sirmabus ==

Support forum: http://www.macromonkey.com/bb/viewforum.php?f=65

Functions to process: 14835


Working, <Press Pause/Break key to abort>...

Patterns created: 79

Process time: 11.41 seconds.

Finished.
-------------------------------------------------------------

As you can see 79 patterns of 14835 functions.

I also used idb2pat python script which exports fine but it doesn't seem to mask pointers... ipso facto generated .sig file doesn't match all the functions.

IDA2PAT reloaded signature:
51A1........8B086A01688F00000050FF91E4000000A1........8B106A0068

IDB2PAT.PY signature:
51A1949A70008B086A01688F00000050FF91E4000000A1949A70008B106A0068

Thanks in advance!
Best regards.

Re: IDA2PAT Reloaded doesn't seem to log all signatures.

I'll add that to the list of the one I need to update and make a 64bit version of.
I've updated most locally and should have them up on my website soon.
Reminds me too there is an update to FLAIR that should add a little more integrity to sigs now that I could look into.

First one issue:
I made my version to only save user named functions.
To reject all the default ones like "sub_402230", etc.
The original tool tried to use flags to determine if functions were "auto-generated" or not but those flags are unreliable so I use a string filter.
Could be something messing up there.

2nd:
The whole pattern matching thing is a loaded issue.
First of all IDA's FLAIR system is pretty ambiguous. That's why you'll see IDA placing bad signatures for it's run time set, etc.
See: https://www.hex-rays.com/products/id...in_depth.shtml

I've done research in the area and made my own simple signature tool still in progress I haven't released.
The biggest problem with FLAIR and in general similar simple signature matching is they don't consider things like uniqueness, commonality, etc.

Imagine you are interested in a "std::string" constructor function. The problem is even though you name one of them there could be many of the same exact function.
To make matters worse it could be a small function with very little details (like immediate values, etc.) to discern it form any other number of similar functions.

Then sure your named one should end up in your .PAT file (from a IDA2PAT tool), but will fail in sigmake.exe if there is a conflict (because two or more have the same signature, etc).
If your resolve the conflict (by editing the result file) and the signature makes it to your .SIG file imagine now applying it to a new/updated target.
It's going to apply the name to the first match it finds. It's probably not going to be the one you wanted it to match too.
On top of that it will match a bunch more (all with serialized names).

See the problem?
FAIR is pretty dumb to the whole scene. It doesn't know about the degree of significance, about similarity, relative positions, etc.

Then another issue is knowing properly how to mask out the non-exclusive features in the PAT generation.
So when you have things like "mov dword_1B4B52C,eax" the tool should keep most of the details, maybe only masking out the "dword_1B4B52C" part and keeping the rest.
The address won't (probably not anyhow) translate over when applied as a signature so it needs to get masked.
But actually when you think about it, it is a significant detail. If nothing else there is a relative position hint in it somewhere.
Just that AFAIK no existing tool stores any meaningful context of it.

What is needed is a new type of tool that considers the whole picture.
And some things will just have to go. For functions that can't be tracked reliably it will say so and reject them, etc.

I'll update my variation to a newer IDA SDK as that might help.
But I think any more time spent on FLAIR (a broken system from my perspective) would be kind of a waste.