Tech Support > Computers & Technology > Free Or Cheap OCR Software For Scanned PDF Pages
Free Or Cheap OCR Software For Scanned PDF Pages
Posted by Martin on December 24th, 2007


Hello,

I receive a monthly document, about 150 pages long, whose file extension is
pdf.

Adobe Reader's Find function can't locate any words within it.

When I spoke with the person who sent it to me, he indicated that the
document consisted of scanned pages, and that Adobe Reader's Find function
would be useless.

He does not have the original document that could possibly be converted into
a true PDF.

Rather than printing a monthly document of 150 pages or so, is there any
*free* or relatively inexpensive OCR software that I could use to convert
the document into a readable format?

As always, I very much appreciate all assistance offered in this NG.

Regards,

Martin


Posted by Mike Easter on December 24th, 2007


Martin wrote:

Some .pdf 'documents' are simply graphics of a scan like a .gif or .png
which do not contain any text, just graphic pixels.

The last time this discussion about a particular .pdf came up in the ng
alt.comp.freeware a.c.f., the discussant posted a link to the actual
..pdf and the resolution of the target was so poor that I didn't think
that OCR would be able to read the text. However, that was a copy of a
19th century newspaper article.

The most recent discussion of freeware OCR in a.c.f was this one
http://groups.google.com/group/alt.c...5083f7d34966d1
or http://snipr.com/1vrld

<snip>
Message-Id: <474a612c$0$8794$4c368faf@roadrunner.com>
Subject: Ocr software
Newsgroups: alt.comp.freeware
Date: Mon, 26 Nov 2007 01:01:53 -0500

is there any excellent freeware ocr application ?
</snip>

That discussion resulted in quite a number of suggestions and links,
including the fact that besides the freeware available for download,
most scanners come with an OCR program, sometimes of fairly good
quality, sometimes not.


--
Mike Easter


Posted by Martin on December 26th, 2007


Hello Mike,

Thank you very much for your reply.

I'll be trying some of the software in question; I hope that at least one
will be able to read a "fake" pdf file.

With regard to Excel spreadsheets, perhaps you or other NG readers can
assist me.

I converted many HTML pages from the same site into Excel. One HTML page =
one Excel page.

Is there a way to import or merge the pages so that, rather than having many
of them, I can have only one, regardless of length?

If it's relevant, I have Excel 2002 and the format of each HTML page is
identical to all others on the site.

Once again, thank you for any assistance offered.

Regards,

Martin



Similar Posts