Print Page | Close Window

PDF types

Printed From: Foxit's Planet PDF Forum
Category: Lets Talk PDF
Forum Name: General
Forum Description: This is for experienced PDF users. Here you can talk about any topics not set aside for Developers, Prepress and PDF Forms.
Printed Date: 15 Nov 2019 at 12:44am
Software Version: Web Wiz Forums 11.10 -

Topic: PDF types
Posted By: cowboy
Subject: PDF types
Date Posted: 12 Mar 2012 at 1:06pm
Hello, my PDF background is simply a user, in that I have Reader X to view pdf files, mostly books.

From what I can tell, there are 2 types that I run across, first a real text file, that I can copy/paste from and/or "save as" text, the second is an image file that you can't copy/paste or "save as" text.

I seem to have run across a third type that is an image that I can copy/paste from (haven't tried to "save as" yet as it is a large file)

Can someone fill me in on these different types ??


Posted By: aandi
Date Posted: 12 Mar 2012 at 1:44pm
They aren't really different types of PDF.
Every PDF can have some text, or none. And some pictures or none.
Sometimes what looks like text is just a picture.
Sometimes a picture has invisible text behind it, so that you can still copy/paste.
A PDF can have any or all of these in any mixture.

Posted By: aandi
Date Posted: 12 Mar 2012 at 1:57pm

Perhaps I can answer a different question: "how can I tell if copy/paste will work for a particular PDF". Sadly, you can't. You can just try it and see.

Posted By: Rowan
Date Posted: 12 Mar 2012 at 1:58pm
Anecdotally speaking, I've seen people refer to PDFs that have been generated from a scanned paper document (without subsequent OCR'ing) as image-based PDFs and PDFs generated from Microsoft Word (or similar authoring tools) as text-based PDFs.

I guess they are called image-based PDFs because although they look like they might have selectable text, they actually do not.

As aandi has already said, these aren't different types of PDFs -- just a simple (perhaps too simple to the point of being misleading) of describing the state of the text that you can see in a PDF.

Posted By: cowboy
Date Posted: 12 Mar 2012 at 2:28pm
I understand everything that has been said here, except for the one about and image haveing text behind it...

To clarify, I realize "pdf document" is a broad term.
For my use and question, it is for those documents that appear to be text, like books.

I get some frome Google books and other places.
Clearly, especialy with Google, they are images, as when you try to copy/paste
it won't work. Others, are just the opossite in that they will copy/paste....

It was the 3rd type that threw me, while it would let me copy/paste, yet when I tried
to save a portion via "cutepdf" it came out as an image. I guess the "text behind the image"
explains that.

the gist of my needs/problem is to have a way to save portions of a pdf text file.
Some times I may want to save 20 pages of a file, and there doesn't seem to be
a way to do that via Reader i.e.(save all or none)
Are there tools where you can do that ? "cutepdf" was one I used.

The second part is dealing with imaged text, what would be the best way to convert image to text?

Posted By: Rowan
Date Posted: 23 Mar 2012 at 7:40am
I recently wrote a tips and tricks article for Planet PDF which describes how to use Google Docs to convert a scanned PDF to text, you can read it here: - Convert scanned PDFs to text documents using Google Docs

As far as free solutions go, this is probably as good as you will get at this stage.

As for saving 20 pages of a PDF file, I presume you mean you want to extract 20 pages from a PDF? Just search for a free tools that does PDF splitting. Most PDF splitting tools will let you split by page range. This is technically the same as "extracting", it's just a different term.

Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.10 -
Copyright ©2001-2017 Web Wiz Ltd. -