Planet PDF Forum Planet PDF Forum
  New Posts New Posts RSS Feed - How to get internal structure of the PDF file?
  FAQ FAQ  Forum Search   Register Register  Login Login

Hi, welcome to the Foxit Planet PDF Forum. If you have PDF or Adobe Acrobat questions then the right place to ask them is here, in this forum.

How to get internal structure of the PDF file?

 Post Reply Post Reply
Author
Saravanan6 View Drop Down
New Member
New Member


Joined: 10 Jan 2012
Location: India
Points: 2
Post Options Post Options   Quote Saravanan6 Quote  Post ReplyReply Direct Link To This Post Topic: How to get internal structure of the PDF file?
    Posted: 10 Jan 2012 at 5:42am
Hi All,

    I would like to know if there is any tool available for getting internal structure(XML BASED) of the PDF file likewise Open XML representation for MS-OFFICE 2007?

Please enlighten me on this...?


Thanks & Regards,
P.SARAVANAN
Back to Top
aandi View Drop Down
Senior Member
Senior Member


Joined: 07 Jul 2011
Points: 18358
Post Options Post Options   Quote aandi Quote  Post ReplyReply Direct Link To This Post Posted: 10 Jan 2012 at 8:50am
What sort of structure? The object tree? Tagging (optional)? A pseudo-structure of visual elements only?
 
If you aren't sure of the answer, try this instead: what is the purpose of the excercise?
Back to Top
Saravanan6 View Drop Down
New Member
New Member


Joined: 10 Jan 2012
Location: India
Points: 2
Post Options Post Options   Quote Saravanan6 Quote  Post ReplyReply Direct Link To This Post Posted: 10 Jan 2012 at 11:43am

Hi,

     Thanks for your reply.

I am expecting Likewise in Open XML representation for MS-OFFICE 2007.Because i want to parse(GET OR EXTRACT) each and every paragraphs,images,table,graph of the PDF document, then finally i want to assign some paragraphs to split the original document and able to build the new document that contains only the assigned paragraphs without changing any format.

Please enlighten me on this...


Thanks & Regards,
P.SARAVANAN



Edited by Rowan - 10 Jan 2012 at 12:15pm
Back to Top
aandi View Drop Down
Senior Member
Senior Member


Joined: 07 Jul 2011
Points: 18358
Post Options Post Options   Quote aandi Quote  Post ReplyReply Direct Link To This Post Posted: 10 Jan 2012 at 1:22pm
You may be disappointed in what is inside a PDF. There are images, there is text, and there is vector art, drawn a line at a time. There are no tables, no paragraphs, no charts: only collections of text, images and lines which look like them.
 
Unless the PDF file is tagged, that is. Are you dealing with tagged files?
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.10
Copyright ©2001-2017 Web Wiz Ltd.

This page was generated in 0.031 seconds.