Search for text on specific lines and within specifc columns
Search for text on specific lines and within specifc columns
We need to be able to programmatically search pdf files for text that is on line x, y or z and is between columns a and b. Our current solution provides the location of each word in the document. This allows us to define searches in a manner that closely approximates columns and lines. We would like to find an alternative solution. Do any of your tools have the ability to provide line and column information about the text within a pdf?
It will help during the actual search. But we need to know the location of all text prior to a search. This allows the user to define a document by indicating that text "x" will be found on line "y" starting in column "z". With this information the program can determine which document definition a document belongs to by doing a search for "x" and then comparing the coordinates returned with each of the document definitions.
Bottom line we need to be able to run a document through a process that will give us the coordinates for all of the text on the page.
Bottom line we need to be able to run a document through a process that will give us the coordinates for all of the text on the page.
Hello,
Using the current version of the PDF Creator you can go over all the objects in a pdf file and retrieve each object's parameters. so you can loop through all the objects, check if the object is a text and retrieve its coordinates.
There are also two functions that might be useful for you GetObjectXY() return the object located at a given coordinate, and DelimitedText() return all the text within a specific area.
Using the current version of the PDF Creator you can go over all the objects in a pdf file and retrieve each object's parameters. so you can loop through all the objects, check if the object is a text and retrieve its coordinates.
There are also two functions that might be useful for you GetObjectXY() return the object located at a given coordinate, and DelimitedText() return all the text within a specific area.