Strange behaviour while exporting pdf file into text file.

The Amyuni PDF Converter is our Printer Driver that enables you to convert any documents to PDF format. If you have any questions about installing and using the Amyuni PDF Converter please post them here.
Post Reply
Awadhendra
Posts: 2
Joined: Thu Oct 30 2014

Strange behaviour while exporting pdf file into text file.

Post by Awadhendra »

Hi,

I am using Interop.ACPDFCREACTIVEX.dll this assembly for exporting PDF document into text file document. It shows strange behaviour on different machine. While converting all content is fine but line positioning of content is shifted to top on different machine. Like when I convert PDF on XP machine then "Place the barcode here or write the serial number and check digit of the electronic file in the spaces provided." then it's line position is correct on text file but when same file I converted on windows server machine then it is shifted to top position on text file. I don't know why it is behaving like this. Is pdf export functionality is machine dependent or I am missing something while exporting. I am using following line of code.

StringBuilder pdfTextData = new StringBuilder();
PDFCreactiveX pdfCreater = new PDFCreactiveX();
pdfCreater.SetLicenseKey("companyName", "somelicensekey");

if (File.Exists(pdfFileName))
{
pdfCreater.Open(pdfFileName, "");
pdfCreater.Refresh();
pdfCreater.ExportToRTF(pdfFileName.Replace(".PDF", ".U.TXT"), acRtfExportOptions.acRtfExportOptionText, 1);
}

I need help to resolve this issue as my unit testing is not behaving correctly on different machines.

Thanks,
Awadhendra
Awadhendra
Posts: 2
Joined: Thu Oct 30 2014

Re: Strange behaviour while exporting pdf file into text file.

Post by Awadhendra »

Awadhendra wrote:Hi,

I am using Interop.ACPDFCREACTIVEX.dll this assembly for exporting PDF document into text file document. It shows strange behaviour on different machine. While converting all content is fine but line positioning of content is shifted to top on different machine. Like when I convert PDF on XP machine then "Place the barcode here or write the serial number and check digit of the electronic file in the spaces provided." then it's line position is correct on text file but when same file I converted on windows server machine then it is shifted to top position on text file. I don't know why it is behaving like this. Is pdf export functionality is machine dependent or I am missing something while exporting. I am using following line of code.

StringBuilder pdfTextData = new StringBuilder();
PDFCreactiveX pdfCreater = new PDFCreactiveX();
pdfCreater.SetLicenseKey("companyName", "somelicensekey");

if (File.Exists(pdfFileName))
{
pdfCreater.Open(pdfFileName, "");
pdfCreater.Refresh();
pdfCreater.ExportToRTF(pdfFileName.Replace(".PDF", ".U.TXT"), acRtfExportOptions.acRtfExportOptionText, 1);
}

I need help to resolve this issue as my unit testing is not behaving correctly on different machines.

Thanks,
Awadhendra
Devteam
Posts: 119
Joined: Fri Oct 14 2005
Location: Montreal
Contact:

Re: Strange behaviour while exporting pdf file into text file.

Post by Devteam »

Hello Awadhendra,

The text export functionality is not in general machine-dependent, but it is file-dependent. We have tried to reproduce this issue on our side using the limited information you have provided and the latest version of the library (5.0.1.3), but we have not been able to reproduce it, the library produces the same results on every system we tried.

Note that for a PDF file that doesn't have the fonts embedded, you may get different results if you export it to RTF on a system that has all those fonts installed versus a system that does not have them.

In order to provide further assistance we need to be able to reproduce the issue on our end. Can you please double check that you are using the latest version available of the product? At this moment our latest version is 5.0.1.3. If you are already using this version, can you please then send us by email the PDF file that you are having issues with together with the two types of outputs that you are getting? You can send the files to support@amyuni.com.

Regarding the sample code you posted, please note that PDF files may draw characters on a page at any order, using just x-y coordinates, it is then recommended to call the method OptimizeDocument(int optimizationLevel) prior to exporting a PDF to RTF or plain text. This method will put all the characters together as pieces of text according to their proximity. You can find more information about calling this method in the following page:

http://www.amyuni.com/WebHelp/Amyuni_PD ... Method.htm

Also note that for certain PDF files, text extraction might not be possible at all. Some PDF files may contain incomplete font information, which prevents text extraction from working properly. You can know if you have encountered one of these files by opening it with Acrobat Reader and trying to do copy-paste of a piece of text into Notepad. You will get garbled characters when pasting in Notepad, or no text at all.

Best regards,
Amyuni Support.
Amyuni Development Team

Efficient and accurate conversion to PDF, XPS, PDF/A and more. Free trials at - https://www.amyuni.com
Post Reply