Memory increase until crash

The PDF Creator .NET Library enables you to create secure PDF documents on the fly and to view, print and process existing PDF documents. If you have any questions about PDF Creator .Net, please post them here.
Post Reply
kensands
Posts: 9
Joined: Wed Dec 19 2012

Memory increase until crash

Post by kensands »

I've got a very simple loop looking for a picture object on each page of a pdf which is a few thousand pages long. when I run it memory use will just keep going up until it gets close to 2GB at somewhere near 1800 pages in and at that point the library throws an exception.

I removed all the other bits of my loop and this by itself will cause the issue.

Do While MyPageNumber < document.PageCount
document.ObjectByXY(MyPageNumber, 8388, 850)
Loop

So my next move is to kill the document and reopen the pdf every few hundred pages. I shouldn't have to do that, is there something I could be doing different? I've tried this same code using getobjectsinrectangle and by getpage().objectbyXY() both produce the same crash due to the memory usage spiraling out of control.

I'm using Creator.Net 4.5

Exception details:

Error -2147467259
Message: "External component has thrown an exception."
Stack Trace: " at f(UInt32 ) at l(at_apv* ) at b(at_apv* , at_aw7* ) at a(at_apv* , at_aw7* , UInt32 , Boolean ) at b(at_anv* , SByte* , Int32 ) at a(at_aig* , SByte* , Int32 ) at a(at_aig* , SByte* ) at a(at_aig* , UInt32 ) at f(at_aig* ) at a(at_as1* , Byte* , UInt32 , UInt32 , Int32 , at_atw* , at_ars* , at_am* ) at b(at_fy* , at_apv* , Int32 ) at e(at_as1* , Int32 ) at b(at_as1* , UInt32 , Int32 , Int32 ) at Amyuni.PDFCreator.IacDocument.ObjectByXY(Int32 pageIndex, Int32 x, Int32 y) at Program.Form1.count_logos() ..."
kensands
Posts: 9
Joined: Wed Dec 19 2012

Re: Memory increase until crash

Post by kensands »

As it may help some encountering this same problem, until a real solution is given resetting the document every 500 pages "solved" it for me.

Put this in the loop and the memory gets up to about 500MB before this kicks in and resets it. Pretty slow and annoying to have to reopen a document every 500 pages but this does allow it to do what I need.

If MyPageNumber > 499 And MyPageNumber Mod 500 = 0 Then
document.Dispose()
document = Nothing
GC.Collect()
GC.WaitForPendingFinalizers()
document = New IacDocument()
document.Open(readdoc, "")
End If
Yoisel
Amyuni Team
Posts: 5
Joined: Thu Sep 10 2009

Re: Memory increase until crash

Post by Yoisel »

Hello,

The behaviour you are describing is not a memory leak, if that were the case, even disposing the document object would not free the allocated memory.
What you are experiencing is just the fact that there is a limit of 2Gb for program execution under windows.

By default, Amyuni PDF Creator will apply all the modifications of the active document in-memory.
You can use page by page mode however when processing large files. In page by page mode we will only keep a pre-defined number of pages in memory.
When you move along the document, some pages will be saved to disk and the memory that was allocated for them will be freed.

Please refer to the documentation for more details on this feature:

Processing large PDF files:
http://www.amyuni.com/WebHelp/Amyuni_PD ... erview.htm

StartSave method:
http://www.amyuni.com/WebHelp/Amyuni_PD ... Method.htm

SavePage method:
http://www.amyuni.com/WebHelp/Amyuni_PD ... Method.htm

EndSave method:
http://www.amyuni.com/WebHelp/Amyuni_PD ... Method.htm
kensands
Posts: 9
Joined: Wed Dec 19 2012

Re: Memory increase until crash

Post by kensands »

I never said there was a memory leak, just that memory use increases until you get a crash. Using those methods I also thought would solve the issues but they don't. I've been told that the key is using OpenEx and setting the amount of pages that will be held in cache, code that is wonderfully undocumented, it requires setting the 'PagesInCache' propery.

eg

document.AttributeByName("PagesInCache").Value = 10

However I've yet to try this as my method of closing and opening every so often is working ok, when I next need to work on the code I'll try it out.
kensands
Posts: 9
Joined: Wed Dec 19 2012

Re: Memory increase until crash

Post by kensands »

Well I've found the perfect solution for this. using PDFSharp library. Absolutely blown away by how a totally free library performed for the batch appending of many files, I threw over 3000 half meg pdfs at it and it appended and saved out in a few minutes. the same thing took 3 hours of churning away in amyuni before it crashed out using too much memory. While the amyuni suite is brilliant for picking up text and displaying pdfs when it comes to batch processing they are hopeless. Lesson learned, pick the right tools for the job.
Yoisel
Amyuni Team
Posts: 5
Joined: Thu Sep 10 2009

Re: Memory increase until crash

Post by Yoisel »

Hello,

From your previous posts I didn’t realize that you were interested in appending a great volume of PDF files. As with most computational tools, there are different ways of using Amyuni PDF Creator or Amyuni PDF Converter that better suit different scenarios. Our support team continuously helps customers to find the best way of accomplishing particular tasks with our libraries.

You started mentioning the method ObjectByXY(...). Operations like finding an object by its position on a page are a lot more resource-consuming than appending documents, because the former involves parsing the contents of the page, processing its dependencies, maybe decompressing/decoding streams, etc. For appending documents only, it’s enough to use OpenEx, then Append, avoiding other unnecessary and more costly operations. Once you try that, you will probably have a more accurate idea of the library’s performance for this particular task. If it’s still slow for your requirements/expectations, we can always provide a way for disabling some additional processing we perform inside Append, which factors out resources like fonts or raster images that could be common to several of the concatenated documents. That should improve speed at the expense of file size.

Finding the optimal way to use a tool for accomplishing a particular task can always be harder with a powerful, multi-purpose tool than with a more limited one. To help our customers with that, we provide our support and our online documentation, which evolves as we come across customers’ concerns like yours. I would like to add that our libraries support advanced PDF features that you might not find in other PDF libraries, either commercial or free, for example:
- Support for reading files with compressed objects streams.
- Support for appending/concatenating files that comply with the PDF-A specification without breaking the PDF-A compliance.
- Support for appending/concatenating PDF AcroForms (with possibly colliding field names).
- Support for appending/concatenating PDF files that contain optional content groups (PDF layers)

Thank you for using Amyuni libraries and for giving us the opportunity to clarify your concerns.
Post Reply