I am using version 4.5.2.9 and have adjusted the vb.net sample code to extract all the text elements from a PDF file, but some of the text read is rubbish e.g. "W\DQGÀQDQFLDODI"
Here is the extract of my code where it is trying to read all the text objects from page 1
For Each obj As Amyuni.PDFCreator.IacObject In arList
'you can access all properties of each object
Dim attr As IacAttribute = obj.Attribute("ObjectType")
Dim oPage As Integer
oPage = obj.PageNumber
If oPage = i Then
Dim oType As Integer
oType = CInt(attr.Value)
Dim oTypeMR As String = ""
Select Case oType
Case 5 : oTypeMR = "Text"
Console.WriteLine(obj.Attribute("Text").Value)
If Pass = 1 Then
Dim oTextText As String
oTextText = obj.Attribute("Text").Value
Dim oTextColor As String
oTextColor = obj.Attribute("TextColor").Value
Dim oTextFont As String
oTextFont = obj.Attribute("TextFont").Value
All of the fonts in the PDF are subsets of fonts e.g. "AZVOLY+Arial Black,20.0000,400,0,0,0,0"
It would appear that depending on the subset, some fonts are read correctly and others don't.
Any help would be appreciated
Regards
Wes