Login - Search - Contact Us
Amyuni Logo
Home Developer Tools End-User Products Support Company Language
  White Papers and Techical Articles  

 

 

 

 



 

Font File Structures

In addition to their technological differences, fonts can also be categorized according to how they are structured as PDF objects. Generally, fonts can be structured as:

• Simple Fonts

• Composite Fonts

PDFs contain font objects (see Figure 5 on page 7) that essentially act as wrappers for embedded font programs that contain the actual font data. Font programs can be TrueType, OpenType, Type 1, and so forth. Font objects also contain a number of properties and descriptions of the font data in order to enable PDF applications and viewers to use the font in the document.

Simple Fonts

Simple Fonts use a single byte of information to represent a glyph. As a result, a maximum of 256 (28) different glyph representations are possible. The Simple Font category includes the original instances of Type 1 and TrueType fonts.

Composite Fonts

Because of their 256 character encoding limitation, Simple Fonts could not support complex Asian glyphs, where a typical Japanese font can have over 7,000 Kanji, Katakana, and Hiragana characters, or non-horizontal writing.

The solution was the development of Composite Fonts (or CID fonts). Unlike Simple Fonts, Composite Fonts are multi-byte and can thus contain an arbitrary number of glyphs. As a result, Composite Fonts are able to support a wider range of glyphs.

Composite Font technologies enable developers to use any number of base fonts and create new composite fonts. Composite font technologies also enable developers to include two sets of character spacing details (metrics) in fonts. One metric can be used for horizontal writing mode and another for vertical writing mode.

Aside from their ability to handle complex glyphs, Composite Fonts are also flexible and expandable.

CMap File

A CMap is an ASCII text file that contains the PostScript language instructions required to map character codes to CID codes used by Composite Fonts. For example, after a character code is processed (from a keyboard input), the CMap file maps the character code to a corresponding Character Identifier number (CID). The CID code is then passed on to the Composite Font which will in turn generate the appropriate glyph. As we shall see in the next document, CMap files can also be missing and impact proper PDF processing.

Font Embedding

To display, print, or process a PDF accurately, it must contain the necessary font information. If font information is missing, recipients may not be able to display or edit the document properly or, worse, applications may not be able to process the PDF at all.

Embedding fonts in a PDF ensures that they display and print exactly from one system to another as the author intended. The following sections will look at how fonts are embedded in PDFs and introduce the upcoming subject matter for the following document.

Full Font Embedding

The first method of embedding fonts is full font embedding. Full font embedding effectively makes the font part of the PDF thereby preventing font substitution when recipients need to display or print a PDF. Essentially recipients don’t need the same fonts to view or edit the document. This method is advisable in situations in which modifications to the PDF are expected.

Full font embedding can also potentially help avoid some of the problems associated with missing system fonts and ensure optimal viewing regardless of the system and platform. In an ideal PDF world, fully embedding all fonts would reduce many development woes.

The main drawbacks to full font embedding are file size and licensing issues. Every embedded font makes the document larger, especially if it contains Chinese, Japanese, or Korean (CJK) fonts, which can be problematic. In fact, CJK fonts are rarely fully embedded due to their large character sets. Also, fully embedded fonts can be extracted and used outside of the PDF file. As a result, this font extraction can create the potential of unlimited font distribution and violate the licensing policy of the font manufacturer. The solution then is to partially embed fonts in a document.

Partial Font Embedding (Subsetting Fonts)

Unlike full font embedding, subsetting a font only embeds the glyph definitions for the characters used (i.e., that are displayed in the PDF).

There are three main reasons one should subset fonts. First, as previously stated, PDFs are primarily for content exchange and viewing. PDF is not an ideal editing format, despite the popularity of PDF editing programs available on the Internet, and it is generally assumed (rightfully or wrongfully) by the PDF’s creator that the recipient will not modify the document’s contents. As we shall see in the following document, editing a PDF is not always a straightforward affair.

Second, subsetting fonts reduces document size. For example, the size of the font “Arial Unicode MS” is nearly 20MB; however, subsetting this font to show 10 Kanji characters would instead only add approximately 25KB to the PDF. In cases where CJK fonts are used, full embedding all fonts would result in problematically very large files.

Third, subsetting of fonts avoids licensing issues because the font then becomes unusable for other purposes then rendering the document which is often permitted by the font licensors.

The draw back with partially embedded fonts is that if recipients do not have the fonts on their system, they will not be able to edit the document or will be very limited in their ability to edit text. This is where the problem of missing fonts begins to emerge.

When Fonts Go Missing

Now that some of the key PDF and font concepts have been reviewed, the different problems that can occur when font information is missing can be addressed.

The following document (Part 2) will explore how problems associated with missing font information can start right at the source, with the creation of the PDF document itself. These problems include full and partial font embedding, incomplete font information in TrueType fonts, and missing CMap files.

References:

Walsh, Norman. "Frequently Asked Questions About Fonts."

14 August 1996. < http://nwalsh.com/comp.fonts/FAQ/index.html>

 

Learn more about Amyuni PDF Development Tools

© Amyuni Technologies Inc. All rights reserved. All trademarks are property of their respective owners.

 

Amyuni product related to this White Paper:
View, design, edit and create in PDF. Built for .NET and ActiveX. FREE Technical Support with your 30-day trial.
PDF Creator Developer Pro


Call us at our Montreal Head Office, Tel: 1-866-926-9864 EST.

 




Download Now
Read this White Paper in PDF format:
Download PDF

Amyuni PDF Creator
Develop with PDF with complete font control:
Get Started

Professional Services
Need a custom PDF solution? For quotes:
Submit your project

 

 

 

 

 

 

 

 



TEL U.S. 1-866-926-9864 (9-AMYUNI) EUR (+33). 1-30-61-07-97 Site Map | Forum | Search | Trial Download Resource | Contact Us | sales@amyuni.com