Document Structuring Conventions


Document Structuring Conventions, or DSC, is a set of standards for PostScript, based on the use of comments, which primarily specifies a way to structure a PostScript file and a way to expose that structure in a machine-readable way. A PostScript file that conforms to DSC is called a conforming document.
The need for a structuring convention arises since PostScript is a Turing-complete programming language. There is thus no guaranteed method — short of actually printing the document — to do things like determining how many pages long a given document is or how large a given page is, or how to skip to a particular page. The addition of structure, with DSC comments exposing that structure, helps provide a way for, e.g., an intelligent print spooler to have the ability to rearrange the pages for printing, or for a page layout program to find the bounding box of a PostScript file used as a graphic image. Collectively, any such program that takes PostScript files as input data is called a document manager.
In order for a PostScript print file to properly distill to PDF using Adobe tools, it should conform to basic DSC standards.
Some DSC comments serve a second function, specifying a way to tell the document manager to do certain things, like inserting a font or other PostScript code into the file. DSC comments that serve this second function are more akin to preprocessing directives and are not purely comments. Documents using those kinds of DSC comments require a functioning document manager to come out as intended; sending them directly to a printer will not work.
DSC is the basis for encapsulated PostScript; EPS files are conforming documents with further restrictions.
The set of DSC comments can be expanded by a mechanism called the Open Structuring Conventions, which, together with the EPS specification, form the basis of early versions of the Adobe Illustrator Artwork file format.

DSC at a glance

The basic premise of DSC is the separation of prolog and script, plus the disallowing of certain PostScript operators deemed inappropriate for page descriptions. This ensures a basic level of predictability in the PostScript code, thus forming the basis of document manageability.
An optional, additional layer of document manageability is provided by separating the script into a document setup section, zero or more functionally independent pages, and an optional trailer. The functional independence between pages, plus the disallowing of more PostScript operators in the pages section, form the basis for page independence, which allows pages to be reordered, and independently and randomly accessed.
This imposed structure is then exposed by delimiting the PostScript file with DSC comments, which normally begin with two percent signs followed by a keyword. Some keywords need to be followed by a colon, an optional space character, and then a series of arguments.
Finally, the document is marked as conforming by starting it with a comment starting with “%!PS-Adobe-” followed by the DSC version number.
Sections of reusable PostScript code can be modularized into procsets, in order to ease the generation of PostScript code. Procsets and other PostScript resources can be omitted from the PostScript file itself, and externally referenced by a directive-like DSC comment; such external referencing, however, can only work with a document manager that understands such DSC comments.
DSC version 3.0 was released on September 25, 1992. The specification states, "Even though the DSC comments are a layer of communication beyond the PostScript language and do not affect the final output, their use is considered
to be good PostScript language programming style." Thus, most PostScript-producing programs output DSC-conformant comments along with the code, although some such programs do not actually produce conforming documents.

Example

A DSC-conforming document might begin:

%!PS-Adobe-2.0
%%Creator: dvips 5.95a Copyright 2005 Radical Eye Software
%%Title: texput.dvi
%%Pages: 1
%%PageOrder: Ascend
%%BoundingBox: 0 0 612 792
%%DocumentPaperSizes: Letter
%%EndComments

which has the following meaning:
  1. marks the document as conforming to version 2.0 of the DSC
  2. identifies the PostScript-producing program as dvips 5.95a
  3. identifies the document title
  4. tells the document manager that the document consists of one page
  5. tells the document manager that pages are independent and appear in ascending order in the document; in this example, since the document only consists of one page, this information is not usually relevant, but will be needed if additional pages are to be inserted by a document manager
  6. tells the document manager the coordinates, measured in PostScript points, of the bounding box for all the pages taken together; 0 0 612 792 is the coordinates of a US Letter–sized page
  7. tells the document manager what kind of paper sizes are used in the whole document; in this example only one size is used, namely the US Letter size
  8. marks the end of the prolog