Наши партнеры

Книги по Linux (с отзывами читателей)

Библиотека сайта rus-linux.net

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16. Typesetting and Word Processing

If you're coming to Linux with a Microsoft Windows or Apple MacOS background, or from some other non-Unix computing environment, you are likely used to one approach to "word processing." In these environments, most writing is done in word processors--large programs that offer a vast array of formatting options and that store their output in proprietary file formats. Most people use word processors no matter where the intended output will go (even if it's just your diary).

Word processors, from complete suites like StarOffice to commercial favorites like WordPerfect, are available for Linux--and have been for years. However, the standard personal-computing paradigm known as "word processing" has never really taken off on Linux--or, for that matter, on Unix-like operating systems in general. With Linux, most writing is done in a text editor, and files are kept in plain text.

When you keep a file in plain text, you can use command-line tools to format the pages and paragraphs; add page numbers and headers; check the spelling, style, and usage; count the lines, words, and characters it contains; convert it to HTML and other formats; and even print the text in a font of your choosing--all of which are described in the recipes in this book. The text can be formatted, analyzed, cut, chopped, sliced, diced, and otherwise processed by the vast array of Linux command-line tools that work on text--over 750 in an average installation.

This approach may seem primitive at first--especially to those weaned in a computing environment that dictates that all writing must be set in a typeface from the moment of creation--but the word-processing approach can be excessive compared to what Linux provides. You can, if you like, view or print plain text in a font, with a single command--which is what ninety percent of people want to do with a word processor ninety percent of the time, anyway; to do this, see Converting Plain Text for Output.

It's my opinion that word processing is not a forward-thinking direction for the handling of text, especially on Linux systems and especially now that text is not always destined for printed output: text can end up on a Web page, in an "eBook,"(25) in an email message, or possibly in print. The best common source for these formats is plain text. Word processing programs, and the special file formats they require, are anathema to the generalized, tools-based and plain-text philosophy of Unix and Linux (see section Unix and the Tools Philosophy). "Word processing" itself may be an obsolete idea of the 1980s personal computing environment, and it may no longer be a necessity in the age of the Web and email--mediums in which plain text content is more native than proprietary word processor formats.

If you do need to design a special layout for hardcopy, you can typeset the text. One could write a book on the subject of Linux typesetting; unfortunately, no such book has yet been written, but this chapter contains recipes for producing typeset text. They were selected as being the easiest to prepare or most effective for their purpose.

NOTE: For more information on this subject, I recommend Christopher B. Browne's excellent overview, "Word Processors for Linux".

16.1 Choosing the Right Typesetting System for the Job  Choosing the typesetting system to use.
16.2 Converting Plain Text for Output  Converting plain text to PostScript.
16.3 LyX Document Processing  LyX, a document processor.
16.4 Typesetting with TeX and Friends  TeX and friends.
16.5 Writing Documents with SGMLtools  SGML and markup language.
16.6 Other Word Processors and Typesetting Systems  Other typesetting systems.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.1 Choosing the Right Typesetting System for the Job

Choosing the proper typesetting system to use when you are about to begin a project can be daunting: each has its own drawbacks and abilities, and to the less experienced it may not be immediately clear which is most appropriate for a particular document or project.

The following table can help you determine which system is best for a particular task. There isn't one way of doing such things, of course--these are only my recommendations. The first column lists the kind of output you intend, the second gives examples of the kind of documents, and the third suggests the typesetting system(s) to use. These systems are described in the remaining sections of this chapter.

Printed, typeset output and electronic HTML or text file Internet FAQ, white paper, dissertation enscript; Texinfo; SGMLtools
Printed, typeset output and text file
man page, command reference card
Printed, typeset output Letter or other correspondence, report, book manuscript
LaTeX or LyX
Printed, typeset output Brochure or newsletter with multiple columns and images
Printed, typeset output Envelope, mailing label, other specialized document
Printed text output in a font
Grocery list, saved email message, to-do list
Printed, typeset output Poster, sign enscript; HTML; LyX; TeX
Large printed text output Long banners for parties or other occasions banner

NOTE: If you really don't need a document to be typeset, then don't bother! Just keep it a plain text file, and use a text editor to edit it (see section Text Editing). Do this for writing notes, email messages, Web pages, Usenet articles, and so forth. If you ever do need to typeset it later, you will still be able to do so. And you can, if you like, view or print plain text in nice fonts (see section Outputting Text in a Font).

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.2 Converting Plain Text for Output

@sf{Debian}: `enscript'
@sf{WWW}: http://www.iki.fi/~mtr/genscript/

The simplest way to typeset plain text is to convert it to PostScript. This is often done to prepare text for printing; the original source text file remains as unformatted text, but the text of the printed output is formatted in basic ways, such as being set in a font.

The main tool for converting text to PostScript is called enscript; it converts the text file that is specified as an argument into PostScript, making any number of formatting changes in between. It's great for quickly making nice output from a plain text file--you can use it to do things such as output text in a font of your choosing, or paginate text with graphical headers at the top of each page.

By default, enscript paginates its input, outputs it in a 10-point Courier font, and puts a simple header at the top of each page containing the file name, date and time, and page number in bold. Use the `-B' option to omit this header.

If you have a PostScript printer connected to your system, enscript can be set up to spool its output right to the printer. You can verify if your system is set up this way by looking at the enscript configuration file, `/etc/enscript.cfg'. The line

DefaultOutputMethod: printer

specifies that output is spooled directly to the printer; changing it to `stdout' instead of `printer' sends the output to the standard output instead.

Even if your default printer does not natively understand PostScript, it may be able to take enscript output, anyway. Most Linux installations these days have print filters set up so that PostScript spooled for printing is automatically converted to a format the printer understands (if your system doesn't have this setup for some reason, convert the PostScript to a format recognized by your printer with the gs tool, and then print that--see Converting PostScript).

  • To convert the text file `saved-mail' to PostScript, with default formatting, and spool the output right to the printer, type:

    $ enscript saved-mail RET

To write the output to a file instead of spooling it, give the name of the file you want to output as an argument to the `-p' option. This is useful when you don't have a PostScript printer and you need to convert the output first, or for when you just want to make a PostScript image file from some text, or for previewing the output before you print it. In the latter case, you can view it on the display screen with a PostScript viewer application such as ghostview (see section Previewing a PostScript File).

  • To write the text file `saved-mail' to a PostScript file, `saved-mail.ps', and then preview it in X, type:

    $ enscript -p report.ps saved-mail RET
    $ ghostview saved-mail.ps RET

The following recipes show how to use enscript to output text with different effects and properties.

NOTE: Once you make a PostScript file from text input, you can use any of the tools to format this new PostScript file, including rearranging and resizing its pages (see section PostScript).

16.2.1 Outputting Text in a Font  Outputting text in a font.
16.2.2 Outputting Text as a Poster or Sign  Outputting text as posters or signs.
16.2.3 Outputting Text with Language Highlighting  Highlighting text based on syntax.
16.2.4 Outputting Text with Fancy Headers  Making fancy headers.
16.2.5 Outputting Text in Landscape Orientation  Outputting text in landscape orientation.
16.2.6 Outputting Multiple Copies of Text  Outputting multiple copies of text.
16.2.7 Selecting the Pages of Text to Output  Selecting which pages of text to output.
16.2.8 Additional PostScript Output Options  More ways to output PostScript from text.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.2.1 Outputting Text in a Font

To output text in a particular PostScript font, use enscript and give the name of the font you want to use as a quoted argument to the `-f' option.

Specify both the font family and size in points: give the capitalized name of the font family (with hyphens to indicate spaces between words) followed by the the size in points. For example, `Courier14' outputs text in the Courier font at 14 points, and `Times-Roman12.2' outputs text in the Times Roman font at 12.2 points. Some of the available font names are listed in the file `/usr/share/enscript/afm/font.map'; the enscript man page describes how to use additional fonts that might be installed on your system.

  • To print the contents of the text file `saved-mail' on a PostScript printer, with text set in the Helvetica font at 12 points, type:
    $ enscript -B -f "Helvetica12" saved-mail RET

  • To make a PostScript file called `saved-mail.ps' containing the contents of the text file `saved-mail', with text set in the Helvetica font at 12 points, type:

    $ enscript -B -f "Helvetica12" -p saved-mail.ps saved-mail RET

The `-B' option was used in the preceding examples to omit the output of a header on each page. When headers are used, they're normally output in 10-point Courier Bold; to specify a different font for the text in the header, give its name as an argument to the `-F' option.

  • To print the contents of the text file `saved-mail' to a PostScript printer, with text set in 10-point Times Roman and header text set in 18-point Times Bold, type:
    $ enscript -f "Times-Roman10" -F "Times-Bold18" saved-mail RET

  • To make a PostScript file called `saved-mail.ps' containing the contents of the text file `saved-mail', with text and headers both set in 16-point Palatino Roman, type:

    $ enscript -f "Palatino-Roman16" -F "Palatino-Roman16" -p
    saved-mail.ps saved-mail RET

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.2.2 Outputting Text as a Poster or Sign

You can output any text you type directly to the printer (or to a PostScript file) by omitting the name of the input file; enscript will read the text on the standard input until you type C-d on a new line.

This is especially useful for making a quick-and-dirty sign or poster--to do this, specify a large font for the text, such as Helvetica Bold at 72 points, and omit the display of default headers.

  • To print a sign in 72-point Helvetica Bold type to a PostScript printer, type:

    $ enscript -B -f "Helvetica-Bold72" RET

72-point type is very large; use the `--word-wrap' option with longer lines of text to wrap lines at word boundaries if necessary. You might need this option because at these larger font sizes, you run the risk of making lines that are longer than could fit on the page. You can also use the `-r' option to print the text in landscape orientation, as described in Outputting Text in Landscape Orientation.

  • To print a sign in 63-point Helvetica Bold across the long side of the page, type:

    $ enscript -B -r --word-wrap -f "Helvetica-Bold63" RET

NOTE: To make a snazzier or more detailed message or sign, you would create a file in a text editor and justify the words on each line in the file as you want them to print, with blank lines where necessary. If you're getting that complicated with it, it would also be wise to use the `-p' option once to output to a file first, and preview the file before printing it (see section Previewing a PostScript File).

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.2.3 Outputting Text with Language Highlighting

The enscript tool currently recognizes the formatting of more than forty languages and formats, from the Perl and C programming languages to HTML, email, and Usenet news articles; enscript can highlight portions of the text based on its syntax. In Unix-speak, this is called pretty-printing.

The following table lists the names of some of the language filters that are available at the time of this writing and describes the languages or formats they're used for.

ada Ada95 programming language.
asm Assembler listings.
awk AWK programming language.
bash Bourne-Again shell programming language.
c C programming language.
changelog ChangeLog files.
cpp C++ programming language.
csh C-Shell script language.
delphi Delphi programming language.
diff Normal "difference reports" made from diff.
diffu Unified "difference reports" made from diff.
elisp Emacs Lisp programming language.
fortran Fortran77 programming language.
haskell Haskell programming language.
html HyperText Markup Language (HTML).
idl IDL (CORBA Interface Definition Language).
java Java programming language.
javascript JavaScript programming language.
ksh Korn shell programming language.
m4 M4 macro processor programming language.
mail Electronic mail and Usenet news articles.
makefile Rule files for make.
nroff Manual pages formatted with nroff.
objc Objective-C programming language.
pascal Pascal programming language.
perl Perl programming language.
postscript PostScript programming language.
python Python programming language.
scheme Scheme programming language.
sh Bourne shell programming language.
skill Cadence Design Systems Lisp-like language.
sql Sybase 11 SQL.
states Definition files for states.
synopsys Synopsys dc shell scripting language.
tcl Tcl programming language.
tcsh TC-Shell script language.
vba Visual Basic (for Applications).
verilog Verilog hardware description language.
vhdl VHSIC Hardware Description Language (VHDL).
vrml Virtual Reality Modeling Language (VRML97).
zsh Z-shell programming language.

To pretty-print a file, give the name of the filter to use as an argument to the `-E' option, without any whitespace between the option and argument.

  • To pretty-print the HTML file `index.html', type:
    $ enscript -Ehtml index.html RET

  • To pretty-print an email message saved to the file `important-mail', and output it with no headers to a file named `important-mail.ps', type:

    $ enscript -B -Email -p important-mail.ps important-mail RET

Use the special `--help-pretty-print' option to list the languages supported by the copy of enscript you have.

  • To peruse a list of currently supported languages, type:

    $ enscript --help-pretty-print | less RET

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.2.4 Outputting Text with Fancy Headers

To output text with fancy graphic headers, where the header text is set in blocks of various shades of gray, use enscript with the `-G' option.

  • To print the contents of the text file `saved-mail' with fancy headers on a PostScript printer, type:
    $ enscript -G saved-mail RET

  • To make a PostScript file called `saved-mail.ps' containing the contents of the text file `saved-mail', with fancy headers, type:

    $ enscript -G -p saved-mail.ps saved-mail RET

Without the `-G' option, enscript outputs text with a plain header in bold text, printing the file name and the time it was last modified. The `-B' option, as described earlier, omits all headers.

You can customize the header text by quoting the text you want to use as an argument to the `-b' option. Use the special symbol `$%' to specify the current page number in the header text.

  • To print the contents of the text file `saved-mail' with a custom header label containing the current page number, type:

    $ enscript -b "Page $% of the saved email archive" saved-mail RET

NOTE: You can create your own custom fancy headers, too--this is described in the `CUSTOMIZATION' section of the enscript man page.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.2.5 Outputting Text in Landscape Orientation

To output text in landscape orientation, where text is rotated 90 degrees counter-clockwise, use the `-r' option.

  • To print the contents of the text file `saved-mail' to a PostScript printer, with text set in 28-point Times Roman and oriented in landscape orientation, type:

    $ enscript -f "Times-Roman28" -r saved-mail RET

The `-r' option is useful for making horizontal banners by passing output of the figlet tool to enscript (see section Horizontal Text Fonts).

  • To output the text `This is a long banner' in a figlet font and write it to the default printer with text set at 18-point Courier and in landscape orientation, type:

    $ figlet "A long banner" | enscript -B -r -f "Courier18" RET

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.2.6 Outputting Multiple Copies of Text

To output multiple copies of text when sending to the printer with enscript, give the number as an argument to the `-#' option. This option doesn't work when sending to a file, but note that lpr takes the same option (see section Printing Multiple Copies of a Job).

  • To print three copies of the text file `saved-mail' to a PostScript printer with the default enscript headers, type:

    $ enscript -#3 saved-mail RET

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.2.7 Selecting the Pages of Text to Output

To specify which pages of a text are output with enscript, give the range of page number(s) as an argument to the `-a' option.

  • To print pages two through ten of file `saved-mail' with the default enscript headers, type:

    $ enscript -a2-10 saved-mail RET

To print just the odd or even pages, use the special `odd' and `even' arguments. This is good for printing double-sided pages: first print the odd-numbered pages, and then feed the output pages back into the printer and print the even-numbered pages.

  • To print the odd-numbered pages of the file `saved-mail' with the default headers, type:
    $ enscript -a odd saved-mail RET

  • To print the even-numbered pages of the file `saved-mail' with the default headers, type:

    $ enscript -a even saved-mail RET

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.2.8 Additional PostScript Output Options

The following table describes some of enscript's other options.

-number Specify number of columns per page; for example, to specify four columns per page, use `-4'.
-apages Specify the page numbers to be printed, where pages is a comma-delineated list of page numbers. Specify individual pages by their numbers, and specify a range of pages by giving the first and last page numbers in the range separated by a hyphen (`-'). The special `odd' prints odd-numbered pages and `even' prints even-numbered pages.
-dprinter Spool output to the printer named printer.
-Elanguage "Pretty-print" the text written in the specified language with context highlighting.
-Hnumber Specify the height of highlight bars, in lines (without number, the value of 2 is used).
-inumber Indent lines by number characters, or follow number with a letter denoting the unit to use: `c' for centimeters, `i' for inches, or `p' for PostScript points (1/72 inch).
-Ifilter Pass input files through filter, which can be a tool or quoted command.
-j Print borders around columns.
-Lnumbers Specify the number of lines per page.
-utext Specify a quoted string "underlay" to print underneath every page.
-Unumber Specify the number of logical pages to print on each page of output.
--highlight-bar-gray=number Specify the level of gray color to be used in printing the highlight bars, from 0.0 (gray) to 1.0 (white).
Adjust left, right, top, and bottom page margins; the measurements are in PostScript points, and, when specifying the values, any can be omitted. (Given on one line all as one long option.)
--rotate-even-pages Rotate each even-numbered page 180 degrees.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.3 LyX Document Processing

@sf{Debian}: `lyx'
@sf{WWW}: http://www.lyx.org/

LyX is a relative newcomer to the typesetting and word-processing arena, and it is one of the most genuinely fresh ideas in the field: it's a kind of word processor for writing LaTeX input (see section Typesetting with TeX and Friends). It's a visual, graphic editor for X, but it doesn't emulate the output paper directly on the display screen. In contrast to specifying exactly how each character in the document will look ("make this word Helvetica Bold at 18 points"), you specify the structure of the text you write ("make this word a chapter heading"). And, in contrast to the WYSIWYG paradigm, its authors call the new approach WYSIWYM---"What you see is what you mean."

LyX comes with many document classes already defined--such as letter, article, report, and book---containing definitions for the elements these document types may contain. You can change the look of each element and the look of the document as a whole, and you can change the look of individual selections of text, but with these elements available, it's rarely necessary.

Since LyX uses LaTeX as a back-end to do the actual typesetting, and LyX is capable of exporting documents to LaTeX input format, you can think of it as a way to write LaTeX input files in a GUI without having to know the LaTeX language commands.

However, even those who do use LaTeX and related typesetting languages can get some use out of LyX: many people find it quick and easy to create some documents in LyX that are much harder to do in LaTeX, such as multi-column newsletter layouts with illustrations.

(One excellent example of this is http://www.bcgs.org/newsletters/bcgs_newsletter-2000-01.pdf)

You can also import your LaTeX files (and plain text) into LyX for further layout or manipulation.

The following recipes show how to get started using LyX, and where to go to learn more about it.

16.3.1 Features of LyX  
16.3.2 Writing Documents with LyX  Writing documents with LyX.
16.3.3 Learning More about LyX  Learning more about LyX.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.3.1 Features of LyX

When editing in LyX, you'll see that it has all of the commands you'd expect from a word processor--for example, some of the commands found on the Edit menu include Cut, Copy, Paste, Find and Replace, and Spell Check.

Here are some of its major features:

  • Automatic generation of table of contents, nested lists, and numbering of section headings.

  • Easy insertion of PostScript figures and illustrations, which can be rotated, scaled, and captioned.

  • WYSIWYG construction of tables.

  • Undo and redo of any operation or sequence of operations.

  • All LyX functions available from both keyboard commands and pull-down menus.

  • All key-presses used for commands are configurable.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.3.2 Writing Documents with LyX

LyX runs under X, and you start it in the usual way--either by choosing it from the applications menu provided by your window manager or by typing lyx in an xterm window. (For more about starting programs in X, see section Running a Program in X).

To start a new document from scratch, choose New from the File menu. You can also make a document from one of the many templates included with LyX, which have the basic layout and settings for a particular kind of document all set up for you--just fill in the elements for your actual document. To make a new document from a template, choose New from template from the File menu, and then select the name of the template to use.

The following table lists the names of some of the included templates and the kind of documents they're usually used for:

aapaper.lyx Format suitable for papers submitted to Astronomy and Astrophysics.
dinbrief.lyx Format for letters typeset according to German conventions.
docbook_template.lyx Format for documents written in the SGML DocBook DTD.
hollywood.lyx Format for movie scripts as they are formatted in the U.S. film industry.
iletter.lyx Format for letters typeset according to Italian conventions.
latex8.lyx Format suitable for article submissions to IEEE conferences.
letter.lyx Basic format for letters and correspondence.
linuxdoctemplate.lyx Format for documents written in the SGML LinuxDoc DTD, as formerly used by the Linux Documentation Project.
revtex.lyx Article format suitable for submission to publications of the American Physical Society (APS), American Institute of Physics (AIP), and Optical Society of America (OSA).
slides.lyx Format for producing slides and transparencies.

To view how the document will look when you print it, choose View DVI from the File menu. This command starts the xdvi tool, which previews the output on the screen. (For more on using xdvi, see section Previewing a DVI File).

To print the document, choose Print from the File menu. You can also export it to LaTeX, PostScript, DVI, or plain text formats; to do this, choose Export from the File menu and then select the format to export to.

NOTE: If you plan on editing the document again in LyX, be sure to save the actual `.lyx' document file.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.3.3 Learning More about LyX

The LyX Documentation Project has overseen the creation of a great deal of free documentation for LyX, including hands-on tutorials, user manuals, and example documents.

The LyX Graphical Tour is a Web-based tutorial that shows you how to create and edit a simple LyX file.

LyX has a comprehensive set of built-in manuals, which you can read inside the LyX editor like any LyX document, or you can print them out. All of the manuals are available from the Help menu.

  • To run LyX's built-in tutorial, choose Tutorial from the Help menu.

This command opens the LyX tutorial, which you can then read on the screen or print out by selecting Print from the File menu.

The following table lists the names of the available manuals as they appear on the Help menu, and describes what each contains:

Introduction An introduction to using the LyX manuals, describing their contents and how to view and print them.
Tutorial A hands-on tutorial to writing documents with LyX.
User's Guide The main LyX usage manual, describing all of the commonly used commands, options, and features.
Extended Features This is "Part II" of the User's Guide, describing advanced features such as bibliographies, indices, documents with multiple files, and techniques used in special-case situations, such as fax support, SGML-Tools support, and using version control with LyX documents.
Customization Shows which elements of LyX can be customized and how to go about doing that.
Reference Manual Describes all of the menu entries and internal functions.
Known Bugs LyX is in active development, and like any large application, bugs have been found. They are listed and described in this document.
LaTeX Configuration This document is automatically generated by LyX when it is installed on your system. It is an inventory of your LaTeX configuration, including the version of LaTeX in use, available fonts, available document classes, and other related packages that may be installed on your system.

Finally, LyX includes example documents in the `/usr/X11R6/share/lyx/examples' directory. Here's a partial listing of these files with a description of what each contains:

Foils.lyx Describes how to make foils---slides or overhead transparencies--with the FoilTeX package.
ItemizeBullets.lyx Examples of the various bullet styles for itemized lists.
Literate.lyx An example of using LyX as a composition environment for "literate programming."
MathLabeling.lyx Techniques for numbering and labeling equations.
Math_macros.lyx Shows how to make macros in Math mode.
Minipage.lyx Shows how to write two-column bilingual documents.
TableExamples.lyx Examples of using tables in LyX.

Files discussing and showing the use of LyX in the field of astronomy.
Examples of documents written in the format used by the American Mathematical Society.
docbook_example.lyx Example of a DocBook document.
multicol.lyx Example of a multi-column format.
scriptone.lyx Example of a Hollywood script.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.4 Typesetting with TeX and Friends

@sf{Debian}: `tetex-base'
@sf{Debian}: `tetex-bin'
@sf{Debian}: `tetex-doc'
@sf{Debian}: `tetex-extra'
@sf{Debian}: `tetex-lib'
@sf{WWW}: http://www.tug.org/teTeX/

The most capable typesetting tool for use on Linux-based systems is the TeX typesetting system and related software. It is the premier computer typesetting system--its output surpasses or rivals all other systems to date. The advanced line and paragraph breaking, hyphenation, kerning, and other font characteristic policies and algorithms it can perform, and the level of precision at which it can do them, have yet to be matched in word processors.

The TeX system itself--not a word processor or single program, but a large collection of files and data--is packaged in distributions; teTeX is the TeX distribution designed for Linux.

TeX input documents are plain text files written in the TeX formatting language, which the TeX tools can process and write to output files for printing or viewing. This approach has great benefits for the writer: the plain text input files can be written with and exchanged between many different computer systems regardless of operating system or editing software, and these input files do not become obsolete or unusable with new versions of the TeX software.

Donald Knuth, the world's foremost authority on algorithms, wrote TeX in 1984 as a way to typeset his books, because he wasn't satisfied with the quality of available systems. Since its first release, many extensions to the TeX formatting language have been made--the most notable being Leslie Lamport's LaTeX, which is a collection of sophisticated macros written in the TeX formatting language, designed to facilitate the typesetting of structured documents. (LaTeX probably gets more day-to-day use than the plain TeX format, but in my experience, both systems are useful for different kinds of documents.)

The collective family of TeX and related programs are sometimes called "TeX and friends," and abbreviated as `texmf' in some TeX references(26): for example, the supplementary files included with the bare TeX system are kept in the `/usr/lib/texmf' directory tree.

The following recipes describe how to begin writing input for TeX and how to process these files for viewing and printing. While not everyone wants or even has a need to write documents with TeX and LaTeX, these formats are widely used--especially on Linux systems--so every Linux user has the potential to encounter one of these files, and ought to know how to process them.

NOTE: "TeX" doesn't sound like the name of a cowboy, nor "LaTeX" like a kind of paint: the letters `T', `E', and `X' represent the Greek characters tau, epsilon, and chi (from the Greek `techne', meaning art and science). So the last sound in "TeX" is like the `ch' in `Bach', and "LaTeX," depending on local dialect, is pronounced either `lay-teck' or `lah-teck'. Those who become highly adept at using the system, Knuth calls "TeXnicians."

16.4.1 Is It a TeX or LaTeX File?  Is it a TeX or LaTeX file?
16.4.2 Processing TeX Files  Processing TeX files.
16.4.3 Processing LaTeX Files  Processing LaTeX files.
16.4.4 Writing Documents with TeX and LaTeX  Writing TeX files.
16.4.5 TeX and LaTeX Document Templates  TeX and LaTeX document templates.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.4.1 Is It a TeX or LaTeX File?

There are separate commands for processing TeX and LaTeX files, and they're not interchangeable, so when you want to process a TeX or LaTeX input file, you should first determine its format.

By convention, TeX files always have a `.tex' file name extension. LaTeX input files sometimes have a `.latex' or `.ltx' file name extension instead, but not always--one way to tell if a `.tex' file is actually in the LaTeX format is to use grep to search the file for the text `\document', which every LaTeX (and not TeX) document will have. So if it outputs any lines that match, you have a LaTeX file. (The regular expression to use with grep is `\\document', since backslash characters must be specified with two backslashes.)

  • To determine whether the file `gentle.tex' is a TeX or LaTeX file, type:

    $ grep '\\document' gentle.tex RET

In this example, grep didn't return any matches, so it's safe to assume that `gentle.tex' is a TeX file and not a LaTeX file.

NOTE: For more on grep and searching for regular expressions, see Regular Expressions--Matching Text Patterns.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.4.2 Processing TeX Files

Use tex to process TeX files. It takes as an argument the name of the TeX source file to process, and it writes an output file in DVI ("DeVice Independent") format, with the same base file name as the source file, but with a `.dvi' extension.

  • To process the file `gentle.tex', type:

    $ tex gentle.tex RET

Once you have produced a DVI output file with this method, you can do the following with it:

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.4.3 Processing LaTeX Files

The latex tool works just like tex, but is used to process LaTeX files.

  • To process the LaTeX file `lshort.tex', type:

    $ latex lshort.tex RET

This command writes a DVI output file called `lshort.dvi'.

You may need to run latex on a file several times consecutively. LaTeX documents sometimes have indices and cross references, which, because of the way that LaTeX works, take two (and in rare cases three or more) runs through latex to be fully processed. Should you need to run latex through a file more than once in order to generate the proper references, you'll see a message in the latex processing output after you process it the first time instructing you to process it again.

  • To ensure that all of the cross references in `lshort.tex' have been generated properly, run the input file through latex once more:

    $ latex lshort.tex RET

The `lshort.dvi' file will be rewritten with an updated version containing the proper page numbers in the cross reference and index entries. You can then view, print, or convert this DVI file as described in the previous recipe for processing TeX files.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.4.4 Writing Documents with TeX and LaTeX

@sf{WWW}: ftp://ctan.tug.org/tex-archive/documentation/gentle.tex
@sf{WWW}: ftp://ctan.tug.org/tex-archive/documentation/lshort/ 

To create a document with TeX or LaTeX, you generally use your favorite text editor to write an input file containing the text in TeX or LaTeX formatting. Then, you process this TeX or LaTeX input file to create an output file in the DVI format, which you can preview, convert, or print.

It's an old tradition among programmers introducing a programming language to give a simple program that just outputs the text `Hello, world' to the screen; such a program is usually just detailed enough to give those unfamiliar with the language a feel for its basic syntax.

We can do the same with document processing languages like TeX and LaTeX. Here's the "Hello, world" for a TeX document:

Hello, world

If you processed this input file with tex, it would output a DVI file that displayed the text `Hello, world' in the default TeX font, on a default page size, and with default margins.

Here's the same "Hello, world" for LaTeX:

Hello, world

Even though the TeX example is much simpler, LaTeX is generally easier to use fresh "out of the box" for writing certain kinds of structured documents--such as correspondence and articles--because it comes with predefined document classes which control the markup for the structural elements the document contains(27). Plain TeX, on the other hand, is better suited for more experimental layouts or specialized documents.

The TeX and LaTeX markup languages are worth a book each, and providing an introduction to their use is well out of the scope of this text. To learn how to write input for them, I suggest two excellent tutorials, Michael Doob's A Gentle Introduction to TeX, and Tobias Oetiker's The Not So Short Introduction to LaTeX---each available on the WWW at the URLs listed above. These files are each in the respective format they describe; in order to read them, you must process these files first, as described in the two previous recipes.

Good LaTeX documentation in HTML format can be found installed on many Linux systems in the `/usr/share/texmf/doc/latex/latex2e-html/' directory; use the lynx browser to view it (see section Browsing Files).

Some other typesetting systems, such as LyX, SGMLtools, and Texinfo (all described elsewhere in this chapter), write TeX or LaTeX output, too--so you can use those systems to produce said output without actually learning the TeX and LaTeX input formats. (This book was written in Emacs in Texinfo format, and the typeset output was later generated by TeX.)

NOTE: The Oetiker text consists of several separate LaTeX files in the `lshort' directory; download and save all of these files.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.4.5 TeX and LaTeX Document Templates

@sf{WWW}: http://dsl.org/comp/templates/

A collection of sample templates for typesetting certain kinds of documents in TeX and LaTeX can be found at the URL listed above. These templates include those for creating letters and correspondence, articles and term papers, envelopes and mailing labels,(28) and fax cover sheets. If you're interested in making typeset output with TeX and LaTeX, these templates are well worth exploring.

To write a document with a template, insert the contents of the template file into a new file that has a `.tex' or `.ltx' extension, and edit that. (Use your favorite text editor to do this.)

To make sure that you don't accidentally overwrite the actual template files, you can write-protect them (see section Write-Protecting a File):

$ chmod a-w template-file-names RET

In the templates themselves, the bracketed, uppercase text explains what kind of text belongs there; fill in these lines with your own text, and delete the lines you don't need. Then, process your new file with either latex or tex as appropriate, and you've got a typeset document!

The following table lists the file names of the TeX templates, and describes their use. Use tex to process files you make with these templates (see section Processing TeX Files).

fax.tex A cover sheet for sending fax messages.
envelope.tex A No. 10 mailing envelope.
label.tex A single mailing label for printing on standard 15-up sheets.

The following table lists the file names of the LaTeX templates, and describes their use.(29) Use latex to process files you make with these templates (see section Processing LaTeX Files).

letter.ltx A letter or other correspondence.
article.ltx An article or a research or term paper.
manuscript.ltx A book manuscript.

There are more complex template packages available on the net that you might want to look at:

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.5 Writing Documents with SGMLtools

@sf{Debian}: `sgml-tools'
@sf{WWW}: http://www.sgmltools.org/

With the SGMLtools package, you can write documents and generate output in many different kinds of formats--including HTML, plain text, PDF, and PostScript--all from the same plain text input file.

SGML ("Standard Generalized Markup Language") is not an actual format, but a specification for writing markup languages; the markup language "formats" themselves are called DTDs ("Document Type Definition"). When you write a document in an SGML DTD, you write input as a plain text file with markup tags.

The various SGML packages on Linux are currently in a state of transition. The original SGML-Tools package (known as LinuxDoc-SGML in another life; now SGMLtools v1) is considered obsolete and is no longer being developed; however, the newer SGMLtools v2 (a.k.a. "SGMLtools Next Generation" and "SGMLtools '98") is still alpha software, as is SGMLtools-lite, a new subset of SGMLtools.

In the interim, if you want to dive in and get started making documents with the early SGMLtools and the LinuxDoc DTD, it's not hard to do. While the newer DocBook DTD has become very popular, it may be best suited for technical books and other very large projects--for smaller documents written by individual authors, such as a multi-part essay, FAQ, or white paper, the LinuxDoc DTD still works fine.

And since the Linux HOWTOs are still written in LinuxDoc, the Debian project has decided to maintain the SGMLtools 1.0 package independently.

The SGML-Tools User's Guide comes installed with the `sgml-tools' package, and is available in several formats in the `/usr/doc/sgml-tools' directory. These files are compressed; if you want to print or convert them, you have to uncompress them first (see section Compressed Files).

    To peruse the compressed text version of the SGML-Tools guide, type:
    $ zless /usr/doc/sgml-tools/guide.txt.gz RET
  • To print a copy of the PostScript version of the SGML-Tools guide to the default printer, type:

    $ zcat /usr/doc/sgml-tools/guide.ps.gz | lpr RET

16.5.1 Elements of an SGML Document  Elements of an SGML document.
16.5.2 Checking SGML Document Syntax  Checking SGML document syntax.
16.5.3 Generating Output from SGML  Making output from SGML source.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.5.1 Elements of an SGML Document

A document written in an SGML DTD looks a lot like HTML--which is no coincidence, since HTML is a subset of SGML. A very simple "Hello, world" example in the LinuxDoc DTD might look like this:

<!doctype linuxdoc system>
<title>An Example Document
<author>Ann Author
<date>4 May 2000
This is an example LinuxDoc document.


<p>Hello, world.


A simple example document and the various output files it generates are on the SGMLtools site at http://www.sgmltools.org/old-site/example/index.html.

The SGMLtools package also comes with a simple example file, `example.sgml.gz', which is installed in the `/usr/doc/sgml-tools' directory.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.5.2 Checking SGML Document Syntax

Use sgmlcheck to make sure the syntax of an SGML document is correct--it outputs any errors it finds in the document that is specified as an argument.

  • To check the SGML file `myfile.sgml', type:

    $ sgmlcheck myfile.sgml RET

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.5.3 Generating Output from SGML

The following table lists the SGML converter tools that come with SGMLtools, and describes the kind of output they generate. All take the name of the SGML file to work on as an argument, and they write a new file with the same base file name and the file name extension of their output format.

sgml2html Generates HTML files.
sgml2info Generates a GNU Info file.
sgml2lyx Generates a LyX input file.
sgml2latex Generates a LaTeX input file (useful for printing; first process as in Processing LaTeX Files, and then print the resultant DVI or PostScript output file).
sgml2rtf Generates a file in Microsoft's "Rich Text Format."
sgml2txt Generates plain text format.
sgml2xml Generates XML format.

  • To make a plain text file from `myfile.sgml', type:

    $ sgml2txt myfile.sgml RET

This command writes a plain text file called `myfile.txt'.

To make a PostScript or PDF file from an SGML file, first generate a LaTeX input file, run it through LaTeX to make a DVI output file, and then process that to make the final output.

  • To make a PostScript file from `myfile.sgml', type:

    $ sgml2latex myfile.sgml RET
    $ latex myfile.latex RET
    $ dvips -t letter -o myfile.ps myfile.dvi RET

In this example, sgml2latex writes a LaTeX input file from the SGML source file, and then the latex tool processes the LaTeX file to make DVI output, which is processed with dvips to get the final output: a PostScript file called `myfile.ps' with a paper size of US letter.

To make a PDF file from the PostScript file, you need to take one more step and use ps2pdf, part of the gs or Ghostscript package; this converts the PostScript to PDF.

  • To make a PDF file from the PostScript file `myfile.ps', type:

    $ ps2pdf myfile.ps myfile.pdf RET

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

16.6 Other Word Processors and Typesetting Systems

The following table describes other popular word processors and typesetting tools available for Linux. Those systems not in general use have been silently omitted.

AbiWord A graphical, WYSIWYG-style word processor for Linux systems. It can read Microsoft Word files. {@sf{WWW}}: http://www.abisource.com/
groff GROFF is the latest in a line of phototypesetting systems that have been available on Unix-based systems for years; the original in this line was roff ("runoff," meaning that it was for files to be run off to the printer). groff is used in the typesetting of man pages, but it's possible to use it to create other kinds of documents, and it has a following of staunch adherents. To output the tutorial file included with the groff distribution to a DVI file called `intro.dvi', type:
$ zcat /usr/doc/groff/me-intro.me.gz | groff 
-me -T dvi > intro.dvi RET
{@sf{Debian}}: `groff'
Maxwell A graphical word processor for use in X. {@sf{WWW}}: http://www.eeyore-mule.demon.co.uk/
PostScript The PostScript language is generally considered to be a format generated by software, but some people write straight PostScript! Converting Plain Text for Output, has recipes on creating PostScript output from text, including outputting text in a font. People have written PostScript template files for creating all kinds of documents--from desktop calendars to mandalas for meditation. The Debian `cdlabelgen' and `cd-circleprint' packages contain tools for writing labels for compact discs. Also of interest are Jamie Zawinski's templates for printing label inserts for video and audio tapes; edit the files in a text editor and then view or print them as you would any PostScript file. {@sf{WWW}}: http://www.jwz.org/audio-tape.ps {@sf{WWW}}: http://www.jwz.org/video-tape.ps
StarWriter A traditional word processor for Linux systems, part of the StarOffice application suite. It can also read Microsoft Word files. {@sf{WWW}}: http://www.sun.com/staroffice/
Texinfo Texinfo is the GNU Project's documentation system and is an excellent system for writing FAQs or technical manuals. It allows for the inclusion of in-line EPS images and can produce both TeX-based, HTML, and Info output--use it if this matches your needs. {@sf{Debian}}: `tetex-base' {@sf{WWW}}: http://www.texinfo.org/

[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]