MH Format Strings

[previous] [next] [table of contents] [index]

NOTE: for users of the online version of this book: This chapter has a lot of examples followed by long explanations. To avoid jumping between the example and its explanation, it's a good idea to open a new browser window to show an example. (Check your browser's menu for a command like New Web Browser or Open in New Window.) Then, use the original browser to read the explanation while you view the example in the second browser window.

The MH 6.8.3 mh-format(5) manual page says: "Format strings are designed to be efficiently parsed by MH which means they are not necessarily simple to write and understand. This means that... users of MH should not have to deal with them."

The MH 6.6 page said just the opposite: format strings "...represent an integral part of MH. This means that... users of MH should deal with them."

I tend to agree with the MH 6.6 wording. Unless you're doing something very complex, MH format strings really aren't that tough to figure out. And they're very useful. MH format strings let you:

Parse (analyze, get pieces of) message header fields, especially dates and addresses.
Build strings from other strings, including addresses.
Do if-then-elseif-else tests.
Do simple integer arithmetic.

You can use format strings to build message headers, or entire messages, from other messages. That's how the replcomps file works, by the way. The scan command can also use format strings to customize its output. And format strings are great for programming in Perl, the shells, C, etc. -- you can use them to parse message headers, a real time saver. For more information, see the Chapter Introduction to UNIX Programming with MH.

Until recently, the mh-format(5) manual page was fairly brief; it didn't document all of mh-format. The most recent version of the manual page, released with MH 6.7, has quite a bit of information. If your online version isn't up to date, the Section Online Manual Pages explains how to get a newer one.

One term you'll need to know is escape. An escape is a lot like a variable in programming or mathematics: it stands for (and is usually replaced with) something else. There are three kinds of escapes in MH format strings.

The easiest escapes to define are component escapes. These are replaced with the fields' values from your message header. (Remember, MH calls a header field a "component.") Here's an example. To get the subject of a message into your MH format string, you use the subject component escape. Write it this way:

%{subject}

There are two other kinds of escapes: function and control. You'll see examples of those below, and the mh-format(5) manual page defines them.

In fact, this is a good time to spend a few minutes with your online manual page. You don't need to read it word for word, but you should see what sections are there and what topics they cover.

The following sections will take you through MH format strings by example, like the mhl sections did. An easy way to get started with MH format strings is the scan command.

scan Format Strings

A scan format string is an mh-format string. It tells scan how to format the output for each message it scans.

It's time for a few examples. I have a folder with two messages in it. In the Example below, I'll use show to display the header of the first message for reference. Then I'll scan both messages with the normal scan command. Because there's no -form or -format string, scan uses its default format.

Example: Sample folder with two messages

% show 1
(Message scantest:1)
Forwarded: Fri, 13 Jan 1995 03:41:35 -0500
Forwarded: alicia
Replied: Mon, 09 Jan 1995 10:25:45 -0500
Replied: Joe Doe <joe@foobar.ph.com>
Date: Thu 14-Dec-89 17:31:21 est 
Received: by asun.phl.ph.com (5.54/PHL)
        id AA29237; Thu, 14 Dec 89 17:31:21 EST
Message-Id: <8912142231.AA29237@phl.ph.com>
From:  Al Bok <al@phl.ph.com>
Reply-to: Joe Doe <joe@foobar.ph.com>
To:  hquimby@asun.phl.ph.com
cc: ehuser@quack.phl.ph.com, aguru@mt.top.ph.com
Subject:  Query about "repl -query"

I have a question about repl -query...
% scan
   1+-12/14 Al Bok             Query about "repl -query"<<I have
   2  01/09 To:Joe Doe         Re: Query about "repl -query"<<Jo

Now let's give scan a format string. Either you can put format strings in a format file and use scan's -form switch or you can type them on the command line with the -format switch. I'll start with -format.

A simple format string that prints a hash mark followed by the message number and a colon, then the subject, works like this:

% scan -format "#%(msg): %{subject}"
#1: Query about "repl -query"
#2: Re: Query about "repl -query"

Here are some points about that last example:

That format string has double quotes (") around it. They tell the UNIX shell not to interpret most of the special characters in the string. So, # won't be a shell comment character, () won't start a subshell, and so on; those characters will be passed to scan for it to interpret.
In the format string, the %(msg) prints the value of the (msg) function escape. The %{subject} prints the component escape called {subject}.
The hash mark (#), colon (:), and space are printed literally.
The scan command uses the same format string on each message.
I didn't give scan a message number list, so it scanned each message in the folder.

This is a good place to compare component escapes with function escapes. A component escape gets the contents of a component. A function escape performs some sort of calculation, operation, or other function. For example, the component escape {to} gets the contents of the To: field from a message. The (size) function escape counts the number of characters in a message.

If you don't use the percent sign (%) characters, MH won't treat what comes next as an escape. Look what happens without the % characters:

% scan -format "#(msg): {subject}"
#(msg): {subject}
#(msg): {subject}

You've already seen examples of two of the three types of escapes: component and function escapes. The third type, a control escape, does an if-else_if-else-endif operation. The parts are:

%< = if      %? = else_if      %| = else      %> = endif

NOTE: MH 6.7.2 added the else-if operator, %?, to that list. To keep things simple at the start, I won't cover %? until the Section The Default scan Format File.

Let's add a control escape to this example. It will test to see who each message is from. If a message was sent by me, this control escape will display the words FROM ME. Otherwise, it'll display the sender's address by printing the %{from} component escape. The control escape looks like this:

%<(mymbox{from})FROM ME%|%{from}%>

That's not as hard as it might look -- we'll dissect it in a minute. Let's try it first, then explain.

% scan -format "#%(msg): %<(mymbox{from})FROM ME%|%{from}%> %{subject}"
#1: Al Bok <al@phl.ph.com> Query about "repl -query"
#2: FROM ME Re: Query about "repl -query"

The first message is from someone else, so scan prints his address. The second message is from me, so FROM ME is printed instead.

Let's dig into that control escape. Here's a diagram of the if-then-else parts:

%< (mymbox{from})    FROM ME    %|    %{from}  %>
if               then          else
    this is true     do this          do this

Actually, that's a nested set of all three kinds of escapes -- control, function, and component.

The %< is the start of the control escape. It tests the return value of (in other words, the "answer" from) (mymbox{from}). The (mymbox) function escape tests whether an address belongs to the person who's running the MH command. The {from} component escape is the address to test. Note the following:

If the return value of (mymbox{from}) is 1 ("true"), then the message is from me. The control escape evaluates the first part (in other words, the then part) of the rest of the escape. Here that's the words FROM ME. Because those words aren't an escape, they're just printed as is. Then evaluation continues at the %> symbol.
Otherwise, the return value of (mymbox{from}) must be 0 ("false"). Interpretation jumps to the %| symbol, which is the else part of the control escape. Just after this is a component escape, %{from}, that holds the address the message is from -- so, the address is printed. Then evaluation continues at the %>.

Look back at the result of running that command. When the first message was scanned, it was not from me, the test failed, and the From: address was printed. When the second message was scanned, it was from me, the test was true and FROM ME was printed.

An escape returns one of two kinds of values, either numeric (integer) or string. The return values of escapes are put into registers (holding places) named num and str, respectively.

For simple format files, you don't need to know about registers. That's because the return value of an escape is always printed, unless the escape is nested in another escape. The outermost escape should always start with a percent sign (%); inner (nested) escapes shouldn't.

For instance, in the previous format string, the %(msg) and %{subject} escapes are not nested in others -- so their values are just printed. But the nested set of escapes (mymbox{from}) is itself nested in a control escape. There the return value of {from} is passed to (mymbox), and the return value of (mymbox) is passed to the control escape. What's printed is the value of the control escape (which starts with a percent sign (%); that's a clue that it'll be printed).

It's a good idea to test yourself as you look at the other mh-format strings in this section. Experiment to be sure how they work, what will be printed, and so on. The mh-format(5) manual page has more precise information.

NOTE: Most address-parsing function escapes won't work if your MH is configured with [BERK]. scan -help lists your configuration.

The Table below summarizes the four kinds of escapes.

Table: MH Format Escapes

Component
SYNTAX: {component}
EXAMPLE: %{from}
RESULT: What's in the From: field.
Function
SYNTAX: (function)
EXAMPLE: %(mymbox{from})
RESULT: True if the result of {from} is my address.
Control
SYNTAX: %<else_if%|else%>
EXAMPLE: %<(mymbox{from})FROM ME%|To: %{to}%>
RESULT: If(mymbox{from}) is true, value is the string FROM ME. Else, value is the string To: %{to}.
Comment (MH 6.8 and above)
SYNTAX: %;
EXAMPLE: %; by Emma H. User
RESULT: None.

If you're still not exactly sure how this works, this is a good time to practice. To help you get started if you haven't done much programming before, you might want to lure a computer guru from down the hall somewhere. (Hint: all computer gurus like pizza.)

scan Format Files

Because the format strings in the example are getting pretty long to type, I'll start using format files in the examples.

A format file has the same syntax as the format strings we used above, but you type the format string into the file without quotes around it. (Use a text editor like vi or emacs.) You give the filename to scan with its -form switch -- if the file is in your MH directory, you don't need to type a pathname. 'br For example, here's what the above format string would look like in a format file named scan.from in your MH directory. I've left in the backslash at the end of the short first line, so you can see how to continue lines if you need to. (You can also get this little file from the book's online archive. See download/split/mh/Mail/scan.from.)

% cat scan.from
#%(msg): \
%<(mymbox{from})FROM ME%|%{from}%> %{subject}
% scan -form scan.from
#1: Al Bok <al@phl.ph.com> Query about "repl -query"
#2: FROM ME Re: Query about "repl -query"

Another note about these example format files: if you don't want to type them in yourself, you can get them electronically. For instructions, see the Section Obtaining Example Files From This Book.

The scan.answer Format File

Let's turn the simple scan.from format file into one that's more useful:

It shows the message number. If you've replied to the message by using repl -annotate, an R is printed next.
It gives the address you'd use to answer each message. In most cases, the address you want is the From: address -- unless the message has a Reply-to: field. And, if the message is from you, you want FROM ME to remind you that it doesn't need a reply.
It gives the subject of each message.
It lines up text in columns -- that is, each part of the line is the same width as lines above and below.

Here's the output and the format file. (You can also get this file from the book's online archive. It's in download/split/mh/Mail/scan.answer.)

% scan -form scan.answer
   1R Al Bok <al@phl.ph.co Query about "repl -query"
   2  ****** FROM ME ***** Re: Query about "repl -query"
% cat scan.answer
%4(msg)%<{replied}R%| %> \
%<(mymbox{from})****** FROM ME *****%|\
%<{reply-to}%20{reply-to}%|%20{from}%>%> \
%{subject}

Okay; let's take this step by step again:

The %4(msg) prints the message number in a field that's four characters wide.
%<{replied}R%| %> tests the value of the {replied} component escape. If there is a Replied: field in this message header, the test is true and the R is printed. If the message header doesn't have a Replied: field, the test will fail and a space is printed instead (to keep the columns neat).
%<(mymbox{from})****** FROM ME ***** starts the same test as in the scan.from format file: if the message is from me, it prints FROM ME. There are enough asterisks to make the output exactly 20 characters wide.
%|%<{reply-to}%20{reply-to}%|%20{from}%> is the else part of the previous if (the (mymbox{from}) escape). This else is actually made up of another complete if-then-else, as shown below:
- %<{reply-to}%20{reply-to} says that if the message has a Reply-To: field, print that field with a width of 20.
- %|%20{from}%> is the else -- if the message didn't have a Reply-To:, then we use the From: field instead. Again, we use only the first 20 characters. %> is the end of this inner control escape.
If you use MH 6.7.2 or later, that test can be shortened by using the %? else-if operator. The Section The Default scan Format File introduces %?.
And finally, the rest of the file:
- %> \ is the end of the outer control escape. There's a space before the backslash; this makes the space between the address and the subject.
- %{subject} prints the subject with no width limit except the screen size (see the Section scan Widths for more information).

The Default scan Format File

When you use scan without a format file or format string, you get the default format. Here's an example of the default format:

 436+-06/28 Al Bok             <<I have a very complicated ques
 441  06/29 Jerry Peek         That complicated message Al sent
 443  06/30*To:ehuser,emmab    More about lunch<<The meeting is

NOTE: In earlier versions of MH, message 441 showed a problem in the default format. It would scan this way:
441  06/29 To:                That complicated message Al sent
If a message was from you and its header didn't have a To: field, scan would show To: followed by an empty field. That happened when a particular message was a reply sent with repl -query, where the reply wasn't sent to to the person who wrote the original message.

The default scan format is not read from a file each time scan runs. It's built into the scan command. The compiled-in definition is in the file h/scansbr.c in the MH source tree. There are two versions. If your MH is configured without the [UK] option (see the Section The -help Switches to find out), look at the first Example below (or the book's online archive, download/split/mh/Mail/scan.default). I've added line numbers (like 8>) for reference; those aren't part of the file. In the [UK] configuration, the day of the month is printed before the month. That file is shown in the second Example below; it's also in the book's online archive at download/split/mh/Mail/scan.default.uk.

nmh provides a copy of its default format file in the file scan.default. But, like MH, nmh doesn't use that actual file; it uses an internal version. There's an important difference in the nmh default format, though: it decodes MIME characters in the message header. For details, see the end of this section. Example: Default scan format file

1> %; NOTE: This file is supplied for reference only; it shows the default
2> %;  format string (for non-UK sites) which was compiled into "scan".
3> %;  See the source file "h/scansbr.h" for details.
4> %4(msg)%<(cur)+%| %>%<{replied}-%?{encrypted}E%| %>\
5> %02(mon{date})/%02(mday{date})%<{date} %|*%>\
6> %<(mymbox{from})%<{to}To:%14(friendly{to})%>%>%<(zero)%17(friendly{from})%>  \
7> %{subject}%<{body}<<%{body}>>%>

Example: Default UK scan format file

1> %4(msg)%<(cur)+%| %>%<{replied}-%?{encrypted}E%| %>\
2> %02(mday{date})/%02(mon{date})%<{date} %|*%>\
3> %<(mymbox{from})%<{to}To:%14(friendly{to})%>%>%<(zero)%17(friendly{from})%>  \
4> %{subject}%<{body}<<%{body}>>%>

The non-UK scan format (in the Example Default scan format file) is also available, in MH 6.8 and above, as the file scan.default in the MH library directory. I made the UK version by swapping the day and month entries from scan.default.

Let's take a walk through the non-UK Example, Default scan format file. As we work through this example and the ones after it, keep the mh-format(5) manual page close by and refer to it as we go. To help with the explanation, here are two scan output lines with each character (column position) numbered:

 436+-06/28 Al Bok             <<I have a very complicated ques
 443  06/30*To:ehuser,emmab    More about lunch<<The meeting is
0        1         2         3         4         5         6         7
1234567890123456789012345678901234567890123456789012345678901234567890

Lines 1-3 are comments. They start with the comment escape %;, which was added in MH 6.8.
Before MH 6.8, an ugly way to add a comment looked like this:
```
%<{-comment-}This is a comment%>
```
If the message header doesn't have a field named -comment-: (with a dash at the start and end of its name), the comment This is a comment won't be printed. I don't recommend making comments that way.
Line 4 prints the first six characters of each line, columns 1-6:
- The first four characters hold the message number. It comes from the (msg) function escape; the %4 prints the value in a field four characters wide.
- There are two control escape tests on line 4. The first test prints the fifth character. If the (cur) function returns a "true" value, the message we're scanning is the current message; the test prints a plus sign (+). Otherwise, to keep the columns lined up, a space is printed.
- The second test prints the sixth character. It also uses the new %? else-if operator introduced in MH 6.7.2. If the message has been annotated with a Replied: header field (Section Annotating the Original Message), a dash (-) is printed. Else if the message has an Encrypted: header field (from versions of MH configured with the [TMA] option), an E is printed. Else, a space is printed.
Line 5 prints the next six characters, columns 7-12: the date and possibly an asterisk (*). The Date: header field is parsed twice, by the (mon) and (mday) functions, to get the numeric month and day. Those two numbers are printed with the format specifier %02 and a slash (/) between. The leading 0 in %02 means that if the number has fewer digits than the field width (in this case, if it has just one digit), it's printed with a leading zero. So, days like March 5 would be printed as 03/05.
If the message doesn't have a Date: header field, {date} gives the date that the message file itself was last modified. In that case, column 12 will have an asterisk (*) instead of a blank. This is handy for scanning draft folders, where messages usually don't have Date: fields. The note after Table MH-format Special Component and Function Escapes has more information.
Line 6 is fairly complex in MH 6.8 and above. (Earlier versions were simpler, but they had the bug explained in the previous footnote.) Line 6 prints 19 characters: Up to 17 characters of text with two spaces at the end. That's columns 13-31.
Line 6 starts with a nested control escape. It ends with a control escape that tests a register set by the first control escape. Let's take it in steps.
- Here's the first part of line 6:
```
%<(mymbox{from})%<{to}To:%14(friendly{to})%>%>
```
  It starts by testing to see whether the From: field contains my address. (As a side effect, the return value of the test, (mymbox{from}), is stored in the num register. The value is used below.) If the message is from me, the second control escape tests to see whether the message has a To: field. If it does, the format string prints To: followed by the first 14 digits of the first address in the To: field. That's the end of the first part of line 6.
- Now, the second part of the entry:
```
%<(zero)%17(friendly{from})%>  \
```
  The (zero) function is true when the value in the num register is zero. In other words, if the last escape that modified the value of num set a "false" (zero) value, the (zero) function will test true.
  What was the last escape that modified the value of num? There are two possibilities here:
  
  If the message isn't from me,
  
  (mymbox{from}) will have failed. The control escape in the first part of line 6 will have branched to its end, skipping its nested part. In that case, (zero) will test true.
  If the message is from me,
  
  the first control escape will branch to its nested part and evaluate {to}. If the message has a To: field (and most messages do!), {to} will set the num register to "true" (non-zero). So, in most cases, num will be true and the (zero) test will fail.
  But if the message is from me and it doesn't have a To: field, {to} will set the num register to "false". (Also, the To: field won't be printed.) In that case, (zero) will test true -- and the second control escape will print the first 17 characters of the From: field in a "user-friendly" format.
Finally, the two spaces before the backslash (\) at the end of line 6 print two blank columns of output: characters 30 and 31.
That wasn't so bad, was it? :-) Line 6 shows a good example of the num register: holding the result of a test to be used later.
Line 7 starts by printing the Subject: field. If there's any room left, the first part of the message body is printed. (The Section scan Widths explains how scan decides whether there's room left.) {body} is a special component escape set by scan. It holds the first part of the message body, "compressed": newlines and multiple space characters are replaced by a single space.
If the message doesn't have a body, {body} evaluates false and nothing is printed after the subject. If the message has a body, << is printed, followed by as much of the body as will fit:
- If the message body is short, then all of it will be printed. At the end, you'll see >>:
```
 452  07/01 "Emma H. User"     A short message from me<<Hi!>>
```
- In most cases, though, there's only room for the first part of the body. scan will print as much of the body as it can.

Recent versions of nmh have some changes to the default scan format. Here are the last three lines of that file with the changes boldfaced:

%<(mymbox{from})%<{to}To:%14(decode(friendly{to}))%>%>\
%<(zero)%17(decode(friendly{from}))%>  \
%(decode{subject})%<{body}<<%{body}>>%>

The To:, From: and Subject: header fields use the decode function escape. This decodes any RFC 2047 encoding in those fields. For example, this changes a Subject: field encoded as Un =?iso-8859-1?Q?d=EDa_dif=EDcil?= into Un día difícil.

scan only decodes these fields if your terminal can natively display the character set used in the encoding. You should set the MM_CHARSET environment variable to your native character set if it is not US-ASCII.

More Header Information: scan.hdr

Next, let's try a small change to the default (non-UK) format file. This new format file, scan.hdr, shows more information about the message header. The file, or one that you adapt from it, might be useful for you. Its output looks like this:

 435+          05/20 root               <<The job you submitted to a
 436 C       R 06/28 Al Bok             <<I have a very complicated
 441 C         06/29 Jerry Peek         That complicated message Al
 443  DF       06/30*To:ehuser,emmab    More about lunch<<The meetin

This new version has five "field letters" between the message number and date:

C: The message header has a cc: field in it. This is useful for figuring out messages like number 441, which doesn't have a To: address (but, as you can tell, does have a cc: field).
D
: The message has been distributed (either distributed from someone else to you or sent by you with the dist -annotate command). So message 443 has at least one Resent-To:, Resent-cc:, or Resent: field.
F
: The message has been forwarded to someone. Message 443 has been forwarded with forw -annotate, and it has a Forwarded: field.
M
: The message has a MIME MIME-Version: field.
R
: The message has been replied to with repl -annotate. (The default format file uses a dash (-), instead of an R, for this.)

The next Example shows the format file. You can also get this file from the book's online archive in download/split/mh/Mail/scan.hdr.)

Example: scan.hdr format file

1> %4(msg)%<(cur)+%| %>%<{cc}C%| %>\
2> %<{resent-to}D%?{resent-cc}D%?{resent}D%| %>\
3> %<{forwarded}F%| %>%<{mime-version}M%| %>%<{replied}R%| %>\
4>  %02(mon{date})/%02(mday{date})\
5> %<{date} %|*%>\
6> %<(mymbox{from})%<{to}To:%14(friendly{to})%>%>%<(zero)%17(friendly{from})%>  \
7> %{subject}%<{body}<<%{body}>>%>

The differences between scan.default and scan.hdr are in the first four lines of the Example above. Compare those to lines 4-7 of the Example Default scan format file.

Most of the changes are new control escapes to make the field letters. For example:

%<{cc}C%| %>

tests for a cc: header. If there is one, it prints a C; otherwise it prints a space.

The three-part control escape on the second line prints a D or a space. It uses the %? else-if operator. Here is the same line for versions of MH before 6.7.2 which don't have %?:

%<{resent-to}D%|%<{resent-cc}D%|%<{resent}D%| %>%>%>

You might try adding another column for, say, a Sender: field. To test your new format file, use a text editor to add a Sender: field to a couple of mail messages. In MH 6.7 and later, you can also use a command like the following to add a dummy Sender: field. (In MH 6.6 and before, anno doesn't have a -nodate switch.)

% anno -nodate -component Sender -text someone@somewhere

scan Widths

When scan writes to your screen, it tries to determine the width and fill it (if your format gives it that much text). For instance, the standard format string (stored internally in scan) will fill an 80-column screen to column 79. The same format string will fill a 40-column screen to column 39; the right-hand end will be cut off. For instance, here's the output of the same standard format string at three different screen widths:

  18+ 02/13 To:omderose@mvus   Lunch<<Let's eat now. OK? >>
  18+ 02/13 To:omderose@mvus   Lunch<<Let's
  18+ 02/13 To:omderose@m

As another example, notice that adding the five status letters in the Section More Header Information: scan.hdr didn't make the scan.hdr output any wider than the scan.default output.

As an output line is printed, you can get the amount of space left by using the function escape (charleft). The (width) function escape gives the total output width.

The scan.dateparse Format File

Let's try another example: the format file scan.dateparse. It uses date parsing functions to show the dates of messages. The output changes to fit the width available.

scan.dateparse isn't a format file you'd want to use every day, but it's a good demonstration of some important things:

How date parsing works.
How to compare two numbers (in this case, a "greater-than" test).
The effect of line and data width, and how your format file can adjust automatically when text width or output width vary.
Word wrapping and embedded newlines (\n).
More examples of if-then-else branching (control escapes).

Let's see what the file does and then dig into a line-by-line explanation. First, here's a normal scan of a folder with four messages. The messages were sent from different systems in different time zones. Message 3 has an illegal Date:.

% scan
   1+-12/14 Al Bok             Query about "repl -query"<<I hav
   2  01/09 To:Joe Doe         Re: Query about "repl -query"<<J
   3  01/00 randy@atlantic.or  Meeting is on!<<Be sure to get y
   4  08/16 randy@atlantic.or  Meeting is on!<<Be sure to get y

The scan.dateparse format file makes about 330 characters of output for each message. The amount depends on the length of the Date: field in the message. Here's an example of scanning the same folder with scan.dateparse:

% scan -form scan.dateparse -width 330

MESSAGE 1: Thu 14-Dec-89 17:31:21 est (STANDARD time)
 Official: Thu, 14 Dec 89 17:31:21 -0500
 "Pretty": Thu, 14 Dec 89 17:31:21 EST
629677881 seconds since UNIX, 160540976 seconds before now
DAY|WEEKDAY  |WDAY|SDAY|MONTH|LMONTH   |MON|YEAR|HOUR|MIN|SEC
Thu|Thursday |   4|yes |Dec  |December | 12|1989|  17| 31| 21

MESSAGE 2: Mon, 09 Jan 1995 10:25:45 -0500 (STANDARD time)
 Official: Mon, 09 Jan 1995 10:25:45 -0500
 "Pretty": Mon, 09 Jan 1995 10:25:45 EST

789665145 seconds since UNIX, 553712 seconds before now
DAY|WEEKDAY  |WDAY|SDAY|MONTH|LMONTH   |MON|YEAR|HOUR|MIN|SEC
Mon|Monday   |   1|yes |Jan  |January  |  1|1995|  10| 25| 45

MESSAGE 3: -0400 16 Aug 89 16:54:59 CAN'T PARSE DATE

MESSAGE 4: 16 Aug 89 16:54:59 -0400 (DAYLIGHT time)
 Official: 16 Aug 89 16:54:59 -0400
 "Pretty": 16 Aug 89 16:54:59 EDT
619304099 seconds since UNIX, 170914758 seconds before now
DAY|WEEKDAY  |WDAY|SDAY|MONTH|LMONTH   |MON|YEAR|HOUR|MIN|SEC
Wed|Wednesday|   3|no  |Aug  |August   |  8|  89|  16| 54| 59

Notice (on the first line of the listings) that each message has a different date format, but scan can parse all of them -- except the one in message 3.

Format files you've seen up to now just let scan truncate their output when the width limit is reached. But scan.dateparse checks the available width. It prints the last two lines that show the parsed date only if there is enough room for all of both lines. In this next example, the width isn't quite enough, so the last two lines for each message aren't displayed:

% scan -form scan.dateparse -width 300
	...These lines omitted...
629825145 seconds since UNIX, 8553736 seconds before now

MESSAGE 3: -0400 16 Aug 89 16:54:59 CAN'T PARSE DATE

MESSAGE 4: 16 Aug 89 16:54:59 -0400 (DAYLIGHT time)
 Official: 16 Aug 89 16:54:59 -0400
 "Pretty": 16 Aug 89 16:54:59 EDT
619304099 seconds since UNIX, 170914758 seconds before now

If you were going to use a format file like that a lot, you'd probably want to make a new version of scan called something like scandp. When you make the new version, you'd put this entry in your MH profile:

scandp: -form scan.dateparse -width 330

Then, you could just type scandp to use scan.dateparse without having to remember the width.

The next Example shows scan.dateparse. (It's also in the book's online archive at download/split/mh/Mail/scan.dateparse.)

Example: Date parsing demonstration: scan.dateparse

 1> MESSAGE %(msg): %{date} \
 2> %<(nodate{date})CAN'T PARSE DATE%|\
 3> (%<(dst{date})DAYLIGHT%|STANDARD%> time)\n\
 4>  Official: %(tws{date})\n\
 5>  "Pretty": %(pretty{date})\n\
 6> %(clock{date}) seconds since UNIX, %(rclock{date}) seconds before now\
 7> %(void(charleft))%<(gt 125)\n\
 8> DAY|WEEKDAY  |WDAY|SDAY|MONTH|LMONTH   |MON|YEAR|HOUR|MIN|SEC\n\
 9> %(day{date})|%9(weekday{date})|%4(wday{date})|%4(sday{date})|\
10> %5(month{date})|%9(lmonth{date})|%3(mon{date})|%4(year{date})|\
11> %4(hour{date})|%3(min{date})|%3(sec{date})%>%>\n

Next, here's a line-by-line explanation of how scan.dateparse works:

Lines 1-3 produce the first line of output for each message. Line 1 prints the message number and the actual unparsed date field from the message. The backslash at the end of the line is a continuation character.
Line 2 starts with a control escape that tests the (nodate) function escape. If the test (nodate{date}) is true, the Date: field is missing or can't be parsed. Then, the words CAN'T PARSE DATE are output -- and scan jumps ahead to the matching %>, which is at the end of line 11. For messages with unparseable dates (like message 3 here), only one line is output. On the other hand, if the test in line 2 fails (if (nodate{date}) returned zero), then the date is parseable -- and interpretation goes to line 3, where the multiline output starts.
Line 3 completes the first line of scan output with a string in parentheses that tells whether the message was sent during daylight savings time or standard time. Line 3 starts with a parenthesis that is output literally. Next is a control escape that tests the value of (dst{date}) -- if the value is nonzero, then the date is during daylight savings time and the control escape outputs DAYLIGHT. Otherwise, it outputs STANDARD. After the end of the control escape, a space and time) are output, followed by a newline (\n) which ends the first output line.
Lines 4 and 5 print the Date: in two different formats. Line 4 prints the official RFC 822 version, and line 5 prints a format with the time zone shown in letters instead of numerically.
A numeric time zone tells the difference, at the sending site, before or after Coordinated Universal (Greenwich Mean) Time. For instance, the U.S. West Coast is eight hours behind Greenwich Mean Time in the winter; that's written -0800. In the summer, during Daylight Savings Time, the difference is seven hours, or -0700. Sites east of Greenwich have times starting with a plus sign (+). For example, +0030 means 30 minutes after Greenwich Mean Time.
Line 6 prints two numbers that can be useful in UNIX programming. (clock) gives the number of seconds between January 1, 1970 and the time the message was sent. (rclock) gives the number of seconds since the message was sent. Of course, (rclock) output changes each time you scan the same message.
Line 7 starts the part of this format file which won't be output unless there's room (as mentioned above). It starts by putting the number of output characters remaining (the output of the (charleft) function escape) into the num register. The (void) escape keeps the (charleft) output from being displayed on the terminal.
Next, the (gt) escape compares the output of (charleft) (in the num register) to the constant 125, which is the number of characters that the next two lines of output require. If the test succeeds, then there's enough space, and we do lines 8-11 (starting with a newline from the end of line 7). Otherwise, we branch to the next-to-last %> on line 11, which is the end of this control escape.
Line 8 prints a title line with a newline at the end. Line 9 prints the DAY, WEEKDAY, and WDAY fields with vertical bars between them (the vertical bars aren't part of a control escape because there's no % before them). Next, line 9 prints the return value of (sday), which can be negative.
Lines 9-11 use quite a few function escapes; they're all in the mh-format summary (the Section Summary of MH Format Strings). Line 10 and line 11 fill in the rest of the fields. Newer versions of MH use four-digit years (like 1995) internally, so %4(year{date}) will fill its field with the year. Older MH versions use two-digit years (like 95); on those versions, the %4 format specification will print two spaces followed by the two-digit year. Line 11 ends with \n, which means each message will have a blank line after it. Because \n is after the last %>, it will always be used, even for messages with unparseable dates like number 3.

If your version of MH uses four-digit years that you need to convert to two digits -- for example, to make an old MH format string the same way in a newer version of MH -- here's how. Replace the old %(year{date}) with this:

%(void(year{date}))%02(modulo 100)

The mhl.prodsumry format file uses that technique. It starts by writing the year into the num register. Next, the (modulo) function computes the value of num modulo 100 -- in other words, it divides the year by 100 and gives the remainder.

To make format files that are portable to both the two-and four-digit versions of MH, try this string that I found in the MH packmbox script:

%(void(year{date}))%<(gt 100)%4(putnum)%|19%02(putnum)%>

If the output of (year) is over 100, the string outputs the four-digit year. Otherwise, it outputs 19 and the two-digit year.

The scan.more Format File

The scan.more format file is a "do-it-all" format file that gives you a lot of information about messages in a short space. The output changes depending on which header fields the message has. For example, here are four messages scanned with scan.more (by the way, if you were going to use this file a lot, you'd probably store the -form and -width switches in your MH profile):

% scan 435-443 -form scan.more -width 230
 435  SENT: 20 May  CHARS: 383
      FROM: root (Super User)
    APP-TO: jdpeek
    <<BODY: The job you submitted to at, "/u3/acs/jdpeek/.l
 436  SENT: Thursday   CHARS: 29387  REPLIED: Friday
      FROM: Al Bok <al@phl.ph.com>
        TO: ehuser@asun.phl.ph.com
    <<BODY: I have a very complicated question about the ph
 441+ SENT: Friday   CHARS: 499
        CC: ehuser@quack.phl.ph.com, jdpeek
      SUBJ: That complicated message Al Bok sent us
 443 FILED: 16:44  CHARS: 52
        TO: ehuser, emmab
      SUBJ: More about lunch

If you compare the four messages, you'll see how the output changes:

Message 435 was sent more than seven days ago, so its SENT: field shows the date and month that the message was sent. (scan.more uses the same date-formatting as the standard scan.timely format file. This message has 383 characters. It doesn't have a TO: field, but it does have an Apparently-to: field (shown as APP-TO:). There's no Subject: (SUBJ:), so the first part of the message body is shown.
Message 436 was sent within the last week, so the day name is shown. I replied the next day (Friday) with repl -annotate.
Message 441 is the current message -- the plus sign (+) shows that. I sent it on Friday. There's no TO: field (this can happen when you use the repl -query command and don't send your reply to the person who sent you the original message). Here, scan.more shows the CC: addresses instead. Finally, when a message is one that I sent (like this one), scan.more saves space by not showing a FROM: me line.
Message 443 is a draft that was refiled from the What now? prompt. It doesn't have a Date: field, so scan.more shows the time that the message file itself was last modified.

The scan.more command is also used with the version of scan called cur. To save lines on the screen when you scan several messages, the format file hangs the message numbers into the left margin instead of putting blank lines between messages.

The next Example shows the format file. You can also get it from the book's online archive in download/split/mh/Mail/scan.more.)

Example: Lots of information: The scan.more format file

 1> %; $Id: scan.more 1.3 1994/11/26 19:36:21 jerry book3 $
 2> %4(msg)%<(cur)+%| %>\
 3> %<{date} SENT%|FILED%>: \
 4> %(void(rclock{date}))\
 5> %<(gt 15768000)%03(month{date})%(void(year{date}))%02(modulo 100)\
 6> %?(gt 604800)%02(mday{date}) %03(month{date})\
 7> %?(gt 86400)%(weekday{date}) \
 8> %|%02(hour{date}):%02(min{date})%>  \
 9> CHARS: %(size) \
10> %<{forwarded} (FORWARDED)%>\
11> %<{resent} (RESENT)%>\
12> %<{mime-version} (MIME)%>\
13> %<{replied} REPLIED: \
14> %(void(rclock{replied}))\
15> %<(gt 15768000)%03(month{replied})%(void(year{replied}))%02(modulo 100)\
16> %?(gt 604800)%02(mday{replied}) %03(month{replied})\
17> %?(gt 86400)%(weekday{replied}) \
18> %|%02(hour{replied}):%02(min{replied})%>%>\n\
19> %<{apparently-from}  APP-FROM: %{apparently-from}\n%|\
20> %<(mymbox{from})%|      FROM: %{from}\n%>%>\
21> %<{to}        TO: %{to}%|\
22> %<{apparently-to}    APP-TO: %{apparently-to}%|\
23>         CC: %{cc}%>%>\
24> %<{subject}\n      SUBJ: %60{subject}%|\
25> %<{body}\n    <<BODY: %60{body}%>%>

Most of scan.more uses the same techniques and escapes as other format files in this chapter. The parts of scan.more that print the SENT:/FILED: and REPLIED: fields are new, though. They were adapted from the MH scan.timely format file. Here's a look at one of the "date" sections: lines 4-8. (The REPLIED: section, lines 13-17, is almost identical.)

Line 4 uses the (rclock) function escape to find how long ago (in seconds) the message was sent. The result from (rclock) goes into the num register. That number would also be shown on the screen, but the (void) function prevents that. (void) is useful where you want to store an intermediate result in the num or str registers without printing it.
Next, a long if-elseif-elseif-else test starting on line 5 uses the (gt) function to print a different date format, depending on how long ago the message was sent.
For example, if the number from (rclock) is 100000, that means the message was sent 100,000 seconds ago. That's 27.8 hours, which is yesterday (or before). The control escape in line 5 tests to see if the time is more than a month ago -- it isn't. (Here, as with the pick command, "one day ago" means 24 hours ago instead of the previous midnight.)
So control goes to the first %? on line 6 -- and the control escape is evaluated to see if the message was sent more than 604,800 seconds (one week) ago. It wasn't. Notice that the same number from (rclock{date}) is still stored in the num register.
The number in num does match at the test in line 7 -- because 100,000 is greater than 86,400. The date is printed with the (weekday) function escape, which prints a time as a weekday name. Otherwise, the hour and minute would be printed.
Versions of MH before 6.7.2, without the %? elseif operator, need to use a series of nested if-else escapes. Here's the same test written for older versions. (The Section The scan.answer Format File walks through a simpler nested if-else escape.)
```
%<(gt 15768000)%03(month{date})%02(year{date})%|\
%<(gt 604800)%02(mday{date}) %03(month{date})%|\
%<(gt 86400)%(weekday{date}) %|\
%02(hour{date}):%02(min{date})%>%>%>  \
```
This looks more complicated than the same example with the new %? operator. If you compare the two, though, you'll see a pattern. The single %? "elseif" operator has been replaced with a %| "else" operator and a %< "if" operator (plus a matching %> operator at the very end of the string). If the first part of the string matches (if the number is greater than 15768000), the first part of the string is interpreted and control goes to the final %> operator. Else, the %| operator starts a new test for numbers greater than 604800; its matching (nested) %> operator is next-to-last in the string. Similarly, the third %< is matched by the innermost %> at the end of the string; the %| operator between them handles numbers that are less than or equal to 86400.

The are two other things worth mentioning on line 5:

%<(gt 15768000)%03(month{date})%(void(year{date}))%02(modulo 100)\

The line, which is the start of a control escape, has five function escapes. (One of those five function escapes, (year), is nested in another.) You might be wondering which of those function escapes are evaluated as the condition (the "true or false?" part) of the control escape. It's the first function escape, (gt 15768000), that's tested. The other function escapes are evaluated only if the (gt 15768000) is true.
The other function escapes simply print their output, if any, in sequence from left to right. First, the month is printed in a field of three characters. Next, the year is stored in the num register -- but not printed because it's inside a (void) escape. Third, the year (from num) is truncated to two digits and printed. So this control escape processed four other function escapes as the result of one condition, (gt 15768000), testing true. (That might not surprise you, but it confused me at first.)

This format file needs an output width of about 230 characters. The exact amount depends on how wide each field is. scan.more limits the width of the subject and body to 60 characters each. But if the text of the address field (like TO:) is long, it can "steal" width from the subject or body. That almost never happens to me -- if it's a problem for you, you should be able to fix it by now...

The replcomps.addrfix Format File

This section, and the rest of the sections in this chapter, show format files used by programs other than scan.

The Example below shows a replcomps-like format file for the repl command. (You can also get this file from the book's online archive. It's in download/split/mh/Mail/replcomps.addrfix.) This file handles an addressing problem I have with some of the email I get. I can't reply directly to the From: addresses on those messages; I have to edit the To: address in my reply before I send it. Like replcomps, the replcomps.fixaddr format file gets the best reply address from the message header. Then it uses a series of (match) escapes to decide whether the address is one I can't reply to. If a bad address matches, the file outputs To: good-address.

To make the series of tests, I used the "else-if" operator %?. If you have MH 6.7.1a or before, use the nested tests shown in the Section The scan.answer Format File.

Example: The replcomps.addrfix format file

 1> %(lit)\
 2> %(formataddr %<{reply-to}%?{from}%?{sender}%?{return-path}%>)\
 3> %<(nonnull)\
 4> %<(match isla!tim)To: tim\
 5> %?(match djkortz@apl23r)To: djkortz@apl23r.zipcom.com\
 6> %?(match !sparc2gx!vanes@uunet)To: vanes@email.imelda.ac.uk\
 7> %|%(void(width))%(putaddr To: )%>\n%>\
 8> %(lit)%(formataddr{to})%(formataddr{cc})%(formataddr(me))\
 9> %<(nonnull)%(void(width))%(putaddr cc: )\n%>\
10> %<{fcc}Fcc: %{fcc}\n%>\
11> %<{subject}Subject: Re: %{subject}\n%>\
12> In-reply-to: Message from (%<{from}%{from}\
13> %?{sender}%{sender}%|%{apparently-from}%>)\n\
14>    of "%<(nodate{date})%{date}%|%(tws{date})%>."%<{message-id} %{message-id}%>\n\
15> --------

After lines 1-3 store an address in the str register and test for it, lines 4-6 see if the address is one of the three that needs to be rewritten.

For instance, if the original message was From: isla!tim, line 4 would match it. The string To: tim would be output. The else-if operator $?, at the start of line 5, would see that the previous test succeeded; control would go to the matching end-if which is the first %> on line 7.

Here's another example. If the message had a Return-Path: field with the address ...!frobozz!sparc2gx!vanes@uunet.uu.net, it wouldn't match at line 4 or line 5. The %? operator would keep executing tests until the matching test in line 6 was found. You could add many more of these else-if tests.

If none of the %? operators match, the final else (after the %| operator) is executed. Here, the address is printed with no changes.

There's one more %? operator used. It picks an address for the In-reply-to: field in lines 12-13.

The rcvtty.format File

rcvtty will read an MH format file, as explained in the Section Changing the Output Format. My rcvtty.format file is in The Example below -- and also in the book's online archive, at download/split/mh/Mail/rcvtty.format.

Example: The rcvtty.format file

1> ^[[7m\
2> * MAIL: %(size)ch @ %(hour{dtimenow}):%02(min{dtimenow}) *\n\r\
3> ^[[m\
4> %<(mymbox{from})To:%14(friendly{to})%|%17(friendly{from})%>\n\r\
5>   %{subject}%<{body}<<%{body}>>%>

That file uses a few tricks worth explaining:

The first line is written in standout mode. The escape sequence in line 1 starts standout mode on VT100 terminals; line 3 ends standout mode. (The ^[ is the character made by pressing ESC. The other characters are literal.)
"Hard-coding" an escape sequence this way isn't very portable. VT100-style sequences work on a lot of terminals and window systems, though. If those escape sequences don't work on your terminal, check your terminal or window system manual for its termcap or terminfo entry. (The Nutshell Handbook termcap & terminfo can help, too.)
rcvtty has a special component escape called {dtimenow}. It holds the date and time from the Delivery-Date: field. You can parse it with date function escapes like (hour) and (min).
Have you had a program running in the background that writes to your terminal -- and, at the same time, the foreground program (like a text editor) has put the terminal in raw mode? The lines from the background program
```
jump down
          the screen
                     like this
```
The same thing happens, by default, with multi-line rcvtty messages like this file creates. To fix it, I've added carriage-return characters (\r) at the end of lines 2 and 4. They move the cursor back to the left margin, something UNIX doesn't do by default in raw mode. When the terminal isn't in raw mode, these extra carriage returns don't hurt a thing.

The rcvdistcomps File

When a message is redistributed with the rcvdist command, the formatting of the Resent-xxx: header fields is controlled by an rcvdistcomps file. The default file is shown in The Example below; you can also get it from the book's online archive in download/split/mh/Mail/rcvdistcomps.)

Example: The rcvdistcomps file

%(lit)%(formataddr{addresses})\
%<(nonnull)%(void(width))%(putaddr Resent-To: )\n%>\
Resent-Fcc: outbox

Addresses you use on the rcvdist command line are available in the {addresses} component escape. By default, rcvdistcomps puts a copy of every message into your outbox folder. You can change all of this by making your own rcvdistcomps file in your MH directory.

Summary of MH Format Strings

This summary was adapted from the MH 6.8.3 mh-format(5) manual page. It gives a complete, detailed and fast-paced overview of MH format strings. Earlier versions of MH may not have all of these features.

A format string consists of ordinary text and special multi-character escape sequences which begin with % (percent sign). You can use C backslash characters in a format string: \b (backspace), \f (formfeed), \n (newline), \r (carriage return), and \t (tab). Continuation lines in format files end with a backslash (\) followed by the newline character. To put a literal % or \ in a format string, use two of them: %% and \\. There are three types of escape sequences: header fields (called components by MH format), built-in functions, and flow control.

A component escape is specified as %{component}, and exists for each header field in the message being processed. For example %{date} refers to the Date: header field of the appropriate message. All component escapes have a string value. Normally, component values are compressed by converting any control characters (tab and newline included) to spaces, then eliding any leading or multiple spaces. However, commands may give different interpretations to some component escapes; The Table MH-format Special Component and Function Escapes gives a summary and each command's manual page has details.
A function escape is specified as %(function). All functions are built-in, and most have a string or numeric value.
A control escape is one of: %<, %?, %|, or %>. They form a general if-elseif-else-endif block.
Comments may be inserted in most places where a function argument is not expected. A comment begins with %; and ends with a (non-escaped) newline.

The following two subsections explain control and function escapes. Next, after an explanation of Return values, are three tables of function escapes. The following table lists special escapes that are defined only for certain commands. Then comes a subsection that shows how to nest escapes. The last subsection explains field width and output width.

Control-flow Escapes

A control escape is one of: %<, %?, %|, or %>. These are combined into the conditional execution construct:

    %<condition
        format text 1
    %?condition2
        format text 2
    %?condition3
        format text 3
    ...
    %|
        format text N
    %>

Extra white space is shown here only for clarity. These constructs may be nested without ambiguity. They form a general if-elseif-else-endif block where only one of the format text segments is interpreted.

The %< and %? control escapes cause a condition to be evaluated. This condition may be either a component or a function. The four constructs have the following syntax:

    %<{component}
    %<(function)
    %?{component}
    %?(function)

These control escapes test whether the function or component value is non-zero (for integer-valued escapes), or non-empty (for string-valued escapes).

If this test evaulates true, then the format text up to the next corresponding control escape (one of %|, %?, or %>) is interpreted normally. Next, all format text (if any) up to the corresponding %> control escape is skipped. The %> control escape is not interpreted; normal interpretation resumes after the %> escape.
If the test evaluates false, however, then the format text up to the next corresponding control escape (again, one of %|, %?, or %>) is skipped, instead of being interpreted.
- If the control escape encountered was %?, then the condition associated with that control escape is evaluated, and interpretation proceeds after that test as described in the previous paragraph.
- If the control escape encountered was %|, then the format text up to the corresponding %> escape is interpreted normally.
As above, the %> escape is not interpreted and normal interpretation resumes after the %> escape.

The %? control escape and its following format text is optional, and may be included zero or more times. The %| control escape and its following format text is also optional, and may be included zero or one times.

Function Escapes

Most functions expect an argument of a particular type, as shown in the Table below:

Table: Argument Types for MH-format Functions

literal
A literal number or string.
Examples:
```
%(func 1234)
%(func text string)
```
comp
Any header field.
Example:
```
%(func{in-reply-to})
```
date
A date field.
Example:
```
%(func{date})
```
addr
An address field.
Example:
```
%(func{from})
```
expr
An optional component, function or control, perhaps nested.
Examples:
```
%(func(func2))
%(func %<{reply-to}%|%{from}%>)
%(func(func2{comp}))
```

The types date and addr have the same syntax as comp, but require that the header field be a date string (such as Date:), or address string (such as From:), respectively.

All arguments except those of type expr are required. For the expr argument type, the leading % must be omitted for component and function escape arguments, and must be present (with a leading space) for control escape arguments.

The evaluation of format strings is based on a simple machine with an integer register num, and a text string register str. When a function escape is processed, if it accepts an optional expr argument which is not present, it reads the current value of either num or str as appropriate.

Return Values

Component escapes write the value of their message field in str. Function escapes write their return value in num for functions returning integer or boolean values, and in str for functions returning string values. (The boolean type is a subset of integers with usual values 0=false and 1=true.) Control escapes return a boolean value, and set num.

All component escapes, and those function escapes which return an integer or string value, pass this value back to their caller in addition to setting str or num. These escapes will print out this value unless called as part of an argument to another escape sequence. (To prevent printing, use the (void) function escape.) Escapes which return a boolean value do pass this value back to their caller in num, but will never print out the value.

Tables of Function Escapes

The next three tables list MH-format function escapes.

Table: MH-format Function Escapes (1 of 3)

msg (argument: none) (return: integer): Message number
msg (argument: none) (return: integer): In forw -digest: issue number
cur (argument: none) (return: integer): Message is current
cur (argument: none) (return: integer): In forw -digest: volume number
size (argument: none) (return: integer): Size of message
strlen (argument: none) (return: integer): Length of str
width, more... (argument: none) (return: integer): Output buffer size in bytes
charleft (argument: none) (return: integer): Bytes left in output buffer
timenow (argument: none) (return: integer): Seconds since the UNIX epoch
me (argument: none) (return: string): The user's mailbox
eq (argument: literal) (return: boolean): True if argument equals value in num register
ne (argument: literal) (return: boolean): True if argument doesn't equal value in num register
gt, more... (argument: literal) (return: boolean): True if argument is greater than value in num register
match (argument: literal) (return: boolean): True if value in str register contains the argument
amatch (argument: literal) (return: boolean): True if value in str register starts with the argument
plus (argument: literal) (return: integer): Add value in num register to argument
minus (argument: literal) (return: integer): Subtract value in num register from argument
divide (argument: literal) (return: integer): Divide value in num register by argument
modulo (argument: literal) (return: integer): Value in num register modulo the argument (divide value in num by the argument, give the remainder)
num (argument: literal) (return: integer): Store argument in num register; if no argument, erase num
lit (argument: literal) (return: string): Store argument in str register; if no argument, erase str
getenv (argument: literal) (return: string): Store value of environment variable named by argument into str register
profile (argument: literal) (return: string): Set str register to value of MH profile or context entry named by argument
nonzero (argument: expr) (return: boolean): True if value in num register is non-zero
zero (argument: expr) (return: boolean): True if value in num register is zero
null (argument: expr) (return: boolean): True if str register is empty
nonnull (argument: expr) (return: boolean: True if str register is not empty
void (argument: expr) (return: none): Set str or num registers
comp (argument: comp) (return: string): Set str register to value of field comp
compval (argument: comp) (return: integer): Set num register to numeric value (from UNIX atoi() function) of field comp
decode (argument: expr) (return: string): Decode any RFC-2047 encoding in str register (nmh only)
trim (argument: expr) (return: none): Trim trailing whitespace from str register
putstr (argument: expr) (return: none): Print str
putstrf (argument: expr) (return: none): Print str in a fixed width
putnum (argument: expr) (return: none): Print num
putnumf (argument: expr) (return: none): Print num in a fixed width

The functions in the next Table require a date field as an argument:

Table: MH-format Function Escapes (2 of 3)

sec (argument: date) (return: integer): Seconds of the minute
min (argument: date) (return: integer): Minutes of the hour
hour (argument: date) (return: integer): Hours of the day (0-23)
wday (argument: date) (return: integer): Day of the week (Sun=0)
day (argument: date) (return: string): Day of the week (abbrev.)
weekday (argument: date) (return: string): Day of the week
sday (argument: date) (return: integer): Day of the week known? (0=implicit,-1=unknown)
mday (argument: date) (return: integer): Day of the month
yday (argument: date) (return: integer): Day of the year
mon (argument: date) (return: integer): Month of the year
month (argument: date) (return: string): Month of the year (abbrev.)
lmonth (argument: date) (return: string): Month of the year
year (argument: date) (return: integer): Year (may be greater than 100)
zone (argument: date) (return: integer): Timezone in hours
tzone (argument: date) (return: string): Timezone string
szone (argument: date) (return: integer): Timezone explicit? (0=implicit,-1=unknown)
date2local (argument: date) (return: none): Coerce date to local timezone
date2gmt (argument: date) (return: none): Coerce date to GMT
dst (argument: date) (return: integer): Daylight savings in effect?
clock (argument: date) (return: integer): Seconds since the UNIX epoch
rclock (argument: date) (return: integer): Seconds prior to current time
tws (argument: date) (return: string): Official 822 rendering
pretty (argument: date) (return: string): User-friendly rendering
nodate (argument: date) (return: integer): str not a date string

The functions listed in the next Table require an address field as an argument. The return value of functions noted with `*' pertain only to the first address present in the header field.

Table: MH-format Function Escapes (3 of 3)

proper (argument: addr) (return: string): Official RFC 822 rendering
friendly (argument: addr) (return: string): User-friendly rendering
addr (argument: addr) (return: string): Host or host!mbox rendering*
pers (argument: addr) (return: string): The personal name*
note (argument: addr) (return: string): Commentary text*
mbox (argument: addr) (return: string): The local mailbox*
mymbox (argument: addr) (return: integer): The user's addresses? (0=no, 1=yes) (see note after table)
host (argument: addr) (return: string): The host domain*
nohost (argument: addr) (return: integer): No host was present*
type (argument: addr) (return: integer): Host type* (0=local, 1=network, -1=uucp, 2=unknown)
path (argument: addr) (return: string): Any leading host route*
ingrp (argument: addr) (return: integer): Address was inside a group*
gname (argument: addr) (return: string): Name of group*
formataddr (argument: expr) (return: none): Append arg to str as a (comma-separated) address list. Works with repl -query to select addresses.
putaddr (argument: literal) (return: none): Print str address list with arg as optional label; get line width from num

A note about the previous Table: In general, (mymbox{component}) checks each of the addresses in the header field component: against the user's mailbox name and any Alternate-Mailboxes:. It returns true if any address matches, however, it also returns true if the component: header field is not present in the message. If needed, the (null) function can be used to explicitly test for this condition.

Special Escapes

Some MH commands give different interpretations to some escapes. The next Table gives a summary. The third column refers you to sections (S) and examples (X) with more detail about each entry. For details, see the command's manual page.

Table: MH-format Special Component and Function Escapes

{error} in ap(8) (return: string): A diagnostic if the parse failed
(cur) in forw -digest (return: integer): Volume number
{digest} in forw -digest (return: string): Digest name
(msg) in forw -digest (return: integer): Issue number
{addresses} in rcvdist (return: string): Addresses from command line
{body} in rcvtty (return: string): First part of the body, compressed
{dtimenow} in rcvtty (return: date): Current date. Example:
Thu, 01 Dec 1994 18:02:42 -0800
{fcc} in repl (return: string): Any folders specified with -fcc folder
{subject} in repl (return: string): Subject: field without any leading Re: and spaces
{body} in scan (return: string): First part of the body, compressed
{date} in scan (return: string): Returns file modification date if Date: field is missing.
{dtimenow} in scan (return: date): Current date (as in rcvtty).

A note about the previous Table: If no Date: field is present in the message header, the function escapes which operate on {date} will return values for the date of last modification of the message file itself. Therefore, if scan encounters a message without a Date: field, the column that usually holds the date gets the last write date of the message instead. This is particularly handy for scanning a draft folder, as message drafts usually aren't allowed to have dates in them. Because control escapes evaluate false when they test for a field that doesn't exist, the default scan format prints a * when the Date: field is missing.

Nesting Escapes

When escapes are nested, evaluation is done from inner-most to outer-most. The outer-most escape must begin with %; the inner escapes must not. For example,

%<(mymbox{from}) To: %{to}%>

writes the value of the header field From: to str; then (mymbox) reads str and writes its result to num; then the control escape evaluates num. If num is non-zero, the string "To: " (with a trailing space) is printed followed by the value of the header field To:.

Field Width and Output Width

When a function or component escape is interpreted and the result will be immediately printed, an optional field width can be specified to print the field in exactly a given number of characters. For example, a numeric escape like %4(size) will print at most 4 digits of the message size; overflow will be indicated by a ? in the first position (like ?234). A string escape like %4(me) will print the first 4 characters and truncate at the end. Short fields are padded at the right with the fill character (normally, a blank). If the field width argument begins with a leading zero, then the fill character is set to a zero.

As above, the functions (putnumf) and (putstrf) print their result in exactly the number of characters specified by their leading field width argument. For example, %06(putnumf(size)) will print the message size in a field six characters wide filled with leading zeros; %14(putstrf{from}) will print the From: header field in fourteen characters with trailing spaces added as needed. For (putstrf), using a negative value for the field width causes right-justification of the string within the field, with padding on the left up to the field width. The functions (putnum) and (putstr) print their result in the minimum number of characters required, and ignore any leading field width argument.

The available output width is kept in an internal register; any output past this width will be truncated. The functions (width) and (charleft) are useful here; there are examples in the Sections scan Widths and The scan.dateparse Format File.

[Table of Contents] [Index] [Previous: mhl] [Next: Chapter Introduction: Processing New Mail Automatically]

Revised by Jerry Peek. Last change $Date: 1999/10/10 05:14:05 $

This file is from the third edition of the book MH & xmh: Email for Users & Programmers, ISBN 1-56592-093-7, by Jerry Peek. Copyright © 1991, 1992, 1995 by O'Reilly & Associates, Inc. This file is freely available; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation. For more information, see the file copying.htm.

Suggestions are welcome: Jerry Peek <jpeek@jpeek.com>