TROUBLESHOOTING HYLAFAX PROBLEMS
This section contains tips for dealing with the most common problems encountered when setting up and running the software. If you did not follow the instructions in the chapter on ``Server Setup and Basic Configuration'' then do not bother reading this chapter; read the setup information first!

There are several components to the complete HylaFAX software package:

If you are having trouble first try to identify which part of the system is failing. Work forward from the client application to the server machine. On the server machine work from the hfaxd process to the scheduler to the delivery programs. Usually it is pretty obvious which piece of the system has got a problem but if you are unfamiliar with the software you can easily be fooled by error messages that may be passed back to client programs from a process deep within a server machine. The following sections cover specific areas: You can also consult the HylaFAQ for answers to common questions.


TROUBLESHOOTING: CLIENT BASICS

All client applications support a -v option to enable various levels of debugging. It is possible with one or more -v options to trace the protocol between the application and the hfaxd process on the server machine. hfaxd has a ServerTracing configuration parameter that controls various tracing support, including the protocol messages it receives. If you are in doubt whether a problem is on the client machine or the server, try the following:

Run faxstat to request server status. You should see something like:

or possibly, If you do not see something like this, then you are having problems communicating with the hfaxd program on the server machine or there is a configuration problem on the server machine. If you cannot establish a connection to the hfaxd process on the server machine, then verify that you have your FAXSERVER environment variable setup correctly (if the server is not on the same machine where the faxstat program is run) and that both client and server programs are communicating on the same TCP port.

NOTE: Beware of settings that might be present in personal or system-wide configuration files. Some applications such as sendfax search an additional file as well. Consult manual pages for complete information about configuration file handling.

On the server machine make sure that the hfaxd program is setup to run standalone or properly configured to be invoked by the inetd program. If run standalone then hfaxd should be running and have been started with a -i option (and possibly other options). hfaxd should also send messages to the system logging facility whenever it is started up and these messages should identify the client-server protocols it is servicing; e.g.

Otherwise check the contents of /etc/inetd.conf, or similar, for a line of the form: There may also be other entries if support for the old client-server protocol and/or SNPP is enabled.

Be certain that inetd has ``reread'' its configuration file; either send it a SIGHUP or restart it. This should automatically happen when the faxsetup program is run on the server machine.
Note also that the fax service must be defined on the server machine in order for inetd to startup the hfaxd program--check for this entry in the /etc/services file and/or the YP/NIS database.

Note that hfaxd uses the chroot system call to confine clients to the HylaFAX spooling area on the server machine. On most systems only the super-user is permitted to do a chroot call so if hfaxd is not started by the super-user or the executable program is not setup to be setuid-root then it will not function properly. If this happens clients will usually be denied access with a message of the form ``Cannot set privileges.''.

You can also use an existing network program such as telnet or ftp to communicate with the hfaxd process;

If the network-related configuration is setup properly but faxstat still does not return an expected answer then use the -v option to trace the client-server protocol:

A protocol or configuration problem should be evident from the trace information. You can also configure hfaxd to log its operation through syslog by setting the ServerTracing configuration parameter in the hfaxd.conf file: hfaxd will reread this file for each new network connection so there is no need to restart the server if running standalone.

Once again, remember that inetd will not see a change to the inetd.conf file until it is restarted or sent a SIGHUP; and that hfaxd logs its debugging information through syslog (using the syslog facility setup in the hfaxd.conf file, or the setting compiled into the program when the software was built).


TROUBLESHOOTING: CLIENT ACCESS CONTROL PROBLEMS

If you are able to establish a network connection to the hfaxd server process then access control problems are either due to incorrect installation of the server software or misconfigured permissions on the server machine. Client access is defined by the contents of the etc/hosts file located in the HylaFAX spooling area on the server machine. This file must exist and must not be publicly readable or access will be denied to all clients.

Beware of the protection on the etc/hosts file when upgrading from HylaFAX 3.0. Previous versions of HylaFAX did not care if the file was publicly readable so it probably is.

The hfaxd program confines clients to the spooling area on the server machine using the chroot(2) system call. This means that clients should not have access to any information on a HylaFAX server machine outside the spooling area (and hfaxd restricts access to only part of this area).

HylaFAX clients are assigned a ``fax user ID'' when they login to a HylaFAX server. This identifier is similar to the normal UNIX user ID but is maintained separately by HylaFAX and is independent of the normal system operation. Beware that hfaxd stores this ID as the group ID of files that are created on a HylaFAX server on behalf of a client. The fax user ID assigned to a client is defined by the information in the etc/hosts file; consult the manual page for more details. You can also find the user ID for a particular user by logging in with telnet and issuing a STAT request:

melange% telnet flake 4559 Trying 192.111.25.39... Connected to flake.esd.sgi.com. Escape character is '^]'. 220 flake.esd.sgi.com server (HylaFAX (tm) Version 4.0beta020) ready. user foo 230 User foo logged in. stat 211-flake.esd.sgi.com HylaFAX server status: HylaFAX (tm) Version 4.0beta020 Connected to melange.esd.sgi.com (192.111.25.40) --> Logged in as user foo (uid 2) "/" is the current directory Current job: (default) Time values are handled in GMT Idle timeout set to 900 seconds Using long replies No server down time currently scheduled HylaFAX scheduler reached at /FIFO (not connected) Server FIFO is /client/25133 (open) File cache: 15 lookups, 0 hits (0.0%), 1.1 avg probes 15 entries (2.3 KB), 0 entries displaced, 0 entries flushed TYPE: ASCII; STRU: File; MODE: Stream; FORM: PS No client data connection 211 End of status quit 221 Goodbye. Connection closed by foreign host.

Client access may be controlled with passwords that are transmitted in the clear by a client across the network connection. This is obviously insecure and plans exist for improved security measures. The existing password facilities use the same crypt(3) function the system login facilities use and encrypted passwords are stored in the etc/hosts file in a format that is compatible with most systems' password files. This means that client passwords can be copied from a password file if desired (though this is obviously discouraged). The HylaFAX client-server protocol ADDUSER request includes a variety of checks for easy-to-guess passwords.

NOTE: HylaFAX uses the standard system crypt function to implement client passwords. Systems without this routine may substitute the publicly available GNU version but then it may not be possible to copy encrypted passwords from the system password file.

If a client is prompted for a password when contacting the server then it means the client is setup with a password in the etc/hosts file. Read the hosts(4F) manual page carefully to understand the format for this file.

Logging of client login and network connections can be controlled separately by bits in the hfaxd ServerTracing configuration parameter; consult the hfaxd manual page for details.


TROUBLESHOOTING: SERVER BASICS

One faxq process does the job scheduling and initiation of outbound calls; it handles many tasks:

One faxgetty server process is usually run for each fax modem on a machine. These processes are responsible for handling inbound calls, carrying out the following tasks: Proper setup of HylaFAX involves setting up the configuration files and the ancillary programs and command scripts that are invoked by faxq and by faxgetty. The setup of the ancillary programs should automatically be done when the software is configured and installed. The modem-related setup and much of the system-related configuration work should be done by the faxsetup and faxaddmodem commands. Problems that arise in the normal operation of a server typically fall into two categories: In either case HylaFAX provides extensive tracing or logging facilities that should supply the information needed to locate and correct a problem.

Tracing information is broken up into two separate areas: server tracing and session tracing. Server tracing is the logging done by server processes when not in active conversation with another device such as a facsimile machine or paging service provider. This tracing covers modem initialization, scheduler operation, and general bookkeeping and maintenance work. Session tracing is the logging done while a server process is actively communicating with a remote device; e.g. the communcation work done to send or receive a facsimile. This split permits better control over the amount of information logged in a production environment. Typically once a system is setup and working the amount of server tracing information captured can be quite small while the session tracing logged needs to be more extensive to help debug any communication problems that might arise.

HylaFAX servers processes are controlled by a collection of configuration files. There is a configuration file for the faxq scheduler process, etc/config, and one file for each process that uses a modem, etc/config.device. (The hfaxd process that implements the client-server protocols also has its own configuration file.)

faxq tracing information is controlled by the ServerTracing and LogFacility configuration parameters. ServerTracing controls which work should be traced while LogFacility specifies the syslog(3) facility where the tracing messages should be directed. By default server tracing information is directed to the ``daemon'' facility.

Modem-related tracing information is controlled by ServerTracing, SessionTracing, and LogFacility parameters that are put in the per-modem configuration files. ServerTracing controls the logging done when one of these programs is not in active conversation with another device while SessionTracing controls the logging during other times. As before, LogFacility defines where server tracing messages get sent. Session tracing messages are not sent using syslog; they are written to session log files that are described separately below.

To capture server tracing messages you must enable the appropriate bits in a ServerTracing parameter and configure the syslogd process to capture messages sent to the daemon.debug facilitiy (or substitute for daemon to reflect the value of the LogFacility parameter).

NOTE: By setting LogFacility to something like local5 it is easy to capture HylaFAX syslog messages in a separate file; just setup the syslog.conf file appropriately, e.g.
local5.debug /var/spool/fax/etc/syslog
A sample server trace log is shown below. The lines marked ``FaxQueuer'' are generated by the faxq process. The lines marked ``FaxGetty'' are generated by a faxgetty process. The process ID of each process is shown enclosed in ``[]''. Each line is marked with the date and time that it was generated.

The tracing facilities supported by the HylaFAX server processes should provide enough information to debug any problem in the software. Consult the config(4F) manual page for complete information on the various parameters.


TROUBLESHOOTING: SCHEDULER OPERATION

HylaFAX requires a faxq process to be running for outbound transmissions to be done. The faxstat program should show this process running in its output:

If the scheduler is marked ``Not running'' but faxq is running on the server machine then the first thing to check is the server tracing log for faxq to see if there is a problem. faxstat gets its information from hfaxd on the server machine and hfaxd decides if faxq is running based on whether it can open the FIFO special file FIFO in the spooling area on the server machine. Check that this file is the correct type and that it has the appropriate permission. faxq will create this file when it starts up if it does not exist; if there is a problem try stopping faxq, remove the file, possibly fiddle with faxq's server tracing (to get more information logged), and then restart faxq: Note that if the faxquit command does not cause faxq to terminate then you may need to forcibly kill the process (but beware of doing this when outbound jobs are in progress as this can leave the system in an inconsistent state). When sending a signal to faxq use SIGTERM or SIGINT; faxq catches these signals and tries to cleanup its state as much as possible.

FIFO-related problems usually happen when the FIFO special file is removed without first stopping the HylaFAX server processes. This can cause one or more processes to be left with an open file descriptor to a file that is no longer present in the filesystem, and hence unreachable. When doing maintenance work on a HylaFAX server it is a good idea to first shut down the server processes. This is simple to do with the hylafax shell script used on System V-style systems; simply do

If you believe that you are having problems with the messages exchanged by the various HylaFAX server processes ServerTracing bit 0x04000 can be set to force logging of all the messages sent and received through FIFO special files; usually you need to do this only for the faxq process.

Other than FIFO communication problems, the most common scheduler-related problem encountered is that no outbound jobs get scheduled for processing. faxq will only schedule an outbound job when it believes it has a modem available that is capable of handling a job's needs. A modem's existence and capabilities are signalled to faxq through messages received on its FIFO file. The two programs that send modem information are faxgetty and faxmodem. Processes that want to send faxq information about new modems must be able to access the file; this means the protection on the file must be such that clients can open it for writing (the default setup should make this happen). The order in which faxq and faxgetty are started does not matter; a handshaking protocol between faxq and faxgetty insures that modem status information will be exchanged no matter what order the processes startup in.

When in doubt about what is happening, or not happening, to jobs enable the job queue management tracing bit in the ServerTracing parameter for the faxq process; e.g.

and check the log to understand what is going on. There is also a separate bit for tracing low-level job queue operations; this should only be needed on rare occasions.

Otherwise there is very little that can go wrong with the scheduler with respect to managing the queue of outbound jobs. Setting the system time backwards on the server machine can cause problems as timers managed by faxq are calculated relative to the current time-of-day. Jobs may be rejected without a phone call if a rejectNotice entry is present for a destination phone number in the destination controls file destctrls(4F). Beware that multiple jobs to the same destination are usually serialized to reduce phone calls. Jobs that are blocked in this way have a "Blocked by concurrent..." status. You can change the maximum number of concurrent jobs that will be scheduled to a destination with the MaxConcurrentJobs configuration parameter.


TROUBLESHOOTING: SESSION TRACING

The SessionTracing parameter controls tracing information during the time HylaFAX is engaged in conversation with another device (fax machine, pager service provider, etc.) Tracing of this sort is done by faxgetty processes (when receiving facsimile) and by processes started up by faxq to process outbound jobs (faxsend, pagesend, etc.). Session tracing is controlled by configuration parameters specified in the per-modem configuration files. It is also possible to enable session tracing on a per-destination basis for outbound jobs through the per-destination DestControls facility provided by faxq.

Communication-related problems will be found in the information logged under session tracing. Session tracing information is stored in files in the log subdirectory in the spooling tree and is returned to users via electronic mail when notification is requested, or when an unrecoverable error is encountered. The log(4F) manual page has information on the meaning of many messages that might appear in these files.

A sample snippet from a session log is shown below. The trace was collected from a transmission through a Class 1 modem. The SessionTracing parameter was set to 0x4f. Notice that the format is very similar to the server tracing information collected through syslog (shown above). Beware that, unlike previous versions of HylaFAX, session logs are collected in individual files that are uniquely named according to a unique communication identifier. This can make locating the session log file a bit tricky; to find the appropriate log file you can look for a record in the etc/xferlog file or a server tracing message. Either should contain a communication identifier which can then be used to select the appropriate file from the log directory in the spooling area. (If only one call is being handled on a server you can also just look for the most recently changed file in the log directory.)

Session logs are straightforward to understand. Messages sent to the modem are identified by lines with a ``<--'' mark while data received from the modem are identified by lines with a ``-->''. Timestamps show the date and time, with time to the right of the decimal point displayed to 10 millisecond precision (the typical granularity of the realtime clock on a UNIX system). The number of characters in each message prefix the message itself. Unimportant binary data, usually the facsimile page data, is sometimes shown generically as ``data''.

Normal installation of HylaFAX will enable enough session tracing to debug most communication problems. The default configuration files come with SessionTracing set to 11 which is a good setting for Class 2 modems (i.e. a lot of information is provided, but the load on the server should not keep it from operating properly). For Class 1 modems a setting of 0x4f will also cause HDLC frames to be collected. Beware of tracing timer operations and modem I/O; these trace flags are only useful if you are trying to debug a problem specifically related to a timer not going off, or a problem where data appears to be corrupted.

NOTE: When debugging modem-related problems, only enable the tracing that you really need. Enabling all tracing can affect the operation of the server processes by altering the timing of operations.

Note that when capturing a trace for the purpose of submitting a bug report, the less extraneous information that you include, the easier it is for people to help understand the problem. Most of the time HylaFAX will return the relevant session log for a communication failure in the notification message sent to a user when an outbound job fails. Note however that the contents of this log is controlled by the value of the SessionTracing parameter specified in the per-modem configuration files. If this parameter is set too low then session logs may be returned that do not show sufficient information to diagnose a problem.


TROUBLESHOOTING: POSTSCRIPT DOCUMENT PREPARATION

PostScript documents submitted for transmission as facsimile are converted to a binary format by the ps2fax(1M) script that is invoked by the scheduler. If this preparation fails it is either due to the submission of invalid PostScript or a problem in the setup of the PostScript RIP that does the conversion from PostScript to TIFF/F. faxq tries to return any error messages returned by the PostScript RIP to the user that submits a job but some programs make this difficult. In this case it may be easiest to see what is happening by invoking the ps2fax script directly using the same command arguments used by faxq; this information can be found in the faxq trace log. Beware however, that if faxq is started up by the init(1M) program that it may inherit a different shell environment. In particular, beware of problems with search paths when the PostScript RIP is linked with Dynamic Shared Objects (DSOs); e.g. when Ghostscript is linked with the the X11 driver and a DSO version of the X library.

NOTE: On machines with dynamic shared libraries (e.g. SunOS), if you link Ghostscript with the X11 device driver and use shared X11 libraries that are not in a standard location, then you may need to augment the HylaFAX util/ps2fax.gs.sh script with something of the form:
LD_LIBRARY_PATH=/usr/local/R5/lib:/usr/openwin/lib export LD_LIBRARY_PATH
The faxsetup script should verify that the PostScript RIP is installed in the correct location and properly configured for use with HylaFAX. It may not however be able to verify DSO-related problems of the nature described above.

PostScript imaging problems may also result in the faxq process not being able to reopen the imaged document after the ps2fax script is run. After faxq invokes ps2fax to image a document it validates the resulting TIFF/F file to make sure the work was done correctly. This is necessary because some programs terminate with exit status 0 indicating a successful run even if an error was encountered.

Some other potential problems to be aware of. If Ghostscript is configured with a tiffg32d device driver to generated 2D-encoded data beware that versions of Ghostscript prior to 3.12 (inclusive) had a bug in this driver that caused invalid data to be generated. If you are in doubt you can disable the use of 2D-encoded facsimile data with the User2D configuration parameter to faxq:

(remember this goes in faxq's etc/config file, not the per-modem configuration file).

Versions of Ghostscript prior to about 3.63 permitted PostScript documents to set the output page dimensions to arbitrary values. Some applications such as Frame 4.0 and various PostScript drivers found in Microsoft Windows used this facility to force the output page to be 1734 pixels wide (8.5 inches at 204 pixels/inch). This causes problems when HylaFAX requests that documents be imaged with pages that are 1728 pixels wide. The result is that documents will be submitted, imaged, and then rejected during transmission because they have an incorrect page width. The solution is to get a current version of Ghostscript or to edit the PostScript documents to remove the setpagedevice requests that (incorrectly) force the output page dimensions.


TROUBLESHOOTING: TIFF DOCUMENT PREPARATION

Like PostScript documents TIFF documents may need to be prepared before they can be transmitted as facsimile. This work is done by the tiff2fax(1M) script that is invoked by faxq. The tiff2fax script depends on programs that are part of the separate TIFF software distribution and, in some instances, the ps2fax script. If you encounter a problem preparing a TIFF document for transmission capture the command arguments to the tiff2fax script from the faxq log and try running it by hand.


TROUBLESHOOTING: COMMUNICATION PROBLEMS

Communication problems refer to errors that occur during an outbound call handled by faxsend or pagesend, or an inbound call handled by faxgetty. Almost all communication problems can be diagnosed from the information in a session log. The only server tracing information that might be needed is for modem setup work done by faxgetty which does not appear in a session log.

If a problem occurs during modem setup by faxgetty, set ServerTracing to 11, or similar, in the modem configuration file and check the server trace log. Modem initialization for an outbound job is included in the session log and controlled by the SessionTracing parameter. Problems during modem setup are typically caused by:

Once a call is placed, facsimile communication happens in several phases. First the sender and receiver negotiate a set of session parameters to use during communication. These parameters define the format of data that is to be exchanged, the physical dimensions of the pages, and certain other parameters related to the communication work (speed at which to transfer data, modulation scheme, time to delay between raster scanlines to permit a receiver's printer to run properly, etc.). An example of this negotiation for a Class 2.0 modem is:

The receiver announces its capabilities and the sender then selects which of these capabilities it wants to use in doing the transmission.

NOTE: Beware that a common problem in Class 2 modems is for the modem to reject or ignore an AT+FDIS command to set the session parameters between the time a call is placed and a page is transmitted. When this problem occurs pages may appear ``squished'' or the session may be aborted abnormally. If you suspect this problem use the Class2DDISCmd configuration parameter to enable workaround support; e.g.
Class2DDISCmd: AT+FDIS
Following the negotiation of the session parameters pages of facsimile data are transmitted followed by a post page exchange of messages. For example:

In this case after the page an ``MPS'' message was sent to indicate that this page was to be followed by more pages of the same document. The receiver responded with an ``MCF'' message indicating the received page of facsimile data was OK (had an acceptable number of errors, if any) and that the sender should commence sending the next page of data. This procedure continues until the sender is done in which case it signs off by sending an ``EOP'' message at the end of the last page to be transmitted.

The basic work described above occurs for all facsimile communication no matter whether it is done with a Class 1, 2, or 2.0 modem. Two things to understand from this abbreviated description:

A more detailed description of the underlying facsimile protocol can be found at http://www.grayfax.com/ and, of course, in the CCITT/ITU T.30 recommendation.

When debugging a communication problem try to identify if the problem is transient or repeatable and if the problem always happens when communicating with the same sender/receiver. Also check the negotiated session parameters to see if there is any correlation, for example, to the negotiated data format (i.e. 1D-encoded data versus 2D-encoded data). Communication problems can be caused by many things including:

If you can communicate with some facsimile devices but not others than you can usually rule out the first cause. Transient or unrepeatable problems, jobs that require one or more retransmissions before working, or other similar problems are frequently due to line noise or poor phone connections but sometimes incorrect protocol implementations or host-modem flow control problems that depend on host load or page content. If you believe your have a noise problem it is a good idea to isolate the modem on the phone line (i.e. remove any other equipment such as an answering machine or handset) and to try an alternate line if possible, though this may not matter if the problem is close to the receiver. Last but not least, when using a Class 2 or 2.0 modem check with the vendor to make sure the firmware revision for the modem is up to date. If you can reliably reproduce a problem then knowing the make and model of the device at the other end can help a modem vendor understand and/or fix a protocol implementation problem.

NOTE: If communication problems persist to a specific receiver when using 2D-encoded data, you can disable its use through the info(4F) database. Note also that sendfax has a command line option that lets you select 1D-encoded data in any single transmission.

The HylaFAQ contains information on some common communication problems that might be encountered. It is also important to monitor the operation of a server to detect trends that might indicate modem or telecommunication problems; the faxcron(1M) script is useful for doing this since it extracts the transcripts of failed calls.


TROUBLESHOOTING: GETTY PROBLEMS

HylaFAX will invoke the getty program when a data connection is established and the GettyArgs parameter is set to a non-null string. When HylaFAX starts up a getty program it sets the standard input, output, and error descriptors to the modem device (closing all other descriptors), creates a new process group, and turns off the CLOCAL bit on the tty device so that if carrier is dropped the process group will receive a SIGHUP signal.

The section on ``System-specific Guidance'' in the chapter on setting up a server and the HylaFAQ have information about other problems that you may encounter.


TROUBLESHOOTING: PROBLEMS WITH UUCP, CU, TIP, ETC.

If you have a problem running the fax software together with other communication programs such as uucp, cu, tip, slip, ppp, etc. first verify that your data communication software is configured to use the correct modem initialization strings and that both HylaFAX and the program(s) in question are using the same device name. Many of the prototype modem configuration files for Class 2 and Class 2.0 modems will leave the modem idling in Class 2 or 2.0. This means that in order to place a data call the modem must first be reset to Class 0; e.g.

Other common problems involve the ownership and protection of the modem device file. When a HylaFAX server process is running it forces the tty device to be owned by the ``fax'' user (typically the same UID as the ``uucp'' user) and to have the mode specified by the DeviceMode configuration parameter. Finally, beware that there are several different styles of UUCP lock files; verify that your UUCP and related programs use the same style that HylaFAX is configured to use. The UUCP lock file scheme used by HylaFAX may be specified with the UUCPLockType configuration parameter. If you specify this parameter be certain to put it in both the faxq configuration file and each modem configuration file; otherwise one HylaFAX server process may do the right thing while another may not.

HylaFAX table of contents.


Sam Leffler / sam@engr.sgi.com. Last updated $Date: 1996/08/16 21:03:37 $.