Configuration and Tuning

1. Synopsis

One of the important aspects of FreeBSD is proper system configuration. This chapter explains much of the FreeBSD configuration process, including some of the parameters which can be set to tune a FreeBSD system.

After reading this chapter, you will know:

  • The basics of rc.conf configuration and /usr/local/etc/rc.d startup scripts.

  • How to configure and test a network card.

  • How to configure virtual hosts on network devices.

  • How to use the various configuration files in /etc .

  • How to tune FreeBSD using sysctl(8) variables.

  • How to tune disk performance and modify kernel limitations.

Before reading this chapter, you should:

  • Understand UNIX® and FreeBSD basics ([_basics]).

  • Be familiar with the basics of kernel configuration and compilation ([_kernelconfig]).

2. Starting Services

Starting Services

Tom Rhodes

Many users install third party software on FreeBSD from the Ports Collection and require the installed services to be started upon system initialization. Services, such as mail/postfix or www/apache22 are just two of the many software packages which may be started during system initialization. This section explains the procedures available for starting third party software.

In FreeBSD, most included services, such as cron(8) , are started through the system startup scripts.

.1. Extended Application Configuration

Now that FreeBSD includes rc.d , configuration of application startup is easier and provides more features. Using the key words discussed in Managing Services in FreeBSD, applications can be set to start after certain other services and extra flags can be passed through /etc/rc.conf in place of hard coded flags in the startup script. A basic script may look similar to the following:

#!/bin/sh
#
# PROVIDE: utility
# REQUIRE: DAEMON
# KEYWORD: shutdown

. /etc/rc.subr

name=utility
rcvar=utility_enable

command="/usr/local/sbin/utility"

load_rc_config $name

#
# DO NOT CHANGE THESE DEFAULT VALUES HERE
# SET THEM IN THE /etc/rc.conf FILE
#
utility_enable=${utility_enable-"NO"}
pidfile=${utility_pidfile-"/var/run/utility.pid"}

run_rc_command "$1"

This script will ensure that the provided utility will be started after the DAEMON pseudo-service. It also provides a method for setting and tracking the process ID (PID).

This application could then have the following line placed in /etc/rc.conf :

utility_enable="YES"

This method allows for easier manipulation of command line arguments, inclusion of the default functions provided in /etc/rc.subr , compatibility with rcorder(8) , and provides for easier configuration via rc.conf .

.2. Using Services to Start Services

Other services can be started using inetd(8) . Working with inetd(8) and its configuration is described in depth in [_network_inetd].

In some cases, it may make more sense to use cron(8) to start system services. This approach has a number of advantages as cron(8) runs these processes as the owner of the crontab(5) . This allows regular users to start and maintain their own applications.

The @reboot feature of cron(8) , may be used in place of the time specification. This causes the job to run when cron(8) is started, normally during system initialization.

3. Configuring cron8

Configuring cron8

Tom Rhodes

One of the most useful utilities in FreeBSD is cron. This utility runs in the background and regularly checks /etc/crontab for tasks to execute and searches /var/cron/tabs for custom crontab files. These files are used to schedule tasks which cron runs at the specified times. Each entry in a crontab defines a task to run and is known as a cron job .

Two different types of configuration files are used: the system crontab, which should not be modified, and user crontabs, which can be created and edited as needed. The format used by these files is documented in crontab(5) . The format of the system crontab, /etc/crontab includes a who column which does not exist in user crontabs. In the system crontab, cron runs the command as the user specified in this column. In a user crontab, all commands run as the user who created the crontab.

User crontabs allow individual users to schedule their own tasks. The root user can also have a user crontab which can be used to schedule tasks that do not exist in the system crontab .

Here is a sample entry from the system crontab, /etc/crontab :

# /etc/crontab - root's crontab for FreeBSD
#
# $FreeBSD$
#
SHELL=/bin/sh
PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin
#
#minute	hour	mday	month	wday	who	command
#
*/5	*	*	*	*	root	/usr/libexec/atrun

Lines that begin with the ` \#` character are comments. A comment can be placed in the file as a reminder of what and why a desired action is performed. Comments cannot be on the same line as a command or else they will be interpreted as part of the command; they must be on a new line. Blank lines are ignored. The equals (= ) character is used to define any environment settings. In this example, it is used to define the SHELL and PATH . If the SHELL is omitted, cron will use the default Bourne shell. If the PATH is omitted, the full path must be given to the command or script to run. This line defines the seven fields used in a system crontab: minute , hour , mday , month , wday , who , and command . The minute

  field is the time in minutes when the specified command will
  be run, the `hour`
is the hour when the
  specified command will be run, the `mday`
	  is the day of the month, `month`
 is the
	  month, and `wday`
 is the day of the week.
	  These fields must be numeric values, representing the
	  twenty-four hour clock, or a ``\*``
,
	  representing all values for that field.  The
	  `who`
 field only exists in the system
	  crontab and specifies which user the command should be run
	  as.  The last field is the command to be executed.
This entry defines the values for this cron job.  The
	  ``\*/5``
, followed by several more
	  `\*`
 characters, specifies that
	  [command]``/usr/libexec/atrun``
 is invoked by
	  [username]``root``
 every five
	  minutes of every hour, of every day and day of the week, of
	  every month.
Commands can include any number of switches.  However,
	  commands which extend to multiple lines need to be broken
	  with the backslash "`
\`"
 continuation
	  character.

.1. Creating a User Crontab

To create a user crontab, invoke crontab in editor mode:

% crontab -e

This will open the user’s crontab using the default text editor. The first time a user runs this command, it will open an empty file. Once a user creates a crontab, this command will open that file for editing.

It is useful to add these lines to the top of the crontab file in order to set the environment variables and to remember the meanings of the fields in the crontab:

SHELL=/bin/sh
PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin
# Order of crontab fields
# minute	hour	mday	month	wday	command

Then add a line for each command or script to run, specifying the time to run the command. This example runs the specified custom Bourne shell script every day at two in the afternoon. Since the path to the script is not specified in PATH, the full path to the script is given:

0	14	*	*	*	/usr/home/dru/bin/mycustomscript.sh

Before using a custom script, make sure it is executable and test it with the limited set of environment variables set by cron. To replicate the environment that would be used to run the above cron entry, use:

env -i SHELL=/bin/sh PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin HOME=/home/dru LOGNAME=dru /usr/home/dru/bin/mycustomscript.sh

The environment set by cron is discussed in crontab(5) . Checking that scripts operate correctly in a cron environment is especially important if they include any commands that delete files using wildcards.

When finished editing the crontab, save the file. It will automatically be installed and cron will read the crontab and run its cron jobs at their specified times. To list the cron jobs in a crontab, use this command:

% crontab -l0	14	*	*	*	/usr/home/dru/bin/mycustomscript.sh

To remove all of the cron jobs in a user crontab:

% crontab -rremove crontab for dru?y

4. Managing Services in FreeBSD

Managing Services in FreeBSD

Tom Rhodes

FreeBSD uses the rc(8) system of startup scripts during system initialization and for managing services. The scripts listed in /etc/rc.d provide basic services which can be controlled with the start, stop, and restart options to service(8) . For instance, sshd(8) can be restarted with the following command:

# service sshd restart

This procedure can be used to start services on a running system. Services will be started automatically at boot time as specified in rc.conf(5) . For example, to enable natd(8) at system startup, add the following line to /etc/rc.conf :

natd_enable="YES"

If a natd_enable="NO" line is already present, change the NO to YES. The rc(8) scripts will automatically load any dependent services during the next boot, as described below.

Since the rc(8) system is primarily intended to start and stop services at system startup and shutdown time, the start, stop and restart options will only perform their action if the appropriate /etc/rc.conf variable is set. For instance, sshd restart will only work if sshd_enable is set to YES in /etc/rc.conf . To start, stop or restart a service regardless of the settings in /etc/rc.conf , these commands should be prefixed with “one” . For instance, to restart sshd(8) regardless of the current /etc/rc.conf setting, execute the following command:

# service sshd onerestart

To check if a service is enabled in /etc/rc.conf , run the appropriate rc(8) script with rcvar. This example checks to see if sshd(8) is enabled in /etc/rc.conf :

# service sshd rcvar# sshd
#
sshd_enable="YES"
#   (default: "")

The \# sshd line is output from the above command, not a root console.

To determine whether or not a service is running, use status. For instance, to verify that sshd(8) is running:

# service sshd statussshd is running as pid 433.

In some cases, it is also possible to reload a service. This attempts to send a signal to an individual service, forcing the service to reload its configuration files. In most cases, this means sending the service a SIGHUP signal. Support for this feature is not included for every service.

The rc(8) system is used for network services and it also contributes to most of the system initialization. For instance, when the /etc/rc.d/bgfsck script is executed, it prints out the following message:

Starting background file system checks in 60 seconds.

This script is used for background file system checks, which occur only during system initialization.

Many system services depend on other services to function properly. For example, yp(8) and other RPC-based services may fail to start until after the rpcbind(8) service has started. To resolve this issue, information about dependencies and other meta-data is included in the comments at the top of each startup script. The rcorder(8) program is used to parse these comments during system initialization to determine the order in which system services should be invoked to satisfy the dependencies.

The following key word must be included in all startup scripts as it is required by rc.subr(8) to “enable” the startup script:

  • PROVIDE: Specifies the services this file provides.

The following key words may be included at the top of each startup script. They are not strictly necessary, but are useful as hints to rcorder(8) :

  • REQUIRE: Lists services which are required for this service. The script containing this key word will run after the specified services.

  • BEFORE: Lists services which depend on this service. The script containing this key word will run before the specified services.

By carefully setting these keywords for each startup script, an administrator has a fine-grained level of control of the startup order of the scripts, without the need for “runlevels” used by some UNIX® operating systems.

Additional information can be found in rc(8) and rc.subr(8) . Refer to this article for instructions on how to create custom rc(8) scripts.

.1. Managing System-Specific Configuration

The principal location for system configuration information is /etc/rc.conf . This file contains a wide range of configuration information and it is read at system startup to configure the system. It provides the configuration information for the rc* files.

The entries in /etc/rc.conf override the default settings in /etc/defaults/rc.conf . The file containing the default settings should not be edited. Instead, all system-specific changes should be made to /etc/rc.conf .

A number of strategies may be applied in clustered applications to separate site-wide configuration from system-specific configuration in order to reduce administration overhead. The recommended approach is to place system-specific configuration into /etc/rc.conf.local . For example, these entries in /etc/rc.conf apply to all systems:

sshd_enable="YES"
keyrate="fast"
defaultrouter="10.1.1.254"

Whereas these entries in /etc/rc.conf.local apply to this system only:

hostname="node1.example.org"
ifconfig_fxp0="inet 10.1.1.1/8"

Distribute /etc/rc.conf to every system using an application such as rsync or puppet, while /etc/rc.conf.local remains unique.

Upgrading the system will not overwrite /etc/rc.conf , so system configuration information will not be lost.

Both /etc/rc.conf and /etc/rc.conf.local are parsed by sh(1) . This allows system operators to create complex configuration scenarios. Refer to rc.conf(5) for further information on this topic.

5. Setting Up Network Interface Cards

Setting Up Network Interface Cards

Marc Fonvieille

Adding and configuring a network interface card (NIC) is a common task for any FreeBSD administrator.

.1. Locating the Correct Driver

First, determine the model of the NIC and the chip it uses. FreeBSD supports a wide variety of NICs. Check the Hardware Compatibility List for the FreeBSD release to see if the NIC is supported.

If the NIC is supported, determine the name of the FreeBSD driver for the NIC. Refer to /usr/src/sys/conf/NOTES and /usr/src/sys/arch/conf/NOTES for the list of NIC drivers with some information about the supported chipsets. When in doubt, read the manual page of the driver as it will provide more information about the supported hardware and any known limitations of the driver.

The drivers for common NICs are already present in the GENERIC kernel, meaning the NIC should be probed during boot. The system’s boot messages can be viewed by typing more /var/run/dmesg.boot and using the spacebar to scroll through the text. In this example, two Ethernet NICs using the dc(4) driver are present on the system:

dc0: <82c169 PNIC 10/100BaseTX> port 0xa000-0xa0ff mem 0xd3800000-0xd38
000ff irq 15 at device 11.0 on pci0
miibus0: <MII bus> on dc0
bmtphy0: <BCM5201 10/100baseTX PHY> PHY 1 on miibus0
bmtphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
dc0: Ethernet address: 00:a0:cc:da:da:da
dc0: [ITHREAD]
dc1: <82c169 PNIC 10/100BaseTX> port 0x9800-0x98ff mem 0xd3000000-0xd30
000ff irq 11 at device 12.0 on pci0
miibus1: <MII bus> on dc1
bmtphy1: <BCM5201 10/100baseTX PHY> PHY 1 on miibus1
bmtphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
dc1: Ethernet address: 00:a0:cc:da:da:db
dc1: [ITHREAD]

If the driver for the NIC is not present in GENERIC , but a driver is available, the driver will need to be loaded before the NIC can be configured and used. This may be accomplished in one of two ways:

  • The easiest way is to load a kernel module for the NIC using kldload(8) . To also automatically load the driver at boot time, add the appropriate line to /boot/loader.conf . Not all NIC drivers are available as modules.

  • Alternatively, statically compile support for the NIC into a custom kernel. Refer to /usr/src/sys/conf/NOTES , /usr/src/sys/arch/conf/NOTES and the manual page of the driver to determine which line to add to the custom kernel configuration file. For more information about recompiling the kernel, refer to [_kernelconfig]. If the NIC was detected at boot, the kernel does not need to be recompiled.

.1.1. Using Windows NDIS Drivers

Unfortunately, there are still many vendors that do not provide schematics for their drivers to the open source community because they regard such information as trade secrets. Consequently, the developers of FreeBSD and other operating systems are left with two choices: develop the drivers by a long and pain-staking process of reverse engineering or using the existing driver binaries available for Microsoft™  Windows™ platforms.

FreeBSD provides “native” support for the Network Driver Interface Specification (NDIS). It includes ndisgen(8) which can be used to convert a Windows™  XP driver into a format that can be used on FreeBSD. Because the ndis(4) driver uses a Windows™  XP binary, it only runs on i386™ and amd64 systems. PCI, CardBus, PCMCIA, and USB devices are supported.

To use ndisgen(8) , three things are needed:

  1. FreeBSD kernel sources.

  2. A Windows™  XP driver binary with a .SYS extension.

  3. A Windows™  XP driver configuration file with a .INF extension.

Download the .SYS and .INF files for the specific NIC. Generally, these can be found on the driver CD or at the vendor’s website. The following examples use W32DRIVER.SYS and W32DRIVER.INF .

The driver bit width must match the version of FreeBSD. For FreeBSD/i386, use a Windows™ 32-bit driver. For FreeBSD/amd64, a Windows™ 64-bit driver is needed.

The next step is to compile the driver binary into a loadable kernel module. As root , use ndisgen(8) :

# ndisgen /path/to/W32DRIVER.INF /path/to/W32DRIVER.SYS

This command is interactive and prompts for any extra information it requires. A new kernel module will be generated in the current directory. Use kldload(8) to load the new module:

# kldload ./W32DRIVER_SYS.ko

In addition to the generated kernel module, the ndis.ko and if_ndis.ko modules must be loaded. This should happen automatically when any module that depends on ndis(4) is loaded. If not, load them manually, using the following commands:

# kldload ndis
# kldload if_ndis

The first command loads the ndis(4) miniport driver wrapper and the second loads the generated NIC driver.

Check dmesg(8) to see if there were any load errors. If all went well, the output should be similar to the following:

ndis0: <Wireless-G PCI Adapter> mem 0xf4100000-0xf4101fff irq 3 at device 8.0 on pci1
ndis0: NDIS API version: 5.0
ndis0: Ethernet address: 0a:b1:2c:d3:4e:f5
ndis0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
ndis0: 11g rates: 6Mbps 9Mbps 12Mbps 18Mbps 36Mbps 48Mbps 54Mbps

From here, ndis0 can be configured like any other NIC.

To configure the system to load the ndis(4) modules at boot time, copy the generated module, W32DRIVER_SYS.ko , to /boot/modules . Then, add the following line to /boot/loader.conf :

W32DRIVER_SYS_load="YES"

.2. Configuring the Network Card

Once the right driver is loaded for the NIC, the card needs to be configured. It may have been configured at installation time by bsdinstall(8) .

To display the NIC configuration, enter the following command:

% ifconfigdc0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80008<VLAN_MTU,LINKSTATE>
        ether 00:a0:cc:da:da:da
        inet 192.168.1.3 netmask 0xffffff00 broadcast 192.168.1.255
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
dc1: flags=8802<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80008<VLAN_MTU,LINKSTATE>
        ether 00:a0:cc:da:da:db
        inet 10.0.0.1 netmask 0xffffff00 broadcast 10.0.0.255
        media: Ethernet 10baseT/UTP
        status: no carrier
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=3<RXCSUM,TXCSUM>
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
        inet6 ::1 prefixlen 128
        inet 127.0.0.1 netmask 0xff000000
        nd6 options=3<PERFORMNUD,ACCEPT_RTADV>

In this example, the following devices were displayed:

  • dc0 : The first Ethernet interface.

  • dc1 : The second Ethernet interface.

  • lo0 : The loopback device.

FreeBSD uses the driver name followed by the order in which the card is detected at boot to name the NIC. For example, sis2 is the third NIC on the system using the sis(4) driver.

In this example, dc0 is up and running. The key indicators are:

  1. UP means that the card is configured and ready.

  2. The card has an Internet (inet) address, 192.168.1.3 .

  3. It has a valid subnet mask (netmask), where 0xffffff00 is the same as 255.255.255.0 .

  4. It has a valid broadcast address, 192.168.1.255 .

  5. The MAC address of the card (ether) is 00:a0:cc:da:da:da .

  6. The physical media selection is on autoselection mode (media: Ethernet autoselect (100baseTX <full-duplex>)). In this example, dc1 is configured to run with 10baseT/UTP media. For more information on available media types for a driver, refer to its manual page.

  7. The status of the link (status) is active, indicating that the carrier signal is detected. For dc1 , the status: no carrier status is normal when an Ethernet cable is not plugged into the card.

If the ifconfig(8) output had shown something similar to:

dc0: flags=8843<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=80008<VLAN_MTU,LINKSTATE>
	ether 00:a0:cc:da:da:da
	media: Ethernet autoselect (100baseTX <full-duplex>)
	status: active

it would indicate the card has not been configured.

The card must be configured as root . The NIC configuration can be performed from the command line with ifconfig(8) but will not persist after a reboot unless the configuration is also added to /etc/rc.conf . If a DHCP server is present on the LAN, just add this line:

ifconfig_dc0="DHCP"

Replace dc0 with the correct value for the system.

The line added, then, follow the instructions given in Testing and Troubleshooting.

If the network was configured during installation, some entries for the NIC(s) may be already present. Double check /etc/rc.conf before adding any lines.

In the case, there is no DHCP server, the NIC(s) have to be configured manually. Add a line for each NIC present on the system, as seen in this example:

ifconfig_dc0="inet 192.168.1.3 netmask 255.255.255.0"
ifconfig_dc1="inet 10.0.0.1 netmask 255.255.255.0 media 10baseT/UTP"

Replace dc0 and dc1 and the IP address information with the correct values for the system. Refer to the man page for the driver, ifconfig(8) , and rc.conf(5) for more details about the allowed options and the syntax of /etc/rc.conf .

If the network is not using DNS, edit /etc/hosts to add the names and IP addresses of the hosts on the LAN, if they are not already there. For more information, refer to hosts(5) and to /usr/share/examples/etc/hosts .

If there is no DHCP server and access to the Internet is needed, manually configure the default gateway and the nameserver:

# echo 'defaultrouter="your_default_router"' >> /etc/rc.conf
# echo 'nameserver your_DNS_server' >> /etc/resolv.conf

.3. Testing and Troubleshooting

Once the necessary changes to /etc/rc.conf are saved, a reboot can be used to test the network configuration and to verify that the system restarts without any configuration errors. Alternatively, apply the settings to the networking system with this command:

# service netif restart

If a default gateway has been set in /etc/rc.conf , also issue this command:

# service routing restart

Once the networking system has been relaunched, test the NICs.

.3.1. Testing the Ethernet Card

To verify that an Ethernet card is configured correctly, ping(8) the interface itself, and then ping(8) another machine on the LAN:

% ping -c5 192.168.1.3PING 192.168.1.3 (192.168.1.3): 56 data bytes
64 bytes from 192.168.1.3: icmp_seq=0 ttl=64 time=0.082 ms
64 bytes from 192.168.1.3: icmp_seq=1 ttl=64 time=0.074 ms
64 bytes from 192.168.1.3: icmp_seq=2 ttl=64 time=0.076 ms
64 bytes from 192.168.1.3: icmp_seq=3 ttl=64 time=0.108 ms
64 bytes from 192.168.1.3: icmp_seq=4 ttl=64 time=0.076 ms

--- 192.168.1.3 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.074/0.083/0.108/0.013 ms
% ping -c5 192.168.1.2PING 192.168.1.2 (192.168.1.2): 56 data bytes
64 bytes from 192.168.1.2: icmp_seq=0 ttl=64 time=0.726 ms
64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=0.766 ms
64 bytes from 192.168.1.2: icmp_seq=2 ttl=64 time=0.700 ms
64 bytes from 192.168.1.2: icmp_seq=3 ttl=64 time=0.747 ms
64 bytes from 192.168.1.2: icmp_seq=4 ttl=64 time=0.704 ms

--- 192.168.1.2 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.700/0.729/0.766/0.025 ms

To test network resolution, use the host name instead of the IP address. If there is no DNS server on the network, /etc/hosts must first be configured. To this purpose, edit /etc/hosts to add the names and IP addresses of the hosts on the LAN, if they are not already there. For more information, refer to hosts(5) and to /usr/share/examples/etc/hosts .

.3.2. Troubleshooting

When troubleshooting hardware and software configurations, check the simple things first. Is the network cable plugged in? Are the network services properly configured? Is the firewall configured correctly? Is the NIC supported by FreeBSD? Before sending a bug report, always check the Hardware Notes, update the version of FreeBSD to the latest STABLE version, check the mailing list archives, and search the Internet.

If the card works, yet performance is poor, read through tuning(7) . Also, check the network configuration as incorrect network settings can cause slow connections.

Some users experience one or two device timeout messages, which is normal for some cards. If they continue, or are bothersome, determine if the device is conflicting with another device. Double check the cable connections. Consider trying another card.

To resolve watchdog timeout errors, first check the network cable. Many cards require a PCI slot which supports bus mastering. On some old motherboards, only one PCI slot allows it, usually slot 0. Check the NIC and the motherboard documentation to determine if that may be the problem.

No route to host messages occur if the system is unable to route a packet to the destination host. This can happen if no default route is specified or if a cable is unplugged. Check the output of netstat -rn and make sure there is a valid route to the host. If there is not, read [_network_routing].

ping: sendto: Permission denied error messages are often caused by a misconfigured firewall. If a firewall is enabled on FreeBSD but no rules have been defined, the default policy is to deny all traffic, even ping(8) . Refer to [_firewalls] for more information.

Sometimes performance of the card is poor or below average. In these cases, try setting the media selection mode from autoselect to the correct media selection. While this works for most hardware, it may or may not resolve the issue. Again, check all the network settings, and refer to tuning(7) .

6. Virtual Hosts

A common use of FreeBSD is virtual site hosting, where one server appears to the network as many servers. This is achieved by assigning multiple network addresses to a single interface.

A given network interface has one “real” address, and may have any number of “alias” addresses. These aliases are normally added by placing alias entries in /etc/rc.conf , as seen in this example:

ifconfig_fxp0_alias0="inet xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx"

Alias entries must start with alias0 using a sequential number such as alias0, alias1, and so on. The configuration process will stop at the first missing number.

The calculation of alias netmasks is important. For a given interface, there must be one address which correctly represents the network’s netmask. Any other addresses which fall within this network must have a netmask of all 1s, expressed as either 255.255.255.255 or 0xffffffff .

For example, consider the case where the fxp0 interface is connected to two networks: 10.1.1.0 with a netmask of 255.255.255.0 and 202.0.75.16 with a netmask of 255.255.255.240 . The system is to be configured to appear in the ranges 10.1.1.1 through 10.1.1.5 and 202.0.75.17 through 202.0.75.20 . Only the first address in a given network range should have a real netmask. All the rest (10.1.1.2 through 10.1.1.5 and 202.0.75.18 through 202.0.75.20 ) must be configured with a netmask of 255.255.255.255 .

The following /etc/rc.conf entries configure the adapter correctly for this scenario:

ifconfig_fxp0="inet 10.1.1.1 netmask 255.255.255.0"
ifconfig_fxp0_alias0="inet 10.1.1.2 netmask 255.255.255.255"
ifconfig_fxp0_alias1="inet 10.1.1.3 netmask 255.255.255.255"
ifconfig_fxp0_alias2="inet 10.1.1.4 netmask 255.255.255.255"
ifconfig_fxp0_alias3="inet 10.1.1.5 netmask 255.255.255.255"
ifconfig_fxp0_alias4="inet 202.0.75.17 netmask 255.255.255.240"
ifconfig_fxp0_alias5="inet 202.0.75.18 netmask 255.255.255.255"
ifconfig_fxp0_alias6="inet 202.0.75.19 netmask 255.255.255.255"
ifconfig_fxp0_alias7="inet 202.0.75.20 netmask 255.255.255.255"

A simpler way to express this is with a space-separated list of IP address ranges. The first address will be given the indicated subnet mask and the additional addresses will have a subnet mask of 255.255.255.255.

ifconfig_fxp0_aliases="inet 10.1.1.1-5/24 inet 202.0.75.17-20/28"

7. Configuring System Logging

Configuring System Logging

Niclas Zeising

Generating and reading system logs is an important aspect of system administration. The information in system logs can be used to detect hardware and software issues as well as application and system configuration errors. This information also plays an important role in security auditing and incident response. Most system daemons and applications will generate log entries.

FreeBSD provides a system logger, syslogd, to manage logging. By default, syslogd is started when the system boots. This is controlled by the variable syslogd_enable in /etc/rc.conf . There are numerous application arguments that can be set using syslogd_flags in /etc/rc.conf . Refer to syslogd(8) for more information on the available arguments.

This section describes how to configure the FreeBSD system logger for both local and remote logging and how to perform log rotation and log management.

.1. Configuring Local Logging

The configuration file, /etc/syslog.conf , controls what syslogd does with log entries as they are received. There are several parameters to control the handling of incoming events. The facility describes which subsystem generated the message, such as the kernel or a daemon, and the level describes the severity of the event that occurred. This makes it possible to configure if and where a log message is logged, depending on the facility and level. It is also possible to take action depending on the application that sent the message, and in the case of remote logging, the hostname of the machine generating the logging event.

This configuration file contains one line per action, where the syntax for each line is a selector field followed by an action field. The syntax of the selector field is facility.level which will match log messages from facility at level level or higher. It is also possible to add an optional comparison flag before the level to specify more precisely what is logged. Multiple selector fields can be used for the same action, and are separated with a semicolon (;). Using \* will match everything. The action field denotes where to send the log message, such as to a file or remote log host. As an example, here is the default syslog.conf from FreeBSD:

# $FreeBSD$
#
#       Spaces ARE valid field separators in this file. However,
#       other *nix-like systems still insist on using tabs as field
#       separators. If you are sharing this file between systems, you
#       may want to use only tabs as field separators here.
#       Consult the syslog.conf(5) manpage.
*.err;kern.warning;auth.notice;mail.crit                /dev/console
*.notice;authpriv.none;kern.debug;lpr.info;mail.crit;news.err   /var/log/messages
security.*                                      /var/log/security
auth.info;authpriv.info                         /var/log/auth.log
mail.info                                       /var/log/maillog
lpr.info                                        /var/log/lpd-errs
ftp.info                                        /var/log/xferlog
cron.*                                          /var/log/cron
!-devd
*.=debug                                        /var/log/debug.log
*.emerg                                         *
# uncomment this to log all writes to /dev/console to /var/log/console.log
#console.info                                   /var/log/console.log
# uncomment this to enable logging of all log messages to /var/log/all.log
# touch /var/log/all.log and chmod it to mode 600 before it will work
#*.*                                            /var/log/all.log
# uncomment this to enable logging to a remote loghost named loghost
#*.*                                            @loghost
# uncomment these if you're running inn
# news.crit                                     /var/log/news/news.crit
# news.err                                      /var/log/news/news.err
# news.notice                                   /var/log/news/news.notice
# Uncomment this if you wish to see messages produced by devd
# !devd
# *.>=info
!ppp
*.*                                             /var/log/ppp.log
!*

In this example:

  • Line 8 matches all messages with a level of err or higher, as well as kern.warning, auth.notice and mail.crit, and sends these log messages to the console (/dev/console ).

  • Line 12 matches all messages from the mail facility at level info or above and logs the messages to /var/log/maillog .

  • Line 17 uses a comparison flag (=) to only match messages at level debug and logs them to /var/log/debug.log .

  • Line 33 is an example usage of a program specification. This makes the rules following it only valid for the specified program. In this case, only the messages generated by ppp are logged to /var/log/ppp.log .

The available levels, in order from most to least critical are emerg, alert, crit, err, warning, notice, info, and debug.

The facilities, in no particular order, are auth, authpriv, console, cron, daemon, ftp, kern, lpr, mail, mark, news, security, syslog, user, uucp, and local0 through local7. Be aware that other operating systems might have different facilities.

To log everything of level notice and higher to /var/log/daemon.log , add the following entry:

daemon.notice                                        /var/log/daemon.log

For more information about the different levels and facilities, refer to syslog(3) and syslogd(8) . For more information about /etc/syslog.conf , its syntax, and more advanced usage examples, see syslog.conf(5) .

.2. Log Management and Rotation

Log files can grow quickly, taking up disk space and making it more difficult to locate useful information. Log management attempts to mitigate this. In FreeBSD, newsyslog is used to manage log files. This built-in program periodically rotates and compresses log files, and optionally creates missing log files and signals programs when log files are moved. The log files may be generated by syslogd or by any other program which generates log files. While newsyslog is normally run from cron(8) , it is not a system daemon. In the default configuration, it runs every hour.

To know which actions to take, newsyslog reads its configuration file, /etc/newsyslog.conf . This file contains one line for each log file that newsyslog manages. Each line states the file owner, permissions, when to rotate that file, optional flags that affect log rotation, such as compression, and programs to signal when the log is rotated. Here is the default configuration in FreeBSD:

# configuration file for newsyslog
# $FreeBSD$
#
# Entries which do not specify the '/pid_file' field will cause the
# syslogd process to be signalled when that log file is rotated.  This
# action is only appropriate for log files which are written to by the
# syslogd process (ie, files listed in /etc/syslog.conf).  If there
# is no process which needs to be signalled when a given log file is
# rotated, then the entry for that file should include the 'N' flag.
#
# The 'flags' field is one or more of the letters: BCDGJNUXZ or a '-'.
#
# Note: some sites will want to select more restrictive protections than the
# defaults.  In particular, it may be desirable to switch many of the 644
# entries to 640 or 600.  For example, some sites will consider the
# contents of maillog, messages, and lpd-errs to be confidential.  In the
# future, these defaults may change to more conservative ones.
#
# logfilename          [owner:group]    mode count size when  flags [/pid_file] [sig_num]
/var/log/all.log                        600  7     *    @T00  J
/var/log/amd.log                        644  7     100  *     J
/var/log/auth.log                       600  7     100  @0101T JC
/var/log/console.log                    600  5     100  *     J
/var/log/cron                           600  3     100  *     JC
/var/log/daily.log                      640  7     *    @T00  JN
/var/log/debug.log                      600  7     100  *     JC
/var/log/kerberos.log                   600  7     100  *     J
/var/log/lpd-errs                       644  7     100  *     JC
/var/log/maillog                        640  7     *    @T00  JC
/var/log/messages                       644  5     100  @0101T JC
/var/log/monthly.log                    640  12    *    $M1D0 JN
/var/log/pflog                          600  3     100  *     JB    /var/run/pflogd.pid
/var/log/ppp.log        root:network    640  3     100  *     JC
/var/log/devd.log                       644  3     100  *     JC
/var/log/security                       600  10    100  *     JC
/var/log/sendmail.st                    640  10    *    168   B
/var/log/utx.log                        644  3     *    @01T05 B
/var/log/weekly.log                     640  5     1    $W6D0 JN
/var/log/xferlog                        600  7     100  *     JC

Each line starts with the name of the log to be rotated, optionally followed by an owner and group for both rotated and newly created files. The mode field sets the permissions on the log file and count denotes how many rotated log files should be kept. The size and when fields tell newsyslog when to rotate the file. A log file is rotated when either its size is larger than the size field or when the time in the when field has passed. An asterisk (\*) means that this field is ignored. The flags field gives further instructions, such as how to compress the rotated file or to create the log file if it is missing. The last two fields are optional and specify the name of the Process ID (PID) file of a process and a signal number to send to that process when the file is rotated.

For more information on all fields, valid flags, and how to specify the rotation time, refer to newsyslog.conf(5) . Since newsyslog is run from cron(8) , it cannot rotate files more often than it is scheduled to run from cron(8) .

.3. Configuring Remote Logging

Configuring Remote Logging

Tom Rhodes

Monitoring the log files of multiple hosts can become unwieldy as the number of systems increases. Configuring centralized logging can reduce some of the administrative burden of log file administration.

In FreeBSD, centralized log file aggregation, merging, and rotation can be configured using syslogd and newsyslog. This section demonstrates an example configuration, where host A , named logserv.example.com , will collect logging information for the local network. Host B , named logclient.example.com , will be configured to pass logging information to the logging server.

.1. Log Server Configuration

A log server is a system that has been configured to accept logging information from other hosts. Before configuring a log server, check the following:

  • If there is a firewall between the logging server and any logging clients, ensure that the firewall ruleset allows UDP port 514 for both the clients and the server.

  • The logging server and all client machines must have forward and reverse entries in the local DNS. If the network does not have a DNS server, create entries in each system’s /etc/hosts . Proper name resolution is required so that log entries are not rejected by the logging server.

On the log server, edit /etc/syslog.conf to specify the name of the client to receive log entries from, the logging facility to be used, and the name of the log to store the host’s log entries. This example adds the hostname of B , logs all facilities, and stores the log entries in /var/log/logclient.log .

Example 1. Sample Log Server Configuration
+logclient.example.com
*.*     /var/log/logclient.log

When adding multiple log clients, add a similar two-line entry for each client. More information about the available facilities may be found in syslog.conf(5) .

Next, configure /etc/rc.conf :

syslogd_enable="YES"
syslogd_flags="-a logclient.example.com -v -v"

The first entry starts syslogd at system boot. The second entry allows log entries from the specified client. The -v -v increases the verbosity of logged messages. This is useful for tweaking facilities as administrators are able to see what type of messages are being logged under each facility.

Multiple -a options may be specified to allow logging from multiple clients. IP addresses and whole netblocks may also be specified. Refer to syslogd(8) for a full list of possible options.

Finally, create the log file:

# touch /var/log/logclient.log

At this point, syslogd should be restarted and verified:

# service syslogd restart
# pgrep syslog

If a PID is returned, the server restarted successfully, and client configuration can begin. If the server did not restart, consult /var/log/messages for the error.

.2. Log Client Configuration

A logging client sends log entries to a logging server on the network. The client also keeps a local copy of its own logs.

Once a logging server has been configured, edit /etc/rc.conf on the logging client:

syslogd_enable="YES"
syslogd_flags="-s -v -v"

The first entry enables syslogd on boot up. The second entry prevents logs from being accepted by this client from other hosts (-s) and increases the verbosity of logged messages.

Next, define the logging server in the client’s /etc/syslog.conf . In this example, all logged facilities are sent to a remote system, denoted by the @ symbol, with the specified hostname:

*.*		@logserv.example.com

After saving the edit, restart syslogd for the changes to take effect:

# service syslogd restart

To test that log messages are being sent across the network, use logger(1) on the client to send a message to syslogd:

# logger "Test message from logclient"

This message should now exist both in /var/log/messages on the client and /var/log/logclient.log on the log server.

.3. Debugging Log Servers

If no messages are being received on the log server, the cause is most likely a network connectivity issue, a hostname resolution issue, or a typo in a configuration file. To isolate the cause, ensure that both the logging server and the logging client are able to ping each other using the hostname specified in their /etc/rc.conf . If this fails, check the network cabling, the firewall ruleset, and the hostname entries in the DNS server or /etc/hosts on both the logging server and clients. Repeat until the ping is successful from both hosts.

If the ping succeeds on both hosts but log messages are still not being received, temporarily increase logging verbosity to narrow down the configuration issue. In the following example, /var/log/logclient.log on the logging server is empty and /var/log/messages on the logging client does not indicate a reason for the failure. To increase debugging output, edit the syslogd_flags entry on the logging server and issue a restart:

syslogd_flags="-d -a logclient.example.com -v -v"
# service syslogd restart

Debugging data similar to the following will flash on the console immediately after the restart:

logmsg: pri 56, flags 4, from logserv.example.com, msg syslogd: restart
syslogd: restarted
logmsg: pri 6, flags 4, from logserv.example.com, msg syslogd: kernel boot file is /boot/kernel/kernel
Logging to FILE /var/log/messages
syslogd: kernel boot file is /boot/kernel/kernel
cvthname(192.168.1.10)
validate: dgram from IP 192.168.1.10, port 514, name logclient.example.com;
rejected in rule 0 due to name mismatch.

In this example, the log messages are being rejected due to a typo which results in a hostname mismatch. The client’s hostname should be logclient, not logclien. Fix the typo, issue a restart, and verify the results:

# service syslogd restartlogmsg: pri 56, flags 4, from logserv.example.com, msg syslogd: restart
syslogd: restarted
logmsg: pri 6, flags 4, from logserv.example.com, msg syslogd: kernel boot file is /boot/kernel/kernel
syslogd: kernel boot file is /boot/kernel/kernel
logmsg: pri 166, flags 17, from logserv.example.com,
msg Dec 10 20:55:02 <syslog.err> logserv.example.com syslogd: exiting on signal 2
cvthname(192.168.1.10)
validate: dgram from IP 192.168.1.10, port 514, name logclient.example.com;
accepted in rule 0.
logmsg: pri 15, flags 0, from logclient.example.com, msg Dec 11 02:01:28 trhodes: Test message 2
Logging to FILE /var/log/logclient.log
Logging to FILE /var/log/messages

At this point, the messages are being properly received and placed in the correct file.

.4. Security Considerations

As with any network service, security requirements should be considered before implementing a logging server. Log files may contain sensitive data about services enabled on the local host, user accounts, and configuration data. Network data sent from the client to the server will not be encrypted or password protected. If a need for encryption exists, consider using security/stunnel , which will transmit the logging data over an encrypted tunnel.

Local security is also an issue. Log files are not encrypted during use or after log rotation. Local users may access log files to gain additional insight into system configuration. Setting proper permissions on log files is critical. The built-in log rotator, newsyslog, supports setting permissions on newly created and rotated log files. Setting log files to mode 600 should prevent unwanted access by local users. Refer to newsyslog.conf(5) for additional information.

8. Configuration Files

8.1. /etc Layout

There are a number of directories in which configuration information is kept. These include:

/etc

Generic system-specific configuration information.

/etc/defaults

Default versions of system configuration files.

/etc/mail

Extra sendmail(8) configuration and otherMTA configuration files.

/etc/ppp

Configuration for both user- and kernel-ppp programs.

/usr/local/etc

Configuration files for installed applications. May contain per-application subdirectories.

/usr/local/etc/rc.d

rc(8) scripts for installed applications.

/var/db

Automatically generated system-specific database files, such as the package database and the locate(1) database.

8.2. Hostnames

8.2.1. /etc/resolv.conf

How a FreeBSD system accesses the Internet Domain Name System (DNS) is controlled by resolv.conf(5) .

The most common entries to /etc/resolv.conf are:

nameserver

The IP address of a name server the resolver should query. The servers are queried in the order listed with a maximum of three.

search

Search list for hostname lookup. This is normally determined by the domain of the local hostname.

domain

The local domain name.

A typical /etc/resolv.conf looks like this:

search example.com
nameserver 147.11.1.11
nameserver 147.11.100.30

Only one of the search and domain options should be used.

When using DHCP, dhclient(8) usually rewrites /etc/resolv.conf with information received from the DHCP server.

8.2.2. /etc/hosts

/etc/hosts is a simple text database which works in conjunction with DNS and NIS to provide host name to IP address mappings. Entries for local computers connected via a LAN can be added to this file for simplistic naming purposes instead of setting up a named(8) server. Additionally, /etc/hosts can be used to provide a local record of Internet names, reducing the need to query external DNS servers for commonly accessed names.

# $FreeBSD$
#
#
# Host Database
#
# This file should contain the addresses and aliases for local hosts that
# share this file.  Replace 'my.domain' below with the domainname of your
# machine.
#
# In the presence of the domain name service or NIS, this file may
# not be consulted at all; see /etc/nsswitch.conf for the resolution order.
#
#
::1			localhost localhost.my.domain
127.0.0.1		localhost localhost.my.domain
#
# Imaginary network.
#10.0.0.2		myname.my.domain myname
#10.0.0.3		myfriend.my.domain myfriend
#
# According to RFC 1918, you can use the following IP networks for
# private nets which will never be connected to the Internet:
#
#	10.0.0.0	-   10.255.255.255
#	172.16.0.0	-   172.31.255.255
#	192.168.0.0	-   192.168.255.255
#
# In case you want to be able to connect to the Internet, you need
# real official assigned numbers.  Do not try to invent your own network
# numbers but instead get one from your network provider (if any) or
# from your regional registry (ARIN, APNIC, LACNIC, RIPE NCC, or AfriNIC.)
#

The format of /etc/hosts is as follows:

[Internet address] [official hostname] [alias1] [alias2] ...

For example:

10.0.0.1 myRealHostname.example.com myRealHostname foobar1 foobar2

Consult hosts(5) for more information.

9. Tuning with sysctl8

 sysctl(8)
 is used to make changes to a running FreeBSD system.
This includes many advanced options of the [acronym]``TCP/IP`` stack and virtual memory system that can dramatically improve performance for an experienced system administrator.
Over five hundred system variables can be read and set using  sysctl(8)
.

At its core, sysctl(8) serves two functions: to read and to modify system settings.

To view all readable variables:

% sysctl -a

To read a particular variable, specify its name:

% sysctl kern.maxprockern.maxproc: 1044

To set a particular variable, use the variable=value syntax:

# sysctl kern.maxfiles=5000kern.maxfiles: 2088 -> 5000

Settings of sysctl variables are usually either strings, numbers, or booleans, where a boolean is 1 for yes or 0 for no.

To automatically set some variables each time the machine boots, add them to /etc/sysctl.conf . For more information, refer to sysctl.conf(5) and sysctl.conf.

9.1. sysctl.conf

The configuration file for sysctl(8) , /etc/sysctl.conf , looks much like /etc/rc.conf . Values are set in a variable=value form. The specified values are set after the system goes into multi-user mode. Not all variables are settable in this mode.

For example, to turn off logging of fatal signal exits and prevent users from seeing processes started by other users, the following tunables can be set in /etc/sysctl.conf :

# Do not log fatal signal exits (e.g., sig 11)
kern.logsigexit=0

# Prevent users from seeing information about processes that
# are being run under another UID.
security.bsd.see_other_uids=0

9.2. sysctl8 Read-only

sysctl8 Read-only

Tom Rhodes

In some cases it may be desirable to modify read-only sysctl(8) values, which will require a reboot of the system.

For instance, on some laptop models the cardbus(4) device will not probe memory ranges and will fail with errors similar to:

cbb0: Could not map register memory
device_probe_and_attach: cbb0 attach returned 12

The fix requires the modification of a read-only sysctl(8) setting. Add hw.pci.allow_unsupported_io_range=1 to /boot/loader.conf and reboot. Now cardbus(4) should work properly.

10. Tuning Disks

The following section will discuss various tuning mechanisms and options which may be applied to disk devices. In many cases, disks with mechanical parts, such as SCSI drives, will be the bottleneck driving down the overall system performance. While a solution is to install a drive without mechanical parts, such as a solid state drive, mechanical drives are not going away anytime in the near future. When tuning disks, it is advisable to utilize the features of the iostat(8) command to test various changes to the system. This command will allow the user to obtain valuable information on system IO.

10.1. Sysctl Variables

10.1.1. vfs.vmiodirenable

The vfs.vmiodirenable sysctl(8) variable may be set to either 0 (off) or 1 (on). It is set to 1 by default. This variable controls how directories are cached by the system. Most directories are small, using just a single fragment (typically 1 K) in the file system and typically 512 bytes in the buffer cache. With this variable turned off, the buffer cache will only cache a fixed number of directories, even if the system has a huge amount of memory. When turned on, this sysctl(8) allows the buffer cache to use the VM page cache to cache the directories, making all the memory available for caching directories. However, the minimum in-core memory used to cache a directory is the physical page size (typically 4 K) rather than 512  bytes. Keeping this option enabled is recommended if the system is running any services which manipulate large numbers of files. Such services can include web caches, large mail systems, and news systems. Keeping this option on will generally not reduce performance, even with the wasted memory, but one should experiment to find out.

10.1.2. vfs.write_behind

The vfs.write_behind sysctl(8) variable defaults to 1 (on). This tells the file system to issue media writes as full clusters are collected, which typically occurs when writing large sequential files. This avoids saturating the buffer cache with dirty buffers when it would not benefit I/O performance. However, this may stall processes and under certain circumstances should be turned off.

10.1.3. vfs.hirunningspace

The vfs.hirunningspace sysctl(8) variable determines how much outstanding write I/O may be queued to disk controllers system-wide at any given instance. The default is usually sufficient, but on machines with many disks, try bumping it up to four or five megabytes. Setting too high a value which exceeds the buffer cache’s write threshold can lead to bad clustering performance. Do not set this value arbitrarily high as higher write values may add latency to reads occurring at the same time.

There are various other buffer cache and VM page cache related sysctl(8) values. Modifying these values is not recommended as the VM system does a good job of automatically tuning itself.

10.1.4. vm.swap_idle_enabled

The vm.swap_idle_enabled sysctl(8) variable is useful in large multi-user systems with many active login users and lots of idle processes. Such systems tend to generate continuous pressure on free memory reserves. Turning this feature on and tweaking the swapout hysteresis (in idle seconds) via vm.swap_idle_threshold1 and vm.swap_idle_threshold2 depresses the priority of memory pages associated with idle processes more quickly then the normal pageout algorithm. This gives a helping hand to the pageout daemon. Only turn this option on if needed, because the tradeoff is essentially pre-page memory sooner rather than later which eats more swap and disk bandwidth. In a small system this option will have a determinable effect, but in a large system that is already doing moderate paging, this option allows the VM system to stage whole processes into and out of memory easily.

10.1.5. hw.ata.wc

Turning off IDE write caching reduces write bandwidth to IDE disks, but may sometimes be necessary due to data consistency issues introduced by hard drive vendors. The problem is that some IDE drives lie about when a write completes. With IDE write caching turned on, IDE hard drives write data to disk out of order and will sometimes delay writing some blocks indefinitely when under heavy disk load. A crash or power failure may cause serious file system corruption. Check the default on the system by observing the hw.ata.wc sysctl(8) variable. If IDE write caching is turned off, one can set this read-only variable to 1 in /boot/loader.conf in order to enable it at boot time.

For more information, refer to ata(4) .

10.1.6. SCSI_DELAY (kern.cam.scsi_delay)

The SCSI_DELAY kernel configuration option may be used to reduce system boot times. The defaults are fairly high and can be responsible for 15 seconds of delay in the boot process. Reducing it to 5 seconds usually works with modern drives. The kern.cam.scsi_delay boot time tunable should be used. The tunable and kernel configuration option accept values in terms of milliseconds and _not_seconds.

10.2. Soft Updates

To fine-tune a file system, use tunefs(8) . This program has many different options. To toggle Soft Updates on and off, use:

# tunefs -n enable /filesystem
# tunefs -n disable /filesystem

A file system cannot be modified with tunefs(8) while it is mounted. A good time to enable Soft Updates is before any partitions have been mounted, in single-user mode.

Soft Updates is recommended for UFS file systems as it drastically improves meta-data performance, mainly file creation and deletion, through the use of a memory cache. There are two downsides to Soft Updates to be aware of. First, Soft Updates guarantee file system consistency in the case of a crash, but could easily be several seconds or even a minute behind updating the physical disk. If the system crashes, unwritten data may be lost. Secondly, Soft Updates delay the freeing of file system blocks. If the root file system is almost full, performing a major update, such as make installworld, can cause the file system to run out of space and the update to fail.

10.2.1. More Details About Soft Updates

Meta-data updates are updates to non-content data like inodes or directories. There are two traditional approaches to writing a file system’s meta-data back to disk.

Historically, the default behavior was to write out meta-data updates synchronously. If a directory changed, the system waited until the change was actually written to disk. The file data buffers (file contents) were passed through the buffer cache and backed up to disk later on asynchronously. The advantage of this implementation is that it operates safely. If there is a failure during an update, meta-data is always in a consistent state. A file is either created completely or not at all. If the data blocks of a file did not find their way out of the buffer cache onto the disk by the time of the crash, fsck(8) recognizes this and repairs the file system by setting the file length to 0. Additionally, the implementation is clear and simple. The disadvantage is that meta-data changes are slow. For example, rm -r touches all the files in a directory sequentially, but each directory change will be written synchronously to the disk. This includes updates to the directory itself, to the inode table, and possibly to indirect blocks allocated by the file. Similar considerations apply for unrolling large hierarchies using tar -x.

The second approach is to use asynchronous meta-data updates. This is the default for a UFS file system mounted with mount -o async. Since all meta-data updates are also passed through the buffer cache, they will be intermixed with the updates of the file content data. The advantage of this implementation is there is no need to wait until each meta-data update has been written to disk, so all operations which cause huge amounts of meta-data updates work much faster than in the synchronous case. This implementation is still clear and simple, so there is a low risk for bugs creeping into the code. The disadvantage is that there is no guarantee for a consistent state of the file system. If there is a failure during an operation that updated large amounts of meta-data, like a power failure or someone pressing the reset button, the file system will be left in an unpredictable state. There is no opportunity to examine the state of the file system when the system comes up again as the data blocks of a file could already have been written to the disk while the updates of the inode table or the associated directory were not. It is impossible to implement a fsck(8) which is able to clean up the resulting chaos because the necessary information is not available on the disk. If the file system has been damaged beyond repair, the only choice is to reformat it and restore from backup.

The usual solution for this problem is to implement dirty region logging, which is also referred to as journaling. Meta-data updates are still written synchronously, but only into a small region of the disk. Later on, they are moved to their proper location. Because the logging area is a small, contiguous region on the disk, there are no long distances for the disk heads to move, even during heavy operations, so these operations are quicker than synchronous updates. Additionally, the complexity of the implementation is limited, so the risk of bugs being present is low. A disadvantage is that all meta-data is written twice, once into the logging region and once to the proper location, so performance “pessimization” might result. On the other hand, in case of a crash, all pending meta-data operations can be either quickly rolled back or completed from the logging area after the system comes up again, resulting in a fast file system startup.

Kirk McKusick, the developer of Berkeley FFS, solved this problem with Soft Updates. All pending meta-data updates are kept in memory and written out to disk in a sorted sequence (“ordered meta-data updates” ). This has the effect that, in case of heavy meta-data operations, later updates to an item “catch” the earlier ones which are still in memory and have not already been written to disk. All operations are generally performed in memory before the update is written to disk and the data blocks are sorted according to their position so that they will not be on the disk ahead of their meta-data. If the system crashes, an implicit “log rewind” causes all operations which were not written to the disk appear as if they never happened. A consistent file system state is maintained that appears to be the one of 30 to 60 seconds earlier. The algorithm used guarantees that all resources in use are marked as such in their blocks and inodes. After a crash, the only resource allocation error that occurs is that resources are marked as “used” which are actually “free” . fsck(8) recognizes this situation, and frees the resources that are no longer used. It is safe to ignore the dirty state of the file system after a crash by forcibly mounting it with mount -f. In order to free resources that may be unused, fsck(8) needs to be run at a later time. This is the idea behind the background fsck(8): at system startup time, only a snapshot of the file system is recorded and fsck(8) is run afterwards. All file systems can then be mounted “dirty” , so the system startup proceeds in multi-user mode. Then, background fsck(8) is scheduled for all file systems where this is required, to free resources that may be unused. File systems that do not use Soft Updates still need the usual foreground fsck(8) .

The advantage is that meta-data operations are nearly as fast as asynchronous updates and are faster than logging, which has to write the meta-data twice. The disadvantages are the complexity of the code, a higher memory consumption, and some idiosyncrasies. After a crash, the state of the file system appears to be somewhat “older” . In situations where the standard synchronous approach would have caused some zero-length files to remain after the fsck(8) , these files do not exist at all with Soft Updates because neither the meta-data nor the file contents have been written to disk. Disk space is not released until the updates have been written to disk, which may take place some time after running rm(1) . This may cause problems when installing large amounts of data on a file system that does not have enough free space to hold all the files twice.

11. Tuning Kernel Limits

11.1. File/Process Limits

11.1.1. kern.maxfiles

The kern.maxfiles sysctl(8) variable can be raised or lowered based upon system requirements. This variable indicates the maximum number of file descriptors on the system. When the file descriptor table is full, file: table is full will show up repeatedly in the system message buffer, which can be viewed using dmesg(8) .

Each open file, socket, or fifo uses one file descriptor. A large-scale production server may easily require many thousands of file descriptors, depending on the kind and number of services running concurrently.

In older FreeBSD releases, the default value of kern.maxfiles is derived from maxusers in the kernel configuration file. kern.maxfiles grows proportionally to the value of maxusers. When compiling a custom kernel, consider setting this kernel configuration option according to the use of the system. From this number, the kernel is given most of its pre-defined limits. Even though a production machine may not have 256 concurrent users, the resources needed may be similar to a high-scale web server.

The read-only sysctl(8) variable kern.maxusers is automatically sized at boot based on the amount of memory available in the system, and may be determined at run-time by inspecting the value of kern.maxusers. Some systems require larger or smaller values of kern.maxusers and values of 64, 128, and 256 are not uncommon. Going above 256 is not recommended unless a huge number of file descriptors is needed. Many of the tunable values set to their defaults by kern.maxusers may be individually overridden at boot-time or run-time in /boot/loader.conf . Refer to loader.conf(5) and /boot/defaults/loader.conf for more details and some hints.

In older releases, the system will auto-tune maxusers if it is set to 0. [1] . When setting this option, set maxusers to at least 4, especially if the system runs Xorg or is used to compile software. The most important table set by maxusers is the maximum number of processes, which is set to 20 + 16 * maxusers. If maxusers is set to 1, there can only be 36 simultaneous processes, including the 18 or so that the system starts up at boot time and the 15 or so used by Xorg. Even a simple task like reading a manual page will start up nine processes to filter, decompress, and view it. Setting maxusers to 64 allows up to 1044 simultaneous processes, which should be enough for nearly all uses. If, however, the error is displayed when trying to start another program, or a server is running with a large number of simultaneous users, increase the number and rebuild.

maxusers does not limit the number of users which can log into the machine. It instead sets various table sizes to reasonable values considering the maximum number of users on the system and how many processes each user will be running.

11.1.2. kern.ipc.soacceptqueue

The kern.ipc.soacceptqueue sysctl(8) variable limits the size of the listen queue for accepting new TCP connections. The default value of 128 is typically too low for robust handling of new connections on a heavily loaded web server. For such environments, it is recommended to increase this value to 1024 or higher. A service such as sendmail(8) , or Apache may itself limit the listen queue size, but will often have a directive in its configuration file to adjust the queue size. Large listen queues do a better job of avoiding Denial of Service (DoS) attacks.

11.2. Network Limits

The NMBCLUSTERS kernel configuration option dictates the amount of network Mbufs available to the system. A heavily-trafficked server with a low number of Mbufs will hinder performance. Each cluster represents approximately 2 K of memory, so a value of 1024 represents 2 megabytes of kernel memory reserved for network buffers. A simple calculation can be done to figure out how many are needed. A web server which maxes out at 1000 simultaneous connections where each connection uses a 6 K receive and 16 K send buffer, requires approximately 32 MB worth of network buffers to cover the web server. A good rule of thumb is to multiply by 2, so 2x32 MB / 2 KB = 64 MB / 2 kB = 32768. Values between 4096 and 32768 are recommended for machines with greater amounts of memory. Never specify an arbitrarily high value for this parameter as it could lead to a boot time crash. To observe network cluster usage, use -m with netstat(1) .

The kern.ipc.nmbclusters loader tunable should be used to tune this at boot time. Only older versions of FreeBSD will require the use of the NMBCLUSTERS kernel config(8) option.

For busy servers that make extensive use of the sendfile(2) system call, it may be necessary to increase the number of sendfile(2) buffers via the NSFBUFS kernel configuration option or by setting its value in /boot/loader.conf (see loader(8) for details). A common indicator that this parameter needs to be adjusted is when processes are seen in the sfbufa state. The sysctl(8) variable kern.ipc.nsfbufs is read-only. This parameter nominally scales with kern.maxusers, however it may be necessary to tune accordingly.

Even though a socket has been marked as non-blocking, calling sendfile(2) on the non-blocking socket may result in the sendfile(2) call blocking until enough struct sf_buf's are made available.

11.2.1. net.inet.ip.portrange.*

The net.inet.ip.portrange.* sysctl(8) variables control the port number ranges automatically bound to TCP and UDP sockets. There are three ranges: a low range, a default range, and a high range. Most network programs use the default range which is controlled by net.inet.ip.portrange.first and net.inet.ip.portrange.last, which default to 1024 and 5000, respectively. Bound port ranges are used for outgoing connections and it is possible to run the system out of ports under certain circumstances. This most commonly occurs when running a heavily loaded web proxy. The port range is not an issue when running a server which handles mainly incoming connections, such as a web server, or has a limited number of outgoing connections, such as a mail relay. For situations where there is a shortage of ports, it is recommended to increase net.inet.ip.portrange.last modestly. A value of 10000, 20000 or 30000 may be reasonable. Consider firewall effects when changing the port range. Some firewalls may block large ranges of ports, usually low-numbered ports, and expect systems to use higher ranges of ports for outgoing connections. For this reason, it is not recommended that the value of net.inet.ip.portrange.first be lowered.

11.2.2. TCP Bandwidth Delay Product

` TCP` bandwidth delay product limiting can be enabled by setting the net.inet.tcp.inflight.enable sysctl(8) variable to 1. This instructs the system to attempt to calculate the bandwidth delay product for each connection and limit the amount of data queued to the network to just the amount required to maintain optimum throughput.

This feature is useful when serving data over modems, Gigabit Ethernet, high speed WAN links, or any other link with a high bandwidth delay product, especially when also using window scaling or when a large send window has been configured. When enabling this option, also set net.inet.tcp.inflight.debug to 0 to disable debugging. For production use, setting net.inet.tcp.inflight.min to at least 6144 may be beneficial. Setting high minimums may effectively disable bandwidth limiting, depending on the link. The limiting feature reduces the amount of data built up in intermediate route and switch packet queues and reduces the amount of data built up in the local host’s interface queue. With fewer queued packets, interactive connections, especially over slow modems, will operate with lower Round Trip Times. This feature only effects server side data transmission such as uploading. It has no effect on data reception or downloading.

Adjusting net.inet.tcp.inflight.stab is not recommended. This parameter defaults to 20, representing 2 maximal packets added to the bandwidth delay product window calculation. The additional window is required to stabilize the algorithm and improve responsiveness to changing conditions, but it can also result in higher ping(8) times over slow links, though still much lower than without the inflight algorithm. In such cases, try reducing this parameter to 15, 10, or 5 and reducing net.inet.tcp.inflight.min to a value such as 3500 to get the desired effect. Reducing these parameters should be done as a last resort only.

11.3. Virtual Memory

11.3.1. kern.maxvnodes

A vnode is the internal representation of a file or directory. Increasing the number of vnodes available to the operating system reduces disk I/O. Normally, this is handled by the operating system and does not need to be changed. In some cases where disk I/O is a bottleneck and the system is running out of vnodes, this setting needs to be increased. The amount of inactive and free RAM will need to be taken into account.

To see the current number of vnodes in use:

# sysctl vfs.numvnodesvfs.numvnodes: 91349

To see the maximum vnodes:

# sysctl kern.maxvnodeskern.maxvnodes: 100000

If the current vnode usage is near the maximum, try increasing kern.maxvnodes by a value of 1000. Keep an eye on the number of vfs.numvnodes. If it climbs up to the maximum again, kern.maxvnodes will need to be increased further. Otherwise, a shift in memory usage as reported by top(1) should be visible and more memory should be active.

12. Adding Swap Space

Sometimes a system requires more swap space. This section describes two methods to increase swap space: adding swap to an existing partition or new hard drive, and creating a swap file on an existing partition.

For information on how to encrypt swap space, which options exist, and why it should be done, refer to [_swap_encrypting].

12.1. Swap on a New Hard Drive or Existing Partition

Adding a new hard drive for swap gives better performance than using a partition on an existing drive. Setting up partitions and hard drives is explained in [_disks_adding] while [_configtuning_initial] discusses partition layouts and swap partition size considerations.

Use swapon to add a swap partition to the system. For example:

# swapon /dev/ada1s1b

It is possible to use any partition not currently mounted, even if it already contains data. Using swapon on a partition that contains data will overwrite and destroy that data. Make sure that the partition to be added as swap is really the intended partition before running swapon.

To automatically add this swap partition on boot, add an entry to /etc/fstab :

/dev/ada1s1b	none	swap	sw	0	0

See fstab(5) for an explanation of the entries in /etc/fstab . More information about swapon can be found in swapon(8) .

12.2. Creating a Swap File

These examples create a 64M swap file called /usr/swap0 instead of using a partition.

Using swap files requires that the module needed by md(4) has either been built into the kernel or has been loaded before swap is enabled. See [_kernelconfig] for information about building a custom kernel.

Example 2. Creating a Swap File onFreeBSD 10.X and Later
  1. Create the swap file:

    # dd if=/dev/zero of=/usr/swap0 bs=1m count=64
  2. Set the proper permissions on the new file:

    # chmod 0600 /usr/swap0
  3. Inform the system about the swap file by adding a line to /etc/fstab :

    md99	none	swap	sw,file=/usr/swap0,late	0	0

    The md(4) device md99 is used, leaving lower device numbers available for interactive use.

  4. Swap space will be added on system startup. To add swap space immediately, use swapon(8) :

    # swapon -aL
Example 3. Creating a Swap File onFreeBSD 9.X and Earlier
  1. Create the swap file, /usr/swap0 :

    # dd if=/dev/zero of=/usr/swap0 bs=1m count=64
  2. Set the proper permissions on /usr/swap0 :

    # chmod 0600 /usr/swap0
  3. Enable the swap file in /etc/rc.conf :

    swapfile="/usr/swap0"   # Set to name of swap file
  4. Swap space will be added on system startup. To enable the swap file immediately, specify a free memory device. Refer to [_disks_virtual] for more information about memory devices.

    # mdconfig -a -t vnode -f /usr/swap0 -u 0 && swapon /dev/md0

13. Power and Resource Management

Power and Resource Management

Hiten Pandya; Tom Rhodes

It is important to utilize hardware resources in an efficient manner. Power and resource management allows the operating system to monitor system limits and to possibly provide an alert if the system temperature increases unexpectedly. An early specification for providing power management was the Advanced Power Management (APM) facility. APM controls the power usage of a system based on its activity. However, it was difficult and inflexible for operating systems to manage the power usage and thermal properties of a system. The hardware was managed by the BIOS and the user had limited configurability and visibility into the power management settings. The APMBIOS is supplied by the vendor and is specific to the hardware platform. An APM driver in the operating system mediates access to the APM Software Interface, which allows management of power levels.

There are four major problems in APM. First, power management is done by the vendor-specific BIOS, separate from the operating system. For example, the user can set idle-time values for a hard drive in the APMBIOS so that, when exceeded, the BIOS spins down the hard drive without the consent of the operating system. Second, the APM logic is embedded in the BIOS, and it operates outside the scope of the operating system. This means that users can only fix problems in the APMBIOS by flashing a new one into the ROM, which is a dangerous procedure with the potential to leave the system in an unrecoverable state if it fails. Third, APM is a vendor-specific technology, meaning that there is a lot of duplication of efforts and bugs found in one vendor’s BIOS may not be solved in others. Lastly, the APMBIOS did not have enough room to implement a sophisticated power policy or one that can adapt well to the purpose of the machine.

The Plug and Play BIOS (PNPBIOS) was unreliable in many situations. PNPBIOS is 16-bit technology, so the operating system has to use 16-bit emulation in order to interface with PNPBIOS methods. FreeBSD provides an APM driver as APM should still be used for systems manufactured at or before the year 2000. The driver is documented in apm(4) .

The successor to APM is the Advanced Configuration and Power Interface (ACPI). ACPI is a standard written by an alliance of vendors to provide an interface for hardware resources and power management. It is a key element in Operating System-directed configuration and Power Management as it provides more control and flexibility to the operating system.

This chapter demonstrates how to configure ACPI on FreeBSD. It then offers some tips on how to debug ACPI and how to submit a problem report containing debugging information so that developers can diagnosis and fix ACPI issues.

.1. Configuring ACPI

In FreeBSD the acpi(4) driver is loaded by default at system boot and should not be compiled into the kernel. This driver cannot be unloaded after boot because the system bus uses it for various hardware interactions. However, if the system is experiencing problems, ACPI can be disabled altogether by rebooting after setting hint.acpi.0.disabled="1" in /boot/loader.conf or by setting this variable at the loader prompt, as described in [_boot_loader].

ACPI and APM cannot coexist and should be used separately. The last one to load will terminate if the driver notices the other is running.

ACPI can be used to put the system into a sleep mode with acpiconf, the -s flag, and a number from 1 to 5. Most users only need 1 (quick suspend to RAM) or 3 (suspend to RAM). Option 5 performs a soft-off which is the same as running halt -p.

Other options are available using sysctl. Refer to acpi(4) and acpiconf(8) for more information.

.2. Common Problems

ACPI is present in all modern computers that conform to the ia32 (x86), ia64 (Itanium), and amd64 (AMD) architectures. The full standard has many features including CPU performance management, power planes control, thermal zones, various battery systems, embedded controllers, and bus enumeration. Most systems implement less than the full standard. For instance, a desktop system usually only implements bus enumeration while a laptop might have cooling and battery management support as well. Laptops also have suspend and resume, with their own associated complexity.

An ACPI-compliant system has various components. The BIOS and chipset vendors provide various fixed tables, such as FADT, in memory that specify things like the APIC map (used for SMP), config registers, and simple configuration values. Additionally, a bytecode table, the Differentiated System Description Table DSDT, specifies a tree-like name space of devices and methods.

The ACPI driver must parse the fixed tables, implement an interpreter for the bytecode, and modify device drivers and the kernel to accept information from the ACPI subsystem. For FreeBSD, Intel™ has provided an interpreter (ACPI-CA) that is shared with Linux™ and NetBSD. The path to the ACPI-CA source code is src/sys/contrib/dev/acpica . The glue code that allows ACPI-CA to work on FreeBSD is in src/sys/dev/acpica/Osd . Finally, drivers that implement various ACPI devices are found in src/sys/dev/acpica .

For ACPI to work correctly, all the parts have to work correctly. Here are some common problems, in order of frequency of appearance, and some possible workarounds or fixes. If a fix does not resolve the issue, refer to Getting and Submitting Debugging Info for instructions on how to submit a bug report.

.2.1. Mouse Issues

In some cases, resuming from a suspend operation will cause the mouse to fail. A known work around is to add hint.psm.0.flags="0x3000" to /boot/loader.conf .

.2.2. Suspend/Resume

ACPI has three suspend to RAM (STR) states, S1-S3, and one suspend to disk state (STD), called S4. STD can be implemented in two separate ways. The S4BIOS is a BIOS-assisted suspend to disk and S4OS is implemented entirely by the operating system. The normal state the system is in when plugged in but not powered up is “soft off” (S5).

Use sysctl hw.acpi to check for the suspend-related items. These example results are from a Thinkpad:

hw.acpi.supported_sleep_state: S3 S4 S5
hw.acpi.s4bios: 0

Use acpiconf -s to test S3, S4, and S5. An s4bios of one (1) indicates S4BIOS support instead of S4 operating system support.

When testing suspend/resume, start with S1, if supported. This state is most likely to work since it does not require much driver support. No one has implemented S2, which is similar to S1. Next, try S3. This is the deepest STR state and requires a lot of driver support to properly reinitialize the hardware.

A common problem with suspend/resume is that many device drivers do not save, restore, or reinitialize their firmware, registers, or device memory properly. As a first attempt at debugging the problem, try:

# sysctl debug.bootverbose=1
# sysctl debug.acpi.suspend_bounce=1
# acpiconf -s 3

This test emulates the suspend/resume cycle of all device drivers without actually going into S3 state. In some cases, problems such as losing firmware state, device watchdog time out, and retrying forever, can be captured with this method. Note that the system will not really enter S3 state, which means devices may not lose power, and many will work fine even if suspend/resume methods are totally missing, unlike real S3 state.

Harder cases require additional hardware, such as a serial port and cable for debugging through a serial console, a Firewire port and cable for using dcons(4) , and kernel debugging skills.

To help isolate the problem, unload as many drivers as possible. If it works, narrow down which driver is the problem by loading drivers until it fails again. Typically, binary drivers like nvidia.ko , display drivers, and USB will have the most problems while Ethernet interfaces usually work fine. If drivers can be properly loaded and unloaded, automate this by putting the appropriate commands in /etc/rc.suspend and /etc/rc.resume . Try setting hw.acpi.reset_video to 1 if the display is messed up after resume. Try setting longer or shorter values for hw.acpi.sleep_delay to see if that helps.

Try loading a recent Linux™ distribution to see if suspend/resume works on the same hardware. If it works on Linux™ , it is likely a FreeBSD driver problem. Narrowing down which driver causes the problem will assist developers in fixing the problem. Since the ACPI maintainers rarely maintain other drivers, such as sound or ATA, any driver problems should also be posted to the link:freebsd-current list and mailed to the driver maintainer. Advanced users can include debugging printf(3) s in a problematic driver to track down where in its resume function it hangs.

Finally, try disabling ACPI and enabling APM instead. If suspend/resume works with APM, stick with APM, especially on older hardware (pre-2000). It took vendors a while to get ACPI support correct and older hardware is more likely to have BIOS problems with ACPI.

.2.3. System Hangs

Most system hangs are a result of lost interrupts or an interrupt storm. Chipsets may have problems based on boot, how the BIOS configures interrupts before correctness of the APIC (MADT) table, and routing of the System Control Interrupt (SCI).

Interrupt storms can be distinguished from lost interrupts by checking the output of vmstat -i and looking at the line that has acpi0. If the counter is increasing at more than a couple per second, there is an interrupt storm. If the system appears hung, try breaking to DDB (CTRL+ALT+ESC on console) and type show interrupts.

When dealing with interrupt problems, try disabling APIC support with hint.apic.0.disabled="1" in /boot/loader.conf .

.2.4. Panics

Panics are relatively rare for ACPI and are the top priority to be fixed. The first step is to isolate the steps to reproduce the panic, if possible, and get a backtrace. Follow the advice for enabling options DDB and setting up a serial console in [_serialconsole_ddb] or setting up a dump partition. To get a backtrace in DDB, use tr. When handwriting the backtrace, get at least the last five and the top five lines in the trace.

Then, try to isolate the problem by booting with ACPI disabled. If that works, isolate the ACPI subsystem by using various values of debug.acpi.disable. See acpi(4) for some examples.

.2.5. System Powers Up After Suspend or Shutdown

First, try setting hw.acpi.disable_on_poweroff="0" in /boot/loader.conf . This keeps ACPI from disabling various events during the shutdown process. Some systems need this value set to 1 (the default) for the same reason. This usually fixes the problem of a system powering up spontaneously after a suspend or poweroff.

.2.6. BIOS Contains Buggy Bytecode

Some BIOS vendors provide incorrect or buggy bytecode. This is usually manifested by kernel console messages like this:

ACPI-1287: *** Error: Method execution failed [\\_SB_.PCI0.LPC0.FIGD._STA] \\
(Node 0xc3f6d160), AE_NOT_FOUND

Often, these problems may be resolved by updating the BIOS to the latest revision. Most console messages are harmless, but if there are other problems, like the battery status is not working, these messages are a good place to start looking for problems.

.3. Overriding the Default AML

The BIOS bytecode, known as ACPI Machine Language (AML), is compiled from a source language called ACPI Source Language (ASL). The AML is found in the table known as the Differentiated System Description Table (DSDT).

The goal of FreeBSD is for everyone to have working ACPI without any user intervention. Workarounds are still being developed for common mistakes made by BIOS vendors. The Microsoft™ interpreter (acpi.sys and acpiec.sys ) does not strictly check for adherence to the standard, and thus many BIOS vendors who only test ACPI under Windows™ never fix their ASL. FreeBSD developers continue to identify and document which non-standard behavior is allowed by Microsoft™ 's interpreter and replicate it so that FreeBSD can work without forcing users to fix the ASL.

To help identify buggy behavior and possibly fix it manually, a copy can be made of the system’s ASL. To copy the system’s ASL to a specified file name, use acpidump with -t, to show the contents of the fixed tables, and -d, to disassemble the AML:

# acpidump -td > my.asl

Some AML versions assume the user is running Windows™ . To override this, set hw.acpi.osname="Windows 2009" in /boot/loader.conf , using the most recent Windows™ version listed in the ASL.

Other workarounds may require my.asl to be customized. If this file is edited, compile the new ASL using the following command. Warnings can usually be ignored, but errors are bugs that will usually prevent ACPI from working correctly.

# iasl -f my.asl

Including -f forces creation of the AML, even if there are errors during compilation. Some errors, such as missing return statements, are automatically worked around by the FreeBSD interpreter.

The default output filename for iasl is DSDT.aml . Load this file instead of the BIOS's buggy copy, which is still present in flash memory, by editing /boot/loader.conf as follows:

acpi_dsdt_load="YES"
acpi_dsdt_name="/boot/DSDT.aml"

Be sure to copy DSDT.aml to /boot , then reboot the system. If this fixes the problem, send a diff(1) of the old and new ASL to link:freebsd-acpi so that developers can work around the buggy behavior in acpica .

.4. Getting and Submitting Debugging Info

Getting and Submitting Debugging Info

Nate Lawson; Peter Schultz; Tom Rhodes

The ACPI driver has a flexible debugging facility. A set of subsystems and the level of verbosity can be specified. The subsystems to debug are specified as layers and are broken down into components (ACPI_ALL_COMPONENTS) and ACPI hardware support (ACPI_ALL_DRIVERS). The verbosity of debugging output is specified as the level and ranges from just report errors (ACPI_LV_ERROR) to everything (ACPI_LV_VERBOSE). The level is a bitmask so multiple options can be set at once, separated by spaces. In practice, a serial console should be used to log the output so it is not lost as the console message buffer flushes. A full list of the individual layers and levels is found in acpi(4) .

Debugging output is not enabled by default. To enable it, add options ACPI_DEBUG to the custom kernel configuration file if ACPI is compiled into the kernel. Add ACPI_DEBUG=1 to /etc/make.conf to enable it globally. If a module is used instead of a custom kernel, recompile just the acpi.ko module as follows:

# cd /sys/modules/acpi/acpi && make clean && make ACPI_DEBUG=1

Copy the compiled acpi.ko to /boot/kernel and add the desired level and layer to /boot/loader.conf . The entries in this example enable debug messages for all ACPI components and hardware drivers and output error messages at the least verbose level:

debug.acpi.layer="ACPI_ALL_COMPONENTS ACPI_ALL_DRIVERS"
debug.acpi.level="ACPI_LV_ERROR"

If the required information is triggered by a specific event, such as a suspend and then resume, do not modify /boot/loader.conf . Instead, use sysctl to specify the layer and level after booting and preparing the system for the specific event. The variables which can be set using sysctl are named the same as the tunables in /boot/loader.conf .

Once the debugging information is gathered, it can be sent to link:freebsd-acpi so that it can be used by the FreeBSD ACPI maintainers to identify the root cause of the problem and to develop a solution.

Before submitting debugging information to this mailing list, ensure the latest BIOS version is installed and, if available, the embedded controller firmware version.

When submitting a problem report, include the following information:

  • Description of the buggy behavior, including system type, model, and anything that causes the bug to appear. Note as accurately as possible when the bug began occurring if it is new.

  • The output of dmesg after running boot -v, including any error messages generated by the bug.

  • The dmesg output from boot -v with ACPI disabled, if disabling ACPI helps to fix the problem.

  • Output from sysctl hw.acpi. This lists which features the system offers.

  • The URL to a pasted version of the system’s ASL. Do not send the ASL directly to the list as it can be very large. Generate a copy of the ASL by running this command:

    # acpidump -dt > name-system.asl

    Substitute the login name for name and manufacturer/model for system. For example, use njl-FooCo6000.asl .

Most FreeBSD developers watch the link:FreeBSD-CURRENT mailing list, but one should submit problems to link:freebsd-acpi to be sure it is seen. Be patient when waiting for a response. If the bug is not immediately apparent, submit a bug report. When entering a PR, include the same information as requested above. This helps developers to track the problem and resolve it. Do not send a PR without emailing link:freebsd-acpi first as it is likely that the problem has been reported before.

.1. References

More information about ACPI may be found in the following locations:


1. The auto-tuning algorithm sets maxusers equal to the amount of memory in the system, with a minimum of 32, and a maximum of 384.