InterNet News How-To

Abstract This document is an attempt to plot out a deployment road map for INN 2.1. The document needs to be revised upon an upgrade.

New documentation for INN 2.3 and higher can be found here. It is not 100% complete yet but there's more there than here.

Author Elena Samsonova - World Access / Planet Internet
Date December 18, 1998
Version 1.1


Instead a foreword

Since I only run one particular configuration and often follow plain vanilla choices, I cannot know all the intricacies of other sites, therefore contributions are welcome! All the author rights will be preserved, in the sense that the author's name will appear next to his/her contribution.

Parts of the document that are not yet completed, are marked in blue. They ask for a contribution! Click on the pig. :)

This document is not a transcript of the manuals, and by no means a replacement for them! Although it may be a great disappointment to those who don't like to RTFM :) I'm still not going to copy and paste manuals into here. Instead, every section that talks of programs or scripts, lists relevant manual pages on the right. As one of the greatest difficulties in installing INN is finding the right man page, this is supposed to help.

References

contributeOne of the best sources I found so far, and the one I used mostly, is presented by Forrest J. Cavalier, Mib Software (http://www.mibsoftware.com/userkt).

Other nice pages???

Acknowledgements

Many thanks to the following people who sent me very useful comments:

Forrest J. Cavalier
Ragnar Lonn
Hans Lambermont
Steve Tremblett
you??

Request for input

Some of the functionalities are not covered in this How-To simply because I have insufficient knowledge on them, therefore please contribute!! At least the following parts are not covered:


Table of Contents

  1. SYSTEM ARCHITECTURE
  2. IMPLEMENTATION GUIDE

List of Figures

title Xfig source PostScript
Overal architecture of a centralized system single_node_inn.fig single_node_inn.ps
Overal architecture of a distributed system: server level servers_inn_arch.fig servers_inn_arch.ps
Overal architecture of a distributed system: process level procs_inn_arch.fig procs_inn_arch.ps
Distributed system: INN reader architecture nnrpd_arch.fig nnrpd_arch.ps
Distributed system: INN feeder architecture inn_arch.fig inn_arch.ps


System Architecture

Table of
Contents

Overview

INN is a package of various programs and scripts meant for different purposes, and different system and server configuration may require different programs. This section describes several most commonly used configurations, as far as I got input on those that I don't run myself.

For now, the section contains two configurations: centralized and distributed with separate multiple readers and a single feeder. Input about other configurations is welcome.

In a centralized system, one machine runs a set of programs that handle incoming feed, outgoing feed and user connections for reading and posting. A stand-alone configuration is also possible when the server is used for internal purposes only and no incoming or outgoing feed is needed.

In a distributed system not every machine runs the same programs. INN, being a package of multiple programs working together, offers many mix-and-match possibilities.

Centralized System Architecture

Figure 1 depicts overal architecture of a centralized system. This section is meant to shed some light on the interaction of the processes on the system, it does not explain how to get those processes to behave this way. See section Implementation Guide for further details.

The system runs an innd daemon which handles incoming feeds, manages the active and history files, as well as the article spool, and listens on port 119 and accepts user connections. For each accepted connection it spawns a child nnrpd process which handles further interaction with the user.
innd
nnrpd
inn.conf
nnrp.access
Each nnrpd process reads the active and history files to find article information, fetches requested articles from the spool and sends them to the user. It also accepts user postings.
active
history
User postings are first pulled through a filter, filter_nnrpd, which is a Perl script. It is loaded when nnrpd starts up for subsequent use. The filter may reject certain postings, in which case the user gets an error back (see note). If a postings passes through the filter, nnrpd passes it on to innd. nnrpd does not attempt to store user postings in the spool.

    Note: I am currently evaluating a patch to nnrpd which will allow to reject certain postings without returning an error message to the users.

contributeinnd is configured to accept incoming feed from several external peers. All the incoming articles are first pulled through a filter which is loaded at startup. One of the popular filter scripts is cleanfeed. Contributions about other filters are welcome! The filter drops rejected articles silently. It does however log relevant information.
incoming.conf
news
news.log
cleanfeed
control.ctl
When an article makes it through cleanfeed, innd registers it in the active and history file and stores it in the article spool. If configured, innd also sends the article to the corresponding external peer, either via a channel or via a batch.
newsfeeds
moderators
For peers that receive low volume feed, a news administrator can choose to use the batch method. It therefore spools relevant articles to batch files (one per peer) for further processing. nntpsend is called on a regular basis from cron which examins the batch files and spawns one innxmit process per peer, according to peer configuration. innxmit establishes connection with the peer, transfers the articles and closes the connection when done. Note that when a peer goes down ungracefully (without closing connection), innxmit hangs. It is possible to install a script on the feeder which checks for peers and kills hanging innxmit processes if necessary.

contributeIt is also possible to use innfeed (see below) for low volume feed. Contributions about performance comparisons are wanted!!

nntpsend
nntpsend.ctl
passwd.nntp
innxmit
cron
For peers that receive high volume feed, as well as for peers that receive identical feed, a news administrator can choose to use the channel method. It spawns innfeed at startup and opens a channel to it. Every time innd finds an article to be fed to the peers, it sends it to the innfeed channel. innfeed is configured to feed multiple peers with the same articles from the channel. It manages connections to the peers and writes backlogs in case a peer is unavailable or too slow. innfeed writes one backlog file per peer. The backlog is truncated to a specified length in order to prevent disk space overflow. When this happens, the peer is said to miss articles. innxmit does not process backlogs; a separate program (e.g. innxmit) should be called to do that afterwards.
innfeed
innfeed.conf
news.daily is run daily for article expiration, log file rotation and reporting purposes. For article expiration news.daily spawns expire which processes the history database purging entries for articles to be expired. It produces a list of articles to be removed from the spool, and renumbers the active file to reflect changes. expire calls fastrm to actually remove the articles on the expire list from the spool.
news.daily
expire
expire.ctl
fastrm
For log rotation and reporting purposes, news.daily calls scanlogs, which analogous to the one on the readers, rotates the log files and calls innreport to process them, create a report and mail it to the news administrator.
scanlogs
innreport
There is a separate program that maintains innd, called ctlinnd, and another special program that watches over innd, called innwatch. News group maintenance is also done with ctlinnd. See Implementation Guide for further details.
control.ctl
ctlinnd
innwatch
innwatch.ctl

Distributed System Architecture

Server Level

Figure 2 depicts overal architecture on server level. Functions of the readers are all identical, so there may be as many of them as necessary to cope with the load, which provides for horisontal scaling.

The readers accept user connections, read articles from the spool and deliver them to the users, and accept user postings and forward them to the feeder. The readers do not write either to the spool or to the database files (in ~news/db).

The feeder accepts incoming feeds from external peers and user postings from the readers and writes them to the spool and sends them out to the internet to the external peers. Note that the feeder replicates external newsfeed.

The newsstore is merely a filer which hosts shared data.

Because of this functional split, the readers and the feeder are called the frontend, and the newsstore is called the backend.

Process level

Figure 3 depicts overal architecture on process level. The figure shows only one reader because all the readers have identical architecture.

The readers run nnrpd which handles user connections and spawns one process per user. It reads article information from the active and history files and the articles from the spool, and delivers them to the users. It accepts user postings and stores them in a batch. rnews is run periodically, it reads user postings from the batch and sends them to the feeder for propagation. The readers do not store user postings in the spool, as they don't register them in the database.

The feeder runs innd which handles newsfeeds. It accepts incoming newsfeeds from external peers and user postings from the readers, stores them in the spool and updates the active and history files accordingly. It also propagates newsfeed to external peers and sends out user postings. The feeder runs expire daily to purge old articles from the spool.

All frontend machines also run innreport daily which scans the log files and creates a daily report which is then mailed to the news administrator.

Architectural Details

This section is meant to shed some light on the interaction of the processes on the reader and feeder systems, it does not explain how to get those processes to behave this way. See section Implementation Guide for further details.

Figure 4 depicts INN architecture on a reader.
The system runs an nnrpd daemon (started up with the -D switch), which listens on port 119 and accepts user connections. For each accepted connection it spawns a child nnrpd process which handles further interaction with the user.
nnrpd
inn.conf
nnrp.access
moderators
Alternatively, nnrpd could be started by inetd from /etc/inetd.conf and /etc/services by specifying it for port 119. This ensures that the mother daemon will never die since there's no mother daemon in this case. However, if inetd dies, you're still in trouble.

This approach is equivalent to running a mother daemon nnrpd -D because the program simply forks a new process for each incoming user.

inetd
/etc/inetd.conf
/etc/services
Each nnrpd process reads the active and history files to find article information, fetches requested articles from the spool and sends them to the user. It also accepts user postings.
active
history
User postings are first pulled through a filter, filter_nnrpd, which is a Perl script. It is loaded when nnrpd starts up for subsequent use. The filter may reject certain postings, in which case the user gets an error back ( see note ).

    Note: I am currently evaluating a patch to nnrpd which will allow to reject certain postings without returning an error message to the users.

If a posting passes through the filter, there are two configurations possible: either nnrpd immediately connects to the feeder and forwards the posting, or nnrpd stores it in a batch to be sent to the feeder. In either case however nnrpd does not attempt to store user postings in the spool.

The first option has the following properties:

  • postings get sent out immediately without any delay
  • users get notified if the postings get rejected by the feeder for some reason (see note)
  • Note: you may not always want to notify your users that their spam has been dropped as it would present a perfect way to find a work-around for your anti-spam filter.

  • nnrpd does not return until the article is transferred to the feeder, or an error returned, which means that it will generally take longer for the user compared with the second option
  • users cannot post when the feeder is down or busy expiring or renumbering, or if it is throttled or overloaded

The second option has the following properties:

  • postings get spooled into a queue and get sent to the feeder as frequently as rnews is configured to do it (see below)
  • users get notified if the postings get rejected by nnrpd but do not get notified if they get rejected by the feeder (see note)
  • nnrpd returns as soon as the article gets spooled, which is very quickly
  • unavailability of the feeder does not impair the users' ability to post
When the second option is used, rnews is run on a regular basis from cron to send user postings to the feeder. It processes the batch created by nnrpd and attempts to make a connection to the feeder. If the feeder is temporarily down or does not accept connections for some other reason, rnews leaves the articles in the batch. Next time it is started, it will try again.
rnews
cron
For log file rotation and reporting purposes, news.daily is run daily. news.daily on the readers does not run expire. It spawns scanlogs which rotates the logs and calls innreport which analyses them, creates a report and mails it to the news administrator.
news.daily
scanlogs
innreport
innreport.conf
Figure 5 depicts INN architecture on the feeder.
The system runs innd daemon which handles incoming feeds and manages the active and history files, as well as the article spool.
innd
active
history
contributeinnd is configured to accept incoming feed from several external peers and from the readers. Note that the feeder does not see any difference between external feed and user postings from the readers. All the incoming articles are first pulled through a filter which is loaded at startup. One of the popular filter scripts is cleanfeed. Contributions about other filters are welcome! The filter drops rejected articles silently, as there is no user to issue the error to. It does however log relevant information.
incoming.conf
news
news.log
cleanfeed
control.ctl
When an article makes it through cleanfeed, innd registers it in the active and history file and stores it in the article spool. If configured, innd also sends the article to the corresponding external peer, either via a channel or via a batch.
newsfeeds
moderators
For peers that receive low volume feed, innd uses the batch method. It therefore spools relevant articles to batch files (one per peer) for further processing. nntpsend is called on a regular basis from cron which examins the batch files and spawns one innxmit process per peer, according to peer configuration. innxmit establishes connection with the peer, transfers the articles and closes the connection when done. Note that when a peer goes down ungracefully (without closing connection), innxmit hangs. It is possible to install a script on the feeder which checks for peers and kills hanging innxmit processes if necessary.
nntpsend
nntpsend.ctl
passwd.nntp
innxmit
cron
For peers that receive high volume feed, as well as for peers that receive identical feed, innd uses the channel method. It spawns innfeed at startup and opens a channel to it. Every time innd finds an article to be fed to the peers, it sends it to the innfeed channel. innfeed is configured to feed multiple peers with the same articles from the channel. It manages connections to the peers and writes backlogs in case a peer is unavailable or too slow. innfeed writes one backlog file per peer. The backlog is truncated to a specified length in order to prevent disk space overflow. When this happens, the peer is said to miss articles. innxmit does not process backlogs; a separate program (e.g. innxmit) should be called to do that afterwards.
innfeed
innfeed.conf
news.daily is run daily for article expiration, log file rotation and reporting purposes. For article expiration news.daily spawns expire which processes the history database purging entries for articles to be expired. It produces a list of articles to be removed from the spool, and renumbers the active file to reflect changes. expire calls fastrm to actually remove the articles on the expire list from the spool.
news.daily
expire
expire.ctl
fastrm
For log rotation and reporting purposes, news.daily calls scanlogs, which analogous to the one on the readers, rotates the log files and calls innreport to process them, create a report and mail it to the news administrator.
scanlogs
innreport
There is a separate program that maintains innd, called ctlinnd, and another special program that watches over innd, called innwatch. News group maintenance is also done with ctlinnd. See Implementation Guide for further details.
control.ctl
ctlinnd
innwatch
innwatch.ctl


Implementation Guide

Table of
Contents

Centralized System

contributeInput welcome!!

Distributed System

There are a couple of things to pay attention to when configuring machines in a distributed system. This section describes those specific things. Note that I only describe deviations from the norm because explanations for standard values can be found in the manuals. However, the meaning and use of the various configuration files is outlined here.

Readers

nnrpd configuration:
nnrpd is configured in inn.conf. This file contains configuration parameters for innd as well which are ignored on the readers.
 
Having nnrpd send postings to the feeder:
To make nnrpd connect to the feeder for every posting, set spoolfirst to false and nnrpdposthost to the feeder.
 
Having nnrpd write postings to a queue:
To make sure that nnrpd stores user postings in a batch in queue/incoming directory, set spoolfirst to true and nnrpdposthost to the feeder.
 
Running rnews:
rnews should be run frequently enough to make sure your users don't complain. The postings will only appear on the news server (and will only be sent out to the Internet) when they are transmitted to the feeder. A sensible value is every 5 minutes.
The following parameters are usually used: rnews -v -U.
nnrpd
inn.conf
rnews
cron
crontab
User access rights and closed groups:
User access rights are configured in nnrp.access. Closed groups are also set up here. All standard stuff.

contributeAlthough it is all standard stuff, a good description of how to set up a closed group without turning on authentication for all the public groups as well, is wanted. Please contribute!

nnrp.access
Moderated groups:
Moderated groups are marked as m in the active file. Moderator addresses are listed in the moderators file. nnrpd consults this file when an article is posted to a moderated group.
See feeder section on how to set up a moderated group in the active file.
moderators
active
Rotating logs and creating reports:
Use news.daily to do it. Note that you need to explicitely turn off expire and renumber of the active file. Use the following parameters:
    news.daily noexpire norenumber
news.daily calls scanlogs which in turn calls innreport. The latter is configured in innreport.conf.
news.daily
scanlogs
innreport
innreport.conf

Feeder

innd configuration:
innd is configured in inn.conf. Configuration here is quite standard.
innd
inn.conf
Allowing readers to post in batch mode:
I found it necessary to give the readers access to the feeder in the feeder's nnrp.access file. Otherwise rnews from the readers cannot connect.
 
Allowing readers to post directly:
In this case it is necessary to add the readers to incoming.conf rather than to nnrp.access file.
nnrp.access
Setting up incoming feed:
Incoming feeds are handled by innd and are configured in incoming.conf. Pay attention to the peer ME which is your own news server. If this peer is not configured properly, the articles will not appear in your spool.
The peer ME must also be configured in newsfeeds for the same reasons as above.
innd
incoming.conf
newsfeeds
Setting up outgoing feed:
Outgoing feeds are configured in newsfeeds which determines the method to be used to transfer the articles to the peer.
For low-volume feeds batched method may be used with the transfer handled by nntpsend (which calls innxmit for each peer). nntpsend is configured in nntpsend.ctl and passwd.nntp.
For high-volume feeds channel method may be used with the transfer handled by innfeed, one instance of the program serving all the peers. It is configured in innfeed.conf.
newsfeeds
nntpsend.ctl
passwd.nntp
innfeed
innfeed.conf
Setting up anti-spam filters:
The feeder may run an anti-spam filter that checks every incoming article and drops "bad" ones. If it is a Perl script, it should be copied to the bin/filter/filter_innd.pl, if it is a Tcl/Tk script, it goes to bin/filter/filter_innd.tcl.
contributeOne of the popular filters is cleanfeed which is configured in cleanfeed.conf.
Input about other filters is welcome!
cleanfeed
News groups maintenance:
News groups are maintained (added, removed or changed attributes) with ctlinnd as well as automatically by innd.
Automatic news group maintenance is controlled by control.ctl which defines how specific types of control articles are processed.
ctlinnd
control.ctl
Timely updates of active and history:
So that the readers would be able to access all the articles in the spool for all actions (including cancelling when they need to retrieve the article by message ID), active and history* files need to be updated frequently enough. If MMAP is used, then MMAP_SYNC must also be used with a sufficiently short interval.
active
history
Watching over innd:
innwatch watches over innd and can be configure to do a wide range of things in the configuration file innwatch.ctl.
innwatch
innwatch.ctl
Running expire and rotating logs:
Both these actions are performed by news.daily which calls expire and fastrm for expiration, ctlinnd renumber for active file renumbering, and scanlogs that calls innreport for log file rotation and reporting.
news.daily is run from cron on a daily basis. To make sure that expire does not take for ever, the following parameters are given:
    news.daily delayrm
contributeExpiration policies depend on the storage method(s) used. For the traditional spool they are defined in expire.ctl. Please contribute for the other storage methods!
Reporting is defined in innreport.conf that is used by innreport.
news.daily
expire
fastrm
expire.ctl
scanlogs
innreport
innreport.conf