| mailwulf |
| A Mail Server Clustering Solution |
With the evolution of the Internet over the past decade, many standard
protocols have come and gone, and only a few have withstood the exponential growth in user
population, and the increasing variety of content transported between these users. Such technologies
include the underlying TCP/IP architecture of the 'Net, which has coped with all the changes that have
occured from when the internet was simply a few university/government based research computers.
Other protocols include FTP, HTTP. However things like gopher, UUCP and others have dwindled in
comparison.
Another of the survivors is SMTP, or the email transport protocol. It has been able to cope with
not only the increasing number of users on the net, but also with an amazing growth in message length,
with hundreds of thousands of emails containing large binary attachments. This has been possible
in part due to the underlying distributed architecture of the net, and also of smtp, but also due
to increased bandwidth linking major servers, and increased storage and processing capabilities of
those servers.
However, despite these changes, it is still quite common for internet service providers to service
their email in a fairly anachronistic fashion, with only one machine for the tens of thousands of
users serviced on that machine. When demand increases on that system, the response is to schedule
down-time, and upgrade the server. If an unscheduled down-time occurs, then all users dependant on
that server, are inconvenienced. This is not an acceptable practice for those of us who depend on
continual connectivity, and for industries who depend on the otherwise rapid and reliable transport
of communications and data.
What is proposed in this paper is a hardware and software architecture to improve service availability
and speed, whilst reducing the workload of system administrators, by providing a simple, unified, managment
interface.
mailwulf allows a cluster of individual mail servers (both SMTP and POP) to appear to the outside world as
a single server. The architecture is designed so that if one member of the cluster fails, the impact to
the user is minimal. The more servers in the cluster, the less the impact of a failed server, and the overall
speed of the cluster should increase approximately linearly.
This is acheived by running both a standalone SMTP and POP server on each machine. When an incoming SMTP
connection comes in, it is routed (in a round-robin fashion) to one of the servers. Through a
centralized directory server, each server is aware of the entire set of users who can accept mail on
the system. Once mail is accepted, it is locally delivered by a POP deliver client, to a users mailbox.
When a user wishes to check their mail, the incoming pop connection arrives on a single server, which then
goes off and collects the mail from each of the servers in the cluster.
If an individual server goes down, the smtp proxy will no longer send mail to that system. If a user checks
their mail whilst a machine is down, then they can still read the bulk of their mail from the other servers.
Outlined in Figure 1 is an idealised cluster, in reality there may be no need for an individual incoming mail server,
or just a single directory server (ldap.i2pi.com).
(NB: none of these machine actually exist at the moment.)
+---------------+ +----------------+ +--------------+
incoming ----> | mail.i2pi.com |--+ | mail1.i2pi.com | >----+-----> | pop.i2pi.com | ----> outgoing
mail +---------------+ | +----------------+ | +--------------+ mail
(SMTP) || | || | || (POP)
|| | +----------------+ | ||
|| +---> | mail2.i2pi.com | >----+ ||
|| +----------------+ | ||
|| || | ||
|| +----------------+ | ||
|| | mail3.i2pi.com | >----+ ||
|| +----------------+ | ||
|| || | ||
|| +----------------+ | ||
|| | mail4.i2pi.com | >----+ ||
|| +----------------+ ||
|| || ||
|| || ||
|| || ||
\\ +---------------+ //
\==============| ldap.i2pi.com |=======================/
+---------------+
||
+----------------+
| web.i2pi.com | <----------> Administration Interface
+----------------+
mailwulf is intended to be distributed as a single package of free software components. This is
to minimise the amount of administrative effort required to set up a new node for a cluster. Ideally
it should be a simple task of installing a base operating system, and then installing the mailwulf
package, and pointing it to the directory server, which contains all configuration data.
Given that each node will be installed with the same package, there should be no single point of failure.
If the node handling incoming mail goes down, a DNS change should be able to point it a live node,
acting somewhat like a hot-spare in RAID.
| Function | Proposed Package | Notes |
|---|---|---|
| SMTP Server | postfix | Already with full LDAP support. |
| POP Server | Cyrus IMAPD | Open Source, however restrictions for commercial use |
| SASL Library | Cyrus SASL | SASL implementation for authorisation and authentication |
| PAM LDAP Module | PADL pam_ldap | Needed for SASL to support LDAP auth |
| POP Proxy | smunge | Perfect for the job. :) |
| SMTP Relay Control | drac | Provides POP before SMTP relay authorisation control |
| SMTP Proxy | balance | A small, simple TCP/IP round-robin proxy server. Will require LDAP modifications. |
| LDAP Server | OpenLDAP | Solid, and well supported. Supports cloning, etc. |
| HTTP Server | Apache | There are no other alternatives as far as proven track record. |
| Administration Interface | mailwulf tools | C API, with a CGI interface for cluster administration (Currently Under Construction) |
| Object Class | Required | Element | Type | Usage |
|---|---|---|---|---|
| mwNode | Yes | cn | String | Unique Common Name for Node |
| Yes | nid | integer | Node ID | |
| Yes | server | string | Node hostname | |
| Yes | status | string | Node status A string comprised of upper or lower case letters, indicating whether a particular service on this node is up or down. Eg S (smtp) P (pop) L (ldap) M (mailwulf daemon) B (balance) A (apache) G (smunge) |
|
| mwUser | Yes | cn | String | Common Name for User |
| Yes | uid | integer | Unique User ID | |
| Yes | userPassword | crypt string | encrypted user password | |
| No | mailAlias | string | email address that this user accepts mail at. A user may have multiple such adresses (aliases) |
|
| No | mailForward | string | email address that this user forwards accepted mail to. A user may have multiple such adresses, however none should resolve back the the same user |
| Function | Returns | Parameters | Description |
|---|---|---|---|
| Session Functions | |||
|
Session token containing session details | server - The hostname:port of master server to connect tousername,password - Authentication details |
Makes network connection to mailwulf administration server, and initialises contact with ldap server, etc. Only allows access to administration functionality as determined by server-side ACL's. These are to be stored in a private subtree in LDAP. |
|
0 if ok, otherwise !0, and char *MW_error is set |
session - Session token |
Closes session, invalidates key. |
| Interrupt Functions | |||
|
0 if ok, otherwise !0, and char *MW_error is set |
session - Session tokeninterrupt - Which interrupt type this handler is forhandler - Pointer to handler functionusr_data - User defined data to be passed to handler when it is called |
Used to register mailwulf interrupt handlers to deal with any issues that would otherwise corrupt the local execution of code, when an event occurs outside the control of the local code. |
| Node Functions | |||
|
Linked list of all nodes known to the server | session - Valid session token |
- |
|
Head node in node list | session - Session tokenlist - Node list |
Returns the first node in a node list, also resets the internal node list pointer, so
that subsequent calls to mw_next_node() will return the second node, and
so forth. |
|
Next node in list | session - Session tokenlist - Node list |
Returns the next node in the list |
|
Previous node in list | session - Session tokenlist - Node list |
Returns the next node in the list |
|
Updated contents of node | session - Session tokennode - Node to update |
Whenever a node is fetched by mw_next_node() or mw_prev_node it
is locked with the authentication level of the current session token holder. If that lock
is over-ruled by a session holder with a higher authentication level, others holding that
node will have their node interrupt handlers called (as set with mw_set_hander
and they should subsequently call this function to update their local copies of the node data
to maintain synchronisation |
|
Pointer to node added, if good, else NULL, with char *MW_Error set with valid message. |
session - Session tokennode - Node to insert |
Inserts a node into a node list, and asynchronously integrates. As this is non-blocking, the controling program can only check if the node is up by calling mw_get_node to check the node status.If node already exists, then it will modify the node (given correct authorization) |
| User Functions | |||
|
List of users, or NULL if no matches | session - Session tokenname_filter - Regex filter to match username with |
Get a list of users, by regex match of username |
|
List of users, or NULL if no matches | session - Session tokenfield - Field to search againstfilter - Filter to apply to field |
Obtain a list of users, by performing a regex match on a generic field, eg. specify field as "mailacceptingid", and filter as "^*.\@domain.com$" to return all users who accept mail at domain.com |
|
Head user in list | session - Session tokenlist - User list |
Returns the first user in a user list, also resets the internal user list pointer, so that
subsequent calls to mw_next_user () will return the second user, and so forth. |
|
Next user in list | session - Session tokenlist - User list |
Returns the next user in the list |
|
Previous user in list | session - Session tokenlist - User list |
Returns the previous user in the list |
|
Updated contents of a user | session - Session tokenuser - User to update |
Whenever a user is fetched, it is locked with the authentication level of the current session token holder. If that lock is over-ruled, others holding that user will have their user interrupt handlers called, and should subsequently update their local copies of the user, using this function. |
|
user, or NULL if error, with char *MW_Error set |
session - Session tokenuser - User to add |
Asynchronously add's a user. The calling program should check the status of the user
with mw_get_user to find out when it is actually added |
| Message Functions | |||
MW_attribute:
{
char *field; // Name of the field
char **value; // Array of strings of field values
int values; // Number of values
...
...
}
MW_node_list:
{
MW_node *head; // Head of list
MW_node *current; // Current node pointer
int nodes; // Total number of nodes
...
...
}
MW_node:
{
MW_node *next; // Next node
MW_node *prev; // Prev node
MW_node_list *list; // Pointer to list
long node_id; // Unique node id. LDAP to be indexed on this
char *name; // user friendly machine name
char *server; // domain name of server
MW_attribute *attribute; // An array of attributes, sorted by field name
int attributes; // Elements in the array
...
...
}
MW_user_list:
{
MW_user *head; // Head of list
MW_user *current; // Current user pointer
int users; // Total number of users
char *search_field; // The search field that defined this list
char *search_filter; // The filter that defined this list
...
...
}
MW_user:
{
MW_user *next; // Next user in this list
MW_user *prev; // Previous user in this list
MW_user_list *list; // Pointer to list that contains this user
long user_id; // Unique user id.
char *name; // Username
MW_attribute *attribute; // Array of attributes
int attributes; // Elements in the array
...
...
}
Attributes
| Attribute Name | Meaning |
|---|---|
| Node Attributes | |
| User Attributes | |
I'm trying to make this interface as generic as possible, so that the ideas can be taken from this well defined project can be extended to a more generic project; Network Object Abstraction Layer (NODAL). Maybe it's just a cool acronym, maybe it's a plausible idea. Regardless, I need to flesh it out more, and consult the gurus.
To Be Continued...