Boundary Meter 2.0 – Design

By Brent Cook on 09 27 2013

C is a very simple and powerful language. Because it does not enforce a particular style or organization of code, it can also be very dangerous and easy to write spaghetti code. As they say, it gives you just enough rope to shoot yourself with. However, with careful design, a C project can be grown from a small tech demo to a large application without too many growing pains.

Object Oriented Design in C

For the Boundary Meter 2.0 project, I chose to use an object-oriented style of C programming. Though C is not necessarily an OO language, aspects of OO design are still useful within a C project of any size. The main principals that the meter uses are information hiding, object composition and code reusability.

C does not have a fixed, language-level way of conveying object patterns like other specifically object-oriented languages do. So, a C coder must define the semantics and syntax by convention. Alex Schreiner’s ‘Object Oriented Programming in ANSI-C’ is a thorough manual for ways of implementing OO techniques in C. For the 2.0 meter, I chose to use the style of libdnet, Dug Song’s network programming library, which is very simple and effective. Consider rand.h, the interface for libdnet’s random number generator class.

typedef struct rand_handle rand_t;

__BEGIN_DECLS
rand_t	*rand_open(void);

int	 rand_get(rand_t *r, void *buf, size_t len);
int	 rand_set(rand_t *r, const void *seed, size_t len);
int	 rand_add(rand_t *r, const void *buf, size_t len);

uint8_t	 rand_uint8(rand_t *r);
uint16_t rand_uint16(rand_t *r);
uint32_t rand_uint32(rand_t *r);

int	 rand_shuffle(rand_t *r, void *base, size_t nmemb, size_t size);

rand_t	*rand_close(rand_t *r);
__END_DECLS

To begin, class ‘rand_t’ is an opaque handle – a simple pointer. The contents are not exposed to the user in the header file, or in other words, are effectively private. This makes it easy to change the underlying implementation of the random number generator without changing any user-visible APIs or ABIs. The size of the pointer remains constant and can only use the exposed API for interacting with the object. The ‘open’ and ‘close’ methods are the OOP constructor and destructor. The object relies on no global state, and thus allows one to use the library in multiple contexts without unknown side-effects. This handle+methods design allows one to get thread-safe and deterministic performance from the generator even if multiple instances are using different rand_t objects.

Contrast this to the standard C library’s random number generator, which is not an object-oriented nor extensible design:

RAND(3)                  BSD Library Functions Manual                  RAND(3)

NAME
     rand, rand_r, srand, sranddev -- bad random number generator

LIBRARY
     Standard C Library (libc, -lc)

SYNOPSIS
     int rand(void);

     int rand_r(unsigned *seed);

     void srand(unsigned seed);

     void sranddev(void);

There are two versions of this functionality exposed by the C library. The first, srand() and rand() operate on global state that lives in the C library itself. Thus, if a library or program were to use srand() to obtain a deterministic sequence based on a given seed, it is impossible to guarantee that another caller has not called srand() again in the mean-time. To solve this problem, rand_r allows one to provide a local seed context. However, by defining the seed context as ‘unsigned’, the implementation details are exposed, making it impossible to implement a future version that generates more random numbers in sequence than what can be represented by an unsigned, usually only 2**31 possible numbers. Hence ‘bad number generator’.

The Mediator Pattern

Central to Meter 2.0 is a minimal container object. This object is implemented as an opaque struct with a set of methods that operate on it, like the ‘rand’ object above. This object provides system time and event services, abstracts a global meter configuration object, and starts all of the subordinate objects. As an object container, it is the glue that holds the rest of the system together. Other object types such as ‘ipfix_writer’ and ‘flow_table’ encapsulate other portions of the meter’s functionality, but they are unaware of each other, only interacting with the container object handle. Inter-object communication is mediated via methods defined on the container object itself.

The benefits of this design is that each of the subordinate objects is testable independent of the other subordinates. That makes mocking and testing a matter of supplying a container object handle that only implements the methods that the object uses. Each subordinate object is not only testable in isolation, but can be replaced or extended without having to refactor any of the other objects or APIs. It also makes it easy to add new objects without modifying the interface of any subordinate object. New functionality is exposed simply by adding new mediator methods to the container.

Meter containers

As an example, consider intf_manager, which collects packets from a network interface and passes them to flow_table for processing. The flow_table object exports a method ‘iterate_active_flows’, which, given a callback, iterates over all flows that have been observed sending data in the last second. The intf_manager object instantiates a flow_table for each network interface, then exports a similar ‘iterate_active_flows’ method that walks every internal flow_table. Likewise, the container exposes an ‘iterate_active_flows’ method that currently calls the method on intf_manager. Finally, the ipfix_writer object (which sends stats to Boundary periodically) calls the container’s iterate_active_flows method to get the list of active flows.

meter ‘container’ interface:

struct meter;
struct meter_opts;

struct meter * meter_open(struct meter_opts *opts);

void meter_start(struct meter *m);

void meter_stop(struct meter *m);

int meter_iterate_active_flows(struct meter *m,
    active_flow_cb_t func, void *arg);

void meter_close(struct meter *m);

flow_table interface:

struct flow_table;

struct flow_table * flow_table_open(struct meter *m);

typedef void(*active_flow_cb_t)(const struct flow *, const struct flow_stats *delta, void *arg);

int flow_table_iterate_active_flows(struct flow_table *ft,
    active_flow_cb_t func, void *arg);

void flow_table_close(struct flow_table *ft);

In a future release of the meter, it may be desirable to expose active flows directly from the flow table maintained by the operating system, rather than processing packets. With this design, we can remove intf_manager and replace it with any other method that implements iterate_active_flows, while leaving ipfix_writer untouched, since it is unaware of intf_manager, it having been abstracted by the container object. In fact, as long as the function signature to iterate_active_flows stays the same, ipfix_writer does not even have to be recompiled.

meter_2

Since the container module mediates inter-object communication, one can also swap implementations at runtime based on observed characteristics of the operating system. Consider a meter with a number of specialized active flow exporters based on different versions of Linux.

meter_3

One version of iterate_active_flows may use a method that works more efficiently in newer versions of Linux, while the other uses a slower but more compatible method. This abstraction allows us to also implement functionality in the form of loadable modules, rather than linking all object code at build time.  A future implementation of iterate_active_flows could be loaded from a module with support for a newer kernel version, or replaced on the fly with a version that exports test data.

Events versus Processes versus Threads

If you want your program to do more than one thing at a time, there are many popular choices: events (where your program schedules concurrent tasks itself), multi-process (where multiple tasks are scheduled by the OS with independent memory space), threading (where multiple tasks are scheduled by the OS in the same memory space), or some combination of these.

When there are few interactions between multiple tasks, threads and processes can be a simple way to implement concurrency. However, when there is mutable or changeable state between tasks, such as a shared configuration object that multiple tasks access, multiple threads and processes are tricky to implement safely. Without careful design, your code can become a mess of locks and race conditions.  Events have some limitations not present in threads or processes, such as inability to scale past a single CPU and blocking issues when callbacks are not written granularly enough.  However, because they enforce a single flow of execution, events allows for concurrency without requiring locking or atomic operations to safely access shared state.

The 2.0 meter uses a hybrid of events and threads. Events are used for ‘slow path’ tasks that interact with the common container object. The container object does not implement any locking itself. This is because invariants provided by the event library (libevent) ensure that common objects are not accessed concurrently. Threads are used for the code paths where the performance either needs to scale to multiple CPUs, or needs to be deterministic and non-blockable by other tasks.

Interaction between threads and events can be tricky, especially in high-performance designs. In the 2.0 meter, ‘interface’ objects run in their own thread, capturing packets and analyzing network flows, one thread per interface. The ‘ipfix_writer’ object runs in the main event loop instead, exporting data once per second back to boundary. There are two kinds of thread/event interaction patterns used within the meter, callbacks and RCU updates.

RCU and You

Periodically, an event looks for changes in the configuration of an interface, such as a new IP address. When these occur, the configuration of an interface thread is updated. However, the interface thread may be using the old configuration still. The simplest solution might be to wrap all accesses to the interface configuration object in a lock so it can be updated safely. However, locks are costly to acquire millions of times per second and the data is updated infrequently. To solve this, the meter implements a simple form of RCU.

struct intf {
    const struct intf_info *info, *new_info;
    pthread_mutex_t new_info_lock;
};

/*
 * Only called by the fast-path thread.
 */
const struct intf_info *get_intf_info_fast(struct intf *intf)
{
    if (intf->new_info) {
        intf_lock_intf_info(intf);
        if (intf->new_info) {
            if (intf->info) {
                free_intf_info((struct intf_info *)intf->info);
            }
            intf->info = intf->new_info;
            intf->new_info = NULL;
        }
        intf_unlock_intf_info(intf);
    }
    return intf->info;
}

/*
 * Only called by the slow-path event
 */
void intf_set_intf_info(struct intf *intf, struct intf_info *intf_info)
{
    intf_lock_intf_info(intf);
    if (intf->new_info) {
        free_intf_info((struct intf_info *)intf->new_info);
        intf->new_info = NULL;
    }
    intf->new_info = intf_info;
    intf_unlock_intf_info(intf);
}

In the common case, the fast-path thread checks if the new_info pointer is set. If it is not, then it is safe to access the interface information directly. When the slow-path event wants to update the interface info, it first acquires the lock, then only updates new_info. Then, the next time the fast-path thread runs, it acquires the lock as well, moves the new interface info into place, the releases the lock. This means that for each update, the lock only needs to be acquired twice.

Note that this only solves the access problem in one direction. If the slow-path event needs to actually access the current interface info, it additionally needs to acquire the locks for all accesses to ensure the info is not updated while it is being accessed.

Threads and Callbacks

Periodically, information about active flows is forwarded back to Boundary. A simple way to do this might be to acquire a lock on the flow table to temporarily pause updates, and then scan for flows who have been updated in the last second. This is like a mark-and-sweep garbage collector in reverse, where it  looks for newly active data structures rather than old.

For the same reasons as above, acquiring the lock for every table access is expensive when data is exported relatively infrequently. One second resolution is slow when you’re processing millions of packets per second. To avoid the overhead of locks on the flow table, active flow scanning instead done by the interface thread itself while it processing packets. The interface thread loop looks like this:

void *interface_thread(void * arg)
{
    struct intf *intf = arg;
    while (intf->running) {
         process_packets(intf);
         if (intf_milliseconds(intf) % 1000 == 0) {
             run_timers(intf);
         }
    }
}

When code from the slow-path requests a scan for active flows, it registers a callback function with the interface thread. Eventually, the run_timers() function is called, when all registered callbacks are then processed. Then, the interface thread removes the callback from its queue, and continues processing packets.

One nice feature of the libevent library is the ability to instantiate multiple event loops, and then to run them in blocking or non-blocking mode. In the case of the interface thread, it also has its own event loop instance as well, and uses this to manage timer callbacks internally. The implementation of run_timers() above would then include a call to event_base_loop() with the EVLOOP_NONBLOCK flag set.

How should you design your next project?

The techniques used in the 2.0 meter are not unique to C. It could have been implemented in almost any language with a similar design. However, C is definitely expressive enough to build object-oriented, loosely coupled code while maintaining high performance. Consider it for your next project!