How to Collect Switch Statistics (and Compute Bandwidth Utilization)

If you are just interested in collecting the statistics for use by a northbound application, you can do so directly using the existing Floodlight REST API.

It is assumed you are familiar with how to write a module and the OpenFlowJ-Loxi API. Also, please be familiar with these tutorials if you are interested in adding a REST API or exposing a service in your own module (they are not required for comprehension of this tutorial though).

Table of Contents

Introduction

I have been asked many times personally and on the mailing list about how to collect switch statistics in OpenFlow. Specifically, how to determine port bandwidth consumption is a frequently asked question. As such, here's a tutorial on how to determine the bandwidth used by switch ports (smile)This has been completed for you as a Floodlight module in the net.floodlightcontroller.statistics package and is available as of 12/14/15 in the Floodlight master branch. As such, there is no code you will write in this guide. The goal of this tutorial is to walk you through how the module works, such that you can either use it as-is or modify it to suit your use case.

OpenFlow has many statistics messages to allow the controller to query the switch for information about its running state. Examples include flow stats, meter stats, queue stats, aggregate stats, table stats, and port stats. You can even define custom experimenter statistics to retrieve some user-defined information (which requires modifying the switch and controller to support). Being able to collect this information is great, however it should be noted that these statistics are "snapshot" statistics. What I mean by this is that any data queried was valid at the time it was ascertained for inclusion in the stats reply message. However, by the time the controller receives the message, the values within the message are most likely out of date and do not reflect the real-time state of the switch anymore. Fortunately, for many applications, this slight inaccuracy is tolerable or negligible.

Statistics collection for use in real-time algorithms is one of the disadvantages of having a disjoint control and data plane. This disadvantage is more prominent for a controller that is geographically distant with a large link latency to its switches. As such, any reactive algorithms in place in the controller that rely on statistics collection need to take into account the inherit inaccuracy and possible delay present in the process. Not to mention there is messaging overhead that must occur in order to collect the statistics, which might be undesirable and introduce even more varied delay for a heavily-loaded controller.

In many cases though, these disadvantages present in OpenFlow statistics collection can be ignored, especially if your algorithm passively observes statistics, if your controller is located in close proximity to your switches, or if you just need a "ballpark" figure for the statistics. Let's say for example that you wish to allow users to use the fast path of the network until they surpass a certain threshold of transmitted or received data. You could periodically observe the byte counters of user flows to determine (1) when they have surpassed your defined threshold and (2) when to move their traffic to a slower path. It probably wouldn't hurt to query the flow byte counters every 30s, and the difference between the real time and sample time byte counters would probably be negligible. Worst-case scenario, a user gets an extra ~29 seconds of the fast path. (Of course, you could tune this to be more accurate, but it's just a simple example.)

In this tutorial, we will discuss how to compute the bandwidth consumed on a switch port. A module has been created for you to perform this operation. The pros and cons of the approach are discussed, along with how the module works, how you can use it, and how you could modify it to suit your needs.

Computing bandwidth consumption

Notice that the throttling example above is just looking at raw byte counters and has no notion of time. What if we want to determine some statistic that is dependent on time, such as bandwidth? OpenFlow does not provide a way to collect such values from the switch. The controller needs to make sense of the raw statistics values (i.e. counters in most cases) returned by the "snapshot" statistics outlined above. To determine bandwidth consumption, we can use byte counters returned at two points in time. The difference between these two counters divided by the time elapsed between the "snapshot" point of each counter value yields the bandwidth. There is some inaccuracy in this approach, however, namely:

  • Timestamping of statistics collection
  • Control plane latency
  • Statistics collection interval

The timestamp of when the statistics were collected is not included in the statistics reply sent by the switch (although if it was it would be much easier on the controller and more accurate – are you reading this OF spec writers (wink)). As such, the controller must note when either the stats request was sent, when the stats reply arrived, or perhaps estimate some point in between these two times as to when the statistics were actually collected. The control plane latency will of course impact the timestamp recorded as well. If there is variation in the latency (and there typically is in real networks), it can impact the bandwidth values computed.

When computing bandwidth, there are two approaches:

  1. Issue lots of stats requests and compute the bandwidth frequently to attempt to keep up with the real time bandwidth consumption.
  2. Issue less frequent stats requests and compute and update the bandwidth less frequently.

The first approach would in theory yield more up-to-date bandwidth values; however, the error present in computing timestamps could account for a significant portion of the elapsed time used in the bandwidth calculation. In a control plane with latencies that vary slightly and are small, the bandwidth consumption as computed using frequent stats requests could appear to jump sporadically despite constant traffic flows. The second approach allows the slight millisecond-scale inaccuracies present in timestamping to be smoothed into the much larger time elapsed between computation intervals, making the error less significant. The downside to this approach though is not being able to get bandwidth values updated in near real-time.

This tutorial and the statistics module discussed uses a blend of these two approaching, giving the user the ability to decide the interval between when statistics requests are sent and bandwidth is computed. The default interval is set to 10s, meaning the bandwidth will be reassessed every 10s. If the control plane latency fluctuates by a few milliseconds, it'll be insignificant relative to the overall wait between stats collection intervals. If we lower this to let's say 1s, we run the risk of a higher error percentage.

Using the statistics module

The statistics module can be configured from startup using the floodlightdefault.properties file. It can also be configured at runtime using the Floodlight REST API.

To leverage the statistics module at runtime, the REST API can be used from a northbound application. Alternatively, other Floodlight modules can use the IStatisticsService exposed by the module. The following sections discuss each of these in detail.

Startup configuration

Floodlight stores its startup configuration src/main/resources/floodlightdefault.properties. Add or set the following to modify the startup behavior of the statistics module.

Configuration VariableDescriptionValuesDefault and Notes
net.floodlightcontroller.statistics.enable=<boolean>Enable or disable the statistics collection module

<boolean> is:

'TRUE' to enable the module

'FALSE' to disable the module

FALSE

Stats requests introduce additional overhead, which unless being used are wasteful to send.

net.floodlightcontroller.statistics.collectionIntervalPortStatsSeconds=<positive-integer>

Set the interval at which port stats request messages are sent and bandwidth is recomputed

 <positive-integer> is:

'1' to issue port stats request each second

'2' to issue port stats request every 2 seconds

...

10

Smaller values provide faster updates at the expense of potentially higher error.

Larger values will result in less frequent bandwidth update, are less likely to contain significant error, but will not show as many peaks and valleys present in real bandwidth consumption.

REST API

The statistics module exposes a basic REST API. Note that the REST API is designed such that you can plug in your own statistics types, e.g. if you wanted to return packets per second instead of bandwidth, you could create another API called /wm/statistics/packet/<switch>/<port>/json and perform the necessary computations when it is called. More information on how to create a REST API can be found in this tutorial.

URIHTTP CommandParametersInputOutputNotes
/wm/statistics/config/enable/jsonPOST or PUTn/a'' (two single quotes; empty string)

{
"statistics-collection": "enabled"
}

Enable statistics collection
/wm/statistics/config/disable/jsonPOST or PUTn/a'' (two single quotes; empty string)

{
"statistics-collection": "disabled"
}

Disable statistics collection
/wm/statistics/bandwidth/<switch>/<port>/jsonGET

<switch> = valid colon-delimited switch DPID hex string or number

<port> = valid switch port number

n/a

[
{
"bits-per-second-rx": "<number>",

"bits-per-second-tx": "<number>",
"dpid": "<hex-string>",
"port": "<number>",
"updated": "<local-date-time-string>"
},
...,

...
]

Retrieve bandwidth consumption on a per switch, per port, or combination thereof basis.

The RX and TX bandwidth is included, along with the timestamp the controller saved the information for the given switch and port.

IStatisticsService

The statistics module implements IStatisticsService, which extends IFloodlightService. All Floodlight services can be leveraged by other Floodlight modules. More information on how to extend and implement an IFloodlightService of your own can be found in this tutorial. The following API is exposed by the IStatisticsService:

FunctionDescriptionParametersReturn Value
getBandwidthConsumption(DatapathId dpid, OFPort p)Get bandwidth stats for a specific switch and switch port

dpid = switch DPID

p = switch port

SwitchPortBandwidth

null if not found

getBandwidthConsumption()Get all bandwidth stats per switch and switch portn/a

Map<NodePortTuple, SwitchPortBandwidth>

empty map if no stats present

void collectStatistics(boolean collect)Enable or disable statistics collection

collect = true to enable

collect = false to disable

n/a

Collecting statistics in a Floodlight module

In this section, I'll walk through how the statistics collection works in Floodlight. The purpose of this is not only to show you how it works, but also to give you an idea about how you can expand it to collect other statistics types if bandwidth is not what you're after. Hopefully you can also find some of the these techniques useful in your own modules (smile)

To begin, it should be noted that you should never issue synchronous statistics requests in any I/O or listener thread. This will result in blocking other tasks or listeners while you wait on the response. Instead, the recommended way is to use another thread to handle the statistics request and response. This thread is initialized as follows from the IThreadPoolService Floodlight service (provided by another module):

private void startStatisticsCollection() {
	portStatsCollector = threadPoolService
		.getScheduledExecutor()
			.scheduleAtFixedRate(new PortStatsCollector(), portStatsInterval, portStatsInterval, TimeUnit.SECONDS);
	tentativePortStats.clear();
	log.warn("Statistics collection thread(s) started");
}

After portStatsInterval seconds and every portStatsInterval seconds thereafter, the run() function in the PortStatsCollector object created will be executed. Using the portStatsInterval default value of 10 seconds, this means 10s later and every 10s, the run() method will be executed. Note that the ScheduledExecutorService will not allow two run()'s to overlap from different threads. This means that if PortStatsCollector's run() in thread #1 is running and takes more than portStatsInterval seconds to complete, the timer that invokes run() again will simply skip the run() it was going to execute in thread #2 and wait another portStatsInverval seconds to try again. This ensures at most one scheduled thread is running at once. We will use this fact to ensure we have at most one outstanding port stats request for each switch at a given time (from within this module, from this ScheduledFuture).

PortStatsCollector implements Runnable and contains the run() function, which will be invoked by the executor at the times and interval set as described above. The bulk of the run() function is for managing the prior port stats replies in order to take the difference in byte counts (required to compute bandwidth). Ignoring that bulk for now, let's consider the call to getSwitchStatistics() within PortStatsCollector here. This returns the stats replies for each switch. The source of getSwitchStatistics() is here and is reproduced below.

private Map<DatapathId, List<OFStatsReply>> getSwitchStatistics(Set<DatapathId> dpids, OFStatsType statsType) {
	HashMap<DatapathId, List<OFStatsReply>> model = new HashMap<DatapathId, List<OFStatsReply>>();
	List<GetStatisticsThread> activeThreads = new ArrayList<GetStatisticsThread>(dpids.size());
	List<GetStatisticsThread> pendingRemovalThreads = new ArrayList<GetStatisticsThread>();
	GetStatisticsThread t;
	for (DatapathId d : dpids) {
		t = new GetStatisticsThread(d, statsType);
		activeThreads.add(t);
		t.start();
	}

	/* Join all the threads after the timeout. Set a hard timeout
	 * of 12 seconds for the threads to finish. If the thread has not
	 * finished the switch has not replied yet and therefore we won't
	 * add the switch's stats to the reply.
	 */
	for (int iSleepCycles = 0; iSleepCycles < portStatsInterval; iSleepCycles++) {
		for (GetStatisticsThread curThread : activeThreads) {
			if (curThread.getState() == State.TERMINATED) {
				model.put(curThread.getSwitchId(), curThread.getStatisticsReply());
				pendingRemovalThreads.add(curThread);
			}
		}
		/* remove the threads that have completed the queries to the switches */
		for (GetStatisticsThread curThread : pendingRemovalThreads) {
			activeThreads.remove(curThread);
		}

		/* clear the list so we don't try to double remove them */
		pendingRemovalThreads.clear();

		/* if we are done finish early */
		if (activeThreads.isEmpty()) {
			break;
		}
		try {
			Thread.sleep(1000);
		} catch (InterruptedException e) {
			log.error("Interrupted while waiting for statistics", e);
		}
	}

	return model;
}

Because each individual statistics request must wait for an arbitrary amount of time for the response to return, we can also use threads, specifically GetStatisticsThread, which extends Thread. The algorithm is to (1) spawn a new GetStatisticsThread, one for each switch, (2) poll for the thread(s) to complete, and (3) retrieve the replies from each thread after it completes. Each thread is started (as shown above) here, which chains into GetStatisticsThread's run() function here, which finally chains into getSwitchStatistics() here. The specific statistics request is written to the IOFSwitch object here. Note the ListenableFuture; this is what will be notified when the response comes back using the Floodlight core OpenFlow connection handler.

The overloaded getSwitchStatistics() is reproduced in part below. Note how the specific switch statistics type is abstracted away up until this point. Here, the specific statistics message (in our case PORT stats) is sent to the switch. The code is designed this way to increase reuse and support many different statistics types from this module.

protected List<OFStatsReply> getSwitchStatistics(DatapathId switchId, OFStatsType statsType) {
	IOFSwitch sw = switchService.getSwitch(switchId);
	ListenableFuture<?> future;
	List<OFStatsReply> values = null;
	Match match;

	if (sw != null) {
		OFStatsRequest<?> req = null;
		switch (statsType) {
		case FLOW:
			match = sw.getOFFactory().buildMatch().build();
			req = sw.getOFFactory().buildFlowStatsRequest()
				.setMatch(match)
				.setOutPort(OFPort.ANY)
				.setTableId(TableId.ALL)
				.build();
			break;
		...
		...
		case PORT:
			req = sw.getOFFactory().buildPortStatsRequest()
				.setPortNo(OFPort.ANY)
				.build();
			break;
		...
		...
		default:
			log.error("Stats Request Type {} not implemented yet", statsType.name());
			break;
		}

		try {
			if (req != null) {
				future = sw.writeStatsRequest(req); 
				values = (List<OFStatsReply>) future.get(portStatsInterval / 2, TimeUnit.SECONDS);
			}
		} catch (Exception e) {
			log.error("Failure retrieving statistics from switch {}. {}", sw, e);
		}
	}
	return values;
}

Side note: For those who are interested, what happens under the hood is the XID of the request will be used to match it's response to the corresponding ListenableFuture. The response is required to share the same XID as the request as stated in the OpenFlow specification. This means that there should not be more than one request with the same XID outstanding at the same time. The OpenFlowJ-Loxi library backing Floodlight ensures this, unless a user module manually sets an XID, breaking this convention.

The replies to each individual statistics request returns from the above getStatisticsRequest() function to the multithreaded getStatisticsRequest() function, which then returns to where it was called in PortStatsCollector, reproduced below. As shown, the majority of the code in this outer thread is mainly for making sense of the replies in the context of bandwidth. The timestamp of the previous reply and the current reply for each switch and port are subtracted to get the approximate time elapsed. The difference in RX and TX byte counters for the prior and current replies is taken to get the bytes transmitted during this time. Note that we have to account for counter overflows.

public void run() {
	Map<DatapathId, List<OFStatsReply>> replies = getSwitchStatistics(switchService.getAllSwitchDpids(), OFStatsType.PORT);
	for (Entry<DatapathId, List<OFStatsReply>> e : replies.entrySet()) {
		for (OFStatsReply r : e.getValue()) {
			OFPortStatsReply psr = (OFPortStatsReply) r;
			for (OFPortStatsEntry pse : psr.getEntries()) {
				NodePortTuple npt = new NodePortTuple(e.getKey(), pse.getPortNo());
				SwitchPortBandwidth spb;
				if (portStats.containsKey(npt) || tentativePortStats.containsKey(npt)) {
					if (portStats.containsKey(npt)) { /* update */
						spb = portStats.get(npt);
					} else if (tentativePortStats.containsKey(npt)) { /* finish */
						spb = tentativePortStats.get(npt);
						tentativePortStats.remove(npt);
					} else {
						log.error("Inconsistent state between tentative and official port stats lists.");
						return;
					}
					/* Get counted bytes over the elapsed period. Check for counter overflow. */
					U64 rxBytesCounted;
					U64 txBytesCounted;
					if (spb.getPriorByteValueRx().compareTo(pse.getRxBytes()) > 0) { /* overflow */
						U64 upper = U64.NO_MASK.subtract(spb.getPriorByteValueRx());
						U64 lower = pse.getRxBytes();
						rxBytesCounted = upper.add(lower);
					} else {
						rxBytesCounted = pse.getRxBytes().subtract(spb.getPriorByteValueRx());
					}
					if (spb.getPriorByteValueTx().compareTo(pse.getTxBytes()) > 0) { /* overflow */
						U64 upper = U64.NO_MASK.subtract(spb.getPriorByteValueTx());
						U64 lower = pse.getTxBytes();
						txBytesCounted = upper.add(lower);
					} else {
						txBytesCounted = pse.getTxBytes().subtract(spb.getPriorByteValueTx());
					}
					long timeDifSec = (System.currentTimeMillis() - spb.getUpdateTime()) / MILLIS_PER_SEC;
					portStats.put(npt, SwitchPortBandwidth.of(npt.getNodeId(), npt.getPortId(), 
							U64.ofRaw((rxBytesCounted.getValue() * BITS_PER_BYTE) / timeDifSec), 
							U64.ofRaw((txBytesCounted.getValue() * BITS_PER_BYTE) / timeDifSec), 
							pse.getRxBytes(), pse.getTxBytes())
							);
				} else { /* initialize */
					tentativePortStats.put(npt, SwitchPortBandwidth.of(npt.getNodeId(), npt.getPortId(), U64.ZERO, U64.ZERO, pse.getRxBytes(), pse.getTxBytes()));
				}
			}
		}
	}
}

Concluding remarks

Hopefully this has been informative and has answered some questions – by means of example – that many have had about how to collect statistics from a module and also how to determine port bandwidth. I'll add that this module can be readily expanded to poll and compute whatever statistic you want. If you want, you can change from port stats to flow stats to narrow down the RX and TX counters even more. For flows, you'll have the byte counters to show how many bytes have matched the flows in question, not including other traffic flows present. A possible application would be to use flow byte counters to compute the percentage of consumed bandwidth a particular flow uses with regard to the the total bandwidth being consumed.

Still have questions? Write to our email list.