How to Add Fault Tolerance to the Control Plane
This tutorial and the FT module were designed by Tulio Alberton Ribeiro of the LaSIGE - Large-Scale Informatics Systems Laboratory. Thanks Tulio!
Table of Contents
Using the Included FT Module
If you're interested in using the FT module as written and not about developing or understanding the code, you only need to complete this section. That being said, it'll probably be beneficial if you go through the entire tutorial .
Creating the Keystore
First thing that you need to do is generate the key used in challenge response authentication as follows:
# keytool -genkey -alias AliasChallengeResponse -keystore myKey.jceks -keypass "YourPassWord" -storepass "YourPassWord" -storetype JCEKS
Currently the alias option from keytool is hard coded and it is used in CryptoUtil class located at: floodlight/src/main/java/org/sdnplatform/sync/internal/util/CryptoUtil.java:
public static final String CHALLENGE_RESPONSE_SECRET = "AliasChallengeResponse";
which means that it is necessary to use alias option value as defined above. The value set in CHALLENGE_RESPONSE_SECRET variable will be used to recover the key from the keystore.
As you can see the alias option needs to be "AliasChallengeResponse", unless you change it in both places (keytool generation and CHALLENGE_RESPONSE_SECRET var).
Testing the Keystore
After key generation, you can test the accessibility of your keystore:
# keytool -list -alias AliasChallengeResponse -keystore myKey.jceks -storetype JCEKS Enter keystore password: YourPassWord /* this should be your password as defined above */ AliasChallengeResponse, 24/Mar/2016, PrivateKeyEntry, Certificate fingerprint (SHA1): A2:1B:49:1B:18:D8:DC:95:CC:9F:C3:33:94:04:39:EE:44:DD:CF:BE
Defining Controllers
The Primary Controller
First, note that the net.floodlightcontroller.core.internal.FloodlightProvider.controllerId and org.sdnplatform.sync.internal.SyncManager.thisNodeId should be set to the same values. Note also that all switches defined in net.floodlightcontroller.core.internal.OFSwitchManager.switchesInitialState should be set to MASTER in the primary controller and SLAVE in the backup controller.
org.sdnplatform.sync.internal.SyncManager.authScheme=CHALLENGE_RESPONSE org.sdnplatform.sync.internal.SyncManager.keyStorePath=/etc/floodlight/myKey.jceks org.sdnplatform.sync.internal.SyncManager.dbPath=/var/lib/floodlight/ org.sdnplatform.sync.internal.SyncManager.keyStorePassword=YourPassWord org.sdnplatform.sync.internal.SyncManager.port=6642 org.sdnplatform.sync.internal.SyncManager.thisNodeId=1 org.sdnplatform.sync.internal.SyncManager.persistenceEnabled=FALSE org.sdnplatform.sync.internal.SyncManager.nodes=[\ {"nodeId": 1, "domainId": 1, "hostname": "192.168.1.100", "port": 6642},\ {"nodeId": 2, "domainId": 1, "hostname": "192.168.1.100", "port": 6643}\ ] net.floodlightcontroller.core.internal.FloodlightProvider.controllerId=1 net.floodlightcontroller.core.internal.OFSwitchManager.switchesInitialState={"00:00:00:00:00:00:00:01":"ROLE_MASTER","00:00:00:00:00:00:00:02":"ROLE_MASTER", "00:00:00:00:00:00:00:03":"ROLE_MASTER", "00:00:00:00:00:00:00:04":"ROLE_MASTER","00:00:00:00:00:00:00:05":"ROLE_MASTER"}
The Backup Controller
Note thisNodeId and controllerId are set to 2 in this case (and must be different from 1 as defined above for the master). Also note that the switch roles defined in switchesIntialState are set to SLAVE, as this is the backup controller.
org.sdnplatform.sync.internal.SyncManager.authScheme=CHALLENGE_RESPONSE org.sdnplatform.sync.internal.SyncManager.keyStorePath=/etc/floodlight/key2.jceks org.sdnplatform.sync.internal.SyncManager.dbPath=/var/lib/floodlight2/ org.sdnplatform.sync.internal.SyncManager.keyStorePassword=PassWord org.sdnplatform.sync.internal.SyncManager.port=6643 org.sdnplatform.sync.internal.SyncManager.thisNodeId=2 org.sdnplatform.sync.internal.SyncManager.persistenceEnabled=FALSE org.sdnplatform.sync.internal.SyncManager.nodes=[\ {"nodeId": 1, "domainId": 1, "hostname": "192.168.1.100", "port": 6642},\ {"nodeId": 2, "domainId": 1, "hostname": "192.168.1.100", "port": 6643}\ ] net.floodlightcontroller.core.internal.FloodlightProvider.controllerId=2 net.floodlightcontroller.core.internal.OFSwitchManager.switchesInitialState={"00:00:00:00:00:00:00:01":"ROLE_SLAVE","00:00:00:00:00:00:00:02":"ROLE_SLAVE", "00:00:00:00:00:00:00:03":"ROLE_SLAVE", "00:00:00:00:00:00:00:04":"ROLE_SLAVE","00:00:00:00:00:00:00:05":"ROLE_SLAVE"}
Running the Module
To run the FT module, make sure it's listed in the list of modules to load in floodlightdefault.properties, save the file, and run the controller. It's as simple as that.
Using the Sync Service – A Developer's Guide
This is not a step-by-step guide. It is expected the reader be comfortable with Java and writing Floodlight modules (tutorial here). One can follow along using the SimpleFT code here.
Initialize
To use the sync service, we need create two variables for ISyncService (the IFloodlightService we'll leverage) and IStoreClient (our module, as a user of the sync service):
private ISyncService syncService; private IStoreClient<String, String> storeFT; this.syncService = context.getServiceImpl(ISyncService.class);
Next, we need to start our store with global scope, which allows us to sync our store with other controllers and receive remote updates:
try { this.syncService.registerStore("NameOfMyStore", Scope.GLOBAL); this.storeFT = this.syncService.getStoreClient("NameOfMyStore", String.class, String.class); this.storeFT.addStoreListener(this); } catch (SyncException e) { throw new FloodlightModuleException("Error while setting up sync service", e); }
Read/Write Data Operations
To add data to our store:
try { this.storeFT.put("Key Y", "Data X"); } catch (SyncException e) { e.printStackTrace(); }
To retrieve data from our store:
try { this.storeFT.get("Key Y").getValue().toString(); } catch (SyncException e) { e.printStackTrace(); }
And finally, if you we wish to monitor our store, it is necessary implement interface IStoreListener<String> in our module or monitoring class. (In this case the store has the String type, but this can vary if you wish to store other types.)
Receiving Updates to the Store (from a Remote Controller)
In the example below, we show how our module can receive and process store updates from remote controllers using the keysModified() callback function. If you wish, you can uncomment debug logging code and see local and remote updates from your sync store.
@Override public void keysModified(Iterator<String> keys, org.sdnplatform.sync.IStoreListener.UpdateType type) { while(keys.hasNext()){ String k = keys.next(); try { /* logger.debug("keysModified: Key:{}, Value:{}, Type: {}", new Object[] { k, storeFT.get(k).getValue().toString(), type.name() } ); */ if(type.name().equals("REMOTE")){ String info = storeFT.get(k).getValue(); logger.debug("REMOTE: Key:{}, Value:{}", k, value); } } catch (SyncException e) { e.printStackTrace(); } } }
FT Implementation Details
The FT class uses an RPCListener to monitor RPC connections among the cluster and inform all synced nodes about connected and disconnected events. The fault tolerance module defines a RPCListener and monitors its connections.
In the event a controller boots up and connects, the SimpleFT module will insert a list of its switches and their roles to the store. And upon controller disconnection events, the module gets the disconnected node's switch list from store and set all the disconnected controller's switches as MASTER.
More Information
The source code is located on GitHub here.