Distributed Mode¶
In this page, we describe how to setup Jubatus distributed in multiple nodes.
We recommend trying this tutorial after you experience Basic Tutorial in standalone mode.
Distributed Mode¶
You can run Jubatus in a distributed environment using ZooKeeper and Jubatus proxies.
Setup ZooKeeper¶
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. Jubatus in cluster mode uses ZooKeeper to manage Jubatus servers and proxies in cluster environment.
Run ZooKeeper server like this:
$ /path/to/zookeeper/bin/zkServer.sh start
JMX enabled by default
Using config: /path/to/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ...
STARTED
Here we assume that ZooKeeper is running on localhost:2181. You can change it in the zoo.cfg
file.
Register configuration file to ZooKeeper¶
In distributed environment, register configuration file on the local file system to ZooKeeper using jubaconfig
.
$ jubaconfig --cmd write --zookeeper=localhost:2181 --file config.json --name tutorial --type classifier
Jubatus Proxy¶
Jubatus proxies proxy RPC requests from clients to servers. In distributed environment, make RPC requests from clients to proxies, not directly to servers.
Jubatus proxies are provided for each Jubatus servers.
For the classifier, jubaclassifier_proxy
is the corresponding proxy.
$ jubaclassifier_proxy --zookeeper=localhost:2181 --rpc-port=9198
Now jubaclassifier_proxy
started listening on TCP port 9198 for RPC requests.
Join Jubatus Servers to Cluster¶
To start Jubatus servers in cluster mode, give --name
and --zookeeper
option when executing servers.
Server processes started with same name belongs to the same cluster and they collaborate with one another.
If you want to start multiple server processes on the same machine, please note that you must change the port for each processes.
$ jubaclassifier --rpc-port=9180 --name=tutorial --zookeeper=localhost:2181 &
$ jubaclassifier --rpc-port=9181 --name=tutorial --zookeeper=localhost:2181 &
$ jubaclassifier --rpc-port=9182 --name=tutorial --zookeeper=localhost:2181 &
When Jubatus servers are started in cluster mode, they create a node in ZooKeeper system. You can verify that three server processes are registered to ZooKeeper system by using ZooKeeper client.
$ /path/to/zookeeper/bin/zkCli.sh -server localhost:2181
[zk: localhost:2181(CONNECTED) 0] ls /jubatus/actors/classifier/tutorial/nodes
[XXX.XXX.XXX.XXX_9180, XXX.XXX.XXX.XXX__9181, XXX.XXX.XXX.XXX__9182]
Run Tutorial¶
Run the tutorial program again, but this time we use options to specify port to connect to proxies instead of servers. In cluster mode, you also need to specify the cluster name when making RPC request to proxies.
$ python tutorial.py --server_port=9198 --name=tutorial
Note that you can use the same client code for both standalone mode and distributed mode.
Cluster Management in Jubatus¶
Jubatus has a mechanism to centrally manage various processes. In this tutorial, you will execute some processes on each server as shown in the following table.
IP Address | Processes |
---|---|
192.168.0.1 | Terminal |
192.168.0.11 | jubaclassifier - 1 |
192.168.0.12 | jubaclassifier - 2 |
192.168.0.13 | jubaclassifier - 3 |
192.168.0.101 | jubaclassifier_proxy/client - 1 |
192.168.0.102 | jubaclassifier_proxy/client - 2 |
192.168.0.103 | jubaclassifier_proxy/client - 3 |
192.168.0.211 | ZooKeeper - 1 |
192.168.0.212 | ZooKeeper - 2 |
192.168.0.213 | ZooKeeper - 3 |
For the best practices, see Cluster Administration Guide.
ZooKeepers & Jubatus Proxies¶
Start ZooKeeper servers (make sure you configure an ensemble between them).
[192.168.0.211]$ bin/zkServer.sh start
[192.168.0.212]$ bin/zkServer.sh start
[192.168.0.213]$ bin/zkServer.sh start
Start jubaclassifier_proxy
processes. jubaclassifier_proxy
uses TCP port 9199 by default.
[192.168.0.101]$ jubaclassifier_proxy --zookeeper 192.168.0.211:2181,192.168.0.212:2181,192.168.0.213:2181
[192.168.0.102]$ jubaclassifier_proxy --zookeeper 192.168.0.211:2181,192.168.0.212:2181,192.168.0.213:2181
[192.168.0.103]$ jubaclassifier_proxy --zookeeper 192.168.0.211:2181,192.168.0.212:2181,192.168.0.213:2181
Jubavisor: Process Management Agent¶
jubavisor
is an agent process that manages server processes.
jubavisor
can manage each Jubatus server processes by receiving RPC requests from jubactl
, a controller command.
jubavisor
uses TCP port 9198 by default.
[192.168.0.11]$ jubavisor --zookeeper 192.168.0.211:2181,192.168.0.212:2181,192.168.0.213:2181 --daemon
[192.168.0.22]$ jubavisor --zookeeper 192.168.0.211:2181,192.168.0.212:2181,192.168.0.213:2181 --daemon
[192.168.0.33]$ jubavisor --zookeeper 192.168.0.211:2181,192.168.0.212:2181,192.168.0.213:2181 --daemon
Now send commands from jubactl
to jubavisor
.
[192.168.0.1]$ jubactl -c start --server=jubaclassifier --type=classifier --name=tutorial --zookeeper=192.168.0.211:2181,192.168.0.212:2181,192.168.0.213:2181
[192.168.0.1]$ jubactl -c status --server=jubaclassifier --type=classifier --name=tutorial --zookeeper=192.168.0.211:2181,192.168.0.212:2181,192.168.0.213:2181
active jubaclassifier_proxy members:
192.168.0.101_9199
192.168.0.102_9199
192.168.0.103_9199
active jubavisor members:
192.168.0.11_9198
192.168.0.12_9198
192.168.0.13_9198
active tutorial members:
192.168.0.11_9199
192.168.0.12_9199
192.168.0.13_9199
From members list, you can see the server is running. Now run clients simultaneously, from multiple hosts.
[192.168.0.101]$ python tutorial.py --name=tutorial --server_ip 127.0.0.1:9199
[192.168.0.102]$ python tutorial.py --name=tutorial --server_ip 127.0.0.1:9199
[192.168.0.103]$ python tutorial.py --name=tutorial --server_ip 127.0.0.1:9199
You can also stop instance of Jubatus server from jubactl
.
[192.168.0.1]$ jubactl -c stop --server=jubaclassifier --type=classifier --name=tutorial --zookeeper=192.168.0.211:2181,192.168.0.212:2181,192.168.0.213:2181