Common Data Structures and Methods

These data structures and methods are available in each server.

Data Structures

message datum

Represents a set of data used for machine learning in Jubatus. See Data Conversion for details.

You can change internal values of a datum with these methods.

datum add_string(0: string key, 1: string value)
Parameters:
  • key – The key of the value to add. Cannot contain “$”.
  • value – The value to add.
Returns:

Returns a pointer to itself.

Add a string value.

datum add_number(0: string key, 1: double value)
Parameters:
  • key – The key of the value to add.
  • value – The value to add.
Returns:

Returns a pointer to itself.

Add a numeric value.

datum add_binary(0: string key, 1: raw value)
Parameters:
  • key – The key of the value to add.
  • value – The value to add.
Returns:

Returns a pointer to itself.

Add a binary value.

Internal representation of a datum is below:

0: list<tuple<string, string>> string_values

Input data represented in string. It is represented as key-value pairs of data. Name of keys cannot contain “$” sign.

1: list<tuple<string, double>> num_values

Input data represented in numeric value. It is represented as key-value pairs of data.

2: list<tuple<string, raw>> binary_values

Input data represented in binary value. It is represented as key-value pairs of data.

message datum {
  0: list<tuple<string, string> > string_values
  1: list<tuple<string, double> > num_values
  2: list<tuple<string, raw> > binary_values
}

Constructor

constructor(string host, int port, string name, int timeout_sec)

Creates a new RPC client instance. name is a string value to uniquely identify a task in the ZooKeeper cluster. When using standalone mode, this must be left blank (""). timeout_sec is a length of timeout between the RPC method invocation and response.

Example usage of constructors are as follows:

// C++
#include <jubatus/client.hpp>
using jubatus::classifier::client::classifier;
// ...
classifier client("localhost", 9199, "cluster", 10);
# Python
from jubatus.classifier.client import classifier
# ...
client = classifier("localhost", 9199, "cluster", 10);
// Ruby
require 'jubatus/classifier/client'
include Jubatus::Classifier::Client
// ...
client = Classifier.new("localhost", 9199, "cluster", 10)
// Java
import us.jubat.classifier.ClassifierClient;
// ...
ClassifierClient client = new ClassifierClient("localhost", 9199, "cluster", 10);

Methods

map<string, string> save(0: string id)
Parameters:
  • id – file name to save
Returns:

Path to the saved model for each server. The key of the map is in form of ipaddr_portnumber.

Store the learing model to the local disk at ALL servers.

bool load(0: string id)
Parameters:
  • id – file name to load
Returns:

True if this function loads files successfully at all servers

Restore the saved model from local disk at ALL servers.

bool clear()
Returns:True when the model was cleared successfully

Completely clears the model at ALL servers.

string get_config()
Returns:server configuration set on initialization

Returns server configuration from a server. For format of configuration, see API reference of each services.

map<string, map<string, string>> get_status()
Returns:Internal state for each servers. The key of most outer map is in form of ipaddr_portnumber.

Returns server status from ALL servers. Each server is represented by a pair of IP address and port.

bool do_mix()
Returns:True when model mixed successfully

Force cluster to fire mix. Call this RPC to Jubatus server directly. When you call this to proxy, RPC error will be raised.

map<string, map<string, string>> get_proxy_status()
Returns:Internal state for proxy. The key of most outer map is in form of ipaddr_portnumber.

Returns proxy status.

This is an RPC method for proxy. When you use this for server, RPC error will be raised.

string get_name()
Returns:Name of target cluster

Get name of target cluster of this client object. name is a string value to uniquely identify a task in the ZooKeeper cluster. This is not an RPC method.

void set_name(0: string new_name)
Parameters:
  • new_name – Name of new target cluster

Set name of target cluster of this client object. name is a string value to uniquely identify a task in the ZooKeeper cluster. You can switch the target Jubatus cluster among multiple tasks with one client object. This is not an RPC method.

mprpc_client get_client()
Returns:MessagePack-RPC client instance

Returns the reference to the raw MessagePack-RPC client instance which is used by Jubatus client libraries. This is not an RPC method.

The common use case of this method is to close the TCP connection explicitly or to change the timeout.

mprpc_client is a type of MessagePack-RPC client that is different between languages (C++ / Python / Ruby / Java).

Language-Specific Features

Python / Ruby clients have language-specific features.

Python

By using embedded_jubatus module (embedded-jubatus-python) you can call machine learning algorithms provided in Jubatus Core library directly from Python code. To install the module run pip install embedded_jubatus. Jubatus and Jubatus Core must be installed to install the module.

As illustrated in the example below, embedded_jubatus provides the same interface as RPC clients.

from jubatus.anomaly.types import *

# Use RPC:
from jubatus.anomaly.client import Anomaly
client = Anomaly('127.0.0.1', 9199, '', 0)

# Use Embedded
from jubatus.embedded import Anomaly
client = Anomaly({
    'method': 'lof',
    'parameter': { ... },
    'converter': { ... },
})

# Use Embedded (using JSON config file instead of dict)
from jubatus.embedded import Anomaly
client = Anomaly('/path/to/config.json')

# Both Embedded/RPC client supports the same API:
client.add( ... )

In addition, the following auxiliary method is available.

jubatus.commmon.connect(cls, host, port, name, timeout=10)

Create a client instance of specified class cls, then connect to the server specified using host, port and name. As this method creates a context manager, use this method in with block. The target will be a client object. When leaving with block, this client object disconnects from the server.

with jubatus.common.connect(jubatus.classifier.client.Classifier, 'localhost', 9199, 'cluster_name', 10) as client:
    client.get_status()

Ruby

The following auxiliary method is available.

class Jubatus::Common::ClientBase

All client objects are defined as subclass of ClientBase class.

classmethod ClientBase.connect(host, port, name, timeout_sec, &block)

Using connect method of client classes of each algorithms ensures safely closing client connections. connect method takes host name, port number, cluster name, time-out period and block as arguments. It automatically creates a client object and connects to the server specified. The block can then use the client object. When leaving the block, this client object disconnects from the server.

Jubatus::Classifier::Client::Classifier.connect('localhost', 9199, 'cluster_name', 10) { |client|
  client.get_status()
}