Graph

Configuration

Configuration is given as a JSON file. We show each field below:

method

Specify algorithm for graph analysis. You can use these algorithms.

Value Method
"graph_wo_index" Use graph without index.
parameter

Specify parameters for the algorithm. Its format differs for each method.

graph_wo_index
damping_factor:

Damping factor for PageRank. It adjusts scores for nodes with differenct degrees. The bigger it is, the more sensitive to graph structure PageRank score is, but the larger biases it causes. In the original paper, 0.85 is good. (Float)

  • Range: 0.0 < damping_factor < 1.0
landmark_num:

The number of landmarks for shortest path calculation. The bigger it is, more accurate value you can get, but the more memory is required. (Integer)

  • Range: 0 <= landmark_num
Example:
{
  "method" : "graph_wo_index",
  "parameter" : {
    "damping_factor" : 0.9,
    "landmark_num" : 5
  }
}

Data Structures

message node

Represents the node information.

0: map<string, string> property

Property information for the node.

1: list<ulong> in_edges

List of ID of incoming edges.

2: list<ulong> out_edges

List of ID of outgoing edges.

message node {
  0: map<string, string>  property
  1: list<ulong>  in_edges
  2: list<ulong>  out_edges
}
message query

Represents a query.

0: string from_id
1: string to_id
message query {
  0: string from_id
  1: string to_id
}
message preset_query

Represents a preset query. See the description below for details.

0: list<query> edge_query
1: list<query> node_query
message preset_query {
  0: list<query> edge_query
  1: list<query> node_query
}
message edge

Represents the edge information.

0: map<string, string> property

Property information for the edge.

1: string source

ID of the source node that the edge connects.

2: string target

ID of the target node that the edge connects.

message edge {
  0: map<string, string> property
  1: string source
  2: string target
}
message shortest_path_query

Represents a shortest path query information. See the description of get_shortest_path method for details.

0: string source
1: string target
2: uint max_hop
3: preset_query query
message shortest_path_query {
  0: string source
  1: string target
  2: uint max_hop
  3: preset_query query
}

Usage of Properties and Queries

Properties and Queries are both represented as key-value pair like { 'key' : 'value', 'key2' : 'value2', ... }. The condition when a query matches to a property is: all keys in a query MUST exist in a property and the corresponding value which belongs to the query and the property MUST match exactly. The ordering of each key-value in property/query does not matter.

For example, this case matches:

query:    { 'key' : 'value' }
property: { 'key' : 'value', 'foo' : 'bar' }

This case does not match – same key but different value:

query:    { 'key' : 'wrong' }
property: { 'key' : 'value', 'foo' : 'bar' }

This case does not match – key spam does not exist in property:

query:    { 'key' : 'value', 'spam': 'ham' }
property: { 'key' : 'value', 'foo' : 'bar' }

Methods

service graph
string create_node()

Creates a node on the graph. Returns a node ID as string.

bool remove_node(0: string node_id)

Removes a node node_id from the graph.

bool update_node(0: string node_id, 1: map<string, string> property)

Updates the property of the node node_id to property.

ulong create_edge(0: string node_id, 1: edge e)

Creates a link from e.source to e.target. Returns a edge ID as an unsigned long integer.

The link has a direction. For any two nodes, multiple links with the same direction can be created. In this case, property e.property can be associated to each link (see edge).

node_id must be the same value as e.source.

bool update_edge(0: string node_id, 1: ulong edge_id, 2: edge e)

Updates an existing edge edge_id with information e. Property will be replaced.

node_id must be the same value as e.source.

bool remove_edge(0: string node_id, 1: ulong edge_id)

Removes an edge edge_id. node_id must be an ID for the source node of the edge edge_id.

double get_centrality(0: string node_id, 1: int centrality_type, 2: preset_query query)

Calculates (gets the computed value) the centrality over the edges that match the preset query query. The query must be registered beforehand by using add_centrality_query.

centrality_type is a type of centrality. Currently, only 0 (PageRank centrality) can be specified.

Centrality is computed when mix runs, thus there may be a gap between the exact value of centrality and the computed value if there’re updates not mixed. See also the description of update_index.

bool add_centrality_query(0: preset_query query)

Adds a preset query query to the graph for centrality calculation.

bool add_shortest_path_query(0: preset_query query)

Adds a preset query query to the graph for shortest path calculation.

bool remove_centrality_query(0: preset_query query)

Removes a preset query query from the graph.

bool remove_shortest_path_query(0: preset_query query)

Removes a preset query query from the graph.

list<string> get_shortest_path(0: shortest_path_query query)

Calculates (from the precomputed data) a shortest path from query.source to query.target that matches the preset query. The query must be registered beforehand by using add_shortest_path_query. Returns a list of node IDs that represents a path from query.source to query.target.

If the shortest path from query.source to query.target cannot be found within query.max_hop hops, the result will be truncated.

Path-index tree may have a gap between the exact path and the computed path when in a distributed setup. See also the description of update_index.

bool update_index()

Runs mix locally. Do not use in distributed mode.

Some functions like get_centrality and get_shortest_path uses an index that is updated in the mix operation. In a standalone mode, mix is not automatically called thus users must call this API by themselves.

node get_node(0: string node_id)

Gets the node for a node node_id.

edge get_edge(0: string node_id, 1: ulong edge_id)

Gets the edge of an edge edge_id. node_id is an ID for the source node of the edge edge_id.