Anomaly¶
- See IDL definition for detailed specification.
Configuration¶
Configuration is given as a JSON file. We show each field below:
-
method Specify algorithm for anomaly detection. You can use these algorithms.
Value Method "lof"Use Local Outlier Factor based on recommender. [Breunig2000] "light_lof"Use a variant of LOF based on nearest neighbor.
-
parameter Specify parameters for the algorithm. Its format differs for each
method.- common
unlearner: Specify unlearner strategy. If you don’t use unlearner, you should omit this parameter. You can specify unlearnerstrategy described in Unlearner. Data will be deleted by the ID based on strategy specified here.unlearner_parameter: Specify unlearner parameter. You can specify unlearner_parameterUnlearner. You cannot omit this parameter when you specifyunlearner. Data in excess of this number will be deleted automatically.note:
unlearnerandunlearner_parametercan be omitted .- lof
nearest_neighbor_num: Number of neighbors The bigger it is, the less false-positives are found, but the more false-negatives are found. (Integer)
- Range: 2 <=
nearest_neighbor_num
reverse_nearest_neighbor_num: Number of reverse neighbors to update, when annomaly measure values are update. The bigger it is, the more accurately measures are updated, but the longer update-time is required. (Integer)
- Range:
nearest_neighbor_num<=reverse_nearest_neighbor_num
ignore_kth_same_point: Avoid scores to go
infby limiting the number of duplicate records tonearest_neighbor_num - 1. This parameter is optional and isfalse(disabled) by default. (Boolean)method: Algorithm name of recommender for nearest neighbor search. Refer
methodin Recommender.parameter: Parameters of the recommender for nearest neighbor search. Refer
parameterin Recommender.- Range: 2 <=
- light_lof
nearest_neighbor_num: Number of neighbors The bigger it is, the less false-positives are found, but the more false-negatives are found. (Integer)
- Range: 2 <=
nearest_neighbor_num
reverse_nearest_neighbor_num: Number of reverse neighbors to update, when annomaly measure values are update. The bigger it is, the more accurately measures are updated, but the longer update-time is required. (Integer)
- Range:
nearest_neighbor_num<=reverse_nearest_neighbor_num
ignore_kth_same_point: Avoid scores to go
infby limiting the number of duplicate records tonearest_neighbor_num - 1. This parameter is optional and isfalse(disabled) by default. (Boolean)method: Algorithm name of nearest neighbor for nearest neighbor search. Refer
methodin Nearest Neighbor.parameter: Parameters of the nearest neighbor for nearest neighbor search. Refer
parameterin Nearest Neighbor.- Range: 2 <=
-
converter Specify configuration for data conversion. Its format is described in Data Conversion.
- Example:
{ "method" : "lof", "parameter" : { "nearest_neighbor_num" : 10, "reverse_nearest_neighbor_num" : 30, "method" : "euclid_lsh", "parameter" : { "hash_num" : 64, "table_num" : 4, "seed" : 1091, "probe_num" : 64, "bin_width" : 100 } }, "converter" : { "string_filter_types" : {}, "string_filter_rules" : [], "num_filter_types" : {}, "num_filter_rules" : [], "string_types" : {}, "string_rules" : [ { "key" : "*", "type" : "str", "sample_weight" : "bin", "global_weight" : "bin" } ], "num_types" : {}, "num_rules" : [ { "key" : "*", "type" : "num" } ] } }
Data Structures¶
Methods¶
-
service
anomaly -
bool
clear_row(0: string id)¶ Parameters: - id – point ID to be removed
Returns: True when the point was cleared successfully
Clears a point data with ID
id.
-
id_with_score
add(0: datum row)¶ Parameters: - row –
datumfor the point
Returns: Tuple of the point ID and the anomaly measure value
Adds a point data
row.- row –
-
list<string>
add_bulk(0: list<datum> data)¶ Parameters: - data – List of
datumfor the points
Returns: The list of successfully added IDs.
Adds a bulk of points. In contrast to
add, this API doesn’t return anomaly measure values.- data – List of
-
double
update(0: string id, 1: datum row)¶ Parameters: - id – point ID to update
- row – new
datumfor the point
Returns: Anomaly measure value
Updates the point
idwith the datarow.
-
double
overwrite(0: string id, 1: datum row)¶ Parameters: - id – point ID to overwrite
- row – new
datumfor the point
Returns: Anomaly measure value
Overwrites the point
idwith the datarow.
-
double
calc_score(0: datum row)¶ Parameters: - row –
datum
Returns: Anomaly measure value for given
rowCalculates an anomaly measure value for the point data
rowwithout adding a point.At this time, extremely large numbers can be returned. For the detail, please refer to FAQs:anomaly detection .
- row –
-
list<string>
get_all_rows()¶ Returns: List of all point IDs Returns the list of all point IDs.
-
bool