In this page, we show the list of references for each algorithms.

Classifier & Regression


PA(PA, PA1, PA2): Passive Aggressive
[Crammer03a]Koby Crammer, Ofer Dekel, Shai Shalev-Shwartz and Yoram Singer, Online Passive-Aggressive Algorithms, Proceedings of the Sixteenth Annual Conference on Neural Information Processing Systems (NIPS), 2003.
[Crammer03b]Koby Crammer and Yoram Singer. Ultraconservative online algorithms for multiclass problems. Journal of Machine Learning Research, 2003.
[Crammer06]Koby Crammer, Ofer Dekel, Joseph Keshet, Shai Shalev-Shwartz, Yoram Singer, Online Passive-Aggressive Algorithms. Journal of Machine Learning Research, 2006.
CW: Confidence Weighted Learning
[Dredze08]Mark Dredze, Koby Crammer and Fernando Pereira, Confidence-Weighted Linear Classification, Proceedings of the 25th International Conference on Machine Learning (ICML), 2008
[Crammer08]Koby Crammer, Mark Dredze and Fernando Pereira, Exact Convex Confidence-Weighted Learning, Proceedings of the Twenty Second Annual Conference on Neural Information Processing Systems (NIPS), 2008
[Crammer09a]Koby Crammer, Mark Dredze and Alex Kulesza, Multi-Class Confidence Weighted Algorithms, Empirical Methods in Natural Language Processing (EMNLP), 2009
AROW: Adaptive Regularization of Weight vectors
[Crammer09b]Koby Crammer, Alex Kulesza and Mark Dredze, Adaptive Regularization Of Weight Vectors, Advances in Neural Information Processing Systems, 2009
NHERD: Normal Herd
[Crammer10]Koby Crammer and Daniel D. Lee, Learning via Gaussian Herding, Neural Information Processing Systems (NIPS), 2010.
Iterative Parameter Mixture
[McDonald10]Ryan McDonald, K. Hall and G. Mann, Distributed Training Strategies for the Structured Perceptron, North American Association for Computational Linguistics (NAACL), 2010.
[Mann09]Gideon Mann, R. McDonald, M. Mohri, N. Silberman, and D. Walker, Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models, Neural Information Processing Systems (NIPS), 2009.



minhash: b-Bit Minwise Hash
[Ping2010]Ping Li, Arnd Christian Konig, b-Bit Minwise Hashing, WWW, 2010
euclid_lsh: Euclidean LSH
[Datar2004]Mayur Datar, Nicole Immorlica, Piotr Indyk, Vahab S. Mirokni, Locality-Sensitive Hashing Scheme Based on p-Stable Distributions, SCG, 2004.
[Andoni2005]Alex Andoni, LSH Algorithm and Implementation (E2LSH),
[Lv2007]Qin Lv, William Josephson, Zhe Wang, Moses Charikar, Kai Li, Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search, VLDB, 2007.



Local Outlier Factor
[Breunig2000]Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, Jörg Sander, LOF: Identifying Density-Based Local Outliers, SIGMOD, 2000.



  1. Feldman, M. Langberg. “A Unified Framework for Approximating and Clustering Data.” STOC ‘11: Proceedings of the 43rd annual ACM Symposium on Theory of Computing, pp. 569-578.
  1. Feldman, M. Faulkner, A. Krause. “Scalable Training of Mixture Models via Coresets.” Advances in Neural Information Processing Systems 24, 2011.



Epsilon Greedy, Softmax
    1. Sutton, A. G. Barto, “Introduction to Reinforcement Learning.”, MIT Press, 1998.
Epsilon decreasing (Greedy Mix)
  1. Cesa-Bianchi, P. Fischer, “Finite-time Regret Bounds for the Multiarmed Bandit Problem”, ICML, 1998.
  1. Auer, N. Cesa-Bianchi, P. Fischer, “Finite Analysis of the Multiarmed bandit problem.” Machine Learning, Vol. 47, pp. 235-256, 2002.
  1. Auer, N. Cesa-Bianchi, Y. Freund, R. E. Schapire, “Gambling in a rigged casino: The adversarial multi-arm bandit problem.” FOCS‘95, pp. 322-331, 1995.
Thompson Sampling
[Thompson1933]Thompson, William R. “On the likelihood that one unknown probability exceeds another in view of the evidence of two samples.” Biometrika 25.3/4 (1933): 285-294.