Year of Award
Master of Science (MS)
Other Degree Name/Area of Focus
Department or School/College
Brian Steele, Emily Stone, Javier Perez Alvaro
Outlier, Computational complexity, High dimesional dataset
University of Montana
Applied Statistics | Probability | Theory and Algorithms
In statistics and data science, outliers are data points that differ greatly from other observations in a data set. They are important attributes of the data because they can dramatically influence patterns and relationships manifested by non-outliers. It is therefore very important to detect and adequately deal with outliers. Recently, a novel algorithm, the ROMA algorithm, has been proposed . In this paper, we propose a modification of the ROMA algorithm that reduces its computational complexity from $O(n^2 m)$ to $O((n/(2^m-o(1)))^2 m)$ where $n$ is the number of data points and $m$ is the dimension of the space. And as a consequence, if $\log(n) <2^m$, then the improved complexity is $O((n/\log(n))^2 m)$.
Khormali, Omid, "High Dimensional Outlier Detection" (2019). Graduate Student Theses, Dissertations, & Professional Papers. 11377.
© Copyright 2019 Omid Khormali