Wednesday, December 14, 2016

Distance based Formulations of Data Depth

In an earlier post, I had described a taxonomy of the various types of depth formulations. In this post, the focus will be on distance based formulations of data depth. As the name suggests, these are a class of depth formulations that have something (or a lot) to do with the choice of the distance metric in the space of data. For example, as in many of the formulations, the depth of a data object within an ensemble turns out to be related to the inverse of the sum of distances from all the other objects in the ensemble. As we can choose from among the several valid distance metrics, the choices lead to different corresponding properties in the depth formulations. Let us look at some examples:


$L_2$  depth: The $L_2$  depth of a point, $x$ in an ensemble, $X$, is defined as the inverse of the expected value of the distance of $x$ from another point in the ensemble.

$$D^{L_2} (x|X) = (1+ E||x-X||)^{-1}$$
While being easy to grasp, and also to compute, the $L_2$ depth is unable to properly capture anisotropy in the distribution. For example, in a high-dimensional ellipsoid-shaped point cloud, a point near the center along the minor axis would tend to be assigned a depth similar to another at the same distance along a major axis.


Mahalanobis depth: The Mahalanobis depth is able to capture the anisotropy in the structure of a distribution up to it's second moment. We can compute it by simply replacing the distance metric in the definition of the $L_2$ depth by the Malalanobis distance.



$$D^{\textrm{Mah}}(z|X) = \big( 1 + (x-E[X]) {\sum_X}^{-1} (x-E[X])\big)^{-1}$$

Projection Depth: In projection depth, the idea is that if we sort the magnitude projections of the points along a centered unit vector, the central points will tend to have a projection with a smaller magnitude while outlier points will have projections with larger magnitudes. We can invert these projections to depth values that decrease monotonically from the center.


$$ D^{\textrm{proj}}(x|X) = \bigg(1+\textrm{sup}_{p \in S^{d-1}} \frac{| \langle p,x \rangle - \textrm{med} ( | \langle p,X \rangle ) |}{ \textrm{Dmed}(\langle p,X \rangle)} \bigg)$$


where $ S^{d-1} $ denotes the unit sphere in $ \mathbb{R}^d $ , $ \langle , \rangle $ is the inner product, med($U$) denotes the univariate median of random variable U, and $ \textrm{Dmed}(U)=\textrm{med}(|U-\textrm{med}(U)|) $.


Oja Depth: The Oja depth is based on the average volume of simplices (or convex hulls) that have the vertices from the data. The idea is that the average volume of the simplices supported by a central vertex would be higher than the average volume of the simplices supported by an outlying vertex. Again, we can invert the average volume of simplices associated with a particular point (as permanent support) to get depth values that decrease monotonically from the center.



$$ D^{\textrm{Oja}}(x|X) = \bigg(1+\frac{E (\textrm{vol}_{d}(co\{x, X_1,\ldots,X_d\}))}{\sqrt{\textrm{det}\sum_{X}}} \bigg)^{-1}$$


where $co$ denotes the convex hull, $V_d$ is the $d$-dimensional volume, and $\sum_X$ is the covariance matrix.

While we don't explily see a distance metric in the formulations for projection and Oja depth, they are still considered as "distance based" because of the close relationship of the distance metric to the inner product and volume, respectively. Here is a reference for more information on the ability of the above formulations to discriminate between the distribution structures:


Reference:
 Mosler, Karl. "Depth statistics." Robustness and complex data structures. Springer Berlin Heidelberg, 2013. 17-34. 

No comments:

Post a Comment