Distance is used to measure the similarity of data. For multiple attributes in an observation, use a weighted sum for each attribute to compare two observations. Dense, continuous: Minkowski distance Asymmetric: ignore the null/null cases Sparse: Cosine similarity, Jaccard similarity Sequence: Dynamic matching ## Dissimilarity matrix ## Hamming distance Proportion of bits that are the same in binary vector. d(i,j) = i == j / n n - (i = j = F) ## Jaccard coefficient ## Minkowski distance LP Norm ## Euclidean distance ## Manhattan distance ## Cosine similarity ## Dynamic Time Warping Matching Used to map the most similar points in a sequence (e.g., time series, text)