class: center, middle, inverse, title-slide .title[ # Dimension Reduction:MDS ] .author[ ### Anastasios Panagiotelis ] .institute[ ### University of Sydney ] --- # Outline - Classical Multidimensional Scaling -- + Euclidean distance + Non-Euclidean distance -- - Some applications -- - Sammon Mapping --- class: inverse, middle, center # Multidimensional Scaling (MDS) --- #The idea - Input points: `\(\bx_i\in\bbR^p\)` for `\(i=1,\dots,n\)` - Output points: `\(\by_i\in\bbR^m\)` for `\(i=1,\dots,n\)` + `\(m<<p\)` -- - Denote distance between `\(\bx_i\)` and `\(\bx_j\)` as `\(\delta_{ij}\)`. -- - Denote distance between `\(\by_i\)` and `\(\by_j\)` as `\(d_{ij}\)`. -- - Want `\(\delta_{ij}\)` to be similar to `\(d_{ij}\)`. --- # Strain - Assume Euclidean distances -- for now! -- - Objective is to minimise strain defined as: `$$\sum\limits_{i=1}^{n-1}\sum\limits_{j=i+1}^n (\delta_{ij}^2-d_{ij}^2)$$` -- - This is known as classical MDS -- - How can we solve this? --- # Solution 1. Construct an `\(n\times n\)` matrix of squared interpoint distances `\(\bDelta^{(2)}=\left\{\delta^2_{ij}\right\}\)` -- 2. Double center `\(\bDelta^{(2)}\)` by computing `\(\bB=\bH'\bDelta^{(2)}\bH\)` where `\(\bH=\bI-(1/n)\biota\biota'\)` -- 3. Find the eigenvalue decomposition of `\(\bB=\bU\bLambda\bU'\)` -- 4. The output coordinates are given by the first `\(m\)` columns of `\(\bU\bLambda^{1/2}\)` --- # Implementation ```r library(tidyverse) wb<-read_csv('../data/WorldBankClean.csv') wb%>% select_if(.,is.numeric)%>% #Use numeric data scale()%>% #Standardise dist()%>% #Compute distance matrix cmdscale()%>% #MDS as_tibble(.name_repair = 'universal')%>% cbind(wb)->wb_mds ``` --- # Plot
--- # Look familiar?? - This is almost identical to Principal Components Analysis -- - The axes have been flipped! -- - PCA is invariant to reflections. -- Why? -- - MDS is also invariant to rotations. -- Why? --- # Why are they the same? - Proof is a bit involved. -- - The key idea to show that `\(\bB\)` is related to `\(\bX\bX'\)` while `\(\bS\)` is related to `\(\bX'\bX\)` -- - `\(\bX'\bX\)` and `\(\bX\bX'\)` have the same non-zero eigenvalues. -- - Geometrically, the result makes sense by thinking about the extreme case where the data lie on a `\(m\)`-dimensional plane. --- class:inverse,middle,center # Beyond Euclidean Distance --- # Non-Euclidean Distance - What if a non-Euclidean distance is used? -- - In this case classical MDS does not minimise Strain as defined previously, but minimises `\(tr(\bB-\bB^{*})\)`. -- - Here `\(\bB^{*}\)` is the doubly centered squared (Euclidean) distance matrix in the output space. -- - Distances between output points faithfully represent distances between input points. -- - Only use eigenvectors correponding to non-negative eigenvalues --- # Implementation (L1) ```r library(tidyverse) wb<-read_csv('../data/WorldBankClean.csv') wb%>% select_if(.,is.numeric)%>% #Use numeric data scale()%>% #Standardise dist(method = 'manhattan')%>% #Compute distance matrix cmdscale()%>% #MDS as_tibble(.name_repair = 'universal')%>% cbind(wb)->wb_mds_L1 ``` --- # Plot (L1) <img src="03MDS_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> --- # Plot (L2) <img src="03MDS_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" /> --- # Why is this useful? - We can have distances/dissimilarities between all sorts of objects -- + Time series + Functions + Probability distributions + Strings/ Texts --- # A toy example - Consider the word for "mother" in different languages -- - The Levenshtein distance can be computed between strings + Counts number of insertions, deletions and substitutions to convert one string to another -- - Pairwise Levenshtein distances computed and then classical multidimensional scaling applied. --- #Languages
--- # Distance between pdfs - Consider the electricity smart meter data. -- - The distance between pdfs can be measured using a Jensen Shannon distance. -- - This is the square root of the average of the Kullback Leibler divergence from `\(P\)` to `\(Q\)` and from `\(Q\)` to `\(P\)`. -- - For a log normal distribution this is available in closed form. -- - Consider one household so that each observation corresponds to a time of week. --- # Distance between pdfs
--- class: inverse, middle, center # MDS: Beyond Linearity --- # Beyond Classical MDS - Classical MDS is designed to minimise Strain. -- - An alternative objective function called Stress can be minimised instead -- `$$\mbox{Stress}=\sum\limits_{i=1}^{n-1}\sum\limits_{j>i}\frac{(\delta_{ij}-d_{ij})^2}{\delta_{ij}}$$` -- - The difference between `\(\delta_{ij}\)` and `\(d_{ij}\)` acts like an error. -- - The `\(\delta_{ij}\)` on the denominator acts as a weight --- #Weighting - For large `\(\delta\)` observations are far away in the input space. -- - For these pairs errors are more easily tolerated. -- - For small `\(\delta\)` observations are close in the input space. -- - For these pairs errors are not tolerated. -- - The most accuracy is achieved for nearby points -- - The local structure is preserved. --- # Sammon mapping - The Sammon mapping is solved by numerical optimisation (gradient descent). -- - It is different from the classical solution -- - It is not based on an eigenvalue decomposition -- - It is not based on rotation -- - It is a non-linear mapping. --- # Example - The following is a simulated toy example to motivate non-linear dimension reduction. -- - Consider the case where input points are 2D and the output points are 1D. -- - The specific problem of doing multidimensional scaling where the lower dimension is 1 is called *seriation*. --- # Original Data <img src="03MDS_files/figure-html/unnamed-chunk-8-1.png" style="display: block; margin: auto;" /> --- # Original Data <img src="03MDS_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> --- # Rotate (Classical Solution) <img src="03MDS_files/figure-html/unnamed-chunk-10-1.png" style="display: block; margin: auto;" /> --- # Keep 1 Dimension <img src="03MDS_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> --- # Sammon Mapping <img src="03MDS_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" /> --- # Discussion - Classical MDS cannot account for non-linearity. + The dark blue and yellow points are represented as close to one another. -- - Sammon does account for non-linearity. + The blue and yellow points are represented as far apart. -- - The Sammon mapping preserves local structure. --- # Summary - Classical MDS with Euclidean input distances is identical (up to rotation) to PCA -- - This no longer holds with non-Euclidean input distances -- + Classical MDS gives output points that 'approximate' information in the double centered squared distance matrix. -- - The Sammon mapping highlights the importance of preserving local structure. --- class: inverse, center, middle #Questions?