Each title was embedded with a small sentence model (MiniLM, 384 dimensions), then projected down to 2D using classical MDS. Colors come from k-means clustering on the full embeddings — not on the 2D projection. Click any title to see its closest semantic neighbors.
Embeddings: each title's name + year + format + tags is run through Xenova/all-MiniLM-L6-v2, producing a 384-dimensional vector. Vectors are length-normalized so cosine similarity equals dot product.
Layout: classical multidimensional scaling on the 30×30 distance matrix. The two coordinates you see are the top two eigenvectors of the double-centered squared-distance matrix. No t-SNE / UMAP — just linear algebra.
Clusters: k-means with k=4, run on the full 384-dim vectors (not the 2D projection), then color-coded on the map. Restarted a few times to dodge bad initializations.
Neighbors: cosine similarity in the original 384-dim space. The 2D layout is an approximation — neighbors that look distant on the map are still ranked correctly in the panel.