多维放缩(MDS)与主成分分析(PCA)

摘要

多维缩放(MDS)是一种保持样本间距离关系的降维技术,通过将高维空间中的距离矩阵转换为低维空间中的内积矩阵来实现。在MDS中,首先计算原始数据间的欧氏距离,然后构造出一个中心化的内积矩阵B,并对其进行特征值分解以获得降维后的坐标。主成分分析(PCA)则是另一种广泛使用的降维方法,它基于最大化投影后样本点方差的原则,通过求解协方差矩阵的特征向量找到最佳投影方向。两种方法都旨在减少数据维度同时尽可能保留原始数据的信息。

Abstract

Multidimensional scaling (MDS) is a dimensionality reduction technique that maintains the distance relationship between samples by converting the distance matrix in the high-dimensional space to the inner product matrix in the low-dimensional space. In MDS, the Euclidean distance between the original data is calculated first, and then a centralized inner product matrix B is constructed, and the eigenvalue decomposition is performed to obtain the coordinates after dimensionality reduction. Principal component analysis (PCA) is another widely used dimensionality reduction method, which is based on the principle of maximizing the variance of the projected sample points, and finds the best projection direction by solving the eigenvector of the covariance matrix. Both approaches aim to reduce the data dimension while preserving as much information as possible about the original data.

1. 多维缩放(MDS)

目的:要求原始空间中样本之间的距离在低维中得以保持。

在这里插入图片描述
假定一共有m个样本空间的距离矩阵为
D∈ R m × m {R}^{m\times m} Rm×m,令 B = Z T Z ∈ R m × m \mathbf{B}=\mathbf{Z}^{\mathrm{T}}\mathbf{Z}\in\mathbb{R}^{m\times m} B=ZTZRm×m,其中B为降维后样本的内积矩阵, b i j = z i T z j b_{ij}=z_i^\mathrm{T}z_j bij=ziTzj d i s t i j 2 = ∥ z i ∥ 2 + ∥ z j ∥ 2 − 2 z i T z j = b i i + b j j − 2 b i j . dist_{ij}^2=\|\boldsymbol{z}_i\|^2+\|\boldsymbol{z}_j\|^2-2\boldsymbol{z}_i^\mathrm{T}\boldsymbol{z}_j\\=b_{ii}+b_{jj}-2b_{ij} . distij2=zi2+zj22ziTzj=bii+bjj2bij.
令降维后的样本Z被中心化 ∑ i = 1 m z i = 0 \sum_{i=1}^mz_i=0 i=1mzi=0.显然,矩阵B的行与列之和均为零,即 ∑ i = 1 m b i j = ∑ j = 1 m b i j = 0. \sum_{i=1}^mb_{ij}=\sum_{j=1}^mb_{ij}=0. i=1mbij=j=1mbij=0.可以得到 ∑ i = 1 m d i s t i j 2 = t r ( B ) + m b j j ∑ j = 1 m d i s t i j 2 = t r ( B ) + m b i i ∑ i = 1 m ∑ j = 1 m d i s t i j 2 = 2 m t r ( B ) \sum_{i=1}^{m}dist_{ij}^{2}=\mathrm{tr}(\mathbf{B})+mb_{jj}\\\sum_{j=1}^{m}dist_{ij}^{2}=\mathrm{tr}(\mathbf{B})+mb_{ii}\\\sum_{i=1}^{m}\sum_{j=1}^{m}dist_{ij}^{2}=2m \mathrm{tr}(\mathbf{B})

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值