bevfusion transformation 分析

原创

已于 2023-05-10 23:19:31 修改 · 1.4k 阅读

标签

#python #深度学习 #开发语言

于 2023-05-10 13:53:03 首次发布

文章详细分析了bevfusion在mmdetection3D框架上的二次开发中，针对点云和图像的3D数据增强方法。点云数据增强涉及全局缩放、旋转和平移，而图像的3D增强包括resize、crop、flip和rotate。LSS过程中，点云首先通过逆变换恢复原始状态，再进行lidar到image的投影，结合image的深度信息预估深度。整个过程展示了深度学习在3D目标检测中的坐标变换逻辑。

bevfusion是在mmdetection3D的代码框架上的二次开发，但是基于的版本是比较早期的版本，坐标系系统可能比较混乱。如下具体分析下训练过程中的坐标变换，点云数据增强、图像3D数据增强、LSS过程中如上两个变化的使用。

1. 点云的数据增强

class GlobalRotScaleTrans:
    def __init__(self, resize_lim, rot_lim, trans_lim, is_train):
        self.resize_lim = resize_lim
        self.rot_lim = rot_lim
        self.trans_lim = trans_lim
        self.is_train = is_train

    def __call__(self, data: Dict[str, Any]) -> Dict[str, Any]:
        transform = np.eye(4).astype(np.float32)

        if self.is_train:
            scale = random.uniform(*self.resize_lim)
            theta = random.uniform(*self.rot_lim)
            translation = np.array([random.normal(0, self.trans_lim) for i in range(3)])
            rotation = np.eye(3)
            
    
            # 使用base_points类对应的rotate,translate,scale函数对点云进行相应的变换
            # 注意这里用的是-theta，逆时针-theta，也就是顺时针theta
            if "points" in data:
                data["points"].rotate(-theta)
                data["points"].translate(translation)
                data["points"].scale(scale)
                       
            # 使用lidar_boxes类的rotate,translate,scale对box进行相应的变换
            # 注意这里用的是theta，顺时针theta
            gt_boxes = data["gt_bboxes_3d"]
            rotation = rotation @ gt_boxes.rotate(theta).numpy()
            gt_boxes.translate(translation)
            gt_boxes.scale(scale)
            data["gt_bboxes_3d"] = gt_boxes
            
            # 保留变换矩阵
            # 注意，这里rotation加了转置，返回的矩阵是逆时针theta的矩阵
            # 这里转置变成顺时针的矩阵，和上面的变换保持一致
            transform[:3, :3] = rotation.T * scale
            transform[:3, 3] = translation * scale

        data["lidar_aug_matrix"] = transform
        return

base_points中的rotate

逆时针旋转theta角

# 逆时针旋转theta角
elif axis == 2 or axis == -1:
    rot_mat_T = rotation.new_tensor(
    [[rot_cos, -rot_sin, 0], [rot_sin, rot_cos, 0], [0, 0, 1]]
                )
# 转置，顺时针旋转theta角
rot_mat_T = rot_mat_T.T

# 右乘，逆时针旋转theta角度
self.tensor[:, :3] = self.tensor[:, :3] @ rot_mat_T

lidar_boxes中的rotate

顺时针旋转theta角