(self-supervised learning)Event Camera Data Pre-training

Publisher: ICCV 2023 

MOTIVATION OF READING: 自监督学习、稀疏事件 = NILM

link: https://arxiv.org/pdf/2301.01928.pdf

Code: GitHub - Yan98/Event-Camera-Data-Pre-training


1. Overview

Contributions are summarized as follows:

1. A self-supervised framework for event camera data pre-training. The pre-trained model can be transferred to diverse downstream tasks;

2. A family of event data augmentations, generating meaningful event images;

3. A conditional masking strategy, sampling informative event patches for network training;

4. An embedding projection loss, using paired RGB embeddings to regularize event embeddings to avoid model collapse;

5. A probability distribution alignment loss for aligning embeddings from the paired event and RGB images.

6. We achieve state-of-the-art performance in standard event benchmark datasets.

2. Related work

The SSL frameworks can be generally divided into two categories: contrastive learning and masked modeling.

2.1 Contrastive learning

This approach generally assumes augmentation invariance of images. one notable drawback
of contrastive learning is suffering from model collapse and training instability.

2.2 Masked modeling

Reconstructing masked inputs from the (i. e., unmasked) visible ones is a popular selfsupervised
learning objective motivated by the idea of autoencoding. (Bert, GPT)

3. Methodology

For pre-training, our method takes event data E and its paired natural RGB image I as inputs, and outputs a pre-trained network fe.

Firstly, consecutively perform data augmentations, event image generation, and conditional masking to obtain two patch sets (xq, xk).

Secondly, fe extracts features from event patch set xq, and he_img and he_evt separately project features from fe to latent embeddings q_img and q_evt.

fm and hm_evt are the momentum of fe and he_evt, and are updated by the exponential moving average (EMA). (momentum的含义可以参考MOCO论文)

The momentum network takes patch set xk as input and generates an embedding k_evt.

At the same time, the natural RGB image I is embeded into y = f1(h1(I)).

Finally, we perform event discrimination, and event and natural RGB image discrimination to train our model. 这里不用INFONCE直接对q_evt和k_evt进行相似度计算是因为这么做会导致embedding collapse使得embedding过于相似。原因是事件图像是稀疏离散。因此使用RGB图像的映射。

L_evt is an event embedding projection loss aiming to pull together paired event embeddings qevt and kevt, for event discrimination.

L_RGB aims to pull together paired event and RGB embeddings q_evt and y, for event and natural RGB image discrimination.

L_k1 aims to drive fe learning discriminative event embeddings, towards well-structured embedding space of natural RGB images.


InfoNCE loss Contrastive learning aims to pull together embeddings q and k+, and pushes away embeddings q and {k−}.

Event embedding projection loss

ζ(v1, v2) is the projection function.

Event and RGB image discrimination

Considering the sparsity of the event image, a single event image is less informative than an RGB image, possessing difficulty for self-supervised event network training.

We pull together embeddings of paired event and RGB images, xq and I.

we first compute the pairwise embedding similarity and then fit an exponential kernel to the similarities to compute probability scores. The probability score of the (i, j)-th pair is given by,

Our probability distribution alignment loss is given by,

Total Loss

where λ1 is a hyper-parameter for balancing the losses.

4. Experiment

We evaluate our method on three downstream tasks: object recognition, optical flow estimation, and semantic segmentation.

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.hqwc.cn/news/308784.html

如若内容造成侵权/违法违规/事实不符,请联系编程知识网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

SONiC和ONL所依赖的Debian版本说明

Debian 的最新几个版本 下一代 Debian 正式发行版的代号为 trixie — 测试(testing)版 Debian 12 (bookworm) — 当前的稳定(stable)版 Debian 11 (bullseye) — 当前的旧的稳定(oldstable)版 Debian 10&a…

OR-NeRF论文笔记

OR-NeRF论文笔记 文章目录 OR-NeRF论文笔记论文概述Abstract1 Introduction2 Related Work3 Background4 Method4.1 Multiview Segmentation4.2 Scene Object Removal 5 ExperimentsDatasetsMetricsMultiview SegmentationScene Object Removal 6 Conclusion 论文概述 目的&am…

1panel使用指南(一)面板安装

一、1panel简介 1Panel是杭州飞致云信息科技有限公司推出的产品 [1],帮助用户实现快速建站。 [2]是一款现代化、开源的Linux服务器运维管理面板,于2023年3月推出,深度集成WordPress和Halo,一键完成域名绑定、SSL证书配置等操作&a…

懒加载的el-tree中没有了子节点之后还是有前面icon箭头的展示,如何取消没有子节点之后的箭头显示

没有特别多的数据 <template><el-tree:props"props":load"loadNode"lazyshow-checkbox></el-tree></template><script>export default {data() {return {props: {label: name,children: zones,isLeaf:"leaf",//关…

django基础学习

django基础学习 文章目录 django基础学习django框架urls.py将请求发送到正确的视图views.py处理请求models.py定义数据模型根据models查询数据HTML模板呈现数据 Django项目结构创建虚拟环境下载django创建站点创建应用settings.py项目设置 通用类别视图会话框架身份验证视图使用…

4. AOP

1 AOP能解决什么问题 1.1 提出问题 1.1.1 情景&#xff1a;数学计算器 1.1.1.1 要求 ①执行加减乘除运算 ②日志&#xff1a;在程序执行期间追踪正在发生的活动 ArithmeticCalculator.java public interface ArithmeticCalculator {double add(double a, double b);doub…

AI面板识别 - 华为OD统一考试

OD统一考试 (B卷) 分值: 100分 题解: Java / Python / C++ 题目描述 AI识别到面板上有N(1 ≤ N ≤ 100)个指示灯,灯大小一样,任意两个之间无重叠。 由于AI识别误差,每次别到的指示灯位置可能有差异,以4个坐标值描述AI识别的指示灯的大小和位置(左上角x1,y1,右下角x2…

Centos安装Kafka(KRaft模式)

1. KRaft引入 Kafka是一种高吞吐量的分布式发布订阅消息系统&#xff0c;它可以处理消费者在网站中的所有动作流数据。其核心组件包含Producer、Broker、Consumer&#xff0c;以及依赖的Zookeeper集群。其中Zookeeper集群是Kafka用来负责集群元数据的管理、控制器的选举等。 由…

EasyNTS端口穿透服务新版本发布 0.8.7 增加隧道流量总数记录,可以知晓设备哪个端口耗费流量了

EasyNTS上云平台可通过远程访问内网应用&#xff0c;包含网络桥接、云端运维、视频直播等功能&#xff0c;极大地解决了现场无固定IP、端口不开放、系统权限不开放等问题。平台可提供一站式上云服务&#xff0c;提供直播上云、设备上云、业务上云、运维上云服务&#xff0c;承上…

爬虫工作量由小到大的思维转变---<第三十五章 Scrapy 的scrapyd+Gerapy 部署爬虫项目>

前言: 项目框架没有问题大家布好了的话,接着我们就开始部署scrapy项目(没搭好架子的话,看我上文爬虫工作量由小到大的思维转变---&#xff1c;第三十四章 Scrapy 的部署scrapydGerapy&#xff1e;-CSDN博客) 正文: 1.创建主机: 首先gerapy的架子,就相当于部署服务器上的;所以…

Collector收集器的高级用法

Collectors收集器的高级用法 场景1&#xff1a;获取关联的班级名称 原先如果需要通过关联字段拿到其他表的某个字段&#xff0c;只能遍历List匹配获取 for (Student student : studentList) {Long clazzId student.getClazzId();// 遍历班级列表&#xff0c;获取学生对应班级…

超维空间S2无人机使用说明书——51、基础版——使用yolov8进行目标跟踪

引言&#xff1a;为了提高yolo识别的质量&#xff0c;提高了yolo的版本&#xff0c;改用yolov8进行物体识别&#xff0c;同时系统兼容了低版本的yolo&#xff0c;包括基于C的yolov3和yolov4&#xff0c;以及yolov7。 简介&#xff0c;为了提高识别速度&#xff0c;系统采用了G…