Resource: Seminar

From MobiNetS
Revision as of 09:36, 25 October 2024 by Congrong (talk | contribs)
Jump to: navigation, search

Time: 2024-10-25 10:30-12:00
Address: 4th Research Building A533
Useful links: 📚 Readling list; 📆 Schedules; 🧐 Previous seminars.

Latest

  1. [ICDM‘23] Feature Aggregating Network with Inter-Frame Interaction for Efficient Video Super-Resolution, Shuhong
    Abstract: Video super-resolution (VSR) on mobile devices aims to restore high-resolution frames from their low-resolution counterparts, satisfying the requirements of performance, FLOPs and latency. On one hand, partial feature processing, as a classic and acknowledged strategy, is developed in current studies to reach an appropriate trade-off between FLOPs and accuracy. However, the splitting of partial feature processing strategy are usually performed in a blind manner, thereby reducing the computational efficiency and performance gains. On the other hand, current methods for mobile platforms primarily treat VSR as an extension of single-image super-resolution to reduce model calculation and inference latency. However, lacking inter-frame information interaction in current methods results in a suboptimal latency and accuracy trade-off. To this end, we propose a novel architecture, termed Feature Aggregating Network with Inter-frame Interaction (FANI), a lightweight yet considering frame-wise correlation VSR network, which could achieve real-time inference while maintaining superior performance. Our FANI accepts adjacent multi-frame low-resolution images as input and generally consists of several fully-connection-embedded modules, i.e., Multi-stage Partial Feature Distillation (MPFD) for capturing multi-level feature representations. Moreover, considering the importance of inter-frame alignment, we further employ a tiny Attention-based Frame Alignment (AFA) module to promote inter-frame information flow and aggregation efficiently. Extensive experiments on the well-known dataset and real-world mobile device demonstrate the superiority of our proposed FANI, which means that our FANI could be well adapted to mobile devices and produce visually pleasing results.
  2. [INFOCOM'23] Cross-Camera Inference on the Constrained Edge, Xinyan
    Abstract: The proliferation of edge devices has pushed computing from the cloud to the data sources, and video analytics is among the most promising applications of edge computing. Running video analytics is compute- and latency-sensitive, as video frames are analyzed by complex deep neural networks (DNNs) which put severe pressure on resource-constrained edge devices. To resolve the tension between inference latency and resource cost, we present Polly, a cross-camera inference system that enables co-located cameras with different but overlapping fields of views (FoVs) to share inference results between one another, thus eliminating the redundant inference work for objects in the same physical area. Polly’s design solves two basic challenges of cross-camera inference: how to identify overlapping FoVs automatically, and how to share inference results accurately across cameras. Evaluation on NVIDIA Jetson Nano with a real-world traffic surveillance dataset shows that Polly reduces the inference latency by up to 71.4% while achieving almost the same detection accuracy with state-of-the-art systems.
  3. [TMC'24] CrossVision: Real-Time On-Camera Video Analysis via Common RoI Load Balancing, Xinyan
    Abstract: Smart cameras with on-device deep learning inference capabilities are enabling distributed video analytics at the data source without sending raw video data over the often unreliable and congested wireless network. However, how to unleash the full potential of the computing power of the camera network requires careful coordination among the distributed cameras, catering to the uneven workload distribution and the heterogeneous computing capabilities. This paper presents CrossVision, a distributed framework for real-time video analytics, that retains all video data on cameras while achieving low inference delay and high inference accuracy. The key idea behind CrossVision is that there is a significant information redundancy in the video content captured by cameras with overlapped Field-of-Views (FoVs), which can be exploited to reduce inference workload as well as improve inference accuracy between correlated cameras. CrossVision consists of three main components to realize its function: a Region-of-Interest (RoI) Matcher that discovers video content correlation based on a segmented FoV transformation scheme; a Workload Balancer that implements a randomized workload balancing strategy based on a bulk-queuing analysis, taking into account the cameras’ predicted future workload arrivals; an Accuracy Guard that ensures that the inference accuracy is not sacrificed as redundant information is discarded. We evaluate CrossVision in a hardware-augmented simulator and on real-world cross-camera datasets, and the results show that CrossVision is able to significantly reduce inference delay while improving the inference accuracy compared to a variety of baseline approaches.

History

2024

2023

2022

2021

2020

  • [Topic] [ The path planning algorithm for multiple mobile edge servers in EdgeGO], Rong Cong, 2020-11-18

2019

2018

2017

Instructions

请使用Latest_seminar和Hist_seminar模板更新本页信息.

    • 修改时间和地点信息
    • 将当前latest seminar部分的code复制到这个页面
    • 将{{Latest_seminar... 修改为 {{Hist_seminar...,并增加对应的日期信息|date=
    • 填入latest seminar各字段信息
    • link请务必不要留空,如果没有link则填本页地址 https://mobinets.org/index.php?title=Resource:Seminar
  • 格式说明
    • Latest_seminar:

{{Latest_seminar
|confname=
|link=
|title=
|speaker=
}}

    • Hist_seminar

{{Hist_seminar
|confname=
|link=
|title=
|speaker=
|date=
}}