Difference between revisions of "Resource:Previous Seminars"

Latest revision as of 21:56, 9 April 2026

[Topic] [ The path planning algorithm for multiple mobile edge servers in EdgeGO], Rong Cong, 2020-11-18

[Mobisys20] Combating packet collisions using non-stationary signal scaling in LPWANs, Wenliang Mao, 2020-11-18
[Topic] [ Dependency-Aware and Latency-Optimal Service Cache in Edge networks], Jiwei Mo, 2020-11-18
[talk] Paper Carnival 2020, ALL, 2020-09-24,25,26

请使用Latest_seminar和Hist_seminar模板更新本页信息.

- 修改时间和地点信息
- 将当前latest seminar部分的code复制到这个页面中
- 将{{Latest_seminar... 修改为 {{Hist_seminar...，并增加对应的日期信息|date=
- 填入latest seminar各字段信息
- link请务必不要留空，如果没有link则填本页地址 https://mobinets.org/index.php?title=Resource:Seminar

格式说明
- Latest_seminar:

{{Latest_seminar
|confname=
|link=
|title=
|speaker=
}}

- Hist_seminar

{{Hist_seminar
|confname=
|link=
|title=
|speaker=
|date=
}}

Navigation menu

Difference between revisions of "Resource:Previous Seminars"

Latest revision as of 21:56, 9 April 2026

Contents

@@ Line 1: / Line 1: @@
 === History ===
+{{Hist_seminar
+|confname =IEEE Network'25
+|link = https://ieeexplore.ieee.org/document/10526298
+|title= Optimal Entanglement Distribution Problem in Satellite-Based Quantum Networks
+|speaker=Yaliang
+|date=2026-3-20
+}}
+{{Hist_seminar
+|confname =INFOCOM'24
+|link = https://ieeexplore.ieee.org/document/10621270
+|title= SECO: Multi-Satellite Edge Computing Enabled Wide-Area and Real-Time Earth Observation Missions
+|speaker=LinQi
+|date=2026-3-20
+}}
+{{Hist_seminar
+|confname =Mobicom'25
+|link = https://dl.acm.org/doi/10.1145/3680207.3765249
+|title= SpaceSched: A Constellation-Wide Scheduling System for Resolving Ground Track Congestion in Remote Sensing
+|speaker=Yifei
+|date=2026-3-13
+}}
+{{Hist_seminar
+|confname =Mobicom'25
+|link = https://dl.acm.org/doi/10.1145/3680207.3723473
+|title= NeVo: Advancing Volumetric Video Streaming with Neural Content Representation
+|speaker=Mengfan
+|date=2026-3-13
+}}{{Hist_seminar
+|confname =SenSys'25
+|link = https://dl.acm.org/doi/10.1145/3715014.3722075
+|title= MoLoRa: Intelligent Mobile Antenna System for Enhanced LoRa Reception in Urban Environments
+|speaker=Kai Chen
+|date=2026-1-30
+}}
+{{Hist_seminar
+|confname =WWW'25
+|link = https://dl.acm.org/doi/abs/10.1145/3696410.3714571
+|title= Bridging the Gap: Aligning Language Model Generation with Structured Information Extraction via Controllable State Transition
+|speaker=Daobin
+|date=2026-1-30
+}}
+{{Hist_seminar
+|confname =TMC'25
+|link = https://ieeexplore.ieee.org/document/10705683
+|title= Edge-Cloud Collaborated Object Detection via Bandwidth Adaptive Difficult-Case Discriminator
+|speaker=Menghao Liu
+|date=2026-1-23
+}}
+{{Hist_seminar
+|confname =NSDI'24
+|link = https://www.usenix.org/conference/nsdi24/presentation/sivaraman
+|title= Gemino: Practical and Robust Neural Compression for Video Conferencing
+|speaker=Xinyan
+|date=2026-1-23
+}}
+{{Hist_seminar
+|confname =OSDI'24
+|link = https://www.usenix.org/conference/osdi24/presentation/zhong-yinmin
+|title= DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving
+|speaker=Ruizheng
+|date=2026-1-09
+}}
+{{Hist_seminar
+|confname =OSDI'25
+|link = https://www.usenix.org/conference/osdi25/presentation/domingo
+|title= Kamino: Efficient VM Allocation at Scale with Latency-Driven Cache-Aware Scheduling
+|speaker=Chenli
+|date=2026-1-09
+}}
+{{Hist_seminar
+|confname =OSDI'25
+|link = https://www.usenix.org/conference/osdi25/presentation/ren
+|title= Enabling Efficient GPU Communication over Multiple NICs with FuseLink
+|speaker=Jiahao
+|date=2025-12-26
+}}
+{{Hist_seminar
+|confname =ToN'25
+|link = https://ieeexplore.ieee.org/document/11153500
+|title= Cost-Aware High-Fidelity Entanglement Distribution and Purification in the Quantum Internet
+|speaker=Bangguo
+|date=2025-12-26
+}}{{Hist_seminar
+|confname =TWC'24
+|link = https://ieeexplore.ieee.org/abstract/document/10623400
+|title= SpaceEdge: Optimizing Service Latency and Sustainability for Space-Centric Task Offloading in LEO Satellite Networks
+|speaker=Haifeng
+|date=2025-12-19
+}}
+{{Hist_seminar
+|confname =Mobicom'25
+|link = https://dl.acm.org/doi/10.1145/3680207.3765267
+|title= Vega: Fully Immersive Mobile Volumetric Video Streaming with 3D Gaussian Splatting
+|speaker=Jiyi
+|date=2025-12-19
+}}{{Hist_seminar
+|confname =EMNLP'25
+|link = https://arxiv.org/abs/2501.18460
+|title= ExeCoder: Empowering Large Language Models with Executability Representation for Code Translation
+|speaker=Youwei Ran
+|date=2025-12-12
+}}
+{{Hist_seminar
+|confname =CoRL'24
+|link = https://openreview.net/forum?id=FO6tePGRZj
+|title= Mobile ALOHA: Learning Bimanual Mobile Manipulation using Low-Cost Whole-Body Teleoperation
+|speaker=Yi Zhou
+|date=2025-12-12
+}}{{Hist_seminar
+|confname =ACL'24
+|link = https://arxiv.org/abs/2406.16441
+|title= UniCoder: Scaling Code Large Language Model via Universal Code
+|speaker=Bairong Liu
+|date=2025-12-05
+}}
+{{Hist_seminar
+|confname =TMC'25
+|link = https://ieeexplore.ieee.org/abstract/document/11160677
+|title= Resolving Inter-Logical Channel Interference for Large-scale LoRa Deployments
+|speaker=Mengyu
+|date=2025-12-05
+}}
+{{Hist_seminar
+|confname =ToN'25
+|link = https://ieeexplore.ieee.org/abstract/document/10843977
+|title= Spliceosome: On-Camera Video Thinning and Tuning for Timely and Accurate Analytics
+|speaker=Zhongwei Sun
+|date=2025-11-28
+}}
+{{Hist_seminar
+|confname =NSDI'25
+|link = https://ieeexplore.ieee.org/abstract/document/10843977
+|title= Accelerating Design Space Exploration for LLM Training Systems with Multi-experiment Parallel Simulation
+|speaker=Qinyong
+|date=2025-11-28
+}}
+{{Hist_seminar
+|confname =ASAP'25
+|link = https://ieeexplore.ieee.org/abstract/document/11113621
+|title= ReaLLM: A Trace-Driven Framework for Rapid Simulation of Large-Scale LLM Inference
+|speaker=JunZhe
+|date=2025-11-21
+}}{{Hist_seminar
+|abstract =With the proliferation of mobile devices, spatial crowdsourcing has emerged as a promising paradigm for facilitating location-based services, encompassing various applications across academia and industries. Recently, pioneering works have attempted to infer workers' mobility patterns from historical data to improve the quality of task assignment. However, these studies have overlooked or under-examined issues such as the dynamic mobility patterns of crowd workers, especially in the context of newcomers, the misalignment between the objectives of mobility prediction and task assignment, and the effective utilization of predicted mobility patterns. In this paper, we investigate a problem we term Task Assignment in Mobility Prediction-aware Spatial Crowdsourcing (TAMP). To address the TAMP problem, we first propose a task-adaptive meta-learning algorithm, which trains a set of specific meta-knowledge for workers' mobility prediction models through game theory-based learning task clustering and meta-training within each cluster. Then, we design a task assignment-oriented loss function and develop a task assignment algorithm that incorporates prediction performance, prioritizing assignments with higher confidence of completion. Extensive experiments on real-world datasets validate that our proposed methods can effectively improve the quality of task assignment.
+|confname =ICDE'25
+|link = https://ieeexplore.ieee.org/document/11113007
+|title= Effective Task Assignment in Mobility Prediction-Aware Spatial Crowdsourcing
+|speaker= Zhenguo
+|date=2025-11-21
+}}{{Hist_seminar
+|abstract = Entanglement distribution across remote distances is critical for many quantum applications. Currently, the de facto approach for remote entanglement distribution relies on optical fiber for on-the-ground entanglement distribution. However, the fiber-based approach is incapable of global-scale entanglement distribution due to intrinsic limitations. This paper investigates a new hybrid ground-satellite quantum network architecture (QuESat) for global-scale entanglement distribution, integrating an on-the-ground fiber network with a global-scale passive optical network built with low-Earth-orbit satellites. The satellite network provides dynamic construction of photon lightpaths based on near-vacuum beam guides constructed via adjustable arrays of lenses, forwarding photons from one ground station to another with very high efficiency over long distances compared to using fiber. To assess the feasibility and effectiveness of QuESat for global communication, we formulate lightpath provisioning and entanglement distribution problems, considering the orbital dynamics of satellites and the time-varying entanglement demands from ground users. A two-stage algorithm is developed to dynamically configure the beam guides and distribute entanglements, respectively. The algorithm combines randomized and deterministic rounding for lightpath provisioning to enable global connectivity, with optimal entanglement swapping for distributing entanglements to meet users' demands. By developing a ground-satellite quantum network simulator, QuESat achieves multi-fold improvements compared to repeater networks.
+|confname = INFOCOM'25
+|link = https://ieeexplore.ieee.org/document/11044649
+|title= QuESat: Satellite-Assisted Quantum Internet for Global-Scale Entanglement Distribution
+|speaker= Yaliang
+|date=2025-11-07
+}}{{Hist_seminar
+|abstract =The global business of transnational enterprises demands geo-distributed databases, where the leader-follower-based consensus protocols are the key to guaranteeing consistency of replicas spread across regions. Compared with traditional databases running in a single data center, determining which node is the leader in consensus protocol has a greater per-formance impact in geo-distributed databases running across multiple data centers. However, the performance of legacy leader management is far from satisfactory due to the network and application dynamics (e.g., network delay, node popularity, operation read-write ratio). This paper proposes GeoLM toward performance-oriented leader management for geo-distributed consensus protocols. GeoLM captures the network and application dynamics and proactively conducts seamless leader handovers with bounded switching costs. Our geo-distributed experimental results show that GeoLM improves performance up to 49.75% over the baselines (e.g., Raft and Geo-Raft) and achieves considerably good performance compared to state-of-the-art consensus protocols (e.g., SwiftPaxos, CURP, and EPaxos).
+|confname = INFOCOM'25
+|link = https://ieeexplore.ieee.org/document/11044598
+|title= GeoLM: Performance-oriented Leader Management for Geo-Distributed Consensus Protocol
+|speaker= Linqi Liu
+|date=2025-11-07
+}}{{Hist_seminar
+|abstract = Immersive telepresence has the potential to revolutionize remote communication by offering a highly interactive and engaging user experience. However, state-of-the-art exchanges large volumes of 3D content to achieve satisfactory visual quality, resulting in substantial Internet bandwidth consumption. To tackle this challenge, we introduce MagicStream, a first-of-its-kind semantic-driven immersive telepresence system that effectively extracts and delivers compact semantic details of captured 3D representation of users, instead of traditional bit-by-bit communication of raw content. To minimize bandwidth consumption while maintaining low end-to-end latency and high visual quality, MagicStream incorporates the following key innovations: (1) efficient extraction of user's skin/cloth color and motion semantics based on lighting characteristics and body keypoints, respectively; (2) novel, real-time human body reconstruction from motion semantics; and (3) on-the-fly neural rendering of users' immersive representation with color semantics. We implement a prototype of MagicStream and extensively evaluate its performance through both controlled experiments and user trials. Our results show that, compared to existing schemes, MagicStream can drastically reduce Internet bandwidth usage by up to 1195X while maintaining good visual quality.
+|confname = Sensys'24
+|link = https://dl.acm.org/doi/10.1145/3666025.3699344
+|title= MagicStream: Bandwidth-conserving Immersive Telepresence via Semantic Communication
+|speaker= Mengfan Wang
+|date=2025-10-31
+}}{{Hist_seminar
+|abstract =To fulfill computing demands of numerous Internet of Things (IoT) devices in infrastructure-free regions, low earth orbit (LEO) satellite edge computing has been proposed in recent years, to circumvent the latency arising from long backhaul and link congestion in traditional cloud computing mode. This article proposes a novel time-varying graph-based collaborative task offloading strategy for LEO satellite IoT to reduce task computing latency. To this end, a computing coordinate graph (CCG) is designed to characterize the time-varying topology and resource distribution of LEO satellite networks. When a task is offloaded to LEO satellite networks because local computing capability is unable to meet latency constraint, the position of the task access satellite in the CCG is determined first. Then, the expanded hop counts from all satellite nodes to the access satellite are calculated, which informs the partitioning of different node sets. Afterwards, considering both link and on-board computing resources, with the access satellite as the reference node, the minimum total task computing latency for each node set is obtained in an ascending order of the expanded hop counts. Finally, the minimum one among obtained latency values is the anticipated total task computing latency. Simulation results demonstrate the effectiveness of the proposed task offloading strategy in reducing task computing latency.
+|confname = Systems Joural
+|link = https://ieeexplore.ieee.org/document/11024019
+|title= Collaborative Task Offloading for LEO Satellite Internet of Things: A Novel Computing Coordinate Graph-Based Approach
+|speaker= Yifei Zhou
+|date=2025-10-31
+}}
+{{Hist_seminar
+|abstract = Unlike traditional data collection applications (e.g., environment monitoring) that are dominated by uplink transmissions, the newly emerging applications (e.g., device actuation, firmware update, packet reception acknowledgement) also pose ever-increasing demands on downlink transmission capabilities. However, current LoRaWAN falls short in supporting such applications primarily due to downlink-uplink asymmetry. While the uplink can concurrently receive multiple packets, downlink transmission is limited to a single logical channel at a time, which fundamentally hinders the deployment of downlink-hungry applications. To tackle this practical challenge, FDLoRa develops the first-of-its-kind in-band full-duplex LoRa gateway design with novel solutions to mitigate the impact of self-interference (i.e., strong downlink interference to ultra-weak uplink reception), which unleashes the full spectrum for in-band downlink transmissions without compromising the reception of weak uplink packets. Built upon the full-duplex gateways, FDLoRa introduces a new downlink framework to support concurrent downlink transmissions over multiple logical channels of available gateways. Evaluation results demonstrate that FDLoRa boosts downlink capacity by 5.7x compared to LoRaWAN on a three-gateway testbed and achieves 2.58x higher downlink concurrency per gateway than the state-of-the-art.
+|confname = Sensys'24
+|link = https://dl.acm.org/doi/10.1145/3666025.3699338
+|title= FDLoRa: Tackling Downlink-Uplink Asymmetry with Full-duplex LoRa Gateways
+|speaker= Kai Chen
+|date=2025-10-23
+}}{{Hist_seminar
+|abstract =Recent years have witnessed a widespread adoption of containers. While containers simplify and accelerate application development, existing container network technologies either incur significant overhead, which hurts performance for distributed applications, or lose flexibility or compatibility, which hinders the widespread deployment in production. We carefully analyze the kernel data path of an overlay network, quantifying the time consumed by each segment of the data path and identifying the extra overhead in an overlay network compared to bare metal. We observe that this extra overhead generates repetitive results among packets, which inspires us to introduce caches within an overlay network. We design and implement ONCache (Overlay Network Cache), a cache-based container overlay network, to eliminate the extra overhead while maintaining flexibility and compatibility. We implement ONCache using the extended Berkeley Packet Filter (eBPF) with only 524 lines of code, and integrate it as a plugin of Antrea. With ONCache, containers attain networking performance akin to that of bare metal. Compared to the standard overlay networks, ONCache improves throughput and request-response transaction rate by 12% and 36% for TCP (20% and 34% for UDP), respectively, while significantly reducing per-packet CPU overhead. Popular distributed applications also benefit from ONCache.
+|confname = NSDI'25
+|link = https://www.usenix.org/conference/nsdi25/presentation/lin-shengkai
+|title= ONCache: A Cache-Based Low-Overhead Container Overlay Network
+|speaker= Daobing Zeng
+|date=2025-10-24
+}}
+{{Hist_seminar
+|abstract = We present HyperCam, an energy-efficient image classification pipeline that enables computer vision tasks onboard low-power IoT camera systems. HyperCam leverages hyperdimensional computing to perform training and inference efficiently on low-power microcontrollers. We implement a low-power wireless camera platform using off-the-shelf hardware and demonstrate that HyperCam can achieve an accuracy of 93.60%, 84.06%, 92.98%, and 72.79% for MNIST, Fashion-MNIST, Face Detection, and Face Identification tasks, respectively, while significantly outperforming other classifiers in resource efficiency. \revSpecifically, it delivers inference latency of 0.08-0.27s while using 42.91-63.00KB flash memory and 22.25KB RAM at peak. Among other machine learning classifiers such as SVM, xgBoost, MicroNets, MobileNetV3, and MCUNetV3, HyperCam is the only classifier that achieves competitive accuracy while maintaining competitive memory footprint and inference latency that meets the resource requirements of low-power camera systems.
+|confname = Arxiv
+|link = https://arxiv.org/html/2501.10547v1
+|title= HyperCam: Low-Power Onboard Computer Vision for IoT Cameras
+|speaker= Menghao Liu
+|date=2025-10-17
+}}{{Hist_seminar
+|abstract = We present NIER, a video conferencing system that can adaptively maintain a low bitrate (e.g., 10–100 Kbps) with reasonable visual quality while being robust to packet losses. We use key-point-based deep image animation (DIA) as a key building block and address a series of networking and system challenges to make NIER practical. Our evaluations show that NIER significantly outperforms the baseline solutions.
+|confname =SIGCOMM'25 (short paper)
+|link = https://dl.acm.org/doi/pdf/10.1145/3718958.3750518
+|title= NIER: Practical Neural-enhanced Low-bitrate Video Conferencing
+|speaker=Xinyan Wang
+|date=2025-9-26
+}}{{Hist_seminar
+|abstract = Distributed Edge Computing (DEC) has emerged as a novel paradigm, owing to its superior performance in communication latency, parallel computing efficiency, and energy consumption. With the surge of tasks in generative artificial intelligence, DEC faces higher demands for parallel computing efficiency. Scheduling multiple tasks for simultaneous processing, rather than one-by-one handling, could enhance parallel efficiency. Multiple tasks have multi-dependencies, i.e., sequence dependency, attribute similarity, and attribute correlation. Utilizing the bidirectional edges of traditional graphs to represent multi-dependencies can lead to an explosion in quantity. A hypergraph, with its hyperedges capable of connecting any number of vertices, can significantly solve the above problem. However, the multi-dependencies are rarely studied in the current research, posing the challenges, including incapable representing and unable capturing of multi-dependency hypergraph. In this work, we introduce a Joint communication and computation scheduling for hypErgraph Tasks in DEC, namely HypeJet, To effectively represent multi-dependencies, we employ hypergraph construction to represent task attributes and utilize hypergraph partitioning to clarify and refine task attribute correlations, enhancing parallel efficiency. In response to the challenge of capturing multi-dependencies, we employ a scheduling mechanism with the hypergraph neural network that efficiently acquires higher-order attribute correlated information among convolution matrices, providing enriched contextual information on multi-dependencies that supports decision-making in scheduling tasks. The evaluations using real-world traces demonstrate an 18.07% improvement in parallel efficiency of task scheduling.
+|confname =INFOCOM'25
+|link = https://ieeexplore.ieee.org/abstract/document/11044587
+|title= HyperJet: Joint Communication and Computation Scheduling for Hypergraph Tasks in Distributed Edge Computing
+|speaker= Yi Zhou
+|date=2025-9-26
+}}{{Hist_seminar
+|abstract = Localization of networked nodes is an essential problem in emerging applications, including first-responder navigation, automated manufacturing lines, vehicular and drone navigation, asset tracking, Internet of Things, and 5G communication networks. In this paper, we present Locate3D, a novel system for peer-to-peer node localization and orientation estimation in large networks. Unlike traditional range-only methods, Locate3D introduces angle-of-arrival (AoA) data as an added network topology constraint. The system solves three key challenges: it uses angles to reduce the number of measurements required by 4× and jointly uses range and angle data for location estimation. We develop a spanning-tree approach for fast location updates, and to ensure the output graphs are rigid and uniquely realizable, even in occluded or weakly connected areas. Locate3D cuts down latency by up to 75% without compromising accuracy, surpassing standard range-only solutions. It has a 0.86 meter median localization error for building-scale multi-floor networks (32 nodes, 0 anchors) and 12.09 meters for large-scale networks (100,000 nodes, 15 anchors).
+|confname =NSDI'25
+|link = https://www.usenix.org/conference/nsdi25/presentation/garg
+|title= Large Network UWB Localization: Algorithms and Implementation
+|speaker=Bangguo
+|date=2025-9-26
+}}
+{{Hist_seminar
+|abstract = With cloud-side computing and rendering, mobile cloud gaming (MCG) is expected to deliver high-quality gaming experiences to budget mobile devices. However, our measurement on representative MCG platforms reveals that even under good network conditions, all platforms exhibit high interactive latency of 112–403 ms, from a user-input action to its display response, that critically affects users’ quality of experience. Moreover, jitters in network latency often lead to significant fluctuations in interactive latency. In this work, we collaborate with a commercial MCG platform to conduct the first in-depth analysis on the interactive latency of cloud gaming. We identify VSync, the synchronization primitive of Android graphics pipeline, to be a key contributor to the excessive interactive latency; as many as five VSync events are intricately invoked, which serialize the complex graphics processing logic on both the client and cloud sides. To address this, we design an end-to-end VSync regulator, dubbed LoopTailor, which minimizes VSync events by decoupling game rendering from the lengthy cloud-side graphics pipeline and coordinating cloud game rendering directly with the client. We implement LoopTailor on the collaborated platform and commodity Android devices, reducing the interactive latency (by ∼34%) to stably below 100 ms.
+|confname =NSDI'25
+|link = https://www.usenix.org/conference/nsdi25/presentation/li-yang
+|title= Dissecting and Streamlining the Interactive Loop of Mobile Cloud Gaming
+|speaker= Li Chen
+|date=2025-9-9
+}}{{Hist_seminar
+|abstract = The local deployment of large language models (LLMs) on mobile devices has garnered increasing attention due to its advantages in enhancing user privacy and enabling offline operation. However, given the limited computational resources of a single mobile device, only small language models (SLMs) with restricted capabilities can currently be supported. In this paper, we explore the potential of leveraging the collective computing power of multiple mobile devices to collaboratively support more efficient local LLM inference. We evaluate the feasibility and efficiency of existing parallelism techniques under the constraints of mobile devices and wireless network, identifying that chunked pipeline parallelism holds promise for realizing this vision. Building on this insight, we propose FlexSpark, a novel solution designed to achieve efficient and robust multi-device collaborative inference. FlexSpark incorporates priority scheduling, ordered communication, and elastic compression to maximize wireless bandwidth utilization, and thus accelerates distributed inference. Preliminary experimental results demonstrate that FlexSpark achieves up to a 2 × speedup compared to state-of-the-art frameworks, significantly enhancing the practicality and scalability of LLM deployment on mobile devices.
+|confname =APNet'25
+|link = https://dl.acm.org/doi/10.1145/3735358.3735368
+|title= FlexSpark: Robust and Efficient Multi-Device Collaborative Inference over Wireless Network
+|speaker=Ruizhen
+|date=2025-9-19
+}}
+{{Hist_seminar
+|abstract = Reconfigurable Intelligent Surfaces (RIS) are a promising technology for creating smart radio environments by controlling wireless propagation. However, several factors hinder the integration of RIS technology into existing cellular networks, including the incompatibility of RIS control interfaces with 5G PHY/MAC procedures for synchronizing radio scheduling decisions and RIS operation, and the cost and energy limitations of passive RIS technology. This paper presents RISENSE, a system for practical RIS integration in cellular networks. First, we propose a novel, low-cost, and low-power RIS design capable of decoding control messages without complex baseband operations or additional RF chains, utilizing a power sensor and a network of microstrip lines and couplers. Second, we design an effective in-band wireless RIS control interface, compatible with 5G PHY/MAC procedures, that embeds amplitude-modulated (AM) RIS control commands directly into standard OFDM-modulated 5G data channels. Finally, we propose a low-overhead protocol that supports swift on-demand RIS re-con gurability, making it adaptable to varying channel conditions and user mobility, while minimizing the wastage of 5G OFDM symbols. Our experiments validate the design of RISENSE and our evaluation shows that our system can reconfigure a RIS at the same pace as users move, boosting 5G coverage where static or slow RIS controllers cannot.
+|confname = Mobisys'25
+|link = https://dspace.networks.imdea.org/handle/20.500.12761/1925
+|title= RISENSE: Long-Range In-Band Wireless Control of Passive Reconfigurable Intelligent Surfaces
+|speaker= Haifeng
+|date=2025-9-12
+}}
+{{Hist_seminar
+|abstract = Traditional 3D content representations include dense point clouds that consume large amounts of data and hence network bandwidth, while newer representations such as neural radiance fields suffer from poor frame rates due to their non-standard volumetric rendering pipeline. 3D Gaussian splats (3DGS) can be seen as a generalization of point clouds that meet the best of both worlds, with high visual quality and efficient rendering for real-time frame rates. However, delivering 3DGS scenes from a hosting server to client devices is still challenging due to high network data consumption (e.g., 1.5 GB for a single scene). The goal of this work is to create an efficient 3D content delivery framework that allows users to view high quality 3D scenes with 3DGS as the underlying data representation. The main contributions of the paper are: (1) Creating new layered 3DGS scenes for efficient delivery, (2) Scheduling algorithms to choose what splats to download at what time, and (3) Trace-driven experiments from users wearing virtual reality headsets to evaluate the visual quality and latency. Our system for Layered 3D Gaussian Splats delivery (L3GS) demonstrates high visual quality, achieving 16.9% higher average SSIM compared to baselines, and also works with other compressed 3DGS representations. The code is available at https://github.com/mavens-lab/layered_3d_gaussian_splats.
+|confname =Mobicom'25
+|link = https://arxiv.org/html/2504.05517v1
+|title= L3GS: Layered 3D Gaussian Splats for Efficient 3D Scene Delivery
+|speaker=Jiyi
+|date=2025-9-12
+}}
+{{Hist_seminar
+|abstract = This year, we are embracing the exciting new trends in AIoT including MLsys, LLMs, embodied perception, volumetric videos, etc. Papers collected from top venues in 2025 will be discussed in-depth, and research problems and new ideas are to be discovered!
+|confname = Begin of new semester
+|link = https://mobinets.cn/site/Resource:Paper_Carnival_2025
+|title= Paper Carnival 2025
+|speaker=All
+|date=2025-08-27
+}}
+{{Hist_seminar
+|abstract = In the metaverse era, point cloud video (PCV) streaming on mobile XR devices is pivotal. While most current methods focus on PCV compression from traditional 3-DoF video services, emerging AI techniques extract vital semantic information, producing content resembling the original. However, these are early-stage and computationally intensive. To enhance the inference efficacy of AI-based approaches, accommodate dynamic environments, and facilitate applicability to metaverse XR devices, we present ISCom, an interest-aware semantic communication scheme for lightweight PCV streaming. ISCom is featured with a region-of-interest (ROI) selection module, a lightweight encoder-decoder training module, and a learning-based scheduler to achieve real-time PCV decoding and rendering on resource-constrained devices. ISCom&#x2019;s dual-stage ROI selection provides significantly reduces data volume according to real-time interest. The lightweight PCV encoder-decoder training is tailored to resource-constrained devices and adapts to the heterogeneous computing capabilities of devices. Furthermore, We provide a deep reinforcement learning (DRL)-based scheduler to select optimal encoder-decoder model for various devices adaptivelly, considering the dynamic network environments and device computing capabilities. Our extensive experiments demonstrate that ISCom outperforms baselines on mobile devices, achieving a minimum rendering frame rate improvement of 10 FPS and up to 22 FPS. Furthermore, our method significantly reduces memory usage by 41.7&#x0025; compared to the state-of-the-art AITransfer method. These results highlight the effectiveness of ISCom in enabling lightweight PCV streaming and its potential to improve immersive experiences for emerging metaverse application.
+|confname =JSAC'24
+|link = https://dl.acm.org/doi/10.1109/JSAC.2023.3345430
+|title= ISCom: Interest-Aware Semantic Communication Scheme for Point Cloud Video Streaming on Metaverse XR Devices
+|speaker=Jiyi
+|date=2025-06-13
+}}
+{{Hist_seminar
+|abstract = Scientific Illustration Tutorial
+|confname = TUTORIAL
+|link = https://mobinets.cn/Resource:Seminar
+|title= Idea share
+|speaker=OldBee
+|date=2025-06-13
+}}
+{{Hist_seminar
+|abstract = Deploying deep convolutional neural networks (CNNs) for edge-based video analytics poses significant challenges due to the intensive computing demands. Model partitioning has emerged as a promising solution by offloading segments of CNNs to multiple proximal edge devices for collaborative inference. However, this approach often incurs substantial cross-device transmission overhead, particularly in handling intermediate feature maps. To address these limitations, we propose ReDream (REsidual feature-DRivEn mixed spArse coding for Model partitioning), a novel edge-centric video analytics framework that jointly optimizes  transmission efficiency and inference accuracy. ReDream introduces two key innovations: 1) It enhances the sparsity of intermediate features by replacing activation functions with ReLU in selected CNN layers and retraining, thereby increasing the proportion of zero-valued elements. 2) It leverages the heterogeneous distribution of feature data across layers by applying a mixed sparse coding scheme, i.e., selecting different compression methods adaptively to optimize model partitioning. These optimizations enable ReDream to support more efficient cross-device inference while maintaining high model accuracy, making it well-suited for real-time deployment in collaborative edge environments.
+|confname = IDEA
+|link = https://mns.uestc.cn/wiki/Research:InProgress/MixedSparseCoding
+|title= ReDream: Residual Feature-Driven Mixed Sparse Coding for Model Partitioning
+|speaker=Xianyang
+|date=2025-05-23
+}}
+{{Hist_seminar
+|abstract = While existing strategies to execute deep learning-based classification on low-power platforms assume the models are trained on all classes of interest, this paper posits that adopting context-awareness i.e. narrowing down a classification task to the current deployment context consisting of only recent inference queries can substantially enhance performance in resource-constrained environments. We propose a new paradigm, CACTUS, for scalable and efficient context-aware classification where a micro-classifier recognizes a small set of classes relevant to the current context and, when context change happens (e.g., a new class comes into the scene), rapidly switches to another suitable micro-classifier. CACTUS features several innovations, including optimizing the training cost of context-aware classifiers, enabling on-the-fly context-aware switching between classifiers, and balancing context switching costs and performance gains via simple yet effective switching policies. We show that CACTUS achieves significant benefits in accuracy, latency, and compute budget across a range of datasets and IoT platforms.
+|confname = Mobisys'24
+|link = https://dl.acm.org/doi/abs/10.1145/3643832.3661888
+|title= CACTUS: Dynamically Switchable Context-aware micro-Classifiers for Efficient IoT Inference
+|speaker= Zhenhua
+|date=2025-04-18
+}}
+{{Hist_seminar
+|abstract = Nowadays, volumetric videos have emerged as an attractive multimedia application providing highly immersive watching experiences since viewers could adjust their viewports at 6 degrees-of-freedom. However, the point cloud frames composing the video are prohibitively large, and effective compression techniques should be developed. There are two classes of compression methods. One suggests exploiting the conventional video codecs (2D-based methods) and the other proposes to compress the points in 3D space directly (3D-based methods). Though the 3D-based methods feature fast coding speeds, their compression ratios are low since the failure of leveraging inter-frame redundancy. To resolve this problem, we design a patch-wise compression framework working in the 3D space. Specifically, we search rigid moves of patches via the iterative closest point algorithm and construct a common geometric structure, which is followed by color compensation. We implement our decoder on a GPU platform so that real-time decoding and rendering are realized. We compare our method with GROOT, the state-of-the-art 3D-based compression method, and it reduces the bitrate by up to 5.98×. Moreover, by trimming invisible content, our scheme achieves comparable bandwidth demand of V-PCC, the representative 2D-based method, in FoV-adaptive streaming.
+|confname = TC'24
+|link = https://ieeexplore.ieee.org/document/10360355
+|title= A GPU-Enabled Real-Time Framework for Compressing and Rendering Volumetric Videos
+|speaker=Mengfan
+|date=2025-04-18
+}}
+{{Hist_seminar
+|abstract = Cross-silo federated learning (FL) enables multiple institutions (clients) to collaboratively build a global model without sharing their private data. To prevent privacy leakage during aggregation, homomorphic encryption (HE) is widely used to encrypt model updates, yet incurs high computation and communication overheads. To reduce these overheads, packed HE (PHE) has been proposed to encrypt multiple plaintexts into a single ciphertext. However, the original design of PHE does not consider the heterogeneity among different clients, an intrinsic problem in cross-silo FL, often resulting in undermined training efficiency with slow convergence and stragglers. In this work, we propose FedPHE, an efficiently packed homomorphically encrypted FL framework with secure weighted aggregation and client selection to tackle the heterogeneity problem. Specifically, using CKKS with sparsification, FedPHE can achieve efficient encrypted weighted aggregation by accounting for contributions of local updates to the global model. To mitigate the straggler effect, we devise a sketching-based client selection scheme to cherry-pick representative clients with heterogeneous models and computing capabilities. We show, through rigorous security analysis and extensive experiments, that FedPHE can efficiently safeguard clients’ privacy, achieve a training speedup of 1.85 − 4.44×, cut the communication overhead by 1.24 − 22.62× , and reduce the straggler effect by up to 1.71 − 2.39×.
+|confname =INFOCOM24'
+|link = https://ieeexplore.ieee.org/abstract/document/10621440
+|title= Efficient and Straggler-Resistant Homomorphic Encryption for Heterogeneous Federated Learning
+|speaker=Dongting
+|date=2025-03-28
+}}{{Hist_seminar
+|abstract = Entanglement routing (ER) in quantum networks must guarantee entanglement fidelity, a property that is crucial for applications such as quantum key distribution, quantum computation, and quantum sensing. Conventional ER approaches assume that network links can only generate entanglements with a fixed fidelity, and then they rely on purification to improve endto-end fidelities. However, recent advances in entanglement generation technologies show that quantum links can be configured by choosing among different fidelity/entanglement-rate combinations (defined in this paper as link configurations), hence enabling a more flexible assignment of quantum-network resources for meeting specific application requirements. To exploit this opportunity, we introduce the problem of link configuration for fidelityconstrained routing and purification (LC-FCRP) in Quantum Networks. We first formulate a simplified FCRP version as a Mixed Integer Linear Programming (MILP) model, where the link fidelity can be adjusted within a finite set. Then, to explore the full space of possible link configurations, we propose a link configuration algorithm based on a novel shortest-pathbased fidelity determination (SPFD) algorithm w/o Bayesian Optimization, which can be applied on top of any existing ER algorithm. Numerical results demonstrate that link configuration improves the acceptance ratio of existing ER algorithms by 87%.
+|confname =INFOCOM25'
+|link = https://re.public.polimi.it/bitstream/11311/1281986/1/final_infocom25_link_configuration_for_entanglement_routing.pdf
+|title= Link Configuration for Fidelity-Constrained Entanglement Routing in Quantum Networks
+|speaker=Yaliang
+|date=2025-03-27
+}}
+{{Hist_seminar
+|abstract = Large language models (LLMs) have demonstrated remarkable reasoning capabilities across diverse domains. Recent studies have shown that increasing test-time computation enhances LLMs' reasoning capabilities. This typically involves extensive sampling at inference time guided by an external LLM verifier, resulting in a two-player system. Despite external guidance, the effectiveness of this system demonstrates the potential of a single LLM to tackle complex tasks. Thus, we pose a new research problem: Can we internalize the searching capabilities to fundamentally enhance the reasoning abilities of a single LLM? This work explores an orthogonal direction focusing on post-training LLMs for autoregressive searching (i.e., an extended reasoning process with self-reflection and self-exploration of new strategies). To achieve this, we propose the Chain-of-Action-Thought (COAT) reasoning and a two-stage training paradigm: 1) a small-scale format tuning stage to internalize the COAT reasoning format and 2) a large-scale self-improvement stage leveraging reinforcement learning. Our approach results in Satori, a 7B LLM trained on open-source models and data. Extensive empirical evaluations demonstrate that Satori achieves state-of-the-art performance on mathematical reasoning benchmarks while exhibits strong generalization to out-of-domain tasks. Code, data, and models will be fully open-sourced.
+|confname = Arxiv
+|link = https://arxiv.org/abs/2502.02508
+|title= Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
+|speaker=Qinyong
+|date=2025-03-14
+}}{{Hist_seminar
+|abstract = Light bulbs have been recently explored to design Light Fidelity (LiFi) communication to battery-free tags, thus complementing Radiofrequency (RF) backscatter in the uplink. In this paper, we show that LiFi and RF backscatter are complementary and have unexplored interactions. We introduce PassiveLiFi, a battery-free system that uses LiFi to transmit RF backscatter at a meagre power budget. We address several challenges on the system design in the LiFi transmitter, the tag and the RF receiver. We design the first LiFi transmitter that implements a chirp spread spectrum (CSS) using the visible light spectrum. We use a small bank of solar cells for both communication and harvesting, and reconfigure them based on the amount of harvested energy and desired data rate. We further alleviate the low responsiveness of solar cells with a new low-power receiver design in the tag. We design and implement a novel technique for embedding multiple symbols in the RF backscatter based on delayed chirps. Experimental results with an RF carrier of 17dBm show that we can generate RF backscatter with a range of 92.1 meters/ μW consumed in the tag, which is almost double with respect to prior work.
+|confname =ToN'23
+|link = https://ieeexplore.ieee.org/document/10371205/
+|title= LiFi for Low-Power and Long-Range RF Backscatter
+|speaker=Mengyu
+|date=2025-03-14
+}}
+{{Hist_seminar
+|abstract = Video analytics is widespread in various applications serving our society. Recent advances of content enhancement in video analytics offer significant benefits for the bandwidth saving and accuracy improvement. However, existing content-enhanced video analytics systems are excessively computationally expensive and provide extremely low throughput. In this paper, we present region-based content enhancement, that enhances only the important regions in videos, to improve analytical accuracy. Our system, RegenHance, enables high-accuracy and high-throughput video analytics at the edge by 1) a macroblock-based region importance predictor that identifies the important regions fast and precisely, 2) a region-aware enhancer that stitches sparsely distributed regions into dense tensors and enhances them efficiently, and 3) a profile-based execution planer that allocates appropriate resources for enhancement and analytics components. We prototype RegenHance on five heterogeneous edge devices. Experiments on two analytical tasks reveal that region-based enhancement improves the overall accuracy of 10-19% and achieves 2-3x throughput compared to the state-of-the-art frame-based enhancement methods.
+|confname =NSDI'25
+|link = https://arxiv.org/pdf/2407.16990
+|title= Region-based Content Enhancement for Efficient Video Analytics at the Edge
+|speaker=Xinyan
+|date=2025-03-07
+}}{{Hist_seminar
+|abstract = Occluded person re-identification is a challenging task as human body parts could be occluded by some obstacles (e.g. trees, cars, and pedestrians) in certain scenes. Some existing pose-guided methods solve this problem by aligning body parts according to graph matching, but these graph-based methods are not intuitive and complicated. Therefore, we propose a transformer-based Pose-guided Feature Disentangling (PFD) method by utilizing pose information to clearly disentangle semantic components (e.g. human body or joint parts) and selectively match non-occluded parts correspondingly. First, Vision Transformer (ViT) is used to extract the patch features with its strong capability. Second, to preliminarily disentangle the pose information from patch information, the matching and distributing mechanism is leveraged in Pose-guided Feature Aggregation (PFA) module. Third, a set of learnable semantic views are introduced in transformer decoder to implicitly enhance the disentangled body part features. However, those semantic views are not guaranteed to be related to the body without additional supervision. Therefore, Pose-View Matching (PVM) module is proposed to explicitly match visible body parts and automatically separate occlusion features. Fourth, to better prevent the interference of occlusions, we design a Pose-guided Push Loss to emphasize the features of visible body parts. Extensive experiments over five challenging datasets for two tasks (occluded and holistic Re-ID) demonstrate that our proposed PFD is superior promising, which performs favorably against state-of-the-art methods. Code is available at this https URL
+|confname =AAAI'22
+|link = https://arxiv.org/abs/2112.02466
+|title= Pose-guided Feature Disentangling for Occluded Person Re-identification Based on Transformer
+|speaker=Bairong
+|date=2025-03-07
+}}
+{{Hist_seminar
+|abstract = The emerging programmable networks sparked significant research on Intelligent Network Data Plane (INDP), which achieves learning-based traffic analysis at line-speed. Prior art in INDP focus on deploying tree/forest models on the data plane. We observe a fundamental limitation in tree-based INDP approaches: although it is possible to represent even larger tree/forest tables on the data plane, the flow features that are computable on the data plane are fundamentally limited by hardware constraints. In this paper, we present BoS to push the boundaries of INDP by enabling Neural Network (NN) driven traffic analysis at line-speed. Many types of NNs (such as Recurrent Neural Network (RNN), and transformers) that are designed to work with sequential data have advantages over tree-based models, because they can take raw network data as input without complex feature computations on the fly. However, the challenge is significant: the recurrent computation scheme used in RNN inference is fundamentally different from the match-action paradigm used on the network data plane. BoS addresses this challenge by (i) designing a novel data plane friendly RNN architecture that can execute unlimited RNN time steps with limited data plane stages, effectively achieving line-speed RNN inference; and (ii) complementing the on-switch RNN model with an off-switch transformer-based traffic analysis module to further boost the overall performance. We implement a prototype of BoS using a P4 programmable switch as our data plane, and extensively evaluate it over multiple traffic analysis tasks. The results show that BoS outperforms state-of-the-art in both analysis accuracy and scalability..
+|confname =NSDI'24
+|link = https://www.usenix.org/conference/nsdi24/presentation/yan
+|title= Brain-on-Switch: Towards Advanced Intelligent Network Data Plane via NN-Driven Traffic Analysis at Line-Speed
+|speaker=Youwei
+|date=2025-02-28
+}}
+{{Hist_seminar
+|abstract = Recent advances in quantum information science enabled the development of quantum communication network prototypes and created an opportunity to study full-stack quantum network architectures. This work develops SeQUeNCe, a comprehensive, customizable quantum network simulator. Our simulator consists of five modules: hardware models, entanglement management protocols, resource management, network management, and application. This framework is suitable for simulation of quantum network prototypes that capture the breadth of current and future hardware technologies and protocols. We implement a comprehensive suite of network protocols and demonstrate the use of SeQUeNCe by simulating a photonic quantum network with nine routers equipped with quantum memories. The simulation capabilities are illustrated in three use cases. We show the dependence of quantum network throughput on several key hardware parameters and study the impact of classical control message latency. We also investigate quantum memory usage efficiency in routers and demonstrate that redistributing memory according to anticipated load increases network capacity by 69.1% and throughput by 6.8%. We design SeQUeNCe to enable comparisons of alternative quantum network technologies, experiment planning, and validation and to aid with new protocol design. We are releasing SeQUeNCe as an open source tool and aim to generate community interest in extending it.
+|confname =IOPSCIENCE'21
+|link = https://iopscience.iop.org/article/10.1088/2058-9565/ac22f6/meta
+|title= SeQUeNCe: a customizable discrete-event simulator of quantum networks
+|speaker=Junzhe
+|date=2025-02-21
+}}{{Hist_seminar
+|abstract = This article proposes a remote environmental monitoring system based on low-power Internet of Things, which is applied in smart agriculture to achieve remote and real-time measurement of temperature, humidity, and light intensity parameters in the crop growth environment within the coverage range of the device The system adopts low-power Internet of Things technology, which has the characteristics of wide coverage, multiple connections, fast speed, low cost, low power consumption, and excellent architecture. The overall design of the system includes multiple environmental monitoring nodes, a LoRa gateway, and corresponding environmental monitoring upper computer software. In terms of system software, it involves programming of node MCU and client upper computer software. The key technology implementation includes the hardware design and implementation of low-power sensor nodes and the development of LoRa protocol. System testing and performance analysis show that the optimized LoRa protocol performs well in communication distance, power consumption, stability, and other aspects, laying the foundation for the efficient operation of the system. This study provides a powerful tool for sustainable resource management, which helps to promote agricultural modernization and rural revitalization.
+|confname =CISCE'24
+|link = https://ieeexplore.ieee.org/abstract/document/10653076
+|title= A Long Distance Environmental Monitoring System Based on Low Power IoT
+|speaker= Ayesha Rasool
+|date=2025-02-21
+}}
+{{Hist_seminar
+|abstract = Recently, smart roadside infrastructure (SRI) has demonstrated the potential of achieving fully autonomous driving systems. To explore the potential of infrastructure-assisted autonomous driving, this paper presents the design and deployment of Soar, the first end-to-end SRI system specifically designed to support autonomous driving systems. Soar consists of both software and hardware components carefully designed to overcome various system and physical challenges. Soar can leverage the existing operational infrastructure like street lampposts for a lower barrier of adoption. Soar adopts a new communication architecture that comprises a bi-directional multi-hop I2I network and a downlink I2V broadcast service, which are designed based on off-the-shelf 802.11ac interfaces in an integrated manner. Soar also features a hierarchical DL task management framework to achieve desirable load balancing among nodes and enable them to collaborate efficiently to run multiple data-intensive autonomous driving applications. We deployed a total of 18 Soar nodes on existing lampposts on campus, which have been operational for over two years. Our real-world evaluation shows that Soar can support a diverse set of autonomous driving applications and achieve desirable real-time performance and high communication reliability. Our findings and experiences in this work offer key insights into the development and deployment of next-generation smart roadside infrastructure and autonomous driving systems.
+|confname =MobiCom'24
+|link = https://dl.acm.org/doi/abs/10.1145/3636534.3649352
+|title= Soar: Design and Deployment of A Smart Roadside Infrastructure System for Autonomous Driving
+|speaker=Jiahao
+|date=2025-01-10
+}}{{Hist_seminar
+|abstract = GPUs are increasingly utilized for running DNN tasks on emerging mobile edge devices. Beyond accelerating single task inference, their value is also particularly apparent in efficiently executing multiple DNN tasks, which often have strict latency requirements in applications. Preemption is the main technology to ensure multitasking timeliness, but mobile edges primarily offer two priorities for task queues, and existing methods thus achieve only coarse-grained preemption by categorizing DNNs into real-time and best-effort, permitting a real-time task to preempt best-effort ones. However, the efficacy diminishes significantly when other real-time tasks run concurrently, but this is already common in mobile edge applications. Due to different hardware characteristics, solutions from other platforms are unsuitable. For instance, GPUs on traditional mobile devices primarily assist CPU processing and lack special preemption support, mainly following FIFO in GPU scheduling. Clouds handle concurrent task execution, but focus on allocating one or more GPUs per complex model, whereas on mobile edges, DNNs mainly vie for one GPU. This paper introduces Pantheon, designed to offer fine-grained preemption, enabling real-time tasks to preempt each other and best-effort tasks. Our key observation is that the two-tier GPU stream priorities, while underexplored, are sufficient. Efficient preemption can be realized through software design by innovative scheduling and novel exploitation of the nested redundancy principle for DNN models. Evaluation on a diverse set of DNNs shows substantial improvements in deadline miss rate and accuracy of Pantheon over state-of-the-art methods.
+|confname =MobiSys'24
+|link = https://dl.acm.org/doi/abs/10.1145/3643832.3661878
+|title= Pantheon: Preemptible Multi-DNN Inference on Mobile Edge GPUs
+|speaker=Jiale
+|date=2025-01-10
+}}
+{{Hist_seminar
+|abstract = Volumetric videos offer a unique interactive experience and have the potential to enhance social virtual reality and telepresence. Streaming volumetric videos to multiple users remains a challenge due to its tremendous requirements of network and computation resources. In this paper, we develop MuV2, an edge-assisted multi-user mobile volumetric video streaming system to support important use cases such as tens of students simultaneously consuming volumetric content in a classroom. MuV2 achieves high scalability and good streaming quality through three orthogonal designs: hybridizing direct streaming of 3D volumetric content with remote rendering, dynamically sharing edge-transcoded views across users, and multiplexing encoding tasks of multiple transcoding sessions into a limited number of hardware encoders on the edge. MuV2 then integrates the three designs into a holistic optimization framework. We fully implement MuV2 and experimentally demonstrate that MuV2 can deliver high-quality volumetric videos to over 30 concurrent untethered mobile devices with a single WiFi access point and a commodity edge server.
+|confname =MobiCom'24
+|link = https://dl.acm.org/doi/abs/10.1145/3636534.3649364
+|title= MuV2: Scaling up Multi-user Mobile Volumetric Video Streaming via Content Hybridization and Sharing
+|speaker=Jiyi
+|date=2025-01-03
+}}{{Hist_seminar
+|abstract = The advent of 5G promises high bandwidth with the introduction of mmWave technology recently, paving the way for throughput-sensitive applications. However, our measurements in commercial 5G networks show that frequent handovers in 5G, due to physical limitations of mmWave cells, introduce significant under-utilization of the available bandwidth. By analyzing 5G link-layer and TCP traces, we uncover that improper interactions between these two layers causes multiple inefficiencies during handovers. To mitigate these, we propose M2HO, a novel device-centric solution that can predict and recognize different stages of a handover and perform state-dependent mitigation to markedly improve throughput. M2HO is transparent to the firmware, base stations, servers, and applications. We implement M2HO and our extensive evaluations validate that it yields significant improvements in TCP throughput with frequent handovers.
+|confname =MobiCom'24
+|link = https://dl.acm.org/doi/abs/10.1145/3636534.3690680
+|title= M2HO: Mitigating the Adverse Effects of 5G Handovers on TCP
+|speaker=Jiacheng
+|date=2025-01-03
+}}
 ====2024====
+{{Hist_seminar
+|abstract = Packet routing in virtual networks requires virtual-to-physical address translation. The address mappings are updated by a single party, i.e., the network administrator, but they are read by multiple devices across the network when routing tenant packets. Existing approaches face an inherent read-write performance tradeoff: they either store these mappings in dedicated gateways for fast updates at the cost of slower forwarding or replicate them at end-hosts and suffer from slow updates.SwitchV2P aims to escape this tradeoff by leveraging the network switches to transparently cache the address mappings while learning them from the traffic. SwitchV2P brings the mappings closer to the sender, thus reducing the first packet latency and translation overheads, while simultaneously enabling fast mapping updates, all without changing existing routing policies and deployed gateways. The topology-aware data-plane caching protocol allows the switches to transparently adapt to changing network conditions and varying in-switch memory capacity.Our evaluation shows the benefits of in-network address mapping, including an up to 7.8× and 4.3× reduction in FCT and first packet latency respectively, and a substantial reduction in translation gateway load. Additionally, SwitchV2P achieves up to a 1.9× reduction in bandwidth overheads and requires order-of-magnitude fewer gateways for equivalent performance.
+|confname =SIGCOMM'24
+|link = https://dl.acm.org/doi/abs/10.1145/3651890.3672213
+|title= In-Network Address Caching for Virtual Networks
+|speaker=Dongting
+|date=2024-12-06
+}}{{Hist_seminar
+|abstract = Visible light communication (VLC) has become an important complementary means to electromagnetic communications due to its freedom from interference. However, existing Internet-of-Things (IoT) VLC links can reach only <10 meters, which has significantly limited the applications of VLC to the vast and diverse scenarios. In this paper, we propose ChirpVLC, a novel modulation method to prolong VLC distance from ≤10 meters to over 100 meters. The basic idea of ChirpVLC is to trade throughput for prolonged distance by exploiting Chirp Spread Spectrum (CSS) modulation. Specifically, 1) we modulate the luminous intensity as a sinusoidal waveform with a linearly varying frequency and design different spreading factors (SF) for different environmental conditions. 2) We design range adaptation scheme for luminance sensing range to help receivers achieve better signal-to-noise ratio (SNR). 3) ChirpVLC supports many-to-one and non-line-of-sight communications, breaking through the limitations of visible light communication. We implement ChirpVLC and conduct extensive real-world experiments. The results show that ChirpVLC can extend the transmission distance of 5W COTS LEDs to over 100 meters, and the distance/energy utility is increased by 532% compared to the existing work.
+|confname = IDEA
+|link = https://uestc.feishu.cn/file/Pbq3bWgKJoTQObx79f3cf6gungb
+|title= ChirpVLC：Extending The Distance of Low-cost Visible Light Communication with CSS Modulation
+|speaker=Mengyu
+|date=2024-12-06
+}}
+{{Hist_seminar
+|abstract = On-device Deep Neural Network (DNN) training has been recognized as crucial for privacy-preserving machine learning at the edge. However, the intensive training workload and limited onboard computing resources pose significant challenges to the availability and efficiency of model training. While existing works address these challenges through native resource management optimization, we instead leverage our observation that edge environments usually comprise a rich set of accompanying trusted edge devices with idle resources beyond a single terminal. We propose Asteroid, a distributed edge training system that breaks the resource walls across heterogeneous edge devices for efficient model training acceleration. Asteroid adopts a hybrid pipeline parallelism to orchestrate distributed training, along with a judicious parallelism planning for maximizing throughput under certain resource constraints. Furthermore, a fault-tolerant yet lightweight pipeline replay mechanism is developed to tame the device-level dynamics for training robustness and performance stability. We implement Asteroid on heterogeneous edge devices with both vision and language models, demonstrating up to 12.2× faster training than conventional parallelism methods and 2.1× faster than state-of-the-art hybrid parallelism methods through evaluations. Furthermore, Asteroid can recover training pipeline 14× faster than baseline methods while preserving comparable throughput despite unexpected device exiting and failure.
+|confname = MobiCom'24
+|link = https://dl.acm.org/doi/abs/10.1145/3636534.3649363
+|title= Asteroid: Resource-Efficient Hybrid Pipeline Parallelism for Collaborative DNN Training on Heterogeneous Edge Devices
+|speaker=Congrong
+|date=2024-11-29
+}}
+{{Hist_seminar
+|abstract = The need for cooperation among intelligent edge devices has popularized cooperative multi-agent reinforcement learning (MARL) in multi-target coverage. However, many research efforts rely heavily on parameter sharing among homogeneous agents, which hampers coverage performance. The heterogeneity of computing and sensing capabilities, along with the time-varying dynamics of computing resources, pose significant challenges. To address these challenges, we propose a resource-sensitive multi-agent reinforcement learning framework based on heterogeneous edge devices (SmartHE). SmartHE decomposes the target coverage task into two hierarchical levels: 1) Executor-level task: A central coordinator assigns a subset of executors (i.e., cameras or agents) to execute action policies, aiming to minimize overall policy inference time and energy consumption by leveraging resource heterogeneity. 2) Target-level task: Each executor ignores irrelevant targets that fall outside the coverage radius of the executor based on the estimated target states and ignores redundant targets that could be more effectively covered by other executors based on the utility estimation. This enables each executor to focus on extracting features that optimize coverage. Through this dual-task framework, SmartHE efficiently improves the system performance.
+|confname = IDEA
+|link = https://mobinets.cn/site/Resource:Seminar
+|title= SmartHE: Resource-sensitive MARL framework based on heterogeneous edge devices
+|speaker=Xianyang
+|date=2024-11-29
+}}
+{{Hist_seminar
+|abstract = Collaborative inference is the current state-of-the-art solution for mobile-server neural network inference offloading. However, we find that existing collaborative inference solutions only focus on partitioning the DNN computation, which is only a small part of achieving an efficient DNN offloading system. What ultimately determines the performance of DNN offloading is how the execution system utilizes the characteristics of the given DNN offloading task on the mobile, network, and server resources of the offloading environment. To this end, we design CoActo, a DNN execution system built from the ground up for mobile-server inference offloading. Our key design philosophy is Coactive Inference Offloading, which is a new, improved concept of DNN offloading that adds two properties, 1) fine-grained expression of DNNs and 2) concurrency of runtime resources, to existing collaborative inference. In CoActo, system components go beyond simple model splitting of existing approaches and operate more proactively to achieve the coactive execution of inference workloads. CoActo dynamically schedules concurrent interleaving of the mobile, server, and network operations to actively increase resource utilization, enabling lower end-to-end latency. We implement CoActo for various mobile devices and server environments and evaluate our system with distinct environment settings and DNN models. The experimental results show that our system achieves up to 2.1 times speed-up compared to the state-of-the-art collaborative inference solutions.
+|confname = Mobisys'24
+|link = https://dl.acm.org/doi/10.1145/3643832.3661885
+|title= CoActo: CoActive Neural Network Inference Offloading with Fine-grained and Concurrent Execution
+|speaker=Zhenhua
+|date=2024-11-22
+}}
+{{Hist_seminar
+|abstract = Caching is an indispensable technique for low-cost and fast data serving. The eviction algorithm, at the heart of a cache, has been primarily designed to maximize efficiency—reducing the cache miss ratio. Many eviction algorithms have been designed in the past decades. However, they all trade off throughput, simplicity, or both for higher efficiency. Such a compromise often hinders adoption in production systems.This work presents SIEVE, an algorithm that is simpler than LRU and provides better than state-of-the-art efficiency and scalability for web cache workloads. We implemented SIEVE in five production cache libraries, requiring fewer than 20 lines of code changes on average. Our evaluation on 1559 cache traces from 7 sources shows that SIEVE achieves up to 63.2% lower miss ratio than ARC. Moreover, SIEVE has a lower miss ratio than 9 state-of-the-art algorithms on more than 45% of the 1559 traces, while the next best algorithm only has a lower miss ratio on 15%. SIEVE's simplicity comes with superior scalability as cache hits require no locking. Our prototype achieves twice the throughput of an optimized 16-thread LRU implementation. SIEVE is more than an eviction algorithm; it can be used as a cache primitive to build advanced eviction algorithms just like FIFO and LRU.
+|confname =NSDI'24
+|link = https://www.usenix.org/conference/nsdi24/presentation/zhang-yazhuo
+|title= SIEVE is Simpler than LRU: an Efficient Turn-Key Eviction Algorithm for Web Caches
+|speaker=Haotian
+|date=2024-11-22
+}}
+{{Hist_seminar
+|abstract = In this paper, we revisit the problem of the current routing system in terms of prediction scalability and routing result optimality. Specifically, the current traffic prediction models are not suitable for large urban networks due to the incomplete information of traffic conditions. Besides, existing routing systems can only plan the routes based on the past traffic conditions and struggle to update the optimal route for vehicles in real-time. As a result, the actual route taken by vehicles is different from the ground-truth optimal path. Therefore, we propose a Just-In-Time Predictive Route Planning framework to tackle these two problems. Firstly, we propose a Travel Time Constrained Top- kn Shortest Path algorithm which pre-computes a set of candidate paths with several switch points. This empowers vehicles to continuously have the opportunity to switch to better paths taking into account real-time traffic condition changes. Moreover, we present a query-driven prediction paradigm with ellipse-based searching space estimation, along with an efficient multi-queries handling mechanism. This not only allows for targeted traffic prediction by prioritizing regions with valuable yet outdated traffic information, but also provides optimal results for multiple queries based on real-time traffic evolution. Evaluations on two real-life road networks demonstrate the effectiveness and efficiency of our framework and methods.
+|confname =ICDE'24
+|link = https://ieeexplore.ieee.org/document/10598147/authors#authors
+|title= A Just-In-Time Framework for Continuous Routing
+|speaker=Zhenguo
+|date=2024-11-8
+}}
+{{Hist_seminar
+|abstract = Many networking tasks now employ deep learning (DL) to solve complex prediction and optimization problems. However, current design philosophy of DL-based algorithms entails intensive engineering overhead due to the manual design of deep neural networks (DNNs) for different networking tasks. Besides, DNNs tend to achieve poor generalization performance on unseen data distributions/environments. Motivated by the recent success of large language models (LLMs), this work studies the LLM adaptation for networking to explore a more sustainable design philosophy. With the powerful pre-trained knowledge, the LLM is promising to serve as the foundation model to achieve "one model for all tasks" with even better performance and stronger generalization. In pursuit of this vision, we present NetLLM, the first framework that provides a coherent design to harness the powerful capabilities of LLMs with low efforts to solve networking problems. Specifically, NetLLM empowers the LLM to effectively process multimodal data in networking and efficiently generate task-specific answers. Besides, NetLLM drastically reduces the costs of fine-tuning the LLM to acquire domain knowledge for networking. Across three networking-related use cases - viewport prediction, adaptive bitrate streaming and cluster job scheduling, we showcase that the NetLLM-adapted LLM significantly outperforms state-of-the-art algorithms.
+|confname =SIGCOMM'24
+|link = https://dl.acm.org/doi/abs/10.1145/3651890.3672268
+|title= NetLLM: Adapting Large Language Models for Networking
+|speaker=Yinghao
+|date=2024-11-8
+}}
+{{Hist_seminar
+|abstract = Sparsely-activated Mixture-of-Expert (MoE) layers have found practical applications in enlarging the model size of large-scale foundation models, with only a sub-linear increase in computation demands. Despite the wide adoption of hybrid parallel paradigms like model parallelism, expert parallelism, and expert-sharding parallelism (i.e., MP+EP+ESP) to support MoE model training on GPU clusters, the training efficiency is hindered by communication costs introduced by these parallel paradigms. To address this limitation, we propose Parm, a system that accelerates MP+EP+ESP training by designing two dedicated schedules for placing communication tasks. The proposed schedules eliminate redundant computations and communications and enable overlaps between intra-node and inter-node communications, ultimately reducing the overall training time. As the two schedules are not mutually exclusive, we provide comprehensive theoretical analyses and derive an automatic and accurate solution to determine which schedule should be applied in different scenarios. Experimental results on an 8-GPU server and a 32-GPU cluster demonstrate that Parm outperforms the state-of-the-art MoE training system, DeepSpeed-MoE, achieving 1.13× to 5.77× speedup on 1296 manually configured MoE layers and approximately 3× improvement on two real-world MoE models based on BERT and GPT-2.
+|confname =INFOCOM'24
+|link = https://ieeexplore.ieee.org/abstract/document/10621327
+|title= Parm: Efficient Training of Large Sparsely-Activated Models with Dedicated Schedules
+|speaker=Mengqi
+|date=2024-11-1
+}}
+{{Hist_seminar
+|abstract = HD map is a key enabling technology towards fully autonomous driving. We propose VI-Map, the first system that leverages roadside infrastructure to enhance real-time HD mapping for autonomous driving. The core concept of VI-Map is to exploit the unique cumulative observations made by roadside infrastructure to build and maintain an accurate and current HD map. This HD map is then fused with on-vehicle HD maps in real time, resulting in a more comprehensive and up-to-date HD map. By extracting concise bird-eye-view features from infrastructure observations and utilizing vectorized map representations, VI-Map incurs low compute and communication overhead. We conducted end-to-end evaluations of VI-Map on a real-world testbed and a simulator. Experiment results show that VI-Map can construct decentimeter-level (up to 0.3 m) HD maps and achieve real-time (up to a delay of 42 ms) map fusion between driving vehicles and roadside infrastructure. This represents a significant improvement of 2.8× and 3× in map accuracy and coverage compared to the state-of-the-art online HD mapping approaches. A video demo of VI-Map on our real-world testbed is available at https://youtu.be/p2RO65R5Ezg.
+|confname=Mobicom'23
+|link = https://dl.acm.org/doi/abs/10.1145/3570361.3613280
+|title= VI-Map: Infrastructure-Assisted Real-Time HD Mapping for Autonomous Driving
+|speaker=Wangyang
+|date=2024-11-1
+}}
+{{Hist_seminar
+|abstract = Video super-resolution (VSR) on mobile devices aims to restore high-resolution frames from their low-resolution counterparts, satisfying the requirements of performance, FLOPs and latency. On one hand, partial feature processing, as a classic and acknowledged strategy, is developed in current studies to reach an appropriate trade-off between FLOPs and accuracy. However, the splitting of partial feature processing strategy are usually performed in a blind manner, thereby reducing the computational efficiency and performance gains. On the other hand, current methods for mobile platforms primarily treat VSR as an extension of single-image super-resolution to reduce model calculation and inference latency. However, lacking inter-frame information interaction in current methods results in a suboptimal latency and accuracy trade-off. To this end, we propose a novel architecture, termed Feature Aggregating Network with Inter-frame Interaction (FANI), a lightweight yet considering frame-wise correlation VSR network, which could achieve real-time inference while maintaining superior performance. Our FANI accepts adjacent multi-frame low-resolution images as input and generally consists of several fully-connection-embedded modules, i.e., Multi-stage Partial Feature Distillation (MPFD) for capturing multi-level feature representations. Moreover, considering the importance of inter-frame alignment, we further employ a tiny Attention-based Frame Alignment (AFA) module to promote inter-frame information flow and aggregation efficiently. Extensive experiments on the well-known dataset and real-world mobile device demonstrate the superiority of our proposed FANI, which means that our FANI could be well adapted to mobile devices and produce visually pleasing results.
+|confname = ICDM'23
+|link = https://ieeexplore.ieee.org/abstract/document/10415812
+|title= Feature Aggregating Network with Inter-Frame Interaction for Efficient Video Super-Resolution
+|speaker=Shuhong
+|date=2024-10-25
+}}
+{{Hist_seminar
+|abstract = The proliferation of edge devices has pushed computing from the cloud to the data sources, and video analytics is among the most promising applications of edge computing. Running video analytics is compute- and latency-sensitive, as video frames are analyzed by complex deep neural networks (DNNs) which put severe pressure on resource-constrained edge devices. To resolve the tension between inference latency and resource cost, we present Polly, a cross-camera inference system that enables co-located cameras with different but overlapping fields of views (FoVs) to share inference results between one another, thus eliminating the redundant inference work for objects in the same physical area. Polly’s design solves two basic challenges of cross-camera inference: how to identify overlapping FoVs automatically, and how to share inference results accurately across cameras. Evaluation on NVIDIA Jetson Nano with a real-world traffic surveillance dataset shows that Polly reduces the inference latency by up to 71.4% while achieving almost the same detection accuracy with state-of-the-art systems.
+|confname= INFOCOM'23
+|link = https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10229045
+|title= Cross-Camera Inference on the Constrained Edge
+|speaker=Xinyan
+|date=2024-10-25
+}}
+{{Hist_seminar
+|abstract = Smart cameras with on-device deep learning inference capabilities are enabling distributed video analytics at the data source without sending raw video data over the often unreliable and congested wireless network. However, how to unleash the full potential of the computing power of the camera network requires careful coordination among the distributed cameras, catering to the uneven workload distribution and the heterogeneous computing capabilities. This paper presents CrossVision, a distributed framework for real-time video analytics, that retains all video data on cameras while achieving low inference delay and high inference accuracy. The key idea behind CrossVision is that there is a significant information redundancy in the video content captured by cameras with overlapped Field-of-Views (FoVs), which can be exploited to reduce inference workload as well as improve inference accuracy between correlated cameras. CrossVision consists of three main components to realize its function: a Region-of-Interest (RoI) Matcher that discovers video content correlation based on a segmented FoV transformation scheme; a Workload Balancer that implements a randomized workload balancing strategy based on a bulk-queuing analysis, taking into account the cameras’ predicted future workload arrivals; an Accuracy Guard that ensures that the inference accuracy is not sacrificed as redundant information is discarded. We evaluate CrossVision in a hardware-augmented simulator and on real-world cross-camera datasets, and the results show that CrossVision is able to significantly reduce inference delay while improving the inference accuracy compared to a variety of baseline approaches.
+|confname= TMC'24
+|link = https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10202594
+|title= CrossVision: Real-Time On-Camera Video Analysis via Common RoI Load Balancing
+|speaker=Xinyan
+|date=2024-10-25
+}}
+{{Hist_seminar
+|abstract = LoRa is a promising technology that offers ubiquitous low-power IoT connectivity. With the features of multi-channel communication, orthogonal transmission, and spectrum sharing, LoRaWAN is poised to connect millions of IoT devices across thousands of logical channels. However, current LoRa gateways utilize hardwired Rx chains that cover only a small fraction (<1%) of the logical channels, limiting the potential for massive LoRa communications. This paper presents XGate, a novel gateway design that uses a single Rx chain to concurrently receive packets from all logical channels, fundamentally enabling scalable LoRa transmission and flexible network access. Unlike hardwired Rx chains in the current gateway design, XGate allocates resources including software-controlled Rx chains and demodulators based on the extracted meta information of incoming packets. XGate addresses a series of challenges to efficiently detect incoming packets without prior knowledge of their parameter configurations. Evaluations show that XGate boosts LoRa concurrent transmissions by 8.4× than state-of-the-art.
+|confname=Mobicom' 24
+|link = https://dl.acm.org/doi/pdf/10.1145/3636534.3649375
+|title= Revolutionizing LoRa Gateway with XGate: Scalable Concurrent Transmission across Massive Logical Channels
+|speaker=Chenkai
+|date=2024-10-18
+}}
+{{Hist_seminar
+|abstract = Deep learning training (DLT), e.g., large language model (LLM) training, has become one of the most important services in multitenant cloud computing. By deeply studying in-production DLT jobs, we observed that communication contention among different DLT jobs seriously influences the overall GPU computation utilization, resulting in the low efficiency of the training cluster. In this paper, we present Crux, a communication scheduler that aims to maximize GPU computation utilization by mitigating the communication contention among DLT jobs. Maximizing GPU computation utilization for DLT, nevertheless, is NP-Complete; thus, we formulate and prove a novel theorem to approach this goal by GPU intensity-aware communication scheduling. Then, we propose an approach that prioritizes the DLT flows with high GPU computation intensity, reducing potential communication contention. Our 96-GPU testbed experiments show that Crux improves 8.3% to 14.8% GPU computation utilization. The large-scale production trace-based simulation further shows that Crux increases GPU computation utilization by up to 23% compared with alternatives including Sincronia, TACCL, and CASSINI.
+|confname=SIGCOMM' 24
+|link = https://dl.acm.org/doi/pdf/10.1145/3651890.3672239
+|title= Crux: GPU-Efficient Communication Scheduling for Deep Learning Training
+|speaker=Youwei
+|date=2024-10-18
+}}
+{{Hist_seminar
+|abstract = Zero-shot object navigation is a challenging task for home-assistance robots. This task emphasizes visual grounding, commonsense inference and locomotion abilities, where the first two are inherent in foundation models. But for the locomotion part, most works still depend on map-based planning approaches. The gap between RGB space and map space makes it difficult to directly transfer the knowledge from foundation models to navigation tasks. In this work, we propose a Pixel-guided Navigation skill (PixNav), which bridges the gap between the foundation models and the embodied navigation task. It is straightforward for recent foundation models to indicate an object by pixels, and with pixels as the goal specification, our method becomes a versatile navigation policy towards all different kinds of objects. Besides, our PixNav is a pure RGB-based policy that can reduce the cost of homeassistance robots. Experiments demonstrate the robustness of the PixNav which achieves 80+% success rate in the local path-planning task. To perform long-horizon object navigation, we design an LLM-based planner to utilize the commonsense knowledge between objects and rooms to select the best waypoint. Evaluations across both photorealistic indoor simulators and real-world environments validate the effectiveness of our proposed navigation strategy.
+|confname=ICRA' 24
+|link = https://ieeexplore.ieee.org/document/10610499
+|title= Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation Skill
+|speaker=Qinyong
+|date=2024-10-11
+}}
+{{Hist_seminar
+|abstract = Datacenter networks today provide best-effort delivery—messages may observe unpredictable queueing, delays, and drops due to switch buffer overflows within the network. Such weak guarantees reduce the set of assumptions that system designers can rely upon from the network, thus introducing inefficiency and complexity in host hardware and software. We present Harmony, a datacenter network architecture that provides powerful "congestion-free" message delivery guarantees—each message, once transmitted by the sender, observes bounded queueing at each switch in the network. Thus, network delays are bounded in failure-free scenarios, and congestion-related drops are completely eliminated. We establish, both theoretically and empirically, that Harmony provides such powerful guarantees with near-zero overheads compared to best-effort delivery networks: it incurs a tiny additive latency overhead that diminishes with message sizes, while achieving near-optimal network utilization.
+|confname=NSDI' 24
+|link = https://www.usenix.org/conference/nsdi24/presentation/agarwal-saksham
+|title= Harmony: A Congestion-free Datacenter Architecture
+|speaker=Junzhe
+|date=2024-10-11
+}}
 {{Hist_seminar
 |abstract = Overlapping cameras offer exciting opportunities to view a scene from different angles, allowing for more advanced, comprehensive and robust analysis. However, existing video analytics systems for multi-camera streams are mostly limited to (i) per-camera processing and aggregation and (ii) workload-agnostic centralized processing architectures. In this paper, we present Argus, a distributed video analytics system with cross-camera collaboration on smart cameras. We identify multi-camera, multi-target tracking as the primary task of multi-camera video analytics and develop a novel technique that avoids redundant, processing-heavy identification tasks by leveraging object-wise spatio-temporal association in the overlapping fields of view across multiple cameras. We further develop a set of techniques to perform these operations across distributed cameras without cloud support at low latency by (i) dynamically ordering the camera and object inspection sequence and (ii) flexibly distributing the workload across smart cameras, taking into account network transmission and heterogeneous computational capacities. Evaluation of three real-world overlapping camera datasets with two Nvidia Jetson devices shows that Argus reduces the number of object identifications and end-to-end latency by up to 7.13× and 2.19× (4.86× and 1.60× compared to the state-of-the-art), while achieving comparable tracking quality.
@@ Line 10: / Line 543: @@
 |date=2024-9-29
 }}
 {{Hist_seminar
 |abstract = We present FarfetchFusion, a fully mobile live 3D telepresence system. Enabling mobile live telepresence is a challenging problem as it requires i) realistic reconstruction of the user and ii) high responsiveness for immersive experience. We first thoroughly analyze the live 3D telepresence pipeline and identify three critical challenges: i) 3D data streaming latency and compression complexity, ii) computational complexity of volumetric fusion-based 3D reconstruction, and iii) inconsistent reconstruction quality due to sparsity of mobile 3D sensors. To tackle the challenges, we propose a disentangled fusion approach, which separates invariant regions and dynamically changing regions with our low-complexity spatio-temporal alignment technique, topology anchoring. We then design and implement an end-to-end system, which achieves realistic reconstruction quality comparable to existing server-based solutions while meeting the real-time performance requirements (<100 ms end-to-end latency, 30 fps throughput, <16 ms motion-to-photon latency) solely relying on mobile computation capability.
@@ Line 19: / Line 551: @@
 |date=2024-9-29
 }}
 {{Hist_seminar
 |abstract = Increasing bandwidth demands of mobile video streaming pose a challenge in optimizing the Quality of Experience (QoE) for better user engagement. Multipath transmission promises to extend network capacity by utilizing multiple wireless links simultaneously. Previous studies mainly tune the packet scheduler in multipath transmission, expecting higher QoE by accelerating transmission. However, since Adaptive BitRate (ABR) algorithms overlook the impact of multipath scheduling on throughput prediction, multipath adaptive streaming can even experience lower QoE than single-path. This paper proposes Chorus, a cross-layer framework that coordinates multipath scheduling with adaptive streaming to optimize QoE jointly. Chorus establishes two-way feedback control loops between the server and the client. Furthermore, Chorus introduces Coarse-grained Decisions, which assist appropriate bitrate selection by considering the scheduling decision in throughput prediction, and Finegrained Corrections, which meet the predicted throughput by QoE-oriented multipath scheduling. Extensive emulation and real-world mobile Internet evaluations show that Chorus outperforms the state-of-the-art MPQUIC scheduler, improving average QoE by 23.5% and 65.7%, respectively.
@@ Line 238: / Line 767: @@
 |speaker=Zhenghua
 |date=2024-01-04}}
 ====2023====
 {{Hist_seminar

Difference between revisions of "Resource:Previous Seminars"

Latest revision as of 21:56, 9 April 2026

History

2024

2023

2022

2021

2020

2019

2018

2017

Instructions