Difference between revisions of "Resource:Seminar"

Revision as of 21:50, 28 November 2024

Time: 2024-11-29 10:30-12:00
Address: 4th Research Building A518
Useful links: 📚 Readling list; 📆 Schedules; 🧐 Previous seminars.

[MobiCom'24] Asteroid: Resource-Efficient Hybrid Pipeline Parallelism for Collaborative DNN Training on Heterogeneous Edge Devices, Congrong
Abstract: On-device Deep Neural Network (DNN) training has been recognized as crucial for privacy-preserving machine learning at the edge. However, the intensive training workload and limited onboard computing resources pose significant challenges to the availability and efficiency of model training. While existing works address these challenges through native resource management optimization, we instead leverage our observation that edge environments usually comprise a rich set of accompanying trusted edge devices with idle resources beyond a single terminal. We propose Asteroid, a distributed edge training system that breaks the resource walls across heterogeneous edge devices for efficient model training acceleration. Asteroid adopts a hybrid pipeline parallelism to orchestrate distributed training, along with a judicious parallelism planning for maximizing throughput under certain resource constraints. Furthermore, a fault-tolerant yet lightweight pipeline replay mechanism is developed to tame the device-level dynamics for training robustness and performance stability. We implement Asteroid on heterogeneous edge devices with both vision and language models, demonstrating up to 12.2× faster training than conventional parallelism methods and 2.1× faster than state-of-the-art hybrid parallelism methods through evaluations. Furthermore, Asteroid can recover training pipeline 14× faster than baseline methods while preserving comparable throughput despite unexpected device exiting and failure.
[IDEA] [None SmartHE: Resource-sensitive MARL framework based on heterogeneous edge devices], Xianyang
Abstract: The need for cooperation among intelligent edge devices has popularized cooperative multi-agent reinforcement learning (MARL) in multi-target coverage. However, many research efforts rely heavily on parameter sharing among homogeneous agents, which hampers coverage performance. The heterogeneity of computing and sensing capabilities, along with the time-varying dynamics of computing resources, pose significant challenges. To address these challenges, we propose a resource-sensitive multi-agent reinforcement learning framework based on heterogeneous edge devices (SmartHE). SmartHE decomposes the target coverage task into two hierarchical levels: 1) Executor-level task: A central coordinator assigns a subset of executors (i.e., cameras or agents) to execute action policies, aiming to minimize overall policy inference time and energy consumption by leveraging resource heterogeneity. 2) Target-level task: Each executor ignores irrelevant targets that fall outside the coverage radius of the executor based on the estimated target states and ignores redundant targets that could be more effectively covered by other executors based on the utility estimation. This enables each executor to focus on extracting features that optimize coverage. Through this dual-task framework, SmartHE efficiently improves the system performance.

[Topic] [ The path planning algorithm for multiple mobile edge servers in EdgeGO], Rong Cong, 2020-11-18

[Mobisys20] Combating packet collisions using non-stationary signal scaling in LPWANs, Wenliang Mao, 2020-11-18
[Topic] [ Dependency-Aware and Latency-Optimal Service Cache in Edge networks], Jiwei Mo, 2020-11-18
[talk] Paper Carnival 2020, ALL, 2020-09-24,25,26

请使用Latest_seminar和Hist_seminar模板更新本页信息.

- 修改时间和地点信息
- 将当前latest seminar部分的code复制到这个页面中
- 将{{Latest_seminar... 修改为 {{Hist_seminar...，并增加对应的日期信息|date=
- 填入latest seminar各字段信息
- link请务必不要留空，如果没有link则填本页地址 https://mobinets.org/index.php?title=Resource:Seminar

格式说明
- Latest_seminar:

{{Latest_seminar
|confname=
|link=
|title=
|speaker=
}}

- Hist_seminar

{{Hist_seminar
|confname=
|link=
|title=
|speaker=
|date=
}}

@@ Line 1: / Line 1: @@
 {{SemNote
-|time='''2024-11-22 10:30-12:00'''
+|time='''2024-11-29 10:30-12:00'''
 |addr=4th Research Building A518
 |note=Useful links: [[Resource:Reading_List|📚 Readling list]]; [[Resource:Seminar_schedules|📆 Schedules]]; [[Resource:Previous_Seminars|🧐 Previous seminars]].
@@ Line 8: / Line 8: @@
 {{Latest_seminar
-|abstract = Collaborative inference is the current state-of-the-art solution for mobile-server neural network inference offloading. However, we find that existing collaborative inference solutions only focus on partitioning the DNN computation, which is only a small part of achieving an efficient DNN offloading system. What ultimately determines the performance of DNN offloading is how the execution system utilizes the characteristics of the given DNN offloading task on the mobile, network, and server resources of the offloading environment. To this end, we design CoActo, a DNN execution system built from the ground up for mobile-server inference offloading. Our key design philosophy is Coactive Inference Offloading, which is a new, improved concept of DNN offloading that adds two properties, 1) fine-grained expression of DNNs and 2) concurrency of runtime resources, to existing collaborative inference. In CoActo, system components go beyond simple model splitting of existing approaches and operate more proactively to achieve the coactive execution of inference workloads. CoActo dynamically schedules concurrent interleaving of the mobile, server, and network operations to actively increase resource utilization, enabling lower end-to-end latency. We implement CoActo for various mobile devices and server environments and evaluate our system with distinct environment settings and DNN models. The experimental results show that our system achieves up to 2.1 times speed-up compared to the state-of-the-art collaborative inference solutions.
+|abstract = On-device Deep Neural Network (DNN) training has been recognized as crucial for privacy-preserving machine learning at the edge. However, the intensive training workload and limited onboard computing resources pose significant challenges to the availability and efficiency of model training. While existing works address these challenges through native resource management optimization, we instead leverage our observation that edge environments usually comprise a rich set of accompanying trusted edge devices with idle resources beyond a single terminal. We propose Asteroid, a distributed edge training system that breaks the resource walls across heterogeneous edge devices for efficient model training acceleration. Asteroid adopts a hybrid pipeline parallelism to orchestrate distributed training, along with a judicious parallelism planning for maximizing throughput under certain resource constraints. Furthermore, a fault-tolerant yet lightweight pipeline replay mechanism is developed to tame the device-level dynamics for training robustness and performance stability. We implement Asteroid on heterogeneous edge devices with both vision and language models, demonstrating up to 12.2× faster training than conventional parallelism methods and 2.1× faster than state-of-the-art hybrid parallelism methods through evaluations. Furthermore, Asteroid can recover training pipeline 14× faster than baseline methods while preserving comparable throughput despite unexpected device exiting and failure.
-|confname = Mobisys'24
+|confname = MobiCom'24
-|link = https://dl.acm.org/doi/10.1145/3643832.3661885
+|link = https://dl.acm.org/doi/abs/10.1145/3636534.3649363
-|title= CoActo: CoActive Neural Network Inference Offloading with Fine-grained and Concurrent Execution
+|title= Asteroid: Resource-Efficient Hybrid Pipeline Parallelism for Collaborative DNN Training on Heterogeneous Edge Devices
-|speaker=Zhenhua
+|speaker=Congrong
-|date=2024-11-22
+|date=2024-11-29
 }}
 {{Latest_seminar
-|abstract = Caching is an indispensable technique for low-cost and fast data serving. The eviction algorithm, at the heart of a cache, has been primarily designed to maximize efficiency—reducing the cache miss ratio. Many eviction algorithms have been designed in the past decades. However, they all trade off throughput, simplicity, or both for higher efficiency. Such a compromise often hinders adoption in production systems.This work presents SIEVE, an algorithm that is simpler than LRU and provides better than state-of-the-art efficiency and scalability for web cache workloads. We implemented SIEVE in five production cache libraries, requiring fewer than 20 lines of code changes on average. Our evaluation on 1559 cache traces from 7 sources shows that SIEVE achieves up to 63.2% lower miss ratio than ARC. Moreover, SIEVE has a lower miss ratio than 9 state-of-the-art algorithms on more than 45% of the 1559 traces, while the next best algorithm only has a lower miss ratio on 15%. SIEVE's simplicity comes with superior scalability as cache hits require no locking. Our prototype achieves twice the throughput of an optimized 16-thread LRU implementation. SIEVE is more than an eviction algorithm; it can be used as a cache primitive to build advanced eviction algorithms just like FIFO and LRU.
+|abstract = The need for cooperation among intelligent edge devices has popularized cooperative multi-agent reinforcement learning (MARL) in multi-target coverage. However, many research efforts rely heavily on parameter sharing among homogeneous agents, which hampers coverage performance. The heterogeneity of computing and sensing capabilities, along with the time-varying dynamics of computing resources, pose significant challenges. To address these challenges, we propose a resource-sensitive multi-agent reinforcement learning framework based on heterogeneous edge devices (SmartHE). SmartHE decomposes the target coverage task into two hierarchical levels: 1) Executor-level task: A central coordinator assigns a subset of executors (i.e., cameras or agents) to execute action policies, aiming to minimize overall policy inference time and energy consumption by leveraging resource heterogeneity. 2) Target-level task: Each executor ignores irrelevant targets that fall outside the coverage radius of the executor based on the estimated target states and ignores redundant targets that could be more effectively covered by other executors based on the utility estimation. This enables each executor to focus on extracting features that optimize coverage. Through this dual-task framework, SmartHE efficiently improves the system performance.
-|confname =NSDI'24
+|confname = IDEA
-|link = https://www.usenix.org/conference/nsdi24/presentation/zhang-yazhuo
+|link = None
-|title= SIEVE is Simpler than LRU: an Efficient Turn-Key Eviction Algorithm for Web Caches
+|title= SmartHE: Resource-sensitive MARL framework based on heterogeneous edge devices
-|speaker=Haotian
+|speaker=Xianyang
-|date=2024-11-22
+|date=2024-11-29
 }}
 {{Resource:Previous_Seminars}}

Navigation menu

Difference between revisions of "Resource:Seminar"

Revision as of 21:50, 28 November 2024

Contents

Difference between revisions of "Resource:Seminar"

Revision as of 21:50, 28 November 2024

Latest

History

2024

2023

2022

2021

2020

2019

2018

2017

Instructions