Difference between revisions of "Resource:Seminar"

Latest revision as of 10:37, 10 April 2026

Time: 2026-04-10 10:30
Address: 4th Research Building A518
Useful links: 📚 Readling list; 📆 Schedules; 🧐 Previous seminars.

[OSDI'25] PipeThreader: Software-defined pipelining for efficient DNN execution, Junzhe
Abstract: To effectively utilize heterogeneous specialized hardware units in modern GPUs, such as TensorCores and Tensor Memory Accelerators, this paper introduces PipeThreader, a new DNN compiler. PipeThreader proposes shifting scheduling functionality from hardware to software so as to enable more efficient and sophisticated computation pipelining with minimal manual effort. This is achieved through sTask-graph, a new DNN computation abstraction, a hierarchical hardware abstraction that captures the capabilities of specialized units, and new scheduling primitives. As a result, PipeThreader can discover efficient pipeline scheduling for well-studied DNN architectures like FlashAttention, achieving comparable or even superior performance. Additionally, it can uncover novel pipeline schemes for emerging models like Mamba2, delivering significantly better performance compared to state-of-the-art hand-crafted implementations. The code is open-sourced at https://github.com/tile-ai/tilelang.

[Topic] [ The path planning algorithm for multiple mobile edge servers in EdgeGO], Rong Cong, 2020-11-18

[Mobisys20] Combating packet collisions using non-stationary signal scaling in LPWANs, Wenliang Mao, 2020-11-18
[Topic] [ Dependency-Aware and Latency-Optimal Service Cache in Edge networks], Jiwei Mo, 2020-11-18
[talk] Paper Carnival 2020, ALL, 2020-09-24,25,26

请使用Latest_seminar和Hist_seminar模板更新本页信息.

- 修改时间和地点信息
- 将当前latest seminar部分的code复制到这个页面中
- 将{{Latest_seminar... 修改为 {{Hist_seminar...，并增加对应的日期信息|date=
- 填入latest seminar各字段信息
- link请务必不要留空，如果没有link则填本页地址 https://mobinets.org/index.php?title=Resource:Seminar

格式说明
- Latest_seminar:

{{Latest_seminar
|confname=
|link=
|title=
|speaker=
}}

- Hist_seminar

{{Hist_seminar
|confname=
|link=
|title=
|speaker=
|date=
}}

@@ Line 1: / Line 1: @@
 {{SemNote
-|time='''2024-11-29 10:30-12:00'''
+|time='''2026-04-10 10:30'''
 |addr=4th Research Building A518
 |note=Useful links: [[Resource:Reading_List|📚 Readling list]]; [[Resource:Seminar_schedules|📆 Schedules]]; [[Resource:Previous_Seminars|🧐 Previous seminars]].
@@ Line 8: / Line 8: @@
 {{Latest_seminar
-|abstract = On-device Deep Neural Network (DNN) training has been recognized as crucial for privacy-preserving machine learning at the edge. However, the intensive training workload and limited onboard computing resources pose significant challenges to the availability and efficiency of model training. While existing works address these challenges through native resource management optimization, we instead leverage our observation that edge environments usually comprise a rich set of accompanying trusted edge devices with idle resources beyond a single terminal. We propose Asteroid, a distributed edge training system that breaks the resource walls across heterogeneous edge devices for efficient model training acceleration. Asteroid adopts a hybrid pipeline parallelism to orchestrate distributed training, along with a judicious parallelism planning for maximizing throughput under certain resource constraints. Furthermore, a fault-tolerant yet lightweight pipeline replay mechanism is developed to tame the device-level dynamics for training robustness and performance stability. We implement Asteroid on heterogeneous edge devices with both vision and language models, demonstrating up to 12.2× faster training than conventional parallelism methods and 2.1× faster than state-of-the-art hybrid parallelism methods through evaluations. Furthermore, Asteroid can recover training pipeline 14× faster than baseline methods while preserving comparable throughput despite unexpected device exiting and failure.
+|abstract = To effectively utilize heterogeneous specialized hardware units in modern GPUs, such as TensorCores and Tensor Memory Accelerators, this paper introduces PipeThreader, a new DNN compiler. PipeThreader proposes shifting scheduling functionality from hardware to software so as to enable more efficient and sophisticated computation pipelining with minimal manual effort. This is achieved through sTask-graph, a new DNN computation abstraction, a hierarchical hardware abstraction that captures the capabilities of specialized units, and new scheduling primitives. As a result, PipeThreader can discover efficient pipeline scheduling for well-studied DNN architectures like FlashAttention, achieving comparable or even superior performance. Additionally, it can uncover novel pipeline schemes for emerging models like Mamba2, delivering significantly better performance compared to state-of-the-art hand-crafted implementations. The code is open-sourced at https://github.com/tile-ai/tilelang.
-|confname = MobiCom'24
+|confname =OSDI'25
-|link = https://dl.acm.org/doi/abs/10.1145/3636534.3649363
+|link = https://www.usenix.org/conference/osdi25/presentation/cheng
-|title= Asteroid: Resource-Efficient Hybrid Pipeline Parallelism for Collaborative DNN Training on Heterogeneous Edge Devices
+|title= PipeThreader: Software-defined pipelining for efficient DNN execution
-|speaker=Congrong
+|speaker=Junzhe
-|date=2024-11-29
+|date=2026-4-9
-}}
-{{Latest_seminar
-|abstract = The need for cooperation among intelligent edge devices has popularized cooperative multi-agent reinforcement learning (MARL) in multi-target coverage. However, many research efforts rely heavily on parameter sharing among homogeneous agents, which hampers coverage performance. The heterogeneity of computing and sensing capabilities, along with the time-varying dynamics of computing resources, pose significant challenges. To address these challenges, we propose a resource-sensitive multi-agent reinforcement learning framework based on heterogeneous edge devices (SmartHE). SmartHE decomposes the target coverage task into two hierarchical levels: 1) Executor-level task: A central coordinator assigns a subset of executors (i.e., cameras or agents) to execute action policies, aiming to minimize overall policy inference time and energy consumption by leveraging resource heterogeneity. 2) Target-level task: Each executor ignores irrelevant targets that fall outside the coverage radius of the executor based on the estimated target states and ignores redundant targets that could be more effectively covered by other executors based on the utility estimation. This enables each executor to focus on extracting features that optimize coverage. Through this dual-task framework, SmartHE efficiently improves the system performance.
-|confname = IDEA
-|link =
-|title= SmartHE: Resource-sensitive MARL framework based on heterogeneous edge devices
-|speaker=Xianyang
-|date=2024-11-29
 }}
 {{Resource:Previous_Seminars}}

Navigation menu

Difference between revisions of "Resource:Seminar"

Latest revision as of 10:37, 10 April 2026

Contents

Difference between revisions of "Resource:Seminar"

Latest revision as of 10:37, 10 April 2026

Latest

History

2024

2023

2022

2021

2020

2019

2018

2017

Instructions