Difference between revisions of "Resource:Seminar"

Latest revision as of 10:37, 10 April 2026

Time: 2026-04-10 10:30
Address: 4th Research Building A518
Useful links: 📚 Readling list; 📆 Schedules; 🧐 Previous seminars.

[OSDI'25] PipeThreader: Software-defined pipelining for efficient DNN execution, Junzhe
Abstract: To effectively utilize heterogeneous specialized hardware units in modern GPUs, such as TensorCores and Tensor Memory Accelerators, this paper introduces PipeThreader, a new DNN compiler. PipeThreader proposes shifting scheduling functionality from hardware to software so as to enable more efficient and sophisticated computation pipelining with minimal manual effort. This is achieved through sTask-graph, a new DNN computation abstraction, a hierarchical hardware abstraction that captures the capabilities of specialized units, and new scheduling primitives. As a result, PipeThreader can discover efficient pipeline scheduling for well-studied DNN architectures like FlashAttention, achieving comparable or even superior performance. Additionally, it can uncover novel pipeline schemes for emerging models like Mamba2, delivering significantly better performance compared to state-of-the-art hand-crafted implementations. The code is open-sourced at https://github.com/tile-ai/tilelang.

[Topic] [ The path planning algorithm for multiple mobile edge servers in EdgeGO], Rong Cong, 2020-11-18

[Mobisys20] Combating packet collisions using non-stationary signal scaling in LPWANs, Wenliang Mao, 2020-11-18
[Topic] [ Dependency-Aware and Latency-Optimal Service Cache in Edge networks], Jiwei Mo, 2020-11-18
[talk] Paper Carnival 2020, ALL, 2020-09-24,25,26

请使用Latest_seminar和Hist_seminar模板更新本页信息.

- 修改时间和地点信息
- 将当前latest seminar部分的code复制到这个页面中
- 将{{Latest_seminar... 修改为 {{Hist_seminar...，并增加对应的日期信息|date=
- 填入latest seminar各字段信息
- link请务必不要留空，如果没有link则填本页地址 https://mobinets.org/index.php?title=Resource:Seminar

格式说明
- Latest_seminar:

{{Latest_seminar
|confname=
|link=
|title=
|speaker=
}}

- Hist_seminar

{{Hist_seminar
|confname=
|link=
|title=
|speaker=
|date=
}}

@@ Line 1: / Line 1: @@
 {{SemNote
-|time='''2023-10-08 16:20'''
+|time='''2026-04-10 10:30'''
 |addr=4th Research Building A518
-|note=Useful links: [[Resource:Reading_List|Readling list]]; [[Resource:Seminar_schedules|Schedules]]; [[Resource:Previous_Seminars|Previous seminars]].
+|note=Useful links: [[Resource:Reading_List|📚 Readling list]]; [[Resource:Seminar_schedules|📆 Schedules]]; [[Resource:Previous_Seminars|🧐 Previous seminars]].
 }}
 ===Latest===
 {{Latest_seminar
-|abstract=This paper presents CellFusion, a system designed for high-quality, real-time video streaming from vehicles to the cloud. It leverages an innovative blend of multipath QUIC transport and network coding. Surpassing the limitations of individual cellular carriers, CellFusion uses a unique last-mile overlay that integrates multiple cellular networks into a single, unified cloud connection. This integration is made possible through the use of in-vehicle Customer Premises Equipment (CPEs) and edge-cloud proxy servers. In order to effectively handle unstable cellular connections prone to intense burst losses and unexpected latency spikes as a vehicle moves, CellFusion introduces XNC. This innovative network coding-based transport solution enables efficient and resilient multipath transport. XNC aims to accomplish low latency, minimal traffic redundancy, and reduced computational complexity all at once. CellFusion is secure and transparent by nature and does not require modifications for vehicular apps connecting to it. We tested CellFusion on 100 self-driving vehicles for over six months with our cloud-native back-end running on 50 CDN PoPs. Through extensive road tests, we show that XNC reduced video packet delay by 71.53% at the 99th percentile versus 5G. At 30Mbps, CellFusion achieved 66.11% ~ 80.62% reduction in video stall ratio versus state-of-the-art multipath transport solutions with less than 10% traffic redundancy.
+|abstract = To effectively utilize heterogeneous specialized hardware units in modern GPUs, such as TensorCores and Tensor Memory Accelerators, this paper introduces PipeThreader, a new DNN compiler. PipeThreader proposes shifting scheduling functionality from hardware to software so as to enable more efficient and sophisticated computation pipelining with minimal manual effort. This is achieved through sTask-graph, a new DNN computation abstraction, a hierarchical hardware abstraction that captures the capabilities of specialized units, and new scheduling primitives. As a result, PipeThreader can discover efficient pipeline scheduling for well-studied DNN architectures like FlashAttention, achieving comparable or even superior performance. Additionally, it can uncover novel pipeline schemes for emerging models like Mamba2, delivering significantly better performance compared to state-of-the-art hand-crafted implementations. The code is open-sourced at https://github.com/tile-ai/tilelang.
-|confname=SIGCOMM '23
+|confname =OSDI'25
-|link=https://dl.acm.org/doi/10.1145/3603269.3604832
+|link = https://www.usenix.org/conference/osdi25/presentation/cheng
-|title=CellFusion: Multipath Vehicle-to-Cloud Video Streaming with Network Coding in the Wild
+|title= PipeThreader: Software-defined pipelining for efficient DNN execution
-|speaker=Rong Cong
+|speaker=Junzhe
-|date=2023-10-08}}
+|date=2026-4-9
-{{Latest_seminar
+}}
-|abstract=Resource disaggregation offers a cost effective solution to resource scaling, utilization, and failure-handling in data centers by physically separating hardware devices in a server. Servers are architected as pools of processor, memory, and storage devices, organized as independent failure-isolated components interconnected by a high-bandwidth network. A critical challenge, however, is the high performance penalty of accessing data from a remote memory module over the network. Addressing this challenge is difficult as disaggregated systems have high runtime variability in network latencies/bandwidth, and page migration can significantly delay critical path cache line accesses in other pages. This paper conducts a characterization analysis on different data movement strategies in fully disaggregated systems, evaluates their performance overheads in a variety of workloads, and introduces DaeMon, the first software-transparent mechanism to significantly alleviate data movement overheads in fully disaggregated systems. First, to enable scalability to multiple hardware components in the system, we enhance each compute and memory unit with specialized engines that transparently handle data migrations. Second, to achieve high performance and provide robustness across various network, architecture and application characteristics, we implement a synergistic approach of bandwidth partitioning, link compression, decoupled data movement of multiple granularities, and adaptive granularity selection in data movements. We evaluate DaeMon in a wide variety of workloads at different network and architecture configurations using a state-of-the-art simulator. DaeMon improves system performance and data access costs by 2.39× and 3.06×, respectively, over the widely-adopted approach of moving data at page granularity.
-|confname=SigMetrics '23
-|link=https://dl.acm.org/doi/abs/10.1145/3579445
-|title=DaeMon: Architectural Support for Efficient Data Movement in Fully Disaggregated Systems
-|speaker=Jiyi
-|date=2023-10-08}}
-{{Latest_seminar
-|abstract=Realizing Digital Twins for Vehicular Networks: Towards Future Network Evolution
-|confname=Tech. Talk
-|link=#
-|title=Rechargeable network
-|speaker=Prof. Tang Liu
-|date=2023-10-08}}
-=== History ===
 {{Resource:Previous_Seminars}}

Navigation menu

Difference between revisions of "Resource:Seminar"

Latest revision as of 10:37, 10 April 2026

Contents

Difference between revisions of "Resource:Seminar"

Latest revision as of 10:37, 10 April 2026

Latest

History

2024

2023

2022

2021

2020

2019

2018

2017

Instructions