Difference between revisions of "Resource:Previous Seminars"

Latest revision as of 20:56, 11 December 2025

[Topic] [ The path planning algorithm for multiple mobile edge servers in EdgeGO], Rong Cong, 2020-11-18

[Mobisys20] Combating packet collisions using non-stationary signal scaling in LPWANs, Wenliang Mao, 2020-11-18
[Topic] [ Dependency-Aware and Latency-Optimal Service Cache in Edge networks], Jiwei Mo, 2020-11-18
[talk] Paper Carnival 2020, ALL, 2020-09-24,25,26

请使用Latest_seminar和Hist_seminar模板更新本页信息.

- 修改时间和地点信息
- 将当前latest seminar部分的code复制到这个页面中
- 将{{Latest_seminar... 修改为 {{Hist_seminar...，并增加对应的日期信息|date=
- 填入latest seminar各字段信息
- link请务必不要留空，如果没有link则填本页地址 https://mobinets.org/index.php?title=Resource:Seminar

格式说明
- Latest_seminar:

{{Latest_seminar
|confname=
|link=
|title=
|speaker=
}}

- Hist_seminar

{{Hist_seminar
|confname=
|link=
|title=
|speaker=
|date=
}}

@@ Line 1: / Line 1: @@
 === History ===
 {{Hist_seminar
-|abstract = As Large Language Models (LLMs) continue to scale, optimizing their deployment requires efficient hardware and system co-design. However, current LLM performance evaluation frameworks fail to capture both chip-level execution details and system-wide behavior, making it difficult to assess realistic performance bottlenecks. In this work, we introduce ReaLLM, a trace-driven simulation framework designed to bridge the gap between detailed accelerator design and large-scale inference evaluation. Unlike prior simulators, ReaLLM integrates kernel profiling derived from detailed microarchitectural simulations with a new trace-driven end-to-end system simulator, enabling precise evaluation of parallelism strategies, batching techniques, and scheduling policies. To address the high computational cost of exhaustive simulations, ReaLLM constructs a precomputed kernel library based on hypothesized scenarios, interpolating results to efficiently explore a vast design space of LLM inference systems. Our validation against real hardware demonstrates the framework's accuracy, achieving an average end-to-end latency prediction error of only 9.1% when simulating inference tasks running on 4 NVIDIA H100 GPUs. We further use ReaLLM to evaluate popular LLMs' end-to-end performance across traces from different applications and identify key system bottlenecks, showing that modern GPU-based LLM inference is increasingly compute-bound rather than memory-bandwidth bound at large scale. Additionally, we significantly reduce simulation time with our precomputed kernel library by a factor of 6× for full-simulations and 164× for workload SLO exploration. ReaLLM is open-source and available at https://github.com/bespoke-silicon-group/reallm..
+|confname =ACL'24
+|link = https://arxiv.org/abs/2406.16441
+|title= UniCoder: Scaling Code Large Language Model via Universal Code
+|speaker=Bairong Liu
+|date=2025-12-05
+}}
+{{Hist_seminar
+|confname =TMC'25
+|link = https://ieeexplore.ieee.org/abstract/document/11160677
+|title= Resolving Inter-Logical Channel Interference for Large-scale LoRa Deployments
+|speaker=Mengyu
+|date=2025-12-05
+}}
+{{Hist_seminar
+|confname =ToN'25
+|link = https://ieeexplore.ieee.org/abstract/document/10843977
+|title= Spliceosome: On-Camera Video Thinning and Tuning for Timely and Accurate Analytics
+|speaker=Zhongwei Sun
+|date=2025-11-28
+}}
+{{Hist_seminar
+|confname =NSDI'25
+|link = https://ieeexplore.ieee.org/abstract/document/10843977
+|title= Accelerating Design Space Exploration for LLM Training Systems with Multi-experiment Parallel Simulation
+|speaker=Qinyong
+|date=2025-11-28
+}}
+{{Hist_seminar
 |confname =ASAP'25
 |link = https://ieeexplore.ieee.org/abstract/document/11113621

Navigation menu

Difference between revisions of "Resource:Previous Seminars"

Latest revision as of 20:56, 11 December 2025

Contents

Difference between revisions of "Resource:Previous Seminars"

Latest revision as of 20:56, 11 December 2025

History

2024

2023

2022

2021

2020

2019

2018

2017

Instructions