Difference between revisions of "Resource:Previous Seminars"

From MobiNetS
Jump to: navigation, search
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
=== History ===
=== History ===
{{Hist_seminar
{{Hist_seminar
|abstract = As Large Language Models (LLMs) continue to scale, optimizing their deployment requires efficient hardware and system co-design. However, current LLM performance evaluation frameworks fail to capture both chip-level execution details and system-wide behavior, making it difficult to assess realistic performance bottlenecks. In this work, we introduce ReaLLM, a trace-driven simulation framework designed to bridge the gap between detailed accelerator design and large-scale inference evaluation. Unlike prior simulators, ReaLLM integrates kernel profiling derived from detailed microarchitectural simulations with a new trace-driven end-to-end system simulator, enabling precise evaluation of parallelism strategies, batching techniques, and scheduling policies. To address the high computational cost of exhaustive simulations, ReaLLM constructs a precomputed kernel library based on hypothesized scenarios, interpolating results to efficiently explore a vast design space of LLM inference systems. Our validation against real hardware demonstrates the framework's accuracy, achieving an average end-to-end latency prediction error of only 9.1% when simulating inference tasks running on 4 NVIDIA H100 GPUs. We further use ReaLLM to evaluate popular LLMs' end-to-end performance across traces from different applications and identify key system bottlenecks, showing that modern GPU-based LLM inference is increasingly compute-bound rather than memory-bandwidth bound at large scale. Additionally, we significantly reduce simulation time with our precomputed kernel library by a factor of 6× for full-simulations and 164× for workload SLO exploration. ReaLLM is open-source and available at https://github.com/bespoke-silicon-group/reallm..
|confname =ACL'24
|link = https://arxiv.org/abs/2406.16441
|title= UniCoder: Scaling Code Large Language Model via Universal Code
|speaker=Bairong Liu
|date=2025-12-05
}}
{{Hist_seminar
|confname =TMC'25
|link = https://ieeexplore.ieee.org/abstract/document/11160677
|title= Resolving Inter-Logical Channel Interference for Large-scale LoRa Deployments
|speaker=Mengyu
|date=2025-12-05
}}
{{Hist_seminar
|confname =ToN'25
|link = https://ieeexplore.ieee.org/abstract/document/10843977
|title= Spliceosome: On-Camera Video Thinning and Tuning for Timely and Accurate Analytics
|speaker=Zhongwei Sun
|date=2025-11-28
}}
{{Hist_seminar
|confname =NSDI'25
|link = https://ieeexplore.ieee.org/abstract/document/10843977
|title= Accelerating Design Space Exploration for LLM Training Systems with Multi-experiment Parallel Simulation
|speaker=Qinyong
|date=2025-11-28
}}
{{Hist_seminar
|confname =ASAP'25
|confname =ASAP'25
|link = https://ieeexplore.ieee.org/abstract/document/11113621
|link = https://ieeexplore.ieee.org/abstract/document/11113621

Latest revision as of 20:56, 11 December 2025

History

2024

2023

2022

2021

2020

  • [Topic] [ The path planning algorithm for multiple mobile edge servers in EdgeGO], Rong Cong, 2020-11-18

2019

2018

2017

Instructions

请使用Latest_seminar和Hist_seminar模板更新本页信息.

    • 修改时间和地点信息
    • 将当前latest seminar部分的code复制到这个页面
    • 将{{Latest_seminar... 修改为 {{Hist_seminar...,并增加对应的日期信息|date=
    • 填入latest seminar各字段信息
    • link请务必不要留空,如果没有link则填本页地址 https://mobinets.org/index.php?title=Resource:Seminar
  • 格式说明
    • Latest_seminar:

{{Latest_seminar
|confname=
|link=
|title=
|speaker=
}}

    • Hist_seminar

{{Hist_seminar
|confname=
|link=
|title=
|speaker=
|date=
}}