Seminar Topic WS20/21: Assorted Themes in Parallel Computing

Seminar Style

The presence of each participant in all seminar presentations is obligatory.

Successful participation consist in

  1. choosing, reading and understanding 1-2 papers from list of suggested readings,
  2. presenting the papers to the other participants (slides, 30 minutes),
  3. and writing a summary of the papers (10-15 pages).
The number of meetings depends on the number of participants. There will be usually two talks per meeting.

ECTS points: 3.0
Seminar ECTS points will be assigned where the topic presented fits the best:

Key Dates


Register on TISS until November 10, 2020! (Procedure will be explained during first meeting.)


Topic Advisor Paper ECTS Comment
1 JLT Norman P. Jouppi, Cliff Young, Nishant Patil, David A. Patterson: Motivation for and Evaluation of the First Tensor Processing Unit. IEEE Micro 38(3): 10-19 (2018)
Norman P. Jouppi, Cliff Young, Nishant Patil, David A. Patterson: A domain-specific architecture for deep neural networks. Commun. ACM 61(9): 50-59 (2018)
Norman P. Jouppi, Doe Hyun Yoon, George Kurian, Sheng Li, Nishant Patil, James Laudon, Cliff Young, David A. Patterson: A domain-specific supercomputer for training deep neural networks. Commun. ACM 63(7): 67-78 (2020)
SE, CE Taken  
2 JLT David Cardwell, Fengguang Song: An Extended Roofline Model with Communication-Awareness for Distributed-Memory HPC Systems. HPC Asia 2019: 26-35 SE, CE Taken  
3 JLT Aleksandar Ilic, Frederico Pratas, Leonel Sousa: Beyond the Roofline: Cache-Aware Power and Energy-Efficiency Modeling for Multi-Cores. IEEE Trans. Computers 66(1): 52-58 (2017)
Nicolas Denoyelle, Brice Goglin, Aleksandar Ilic, Emmanuel Jeannot, Leonel Sousa: Modeling Non-Uniform Memory Access on Large Compute Nodes with the Cache-Aware Roofline Model. IEEE Trans. Parallel Distributed Syst. 30(6): 1374-1389 (2019)
SE, CE Taken  
4 JLT Hermann Schweizer, Maciej Besta, Torsten Hoefler: Evaluating the Cost of Atomic Operations on Modern Architectures. PACT 2015: 445-456 SE, CE Taken  
5 JLT Alexandr Andoni, Zhao Song, Clifford Stein, Zhengyu Wang, Peilin Zhong: Parallel Graph Connectivity in Log Diameter Rounds. FOCS 2018: 674-685 AL, TH  
6 JLT Soheil Behnezhad, Laxman Dhulipala, Hossein Esfandiari, Jakub Lacki, Vahab S. Mirrokni: Near-Optimal Massively Parallel Graph Connectivity. FOCS 2019: 1615-1636 AL, TH Taken  
7 JLT Umut A. Acar, Arthur Charguéraud, Mike Rainey: A work-efficient algorithm for parallel unordered depth-first search. SC 2015: 67:1-67:12 SE, CE, AL, TH Taken  
8 JLT Jeremy T. Fineman: Nearly work-efficient parallel algorithm for digraph reachability. STOC 2018: 457-470 AL, TH Taken  
9 JLT Laksono Adhianto, S. Banerjee, Michael W. Fagan, Mark Krentel, Gabriel Marin, John M. Mellor-Crummey, Nathan R. Tallent: HPCTOOLKIT: tools for performance analysis of optimized parallel programs. Concurr. Comput. Pract. Exp. 22(6): 685-701 (2010)
David Böhme, Todd Gamblin, David Beckingsale, Peer-Timo Bremer, Alfredo Giménez, Matthew P. LeGendre, Olga Pearce, Martin Schulz: Caliper: performance introspection for HPC software stacks. SC 2016: 550-560
SE, CE Taken  
10 JLT Sixue Cliff Liu, Robert E. Tarjan, Peilin Zhong: Connected Components on a PRAM in Log Diameter Time. SPAA 2020: 359-369 AL, TH Taken  



If you have further questions about the seminar, please contact Prof. Jesper Larsson Träff.