Abstract
Algorithmic surgical workflow recognition is an ongoing research field andcan be divided into laparoscopic (Internal) and operating room (External)analysis. So far many different works for the internal analysis have beenproposed with the combination of a frame-level and an additional temporal modelto address the temporal ambiguities between different workflow phases. For theExternal recognition task, Clip-level methods are in the focus of researcherstargeting the local ambiguities present in the OR scene. In this work weevaluate combinations of different model architectures for the task of surgicalworkflow recognition to provide a fair comparison of the methods for bothInternal and External analysis. We show that methods designed for the Internalanalysis can be transferred to the external task with comparable performancegains for different architectures.