Login Paper Search My Schedule Paper Index Help

My ICIP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDMLR-APPL-IVASR-2.1
Paper Title TWO-PATHWAY TRANSFORMER NETWORK FOR VIDEO ACTION RECOGNITION
Authors Bo Jiang, Jiahong Yu, Lei Zhou, Kailin Wu, Yang Yang, Netease Inc, China
SessionMLR-APPL-IVASR-2: Machine learning for image and video analysis, synthesis, and retrieval 2
LocationArea D
Session Time:Monday, 20 September, 15:30 - 17:00
Presentation Time:Monday, 20 September, 15:30 - 17:00
Presentation Poster
Topic Applications of Machine Learning: Machine learning for image & video analysis, synthesis, and retrieval
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract Traditional two-stream neural networks have shown that both appearance and motion information are important for video action recognition. However, their way of naively averaging two streams' scores at the end of the framework neglects the underlying relationship between these two kinds of information. In this paper, we propose a two-pathway transformer network that uses memory-based attention to explore such relationship, which further improves the classification performance. Specifically, a transformer-based decoder takes one pathway's features as the query while the other's as the key and value. Then based on the similarity matrix estimated by the query and key, relevant information from the value can be selected to enhance the query for the final classification task. Experiments demonstrate that our proposed method outperforms existing fusion strategies at the end of the two-stream methods.