Fearless Steps Workshop

Introduction

The field of speech communications has seen extensive advancements in speech technology as well as communication sciences for human interaction. While many speech and language resources have been established, few exist which focus on team based problem solving in naturalistic settings. The main challenge in establishing such a resource is the ability to capture such audio in a consistent manner, and ensure proper associated meta-data and content knowledge is provided to allow speech/language processing research, speech communications investigation, as well as speech technology innovation based on machine learning advancements. Privacy constraints, as well as willingness for large teams to be recorded with subsequent audio and meta-data shared represent the primary obstacle to overcome.

To address this, the Fearless Steps APOLLO Community Resource, supported by NSF, is an ongoing massive naturalistic communications resource for researchers, scientists, historians, and technology innovators to advance their fields. Historically, the NASA Apollo program represents one of mankind’s most significant technological challenges to place a human on the moon. This series of manned speech missions were made possible by the dedication of an extensive group of committed scientists, engineers, and specialists who collaborated harmoniously, showcasing an extraordinary level of teamwork. Voice communications played a key role in ensuring a coordinated team effort, which ensure success of each mission and leveraged team member’s strengths and scientific knowledge/experience. The speech & language community has a substantial need for extensive Big Data speech corpora, as it is crucial for advancing next-generation technologies in the field. The primary objective of the proposed workshop is to bring the SPS members together to explore and address urgent needs within the speech/language community that can advance our field through the massive naturalistic Fearless Steps APOLLO corpus.
This workshop aims to feature keynote speakers, host panel discussions, present state-of-the-art research findings, and share both resources and best practices in exploring what will be the largest publicly available naturalistic multi-speaker team based historical audio and meta-data resource.

Topics Covered

Main directions to be covered in this workshop are:

• Big Data Recovery and Deployment; Current progress in the Fearless Steps APOLLO initiative.
• Applications to Education, History, & Archival efforts.
• Applications to Communication Science, Psychology (Group Dynamics/Team Cohesion).
• Applications to SLT development, including but not limited to automatic speech recognition (ASR), speech activity detection (SAD), speaker recognition, and conversational topic detection.

Planned Composition

The objective of the Fearless Steps APOLLO workshop is to provide a forum for advancing speech technology and research on massive naturalistic data. In addition, this workshop is also intended to provide a mechanism for discussions on the latest challengesin speech/language research and innovative solutions to these tasks. The following points provide an overview of the planned workshop and its role in collaborative problem-solving studies.

• Advancements in digitizing and recovery of Apollo audio from tapes, and refining audio diarization machine learning solution(s) for community resource/sharing.
• Understanding team based communication dynamics through speech processing: current challenges and future directions.
• Utility of Fearless Steps APOLLO to communication science, historical archives, and education communities.
• The FEARLESS STEPS Challenge: a tool for community engagement and data generation.
• Effective worldwide deployment of the Fearless Steps APOLLO corpus and corresponding metadata.

The session will consist of 15-minute oral talks with:
(i) overview talks of the Fearless Steps APOLLO community resource, current state-of-the-artsystems developed for the labeled datasets, and established diarization pipeline; followed by
(ii) oral presentations on the best performing systems evaluated for Fearless Steps Challenge dataset. In addition, poster sessions will also be available to researchers applying Fearless Steps APOLLO data for novel tasks. Finally, there will be a (30-minute) panel discussion, with panelists invited from the aforementioned communities.
The discussion will be moderated by the organizers.

Instruction for Authors

IF YOUR ARE A PROSPECTIVE AUTHOR, VIEW THIS SECTION!

We invite authors to submit their original research and workshop contributions following the guidelines outlined below.

How to Submit Your Abstract

Authors are invited to submit their abstracts through our dedicated submission portal availablehere.

Presentation Format

Based on the abstract submissions received, The presentations will be oral presentations.
We encourage all forms of participation; therefore, virtual oral presentations will be facilitated for presenters who are unable to attend in person.
General attendance is expected to be in-person to foster a collaborative and engaging workshop environment.
Remote participation options are reserved exclusively for presenters.

Abstract Formatting and Requirements

Your submission should be a maximum of 2 pages, including references and appendices. The first page should include a brief summary of your work. The second page is reserved for figures/tables and references. The second page is optional.

Two sample PDFs are available on the website as references to guide you in formatting your submission:

Submission and Participation Policy

Submissions will be considered for oral presentations for the review process.
While we offer remote presentation options for presenters, we aim to have an interactive and in-person experience for attendees to maximize the workshop's impact.
Authors with accepted papers at ICASSP 2024 will be given an opportunity to showcase their work through an oral presentation. Additionally, authors with original 4-page work rejected in the ICASSP general conference will be allowed to submit for the workshop provided that the paper follows the workshop abstract submission format.

This workshop aims to shed light on Apollo data’s intricate nuances and potential implications, drawing the attention of researchers and engineers towards this rich field of study. Through this, the workshop hopes to inspire robust model development for team engagement and collaborative problem solving.

Workshop Agenda

Submission Deadlines

Submission Portal: Submit Workshop Abstract
Submission Portal Opens: November 29, 2023
Submission Deadline: April 1, 2024
Author Notification: March 15, 2024
Workshop Date: April 15, 2024

References

[1] Shekar, Meena M. Chandra, and John HL Hansen. “Historical Audio Search and Preservation: Finding Waldo Within the Fearless Steps Apollo 11 Naturalistic Audio Corpus [Applications Corner]” IEEE Signal Processing Magazine 40, no. 3: 30-38. (2023)

[2] Shekar, Meena M.C., and John HL Hansen. “Speaker Tracking using Graph Attention Networks with Varying Duration Utterances across Multi-Channel Naturalistic Data Fearless Steps Apollo-11 Audio Corpus.” ISCA INTERSPEECH-2023 (2023).

[3] Joglekar, Aditya, Ivan Lopez-Espejo, and John H. Hansen. “Fearless Steps APOLLO: Challenges in keyword spotting and topic detection for naturalistic audio streams.” The Journal of the Acoustical Society of America 153, no. 3: A173-A173. (2023)

[4] J. H. L. Hansen, A. Joglekar, M. M. Chandra Shekar, S.-J. Chen, and X. Liu, “FEARLESS STEPS APOLLO: TEAM COMMUNICATIONS BASED DEVELOPMENT FOR SCIENCE, TECHNOLOGY, EDUCATION, AND HISTORICAL PRESERVATION,” Accepted at IEEE ICASSP 2024.

[5] A. Joglekar and J. H. L. Hansen, “DeepComboSAD: Spectro-Temporal Correlation Based Speech Activity Detection for Naturalistic Audio Streams,” in IEEE Signal Processing Letters, vol. 30, pp. 1472-1476, 2023, doi: 10.1109/LSP.2023.3319229.

[6] S.-J. Chen, J. Xie, and J. H. Hansen, “FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition,” in Proc. Interspeech 2022, pp. 3058–3062 (2022)

[7] Hansen, J.H., Joglekar, A., Chen, S.J., Shekar, M.C. and Belitz, C., “Fearless Steps APOLLO: Advanced Naturalistic Corpora Development.” In Proceedings of the 2nd Workshop on Novel Incentives in Data Collection from People: models, implementations, challenges, and results within LREC 2022 (pp. 14-19). (2022)

[8] S.-J. Chen, W. Xia, and J. H. Hansen, “Scenario aware speech recognition: Advancements for Apollo Fearless Steps & CHiME-4 corpora,” in Proc. IEEE ASRU 2021, pp. 289–295 (2021)

[9] Joglekar, Aditya, Seyed Omid Sadjadi, Meena Chandra-Shekar, Christopher Cieri, and John HL Hansen. “Fearless steps challenge phase-3 (FSC-P3): Advancing SLT for Unseen Channel and Mission Data across NASA Apollo Audio.”ISCA INTERSPEECH-2021. (2021)

[10] Joglekar, Aditya, John H.L. Hansen, M.C. Shekar, and Abhijeet Sangwan (2020) FEARLESS STEPS Challenge (FS-2): Supervised Learning with Massive Naturalistic Apollo Data. Proc. Interspeech 2020, 2617-2621 (2020)

[11] Hansen, John HL, Aditya Joglekar, Meena Chandra Shekhar, Vinay Kothapally, Chengzhu Yu, Lakshmish Kaushik, and Abhijeet Sangwan. “The 2019 inaugural fearless steps challenge: A giant leap for naturalistic audio.” ISCA INTERSPEECH-2019. (2019)

[12] Yu, Chengzhu, and John HL Hansen. “A study of voice production characteristics of astronaut speech during Apollo 11 for speaker modeling in space.” The Journal of the Acoustical Society of America 141, no. 3 (2017): 1605-1614

[13] Yu, Chengzhu, and John HL Hansen. “Active learning based constrained clustering for speaker diarization.”IEEE/ACM Transactions on Audio, Speech, and Language Processing 25, no. 11 (2017): 2188-2198

[14] Hansen, John HL, Abhijeet Sangwan, Aditya Joglekar, Ahmet Emin Bulut, Lakshmish Kaushik, and Chengzhu Yu. “Fearless Steps: Apollo-11 Corpus Advancements for Speech Technologies from Earth to the Moon.” In INTERSPEECH, pp. 2758-2762. (2018)

[15] Yu, Chengzhu, John HL Hansen, and Douglas W. Oard. “'Houston, we have a solution': A case study of the analysis of astronaut speech during NASA Apollo 11 for long-term speaker modeling.” In Fifteenth Annual Conference of the International Speech Communication Association. 2014

[16] Ziaei, Ali, Lakshmish Kaushik, Abhijeet Sangwan, John HL Hansen, and Douglas W. Oard. “Speech activity detection for NASA Apollo space missions: Challenges and solutions.” In Fifteenth Annual Conference of the International Speech Communication Association. 2014

[17] Sangwan, Abhijeet, Lakshmish Kaushik, Chengzhu Yu, John HL Hansen, and Douglas W. Oard. “'Houston, we have a solution': using NASA Apollo program to advance speech and language processing technology.” In INTERSPEECH, pp. 1135-1139. 2013

[1] T. Vuong, N. Madaan, R. Panda, and R. M. Stern, “Investigating the important temporal modulations for deep-learning-based speech activity detection,” in 2022 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2023, pp. 525–531.

[2] P. Gimeno, D. Ribas, A. Ortega, A. Miguel, and E. Lleida, “Unsupervised adaptation of deep speech activity detection models to unseen domains,” Applied Sciences, vol. 12, no. 4, p. 1832, 2022.

[3] V. Pannala and B. Yegnanarayana, “A neural network approach for speech activity detection for Apollo corpus,” Computer Speech & Language, vol. 65, p. 101137, 2021.

[4] W. Wang, D. Cai, J. Wang, Q. Lin, X. Wang, M. Hong, and M. Li, “The DKU-Duke-Lenovo System Description for the Fearless Steps Challenge Phase III,” in ISCA Interspeech, 2021, pp. 1044–1048.

[5] T. Vuong, Y. Xia, and R. M. Stern, “The Application of Learnable STRF Kernels to the 2021 Fearless Steps Phase-03 SAD Challenge,” in ISCA Interspeech, 2021, pp. 4364–4368.

[6] P. Gimeno, A. Ortega, A. Miguel, and E. Lleida, “Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021,” in ISCA Interspeech 2021, 2021, pp. 4359–4363.

[7] O. Ghahabi and V. Fischer, “EML Online Speech Activity Detection for the Fearless Steps Challenge Phase-III,” in ISCA Interspeech 2021, 2021, pp. 4379–4382.

[8] P. Gimeno, D. Ribas, A. Ortega, A. Miguel, and E. Lleida, “Convolutional recurrent neural networks for speech activity detection in naturalistic audio from Apollo missions,” Proc. InterSPEECH, vol. 2021, pp. 26–30, 2021.

[9] A. Gorin, D. Kulko, S. Grima, and A. Glasman, “'This is Houston. Say again, please'. The Behavox System for the Apollo-11 Fearless Steps Challenge (Phase II),” ISCA Interspeech 2020, pp. 2612–2616.

[10] J. Heitkaemper et al., “Statistical and Neural Network Based Speech Activity Detection in Non-Stationary Acoustic Environments,” in ISCA Interspeech 2020, 2020, pp. 2597–2601.

[11] X. Zhang, W. Wang, and P. Zhang, “Speaker Diarization System Based on DPCA Algorithm for Fearless Steps Challenge Phase-2,” in ISCA Interspeech 2020, 2020, pp. 2602–2606.

[12] Q. Lin, T. Li, and M. Li, “The DKU Speech Activity Detection and Speaker Identification Systems for Fearless Steps Challenge Phase-02,” in ISCA Interspeech 2020, 2020, pp. 2607–2611.

[13] A. Ragano, E. Benetos, A. Hines et al., “Development of a speech quality database under uncontrolled conditions,” 2020.

[14] B. Sharma, R. K. Das, and H. Li, “Multi-level adaptive speech activity detector for speech in naturalistic environments,” ISCA Interspeech 2019, pp. 2015–2019.

[15] A. Vafeiadis et al., “Two-dimensional convolutional recurrent neural networks for speech activity detection,” in ISCA Interspeech 2019. ISCA, 2019.

[16] G. Deshpande, V. S. Viraraghavan, R. Gavas, “A successive difference feature for detecting emotional valence from speech,” in Proc. SMM19, Workshop on Speech, Music and Mind 2019, 2019, pp. 36–40.

Fearless Steps APOLLO:A Community Resource