The field of speech communications has seen extensive advancements in speech technology as well as
communication sciences for human interaction. While many speech and language resources have been
established, few exist which focus on team based problem solving in naturalistic settings. The main
challenge in establishing such a resource is the ability to capture such audio in a consistent manner, and
ensure proper associated meta-data and content knowledge is provided to allow speech/language
processing research, speech communications investigation, as well as speech technology innovation
based on machine learning advancements. Privacy constraints, as well as willingness for large teams to
be recorded with subsequent audio and meta-data shared represent the primary obstacle to overcome.
To address this, the Fearless Steps APOLLO Community Resource, supported by NSF, is an ongoing massive
naturalistic communications resource for researchers, scientists, historians, and technology innovators to
advance their fields. Historically, the NASA Apollo program represents one of mankind’s most significant
technological challenges to place a human on the moon. This series of manned speech missions were
made possible by the dedication of an extensive group of committed scientists, engineers, and specialists
who collaborated harmoniously, showcasing an extraordinary level of teamwork. Voice communications
played a key role in ensuring a coordinated team effort, which ensure success of each mission and
leveraged team member’s strengths and scientific knowledge/experience. The speech & language
community has a substantial need for extensive Big Data speech corpora, as it is crucial for advancing
next-generation technologies in the field. The primary objective of the proposed workshop is to bring the
SPS members together to explore and address urgent needs within the speech/language community that
can advance our field through the massive naturalistic Fearless Steps APOLLO corpus.
This workshop aims to feature keynote speakers, host panel discussions, present state-of-the-art research
findings, and share both resources and best practices in exploring what will be the largest publicly
available naturalistic multi-speaker team based historical audio and meta-data resource.
Main directions to be covered in this workshop are:
• Big Data Recovery and Deployment; Current progress in the Fearless Steps APOLLO initiative.
• Applications to Education, History, & Archival efforts.
• Applications to Communication Science, Psychology (Group Dynamics/Team Cohesion).
• Applications to SLT development, including but not limited to automatic speech recognition (ASR),
speech activity detection (SAD), speaker recognition, and conversational topic detection.
The objective of the Fearless Steps APOLLO workshop is to provide a forum for advancing speech
technology and research on massive naturalistic data. In addition, this workshop is also intended to
provide a mechanism for discussions on the latest challengesin speech/language research and innovative
solutions to these tasks. The following points provide an overview of the planned workshop and its role
in collaborative problem-solving studies.
• Advancements in digitizing and recovery of Apollo audio from tapes, and refining audio diarization
machine learning solution(s) for community resource/sharing.
• Understanding team based communication dynamics through speech processing: current
challenges and future directions.
• Utility of Fearless Steps APOLLO to communication science, historical archives, and education
communities.
• The FEARLESS STEPS Challenge: a tool for community engagement and data generation.
• Effective worldwide deployment of the Fearless Steps APOLLO corpus and corresponding
metadata.
The session will consist of 15-minute oral talks with:
(i) overview talks of the Fearless Steps APOLLO
community resource, current state-of-the-artsystems developed for the labeled datasets, and established
diarization pipeline; followed by
(ii) oral presentations on the best performing systems evaluated for
Fearless Steps Challenge dataset. In addition, poster sessions will also be available to researchers applying
Fearless Steps APOLLO data for novel tasks. Finally, there will be a (30-minute) panel discussion, with
panelists invited from the aforementioned communities.
The discussion will be moderated by the
organizers.
We invite authors to submit their original research and workshop contributions following the guidelines outlined below.
Authors are invited to submit their abstracts through our dedicated submission portal availablehere.
Your submission should be a maximum of 2 pages, including references and appendices. The first page should include a brief summary of your work. The second page is reserved for figures/tables and references. The second page is optional.
Two sample PDFs are available on the website as references to guide you in formatting your submission:
This workshop aims to shed light on Apollo data’s intricate nuances and potential implications, drawing the attention of researchers and engineers towards this rich field of study. Through this, the workshop hopes to inspire robust model development for team engagement and collaborative problem solving.
[1] Shekar, Meena M. Chandra, and John HL Hansen. “Historical Audio Search and Preservation: Finding Waldo Within the Fearless Steps Apollo 11 Naturalistic Audio Corpus [Applications Corner]” IEEE Signal Processing Magazine 40, no. 3: 30-38. (2023)
[2] Shekar, Meena M.C., and John HL Hansen. “Speaker Tracking using Graph Attention Networks with Varying Duration Utterances across Multi-Channel Naturalistic Data Fearless Steps Apollo-11 Audio Corpus.” ISCA INTERSPEECH-2023 (2023).
[3] Joglekar, Aditya, Ivan Lopez-Espejo, and John H. Hansen. “Fearless Steps APOLLO: Challenges in keyword spotting and topic detection for naturalistic audio streams.” The Journal of the Acoustical Society of America 153, no. 3: A173-A173. (2023)
[4] J. H. L. Hansen, A. Joglekar, M. M. Chandra Shekar, S.-J. Chen, and X. Liu, “FEARLESS STEPS APOLLO: TEAM COMMUNICATIONS BASED DEVELOPMENT FOR SCIENCE, TECHNOLOGY, EDUCATION, AND HISTORICAL PRESERVATION,” Accepted at IEEE ICASSP 2024.
[5] A. Joglekar and J. H. L. Hansen, “DeepComboSAD: Spectro-Temporal Correlation Based Speech Activity Detection for Naturalistic Audio Streams,” in IEEE Signal Processing Letters, vol. 30, pp. 1472-1476, 2023, doi: 10.1109/LSP.2023.3319229.
[6] S.-J. Chen, J. Xie, and J. H. Hansen, “FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition,” in Proc. Interspeech 2022, pp. 3058–3062 (2022)
[7] Hansen, J.H., Joglekar, A., Chen, S.J., Shekar, M.C. and Belitz, C., “Fearless Steps APOLLO: Advanced Naturalistic Corpora Development.” In Proceedings of the 2nd Workshop on Novel Incentives in Data Collection from People: models, implementations, challenges, and results within LREC 2022 (pp. 14-19). (2022)
[8] S.-J. Chen, W. Xia, and J. H. Hansen, “Scenario aware speech recognition: Advancements for Apollo Fearless Steps & CHiME-4 corpora,” in Proc. IEEE ASRU 2021, pp. 289–295 (2021)
[9] Joglekar, Aditya, Seyed Omid Sadjadi, Meena Chandra-Shekar, Christopher Cieri, and John HL Hansen. “Fearless steps challenge phase-3 (FSC-P3): Advancing SLT for Unseen Channel and Mission Data across NASA Apollo Audio.”ISCA INTERSPEECH-2021. (2021)
[10] Joglekar, Aditya, John H.L. Hansen, M.C. Shekar, and Abhijeet Sangwan (2020) FEARLESS STEPS Challenge (FS-2): Supervised Learning with Massive Naturalistic Apollo Data. Proc. Interspeech 2020, 2617-2621 (2020)
[11] Hansen, John HL, Aditya Joglekar, Meena Chandra Shekhar, Vinay Kothapally, Chengzhu Yu, Lakshmish Kaushik, and Abhijeet Sangwan. “The 2019 inaugural fearless steps challenge: A giant leap for naturalistic audio.” ISCA INTERSPEECH-2019. (2019)
[12] Yu, Chengzhu, and John HL Hansen. “A study of voice production characteristics of astronaut speech during Apollo 11 for speaker modeling in space.” The Journal of the Acoustical Society of America 141, no. 3 (2017): 1605-1614
[13] Yu, Chengzhu, and John HL Hansen. “Active learning based constrained clustering for speaker diarization.”IEEE/ACM Transactions on Audio, Speech, and Language Processing 25, no. 11 (2017): 2188-2198
[14] Hansen, John HL, Abhijeet Sangwan, Aditya Joglekar, Ahmet Emin Bulut, Lakshmish Kaushik, and Chengzhu Yu. “Fearless Steps: Apollo-11 Corpus Advancements for Speech Technologies from Earth to the Moon.” In INTERSPEECH, pp. 2758-2762. (2018)
[15] Yu, Chengzhu, John HL Hansen, and Douglas W. Oard. “'Houston, we have a solution': A case study of the analysis of astronaut speech during NASA Apollo 11 for long-term speaker modeling.” In Fifteenth Annual Conference of the International Speech Communication Association. 2014
[16] Ziaei, Ali, Lakshmish Kaushik, Abhijeet Sangwan, John HL Hansen, and Douglas W. Oard. “Speech activity detection for NASA Apollo space missions: Challenges and solutions.” In Fifteenth Annual Conference of the International Speech Communication Association. 2014
[17] Sangwan, Abhijeet, Lakshmish Kaushik, Chengzhu Yu, John HL Hansen, and Douglas W. Oard. “'Houston, we have a solution': using NASA Apollo program to advance speech and language processing technology.” In INTERSPEECH, pp. 1135-1139. 2013
[1] T. Vuong, N. Madaan, R. Panda, and R. M. Stern, “Investigating the important temporal modulations for deep-learning-based speech activity detection,” in 2022 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2023, pp. 525–531.
[2] P. Gimeno, D. Ribas, A. Ortega, A. Miguel, and E. Lleida, “Unsupervised adaptation of deep speech activity detection models to unseen domains,” Applied Sciences, vol. 12, no. 4, p. 1832, 2022.
[3] V. Pannala and B. Yegnanarayana, “A neural network approach for speech activity detection for Apollo corpus,” Computer Speech & Language, vol. 65, p. 101137, 2021.
[4] W. Wang, D. Cai, J. Wang, Q. Lin, X. Wang, M. Hong, and M. Li, “The DKU-Duke-Lenovo System Description for the Fearless Steps Challenge Phase III,” in ISCA Interspeech, 2021, pp. 1044–1048.
[5] T. Vuong, Y. Xia, and R. M. Stern, “The Application of Learnable STRF Kernels to the 2021 Fearless Steps Phase-03 SAD Challenge,” in ISCA Interspeech, 2021, pp. 4364–4368.
[6] P. Gimeno, A. Ortega, A. Miguel, and E. Lleida, “Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021,” in ISCA Interspeech 2021, 2021, pp. 4359–4363.
[7] O. Ghahabi and V. Fischer, “EML Online Speech Activity Detection for the Fearless Steps Challenge Phase-III,” in ISCA Interspeech 2021, 2021, pp. 4379–4382.
[8] P. Gimeno, D. Ribas, A. Ortega, A. Miguel, and E. Lleida, “Convolutional recurrent neural networks for speech activity detection in naturalistic audio from Apollo missions,” Proc. InterSPEECH, vol. 2021, pp. 26–30, 2021.
[9] A. Gorin, D. Kulko, S. Grima, and A. Glasman, “'This is Houston. Say again, please'. The Behavox System for the Apollo-11 Fearless Steps Challenge (Phase II),” ISCA Interspeech 2020, pp. 2612–2616.
[10] J. Heitkaemper et al., “Statistical and Neural Network Based Speech Activity Detection in Non-Stationary Acoustic Environments,” in ISCA Interspeech 2020, 2020, pp. 2597–2601.
[11] X. Zhang, W. Wang, and P. Zhang, “Speaker Diarization System Based on DPCA Algorithm for Fearless Steps Challenge Phase-2,” in ISCA Interspeech 2020, 2020, pp. 2602–2606.
[12] Q. Lin, T. Li, and M. Li, “The DKU Speech Activity Detection and Speaker Identification Systems for Fearless Steps Challenge Phase-02,” in ISCA Interspeech 2020, 2020, pp. 2607–2611.
[13] A. Ragano, E. Benetos, A. Hines et al., “Development of a speech quality database under uncontrolled conditions,” 2020.
[14] B. Sharma, R. K. Das, and H. Li, “Multi-level adaptive speech activity detector for speech in naturalistic environments,” ISCA Interspeech 2019, pp. 2015–2019.
[15] A. Vafeiadis et al., “Two-dimensional convolutional recurrent neural networks for speech activity detection,” in ISCA Interspeech 2019. ISCA, 2019.
[16] G. Deshpande, V. S. Viraraghavan, R. Gavas, “A successive difference feature for detecting emotional valence from speech,” in Proc. SMM19, Workshop on Speech, Music and Mind 2019, 2019, pp. 36–40.