The Fearless Steps Initiative by UTDallas-CRSS led to the digitization, recovery, and diarization of 19,000 hours of original analog audio data, as well as the development of algorithms to extract meaningful information from this multichannel naturalistic data resource. As an initial step to motivate a stream-lined and collaborative effort from the speech and language community, UTDallas-CRSS is hosting a series of progressively complex tasks to promote advanced research on naturalistic “Big Data” corpora. This began with ISCA INTERSPEECH-2019: "The FEARLESS STEPS Challenge: Massive Naturalistic Audio (FS-#1)". This first edition of this challenge encouraged the development of core unsupervised/semi-supervised speech and language systems for single-channel data with low resource availability, serving as the “First Step” towards extracting high-level information from such massive unlabeled corpora.This was followed with ISCA INTERSPEECH-2020 which held the Special Session for FEARLESS STEPS Challenge (FS-2), which focused on developing supervised learning strategies for the 100 hour Challenge Corpus.
As a natural progression following the successful Inaugural Challenge FEARLESS STEPS Challenge Phase-#1 and FEARLESS STEPS Challenge Phase-#2, the FEARLESS STEPS Challenge Phase-#3 (FSC P3) focuses on development of single-channel supervised learning strategies with an aim to test system generalizability to varying channel and mission data. FSC P3 also provides an additional challenge task of Conversational Analysis, motivating researchers to work on natural language understanding and group dynamics analysis. FSC P3 provides 80 hours of ground-truth data through Training and Development sets, with 20 hours of blind-set Apollo-11 evaluation data, 5 hours of unseen channel evaluation data and an additional 5 hours of blind-set Apollo-13 mission evaluation data. Based on feedback from the Fearless Steps participants, additional Tracks for streamlined speech recognition, speaker diarization, and conversational analysis have been included in the FSC P3. The results for this Challenge will be presented at the ISCA INTERSPEECH-2021 Special Session. We encourage participants to explore any and all research tasks of interest with the Fearless Steps Corpus – with suggested Task Domains listed below. Research participants can however, also utilize the FSC P3 corpus to explore additional problems dealing with naturalistic data, which we welcome as part of the special session.
|February 12, 2021 to March 14, 2021
|Training and Development Data Release
|February 12, 2021
|Evaluation plan Release
|February 20, 2021
|Evaluation data Release
|March 5, 2021
|System Submission Opens
|March 5, 2021
|Baseline Description and Results
|March 15, 2021
|INTERSPEECH-2021 Paper Registration Deadline
|March 26, 2021
(see IS-2021 dates)
|Final System Submission Deadline
|March 31, 2021
|Final Results Announced for all Tasks
|April 1, 2021
|INTERSPEECH-2021 FEARLESS STEPS Paper Submission Deadline
|April 2, 2021
(see IS-2021 dates)
|INTERSPEECH-2021 FEARLESS STEPS Special Session
|August 30 - September 3, 2021
(see IS-2021 dates)
Check the Register tab to learn more information on how to register on the OenSAT website. To head over to the website and register for the Fearless Steps Challenge: Phase 3, Please Click the button below!
This challenge is hosted in co-ordination with NIST!
The Entire Fearless Steps Corpus consisting of over 19,000 hours of audio from the Apollo-11 Mission is publicly available under the 'NASA Media Usage Guidelines'. For access to the complete 19,000 hours corpus, Please fill the form given in the link: Fearless Steps Challenge Phase-01. or directly contact us at FearlessSteps@utdallas.edu.
For further questions or inquiries, Please do not hesitate to contact us,
The Fearless Steps corpus is derived fom a five-year NSF CISE funded project awarded to CRSS at the University of Texas at Dallas. UTDallas-CRSS established the hardware/software solutions to digitize and diarize 19,000 hrs of NASA Apollo data. All core Apollo data released as part of this challenge has been approved for public release by NASA Export Control. The full audio corpus is also available through UTDallas-CRSS. Any reference to or listing of organizations other than UTD is for information only; it does not imply recommendation or endorsement by UTDallas-CRSS nor does it imply that the products mentioned are necessarily the best available for that purpose.
All the conversations between Astronauts and and Mission Control Personnel during the Apollo-11 Mission were recorded by NASA. The tireless efforts of CRSS-UTD transcribers and researchers contributed to the shaping of this enormous amounts of data into a well-defined corpus to address various speech and language tasks for naturalistic audio, a portion of which is now made publicly available to the speech community through this Challenge via a creative commons license.
Note:The Creative Commons License is restricted to the efforts made by CRSS-UTD, which involves 100 hours of Challenge Corpus (audio) data sampled from 8Khz, along with its meta-data generated separately. The license also covers all the scripts which were used in the preparation of the corpus and systems built to support the tasks in this Challenge, along with the webpages developed to host the Challenge.
NASA content - images, audio, video, and computer files used in the rendition of 3-dimensional models, such as texture maps and polygon data in any format - generally are not copyrighted. You may use this material for educational or informational purposes, including photo collections, textbooks, public exhibits, computer graphical simulations and Internet Web pages. This general permission extends to personal Web pages.
News outlets, schools, and text-book authors may use NASA content without needing explicit permission. NASA content used in a factual manner that does not imply endorsement may be used without needing explicit permission. NASA should be acknowledged as the source of the material. NASA occasionally uses copyrighted material by permission on its website. Those images will be marked copyright with the name of the copyright holder. NASA's use does not convey any rights to others to use the same material. Those wishing to use copyrighted material must contact the copyright holder directly.
FEARLESS STEPS CHALLENGEbyAditya Joglekar, John H.L. Hansenis licensed under aCreative Commons Attribution 4.0 International License
Based on a work athttps://www.nasa.gov/mission_pages/apollo/apollo-11.html
Permissions beyond the scope of this license may be available athttps://www.nasa.gov/multimedia/guidelines/index.html
For Additional Information regarding Commercial and Non-Commercial Use:
Please visit: https://www.nasa.gov/multimedia/guidelines/index.html
This project was supported in part by AFRL under contractFA8750-15-1-0205, NSF-CISE Project 1219130, and partially by the University of Texas at Dallas from the DistinguishedUniversity Chair in Telecommunications Engineering held by J.H. L. Hansen. We would also like to thank Tatiana Korelsky and the National Science Foundation (NSF) for their support on this scientific and historical project. A special Thanks to Katelyn Foxworth for leading the ground-truth development efforts for the FS-02 Challenge Corpus.