Fearless Steps Challenge: Phase 3 Register

Overview
Submitting an archive via the NIST Dashboard
Submission for each Task
Evaluation Rules
Evaluation Protocol

Home FSC-3 Data Register Submission SAD DIAR SID ASR Conv

Overview

All evaluation activities will be conducted using a NIST maintained web platform shared with OpenSAT. Each participant will need to create an account on this web platform to register. This will allow them to perform various activities such as registering for the evaluation, signing the data license agreement, and uploading submissions.

After registering and agreeing to the NIST FSC-P3 Terms and Conditions, participants will be able to participate in the FSC P3. This page contains step-by-step instructions for creating the evaluation account, joining a site and team, selecting tasks, and signing the relevant agreements.

Submitting an archive via the NIST dashboard

To sign up for an evaluation account, navigate to the OpenSAT Series FSC P3 page on the OpenSAT site and follow the steps below:

To submit output of a system for scoring, log into your participant account and select Dashboard from the top right of the page
Navigate to the Submission Management panel and click on the task that you wish to submit to. This will open the Submissions page
Click Add new system
Select the system type
Select primary if you wish for the scoring results to be displayed on the leaderboard
Select contrastive otherwise
Enter a name for your system
Click Submit

This registers your submission with the scoring server. Next, you need to upload the archive containing your system output.

Locate your submission on the Submissions page. As the entries on this page are displayed in ascending order of submission date, it will be at the very bottom.
Find your submission and Click Upload.
Select the output you want to upload.
Click Submit.

At this point your archive will be uploaded to the NIST server and the following will occur:

A unique submission ID will be generated; this will be used to track your submission
Your submission will be validated
If the submission passes validation, it will be scored

When the server finishes scoring your submission, it will display the status DONE. To access the scoring results, click on this status.
If for any reason scoring failed, it will display a status beginning with FAIL. Clicking on this status will open the error log from the scoring script, which can be used to debug your submission.

Submission for each Task

Speech Activity Detection

System output for each track should be submitted as a .zip that expands into a single directory of txt files containing one txt file for each recording.
Systems should output their SAD as text (txt) files 9 A NIST defined File Format, the text files are text files containing one turn per line, each line containing nine tab-delimited fields:

Test	Test Definition File Name (Value: X)
TestSet ID	contents of the id attribute TestSet tag (Value: X)
Test ID	contents of the id attribute of the TEST tag (Value: X)
Task	SAD <== a literal text string, without quotations (Value: SAD)
File ID	contents of the id attribute of the File tag (Value: X)
Interval start	an offset, in seconds from the start of the audio file for the start of the speech/non-speech interval (Value: floating number)
Interval end	an offset, in seconds from the end of the audio file for the end of the speech/non-speech interval (Value: floating number)
Type	In system output: speech/non-speech without quotation marks (Value: speech/nonspeech) In the reference: S/NS for speech/non-speech
Confidence Score	(Optional) A value in the range 0 thorugh 1.0, with higher values indicating greater confidence about the presence/absence of speech

Use the appropriate script to generate DCF Scores for FSC P3 Challenge SAD Task

USAGE:

bash scoreFS03_SAD.sh <ref_path> <hyp_path> <out_path>

ref_path: Reference (Ground Truth) Directory Path
hyp_path: Hypothesis (System Output) Directory Path
out_path: File Path to write DCF Scores

Example submission packet can be found in the toolkit, link provided here

Speaker Identification

System output for each track should be submitted as a .zip that expands into a single directory of txt files containing one txt file with all results as shown in the example in the submission packet.
The SID output file should be a text file containing one test-segment per line, each line containing five space delimited fields

Test	Test Definition File Name
Prediction 1	fTop System SpeakerID Prediction
Prediction 2	2nd Most Likely System SpeakerID Prediction
Prediction 3	3rd Most Likely System SpeakerID Prediction
Prediction 4	4th Most Likely System SpeakerID Prediction
Prediction 5	5th Most Likely System SpeakerID Prediction

Use the appropriate script to generate Top-3 Accuracy Scores for FSC P3 Challenge SAD Task

USAGE:

bash scoreFS03_SID.sh <ref_path> <hyp_path> <out_path>

ref_path: Reference (Ground Truth) Directory Path
hyp_path: Hypothesis (System Output) Directory Path
out_path: File Path to write Top-3 Accuracy Score

Example submission packet can be found in the toolkit, link provided here

Speaker Diarization

System output for each track should be submitted as a .zip that expands into a single directory of s Rich Transcription Time Marked (RTTM) files containing one RTTM file for each recording.
A NIST defined File Format, the RTTM files are text files containing one turn per line, each line containing nine space-delimited fields:

Type	segment type; should always by “SPEAKER”
File ID	file name; basename of the recording minus extension (e.g., “FS P01 eval 023”)
Channel ID	channel (1-indexed) that turn is on; should always be “1”
Turn Onset	onset of turn in seconds from beginning of recording
Turn Duration	duration of turn in seconds
Orthography Field	should always by “<NA>”
Speaker Type	should always by “<NA>”
Speaker Name	name of speaker of turn; should be unique within scope of each file
Confidence Score	(Optional) system confidence (probability) that information is correct; should always be <NA>

Use the appropriate script to generate DER Scores for FSC P3 Challenge SAD Task

USAGE:

bash scoreFS03_SD.sh <ref_path> <hyp_path> <out_path>

ref_path: Reference (Ground Truth) Directory Path
hyp_path: Hypothesis (System Output) Directory Path
out_path: File Path to write DER Scores

Example submission packet can be found in the toolkit, link provided here

Automatic Speech Recognition

System output for each track should be submitted as a .zip that expands into a single directory of JSON format files containing one JSON file for each recording.
The transcriptions are provided in JSON format for each file as .json. The JSON file includes the following pieces of information for each utterance:

Speaker ID	Token: “speakerID”
Transcription	Token: “words”
Conversational Label	Token: “conv”
Start Time	Token: “startTime”
End Time	Token: “endTime”

Use the appropriate script to generate WER Scores for FSC P3 Challenge SAD Task

USAGE:

bash scoreFS03_ASR.sh <ref_path> <hyp_path> <out_path>

ref_path: Reference (Ground Truth) Directory Path
hyp_path: Hypothesis (System Output) Directory Path
out_path: File Path to write Overall WER Score

Example submission packet can be found in the toolkit, link provided here

Conversational Analysis

Speaker ID	Token: “speakerID”
Transcription	Token: “words”
Conversational Label	Token: “conv”
Start Time	Token: “startTime”
End Time	Token: “endTime”

Use the appropriate script to generate Top-3 Accuracy for FSC P3 Challenge SAD Task

USAGE:

bash scoreFS03_Conv.sh <ref_path> <hyp_path> <out_path>

ref_path: Reference (Ground Truth) Directory Path
hyp_path: Hypothesis (System Output) Directory Path
out_path: File Path to write Top-3 Accuracy

Evaluation Rules

Site registration will be required in order to participate
Researchers who register but do not submit a system to the Challenge are considered withdrawn from the Challenge
Researchers may use any audio and transcriptions to build their systems with the exception of data mentioned in the Evaluation plan
Only the audio for the blind eval set (20 hours) will be released. Researchers are expected to run their systems on the blind eval set.
Investigation of the evaluation data prior to submission of all systems outputs is not allowed. Human probing is prohibited.

All Challenge participants are required to submit a conference paper(s) describing their systems (and reporting performance on Dev and Eval sets) to the ”FEARLESS STEPS CHALLENGE PHASE-3” Special Sessions section at ISCA INTERSPEECH-2021.

Evaluation Protocol

The entire Fearless Steps Corpus (consisting of over 11,000 hours of audio from the Apollo-11 Mission) including the 100 hours is publicly available and requires no additional license to use the data.
There is no cost to participate in the Fearless Steps evaluation. Development data and evaluation data will be freely made available to registered participants.
At least one participant from each team must register on the Fearless Steps Challenge 2021.
System output submissions will be sent to the official Fearless Steps correspondence email-id.
Participants can submit at most 2 system submissions per day.
Results of submitted systems will be mailed to the registered email-id within a week of the submission.
It is required that participants agree to process the data in accordance with the following rules.

Fearless Steps Challenge:Phase III

Overview

Submitting an archive via the NIST dashboard

Submission for each Task

Speech Activity Detection

Speaker Identification

Speaker Diarization

Automatic Speech Recognition

Conversational Analysis

Evaluation Rules

Evaluation Protocol

Fearless Steps Challenge:
Phase III