NIST Open Speech Analytic Technologies 2019 (OpenSAT19)


Leaderboard
The leaderboard is not available yet
Contact Us

For any information about the OpenSAT Evaluation (data, evaluation code, etc.) please email: opensat_poc@nist.gov
Welcome to the NIST Open Speech Analytic Technologies (OpenSAT19)
Summary
OpenSAT19 is the first speech analytics evaluation for the OpenSAT Series following the OpenSAT Pilot in 2017. OpenSAT19 provides an opportunity for developers to apply their systems to multiple data domains and multiple data domains. Participants can choose from one to all tasks and one to all data domains. This evaluation also provides an opportunity for participants to compare their system‘s performance against the pool of systems’ performance for each task and data domain. Tasks and data domains are shown below.

Download the .pdf OpenSAT 2019 Evaluation Plan for details about the evaluation.
This evaluation model has proven to be effective in advancing the state of the art in speech analytic systems where system performance can be tracked year-to-year. OpenSAT19 is also designed to encourage cross-learning among developers focused on different speech analytic tasks and/or by applying their systems to multiple domains.

The new dataset for the public safety communications domain is of special interest and is supported by the Department of Homeland Security (DHS) for advancing assistive technologies in the public safety community.

Tasks and Domains In OpenSAT19

System Tasks Data Domains
Automatic Speech Recognition (ASR)

Speech Activity Detection (SAD)

Keyword Search (KWS)
Public safety communications

Speech from amateur online videos

IARPA Babel low resource language

Participation in the public safety communications domain is highly encouraged. This domain supports the public safety community such as first responders in emergency events. The public safety communications dataset includes simulated first responder communications with the Lombard effect in speech and moments in speech with expressions of a sense of urgency. It also includes loud background sounds.

Who Can Participate?
NIST invites all organizations, including universities, government institutions, and corporations, to submit their results using their technologies to the OpenSAT evaluation server. The evaluation is open worldwide. Participation is free. NIST does not provide funds to participants.

How Do I Participate?
Each participant must create an account on this web platform. After creating an account, each participant will either create a new Team or join an existing Team. After registering and having an LDC data license agreement approved, participants will be able to participate in the OpenSAT19 Evaluation. Most of the data will be accessed from LDC and some of the data from this site. Participants will submit tar.gz files of their system’s output to the NIST OpenSAT scoring server using this web site. Go to the Register tab for registration instructions and to register for OpenSAT19.

Schedule

Milestone
2019
28MAR
Begining of the registration period
29MAR
Release of development data for development/training period
17JUN
Evaluation data released for evaluation period
2JUL
Last date to upload system output from evaluation data to NIST server
15JUL
System output results and evaluation reference data released
20AUG
NIST workshop (Date may change and may be one day instead of two)


Registration Instructions
If you already have an account, login here or at the top of the page. To create an account and register, follow the steps below.
To Create an Account:
1- Click the "Sign up" button at the top of the page.
2- Provide a valid e-mail address and create a password in the “Sign up” form. (After entering your email address and creating a password and clicking “Sign up” on the “OpenSAT19 Sign up” online form, a confirmation email will be sent to that email address).
3- Click “Confirm my account” in the e-mail sent to you. (A log-in page will display with your email address and created password already entered.)
4- Click “Log in”. (A dashboard for your account will display with Registration Steps.)
5- Complete the steps in the dashboard to complete an account creation.
6- Registration is completed when steps 1-5 are completed.
When you are notified by email from LDC that your License Agreement is approved, you can then access the data.
Creating a Team or Joining a Site and Team
When joining OpenSAT, a participant can either create a Site, or join an existing Site and create a Team, or join an existing Team. A participant can be a member of multiple teams.
Each participant, Site, and Team will have its own password. The creator(s) of the Site and Team creates those passwords respectively.
The NIST Agreement
Check the “I acknowledge that I have read and accepted the OpenSAT19 Terms and Conditions” box and then click the “Update the License Agreement” button.
The Data License Agreement
Site creator is required to agree to the LDC terms in order to access data for that site. Read the LDC license agreement and accept the terms by uploading the signed license agreement form. Participants cannot download data until LDC approves your uploaded signed LDC license agreement.
The Dashboard
The dashboard is the personalized page for each participant. To access the dashboard at any time, click the "Dashboard" at the top right of the screen. This is where you can make submissions and view results.
System Output Submission Instructions
Each submission must be associated with a Site, Team and Task.
Multiple systems may be created for each Task with a submission for each system.
Submit system output for validation checking or scoring following these steps:
1- Prepare for Submission
  • System output must be in the format described in the Evaluation Plan for the task that was performed (SAD, KWS, or ASR).
  • Have the .tgz or .zip file ready per Appendix IV in the OpenSAT19 Evaluation Plan and also shown below these steps.
2- Go to Dashboard. (Select "Dashboard" on the top right of the page.)
  • In "Submission Management", click the Task that represents the system output.
  • Click "Create New System" located at the upper right of the dashboard if you wish to create an additional contrastive or single-best System. A “New System" page will display.
    • Select contrastive or singlebest from the dropdown menu.
    • Enter a system identifier in “Name”.
    • Click “Submit”. The “Submission Management” page will display.
  • On the “Submission Management” page, click “Upload” in either development or evaluation slot for your respective system. The “New Submission” page displays.
    • Select the Data Domain from the drop down
    • Click the "Browse" button. Choose the .tgz or .zip file to upload.
    • Click "Submit".
    • A submission ID is automatically created for the submission.
    • The “Scoring” button on the “Submissions” page displays “submitted” until the scoring server completes scoring and then it changes to “Done”.
    • When “Done” is displayed, click “Scoring” button for a Scoring Run Report.
    • Click “View Submission” to see Submission information.
3- View Submission Results
  • To see a Scoring Run Report, click the “Scoring” button, after “submitted” changes to “Done” on the button.
  • To see information about the submission click the “View Submission” button.
Below is Appendix VI from the OpenSAT19 Evaluation Plan: Appendix IV- SAD, KWS, and ASR - System Output Submission Packaging
  • Each submission shall be an archive file named as "SysLabel".tgz or "SysLabel".zip.
  • Submit a separate .tgz or .zip file for each system output (e.g., a separate .tgz or .zip file for Primary, Contrastive1, and Contrastive2 systems).
  • "SysLabel" shall be an alphanumeric [a-zA-Z0-9] that is a performer-assigned identifier for their submission.
  • There should be no parent directory when the submission file is untarred. The tar command should be: > tar MySystemSubmissionFile.tgz or > tar MySystemSubmissionFile.zip respectively.
Prior to uploading the submission file to the NIST scoring server, performers will be asked for information about the submission. The scoring server will attach the following information to the submission filename to catorigize and uniquely identify the submission:
Field Information Method
TeamID [Team] obtained from login information
Task {SAD | ASR | KWS} select from drop-down menu
SubmissionType {primary | contrastive} select from drop-down menu
SubmissionType {primary | contrastive} select from drop-down menu
Training Condition {unconstrained} default - hard-coded
EvalPeriod {2019} default - hard-coded
DatasetName {PSC | VAST | Babel} select from drop-down menu
Date {YYYYMMDD} obtained from NIST scoring server at submission date
TimeStamp {HHMMSS} obtained from NIST scoring server at submission time

Below is an example of a resulting filename:
NIST_ASR_primary_uncontrained_2019_PSC_20190415_163026_ MySystemSubmissionFile.tgz

The NIST scoring server will perform a validation check on each system output submission for conforming to the submission format required for each task.
Submission of a system description conforming to the system description guidelines in Appendix V is required before receiving the system’s score and ranking results in the Evaluation phase.
Scoring and Validation tools: F4DE and SCTK
Here is the Github links for SCTK and F4DE:
SCTK: https://github.com/usnistgov/SCTK for the Automatic Speech Recognition (ASR)
F4DE: https://github.com/usnistgov/F4DE for the Keyword Search (KWS)

For the Speech Activity Detection (SAD), the tools (Validation & Scoring) are available in the Dashboard after registration and data license agreement approval.
FAQ


Questions? Email questions to opensat_poc@nist.gov