Bringing Authentic Spanish Videos into the Classroom

This weekend COERLL attended the Texas Foreign Language Association Fall 2013 Conference. Our presentation on SpinTX featured two excellent Spanish teachers/curriculum developers: Tina Dong (Instructional Coordinator, World Languages at Austin ISD), and Jared Abels (Secondary Spanish Instructor, Round Rock Christian Academy). Each of them shared some great ideas for incorporating SpinTX videos in the classroom. Check out the presentation, below:

SpinTX Project Featured in COERLL Summer Webinar Series

In June, 2013 the SpinTX project was the subject of a professional development webinar offered by COERLL.

From the webinar description:

In the final installment of COERLL’s summer webinar series, we’ll unpack one of our most recent projects, SpinTX. SpinTX is a video archive that provides access to selected video clips and transcripts from the Spanish in Texas Corpus, a collection of video interviews with bilingual Spanish speakers in Texas. We will hear from project members who will show you how to use SpinTX to search and tag the videos for features that match your interests, and create and share your favorite playlists.

Preparing to conduct and film an interview

Post by Scott Zuniga, video production consultant for the Spanish in Texas Project

A great interview can be an excellent source of research, especially when the interview includes both audio and video. This article discusses the steps we took for planning, conducting and filming interviews for the Spanish in Texas video archives.

Step 1: Start with a Plan

This might seem obvious, but often, amateur interviewers fail to prepare for an interview beyond writing a list of questions that they think will draw interesting answers from their subject. It is your job as the director to think through all possible scenarios and to prepare to conduct your interview in a way that will help you achieve your objectives. Think about the Who, What, Where, When, Why and How of each interview that you will be conducting and write them down on an interview plan template.

Step 2: Know Your Objectives

Chances are, you will be conducting an interview to fulfill a specific purpose, or “main objective.” You need to identify this objective and write it down on your interview plan and communicate it with your team so that everyone is clear about what to do when obstacles arise. Having a clear objective will help you make the right creative and technical decisions throughout the interview process. Think about your next interview or an interview you have done in the past and answer the following questions:

  1. What is the purpose of my interview? (i.e. to get audio samples for research, produce authentic video for language learning, etc.).
  2. What audience is my interview intended for?
  3. Where will my interview be shown?
  4. What should you do to prepare for my interview?
  5. What should I do to prepare my subject for the interview?
  6. How should my interview sound? (Be as specific as possible)
  7. What should my interview look like? (Draw a storyboard of what you want your shot to look like – stick figures are ok).

The main objective for the Spanish in Texas video archive was to create a database of videos with samples of Spanish spoken in Texas for use by educators and linguists. Knowing this helped student interviewers work to get the best sound quality possible. This is just one example of how know what your main objective can help.

Step 3: Prepare for the Interview

Now that you’ve made a plan, contacted interviewees and have prepared your questions, it’s time to prepare to conduct the actual interview. It’s good to do practice with a friend or at least ask the questions out loud to yourself and try to anticipate the answers your interviewees will have. This will help you prepare for follow up question.

Good follow up questions are essentials to conducting a fluid interview that allows the interviewee to give thoughtful answers. Here are a few more tips that we used to prepare for our Spanish in Texas interviews, hopefully they will help you too:

Prepare yourself for the interview

  • Memorize questions to avoid looking at notes during interview. Reading your questions in order might not always be best depending on what your interviewee is saying.
  • Make copies of consent forms and questionnaires.

 Prepare your subject for the interview

  • Ask for their permission. It is good to have your subject fill out consent forms and questionnaires before hand so you don’t risk forgetting this important step.
  • Explain to your subject the purpose and intent of your project.
  • Tell them what to wear. No stripes, necklaces or noisy earrings. Don’t worry, you’re the director. The more confident and professional you come across, the more your interviewee will respect you and give you a good interview.
  • Ask interviewee to reserve a quiet room and tell them politely that the sound is very important.
  • Verify the appointment with interviewee the night before.

Prepare your equipment for the interview

  • Charge batteries
  • Do a test run
  • Review checklist of equipment (Batteries charged, Memory cards cleared, Microphones have spare batteries, headphones, tripod).

Following these steps will help you prepare for a successful interview. The more time you spend planning and anticipating for the interview, the more confidence you will have during the actual interview. Many unexpected things can happen during an interview, but if you are well prepared, you will be able to avoid mistakes, make the right creative choices and help you capture the perfect interview.

5 Ways to Open Up Corpora for Language Learning

Note: The following post was originally published on COERLL’s Open Up blog.

Corpora developed by linguists to study languages are a promising source of authentic materials to employ in the development of OER for language learning. Recently, COERLL’s SpinTX Corpus-to-Classroom project launched a new open resource that seeks to make it easy to search and adapt materials from a video corpus.

The SpinTX video archive  provides a pedagogically-friendly web interface to search hundreds of videos from the Spanish in Texas Corpus. Each of the videos is accompanied by synchronized closed captions and a transcript that has been annotated with thematic, grammatical, functional and metalinguistic information. Educators using the site can also tag videos for features that match their interests, and share favorite videos in playlists.

A collaboration among educators, professional linguists, and technologists, the SpinTX project leverages different aspects of the “openness” movement includingopen researchopen dataopen source software, and open education. It is our hope that by opening up this corpus, and by sharing the strategies and tools we used to develop it, others may be able to replicate and build on our work in other contexts.

So, how do we make a corpus open and beneficial across communities? Here are 5 ways:

1. Create an open and accessible search interface

Minimize barriers to your content. Searching the SpinTX video archive requires no registration, passwords or fees. To maximize accessibility, think about your audience’s context and needs. The SpinTX video archive offers a corpus interface specifically for educators, and plans to to create a different interface for researchers.

2. Use open content licences

Add a Creative Commons license to your corpus materials. The SpinTX video archive uses a CC BY-NC-SA license that requires attribution but allows others to reuse the materials different contexts.

3. Make your data open and share content

Allow others to easily embed or download your content and data. The SpinTX video archive provides social sharing buttons for each video, as well as providing access to the source data (tagged transcripts) through Google Fusion Tables.

4. Embrace open source development

When possible, use and build upon open source tools. The SpinTX project was developed using a combination of open source software (e.g. TreeTagger,Drupal) and open APIs (e.g. YouTube Captioning API). Custom code developed for the project is openly shared through a GitHub repository.

5. Make project documentation open

Make it easy for others to replicate and build on your work. The SpinTX team is publishing its research protocols, development processes and methodologies, and other project documentation on the SpinTX Corpus-to-Classroom blog.

Openly sharing language corpora may have wide-ranging benefits for diverse communities of researchers, educators, language learners, and the public interest. The SpinTX team is interested in starting a conversation across these communities. Have you ever used a corpus before? What did you use it for? If you have never used a corpus, how do you find and use authentic videos in the classroom?  How can we make video corpora more accessible and useful for teachers and learners?

Automated captioning of Spanish language videos

By the end of the summer, we expect the Spanish in Texas corpus will include 100 videos with a total running time of more than 50 hours. Fortunately, there are a range of services and tools to expedite the process of transcribing and captioning all those hours of video.

YouTube began offering automated captioning for videos a few years ago. Using Google’s voice recognition technology, a transcript is automatically generated for any video in one of the supported languages. As of today those languages include English, Japanese, Korean and Spanish, German, Italian, French, Portuguese, Russian and Dutch. The result of the automated transcription is still very much inferior to human transcription and is not usable for our purposes. However, YouTube also allows the option of uploading your own transcript as the basis for generating the synchronized captions. When a transcript is provided, the syncing process is very effective at creating accurate closed captions synchronized to a video. In addition, YouTube offers a Captioning API, which allows programmers to access the caption syncing service from within other applications.

Automatic Sync Technologies is a commercial provider of human transcription services as well as a technology for automatically syncing transcripts with media to produce closed captions in a variety of formats. Automatic Sync recently expanded their service to include Spanish as well as mixed Spanish/English content. An advantage of using their service is that they have the ability to create custom output formats (requires a one-time fee). For instance, we worked with them to create a custom output file that included the start and end time for each word in the transcript and was formatted as a tab-delimited text file.

There are also online platforms for manually transcribing and captioning videos in a user-friendly web interface. DotSub leverages a crowd-sourcing model for creating subtitles and then translating the subtitles into many different languages. Another option in this category is Universal Subtitles, which is the platform used to subtitle and translate the popular TED Video series. These can be a good option if resources aren’t available to hire transcribers and/or translators.

While developing the SPinTX corpus we have used all of the solutions mentioned above, but we have now settled on a standard process that works best for us. First, we pay a transcription service to transcribe the video files in mixed Spanish / English and provide us with a plain text file, at a cost of approximately $70 per hour of video. Then, we use the YouTube API to sync the transcripts with the videos and retrieve a caption file. This process works for us because our transcripts often need a lot of revisions, and we can sync as many times as we need at no cost. The caption file is then integrated into our annotation process, so when users get search results they can jump directly to the place it occurs in the video. In a later post, we will go into more detail about how we are implementing the free YouTube API and how you can adapt this process for your own video content!

LIFT off!

This blog will chronicle the development of the SPinTX Corpus, and our work to bring a pedagogically useful corpus of authentic Spanish and bilingual Spanish-English speech samples into language classrooms across Texas. The Spanish in Texas (SPinTX) Project project was selected to receive funding from the Longhorn Innovation Fund for Technology (LIFT) for the grant period September 1, 2012 – August 31, 2013. Development of the Corpus began in 2010 and is ongoing under the auspices of the Title VI Center for Open Educational Resources and Language Learning (COERLL).

The focus of the project over the next year will be to help educators exploit the SPinTX corpus to customize materials for the teaching of Spanish at all educational levels. The aims of the project are:

  • to develop a pedagogically friendly interface for the corpus;
  • to involve teachers and learners, via crowd-sourcing, social networking, and workshops, in the development of open educational resources (OER); and to
  • develop a model for using open source tools and a pedagogical interface that can be adapted for any language corpus.

In the spirit of openness, we will be sharing and discussing what we learn and create throughout the project. We invite you to join with us as we explore new tools and methods for integrating authentic content and open data into the language classroom!