SlideShare ist ein Scribd-Unternehmen logo
1 von 43
SPinTX Corpus-to-Classroom:
ATeacher-Centered Pedagogical Interface for
the Spanish in Texas Corpus
Barbara E. Bullock, Almeida Jacqueline
Toribio, Rachael Gilg, Martí Quixal & Arthur
Wendorf
Who we are
• Barbara E. Bullock & Almeida Jacqueline Toribio
• Project Directors / Sociolinguistics Researchers
• Rachael Gilg
• Project Manager / Web Developer
• Arthur Wendorf
• Corpus Linguist / Developer
• Martí Quixal
• Computational Linguist / Developer
• Carl Blyth
• Director of COERLL
2
Agenda
• Part 1: Introduction to the Corpus-to-Classroom Project
• Part 2: Project Results
• The SpinTX Video Archive: a pedagogically-friendly interface to the
Spanish in Texas Corpus
• Involving teachers in the development of open educational
resources
• A model for open source corpus development
3
Corpus-to-Classroom
4
Corpora in the Classroom: the promise
• Corpus: a large, structured, collection of language
• Benefits:
• Naturalistic language use
• Motivation
• „Real‟ language
• Discovery learning
• Examples:
5
Corpora in the Classroom: the reality
• Large linguistic corpora are of limited utility to untrained
end users.
• Designed for researchers, not educators.
• Collections such as YouTube are popular for language
classes, but can present problems
• Searching for appropriate content is time-consuming using
available search methods.
• Content is not necessarily openly-licensed and can disappear
without warning.
6
Our two-pronged approach
Spanish in Texas Corpus Project
A project of COERLL, a National Foreign Language
Resource Center (2010-2014)
• Video interviews provide rich content
SpinTX: Corpus-to-Classroom Project
Grant from the University of Texas Longhorn
Innovation Fund for Technology (2012-2013)
• Collection of pre-selected, corrected, annotated
clips from the larger corpus
• Open-source, pedagogically-friendly search and
authoring tools
7
Spanish in Texas Corpus: Goals
• To make publically available authentic data about
variation in Spanish as spoken in Texas
• for education
• for research
• Encourage teachers/students/public to view local
varieties as a resource
8
Corpus-to-Classroom: Goals
• develop a pedagogically friendly interface for using
the Spanish in Texas corpus
• involve teachers and learners, via crowd-sourcing,
social networking, and workshops, in the
development of open educational resources
• create a model for using open source tools and a
pedagogical interface that can be adapted for any
language corpus collection
9
Corpus Overview
Spanish in Texas corpus
• Approx. 92 videos of sociolinguistic interviews (avg.
30–45 min)
• Transcribed (approx. 600,000 words)
• Time-synced video caption files
• Tagged for linguistic features
SpinTX Video Archive corpus
• Approx. 327 video clips from 33 speakers (avg. 1-4
min)
• Transcribed (approx. 80,000 words)
• Time-synced video caption files
• Tagged for linguistic and pedagogical features
• Completely open (no registration required, open CC
license)
• Teacher-friendly interface
10
Corpus Tagging: Basic
• Time-synced captions
• Part-of-speech tags (dual language)
• POS
• POS, simplified
• Gender
• Tense
• Aspect
• Mood
• Speaker identification
• Age
• Gender
• Region
11
Corpus Tagging: Pedagogical
• Topics (manually added)
• Automatic tags using custom rulesets
• Grammatical
• aggregated from textbooks
• Pragmatics
• discourse markers, place holders (“este”), attenuators
• Vocabulary
• concept words
• Functional (planned)
• greetings, ask for help, express opinions
• Bilingual forms (planned)
• CS, loans, loan translations
12
13
Interview Metadata
Original Transcript (from Automatic Sync)
Upload Video and Transcript to YouTube
Review Transcript in Google Docs
Download SRT file
Prepare Transcript for TreeTagger
Run through TreeTagger
Combine Data from SRT File and
TreeTagger File, and add additional Tags
Divide CSV Files and Videos into Clips and
adjust Timings and Numberings
The SpinTX Video Archive: a
pedagogically-friendly interface
to the Spanish in Texas Corpus
23
Needs assessment: teacher interviews
• How do you use authentic video in your teaching?
• Describe searches you have done in the past for video
content. What were you looking for and were you able to
find it?
• How can you imagine using clips from the Spanish in
Texas video corpus in your classes?
24
Needs assessment results: primary goals
• Enable teachers to easily videos that suit the
curriculum/work plan
• Search by grammar, theme, vocabulary, etc.
• Provide open, non-ephemeral content
• Downloadable from open site with a license enabling remixing
• Curating sets of videos for comparison and study
• Favoriting and tagging videos
• Provide access to supporting materials.
• Creating a “community of practice” around the videos so materials
can be shared among educators.
25
Needs assessment results: secondary goals
• Materials for teacher trainers
• Teachers of heritage learners can learn about local variation
• Video recording as a cross-competence task
• Interviews collected by students can be contributed to the corpus
26
27
Ideas for future development
• Advanced search capability
• support for wildcards
• improved phrase searching
• improved “keyword in context” result view
• Data visualizations
• word and/or tag clouds
• language maps
• Enhanced word-level annotations
• hover over a word in a transcript and see all annotations
28
Formative evaluation of Beta version
Data collection methods:
• Online user survey
• Web analytics (navigation patterns, popular content)
• Search analytics
• User observation and feedback through ongoing
workshops and focus groups
Results will drive future development of the interface.
29
Involving Teachers in the
Development of OER
30
Workshops with Educators
• Summer 2012 Workshop
• ~100 secondary and college Spanish teachers
• Fall 2012 Working Group
• ~10 Univ. of Texas Spanish teachers
• Spring 2013 Workshops
• Multiple conferences & Univ. of Texas Spanish teachers
• Summer 2013 Working Group
• ~10 secondary and college Spanish teachers
31
Sample materials from the community (1)
32
33
Sample materials from the community (2)
• Idea from teacher workshop: Use videos for grammar
lessons to develop the student‟s metalinguistic and critical
thinking skills as they pertain to language.
• Searched and selected clips for lesson on “por vs. para”.
• Lesson tested in heritage learners class.
• Anecdotal evidence that video lessons were effective and
motivating to students.
34
Template development ideas
• Using video clips from the SpinTX video archive, create
an activity for classroom use (at any level).
• Focus on Topics: Familia, Idioma, Identidad
• Focus on Grammar: Por vs. Para, Gustar, Ser vs. Estar
• Four steps
• Predict: Before watching
• Observe: While watching
• Discuss: After watching
• Produce: Follow-up activity
35
Publication of OER
• Community-developed lesson plans will be available on
the SpinTX website by August, 2013
• We encourage the publication of videos on third-party
platforms for remixing educational content, such as TedEd
(http://www.ed.ted.com)
36
A Model for Open Source
Corpus Development
37
Open source development
• Open Source Software
• TreeTagger (part-of-speech tagger)
• Drupal
• Open API‟s
• YouTube Captioning API
• Google Fusion Tables API
• Custom code developed for the project
• Freely available in our GitHub repository: http://github.com/coerll
38
Enable sharing of content and data
• With educators:
• SpinTX interface allows embedding, downloading, & social sharing
of videos and transcripts.
• With researchers:
• Source tagged data in our GitHub repository
https://github.com/coerll/SpinTXCorpusData
• Documentation of data in our GitHub wiki
https://github.com/coerll/SpinTXCorpusData/wiki
39
Open content licenses
• Creative Commons provides licenses for Open
Educational Resources
• We use CC BY-NC-SA (Attribution, Non-Commercial, Share-Alike)
40
Open Project Documentation
• Research protocols, development processes and
methodologies, and other project documentation
publically available:
• Corpus-to-Classroom Blog: http://sites.la.utexas.edu/corpus-to-
classroom/
• “For Researchers” page on
spanishintexas.orghttp://spanishintexas.org/for-researchers/
41
Questions
42
Links
• SpinTX Video Archive:
http://www.spintx.org
• Spanish in Texas Corpus:
http://www.spanishintexas.org
43

Weitere ähnliche Inhalte

Ähnlich wie SPinTX Corpus-to-Classroom: A Teacher-Centered Pedagogical Interface for the Spanish in Texas Corpus

Designing for Diversity: Creating Learning Experiences that Travel the Globe
Designing for Diversity: Creating Learning Experiences that Travel the GlobeDesigning for Diversity: Creating Learning Experiences that Travel the Globe
Designing for Diversity: Creating Learning Experiences that Travel the GlobeUna Daly
 
Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...
Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...
Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...acornrevolution
 
The OER Workshop as a Launching Pad to Zero-Textbook Cost Course Design
The OER Workshop as a Launching Pad to Zero-Textbook Cost Course DesignThe OER Workshop as a Launching Pad to Zero-Textbook Cost Course Design
The OER Workshop as a Launching Pad to Zero-Textbook Cost Course DesignLaura Murray
 
OER Vetting: Cultural Relevance, Accessibiilty, & Licensing
OER Vetting:  Cultural Relevance, Accessibiilty, & LicensingOER Vetting:  Cultural Relevance, Accessibiilty, & Licensing
OER Vetting: Cultural Relevance, Accessibiilty, & LicensingUna Daly
 
AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...
AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...
AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...Chris Willmott
 
Eurocall2014 SpeakApps Presentation - Speaking Practice
Eurocall2014 SpeakApps Presentation - Speaking PracticeEurocall2014 SpeakApps Presentation - Speaking Practice
Eurocall2014 SpeakApps Presentation - Speaking PracticeSpeakApps Project
 
Flip your classroom tech in elt-challenges and remedies
Flip your classroom   tech in elt-challenges and remediesFlip your classroom   tech in elt-challenges and remedies
Flip your classroom tech in elt-challenges and remediesEric H. Roth
 
Design and Development of Learning Resource Materials
Design and Development of Learning Resource MaterialsDesign and Development of Learning Resource Materials
Design and Development of Learning Resource MaterialsNoel Ortega
 
How Open Education Practices Support Student Centered Design & Accessibility
How Open Education Practices Support Student Centered Design & AccessibilityHow Open Education Practices Support Student Centered Design & Accessibility
How Open Education Practices Support Student Centered Design & AccessibilityUna Daly
 
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...Alannah Fitzgerald
 
The SIOP model...an Overview
The SIOP model...an OverviewThe SIOP model...an Overview
The SIOP model...an OverviewBeth Amaral
 
Materials devlopment good_practice
Materials devlopment good_practiceMaterials devlopment good_practice
Materials devlopment good_practicelllvt
 
HTC Presentation Version 1
HTC Presentation Version 1HTC Presentation Version 1
HTC Presentation Version 1Michael Rost
 

Ähnlich wie SPinTX Corpus-to-Classroom: A Teacher-Centered Pedagogical Interface for the Spanish in Texas Corpus (20)

Testing
TestingTesting
Testing
 
Blended Learning-Best Practices
Blended Learning-Best PracticesBlended Learning-Best Practices
Blended Learning-Best Practices
 
Designing for Diversity: Creating Learning Experiences that Travel the Globe
Designing for Diversity: Creating Learning Experiences that Travel the GlobeDesigning for Diversity: Creating Learning Experiences that Travel the Globe
Designing for Diversity: Creating Learning Experiences that Travel the Globe
 
Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...
Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...
Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...
 
The OER Workshop as a Launching Pad to Zero-Textbook Cost Course Design
The OER Workshop as a Launching Pad to Zero-Textbook Cost Course DesignThe OER Workshop as a Launching Pad to Zero-Textbook Cost Course Design
The OER Workshop as a Launching Pad to Zero-Textbook Cost Course Design
 
OER Workshop
OER Workshop OER Workshop
OER Workshop
 
OER Vetting: Cultural Relevance, Accessibiilty, & Licensing
OER Vetting:  Cultural Relevance, Accessibiilty, & LicensingOER Vetting:  Cultural Relevance, Accessibiilty, & Licensing
OER Vetting: Cultural Relevance, Accessibiilty, & Licensing
 
AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...
AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...
AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...
 
Using pedagogic corpora in ELT
Using pedagogic corpora in ELTUsing pedagogic corpora in ELT
Using pedagogic corpora in ELT
 
Eurocall2014 SpeakApps Presentation - Speaking Practice
Eurocall2014 SpeakApps Presentation - Speaking PracticeEurocall2014 SpeakApps Presentation - Speaking Practice
Eurocall2014 SpeakApps Presentation - Speaking Practice
 
Target Your Training: Techniques to Adapt Your Content to Meet Your Students ...
Target Your Training: Techniques to Adapt Your Content to Meet Your Students ...Target Your Training: Techniques to Adapt Your Content to Meet Your Students ...
Target Your Training: Techniques to Adapt Your Content to Meet Your Students ...
 
Flip your classroom tech in elt-challenges and remedies
Flip your classroom   tech in elt-challenges and remediesFlip your classroom   tech in elt-challenges and remedies
Flip your classroom tech in elt-challenges and remedies
 
Design and Development of Learning Resource Materials
Design and Development of Learning Resource MaterialsDesign and Development of Learning Resource Materials
Design and Development of Learning Resource Materials
 
Evaluating CALL
Evaluating CALLEvaluating CALL
Evaluating CALL
 
How Open Education Practices Support Student Centered Design & Accessibility
How Open Education Practices Support Student Centered Design & AccessibilityHow Open Education Practices Support Student Centered Design & Accessibility
How Open Education Practices Support Student Centered Design & Accessibility
 
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
 
The SIOP model...an Overview
The SIOP model...an OverviewThe SIOP model...an Overview
The SIOP model...an Overview
 
Materials devlopment good_practice
Materials devlopment good_practiceMaterials devlopment good_practice
Materials devlopment good_practice
 
OER.pptx
OER.pptxOER.pptx
OER.pptx
 
HTC Presentation Version 1
HTC Presentation Version 1HTC Presentation Version 1
HTC Presentation Version 1
 

Kürzlich hochgeladen

General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 

Kürzlich hochgeladen (20)

General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

SPinTX Corpus-to-Classroom: A Teacher-Centered Pedagogical Interface for the Spanish in Texas Corpus

  • 1. SPinTX Corpus-to-Classroom: ATeacher-Centered Pedagogical Interface for the Spanish in Texas Corpus Barbara E. Bullock, Almeida Jacqueline Toribio, Rachael Gilg, Martí Quixal & Arthur Wendorf
  • 2. Who we are • Barbara E. Bullock & Almeida Jacqueline Toribio • Project Directors / Sociolinguistics Researchers • Rachael Gilg • Project Manager / Web Developer • Arthur Wendorf • Corpus Linguist / Developer • Martí Quixal • Computational Linguist / Developer • Carl Blyth • Director of COERLL 2
  • 3. Agenda • Part 1: Introduction to the Corpus-to-Classroom Project • Part 2: Project Results • The SpinTX Video Archive: a pedagogically-friendly interface to the Spanish in Texas Corpus • Involving teachers in the development of open educational resources • A model for open source corpus development 3
  • 5. Corpora in the Classroom: the promise • Corpus: a large, structured, collection of language • Benefits: • Naturalistic language use • Motivation • „Real‟ language • Discovery learning • Examples: 5
  • 6. Corpora in the Classroom: the reality • Large linguistic corpora are of limited utility to untrained end users. • Designed for researchers, not educators. • Collections such as YouTube are popular for language classes, but can present problems • Searching for appropriate content is time-consuming using available search methods. • Content is not necessarily openly-licensed and can disappear without warning. 6
  • 7. Our two-pronged approach Spanish in Texas Corpus Project A project of COERLL, a National Foreign Language Resource Center (2010-2014) • Video interviews provide rich content SpinTX: Corpus-to-Classroom Project Grant from the University of Texas Longhorn Innovation Fund for Technology (2012-2013) • Collection of pre-selected, corrected, annotated clips from the larger corpus • Open-source, pedagogically-friendly search and authoring tools 7
  • 8. Spanish in Texas Corpus: Goals • To make publically available authentic data about variation in Spanish as spoken in Texas • for education • for research • Encourage teachers/students/public to view local varieties as a resource 8
  • 9. Corpus-to-Classroom: Goals • develop a pedagogically friendly interface for using the Spanish in Texas corpus • involve teachers and learners, via crowd-sourcing, social networking, and workshops, in the development of open educational resources • create a model for using open source tools and a pedagogical interface that can be adapted for any language corpus collection 9
  • 10. Corpus Overview Spanish in Texas corpus • Approx. 92 videos of sociolinguistic interviews (avg. 30–45 min) • Transcribed (approx. 600,000 words) • Time-synced video caption files • Tagged for linguistic features SpinTX Video Archive corpus • Approx. 327 video clips from 33 speakers (avg. 1-4 min) • Transcribed (approx. 80,000 words) • Time-synced video caption files • Tagged for linguistic and pedagogical features • Completely open (no registration required, open CC license) • Teacher-friendly interface 10
  • 11. Corpus Tagging: Basic • Time-synced captions • Part-of-speech tags (dual language) • POS • POS, simplified • Gender • Tense • Aspect • Mood • Speaker identification • Age • Gender • Region 11
  • 12. Corpus Tagging: Pedagogical • Topics (manually added) • Automatic tags using custom rulesets • Grammatical • aggregated from textbooks • Pragmatics • discourse markers, place holders (“este”), attenuators • Vocabulary • concept words • Functional (planned) • greetings, ask for help, express opinions • Bilingual forms (planned) • CS, loans, loan translations 12
  • 13. 13
  • 15. Original Transcript (from Automatic Sync)
  • 16. Upload Video and Transcript to YouTube
  • 17. Review Transcript in Google Docs
  • 21. Combine Data from SRT File and TreeTagger File, and add additional Tags
  • 22. Divide CSV Files and Videos into Clips and adjust Timings and Numberings
  • 23. The SpinTX Video Archive: a pedagogically-friendly interface to the Spanish in Texas Corpus 23
  • 24. Needs assessment: teacher interviews • How do you use authentic video in your teaching? • Describe searches you have done in the past for video content. What were you looking for and were you able to find it? • How can you imagine using clips from the Spanish in Texas video corpus in your classes? 24
  • 25. Needs assessment results: primary goals • Enable teachers to easily videos that suit the curriculum/work plan • Search by grammar, theme, vocabulary, etc. • Provide open, non-ephemeral content • Downloadable from open site with a license enabling remixing • Curating sets of videos for comparison and study • Favoriting and tagging videos • Provide access to supporting materials. • Creating a “community of practice” around the videos so materials can be shared among educators. 25
  • 26. Needs assessment results: secondary goals • Materials for teacher trainers • Teachers of heritage learners can learn about local variation • Video recording as a cross-competence task • Interviews collected by students can be contributed to the corpus 26
  • 27. 27
  • 28. Ideas for future development • Advanced search capability • support for wildcards • improved phrase searching • improved “keyword in context” result view • Data visualizations • word and/or tag clouds • language maps • Enhanced word-level annotations • hover over a word in a transcript and see all annotations 28
  • 29. Formative evaluation of Beta version Data collection methods: • Online user survey • Web analytics (navigation patterns, popular content) • Search analytics • User observation and feedback through ongoing workshops and focus groups Results will drive future development of the interface. 29
  • 30. Involving Teachers in the Development of OER 30
  • 31. Workshops with Educators • Summer 2012 Workshop • ~100 secondary and college Spanish teachers • Fall 2012 Working Group • ~10 Univ. of Texas Spanish teachers • Spring 2013 Workshops • Multiple conferences & Univ. of Texas Spanish teachers • Summer 2013 Working Group • ~10 secondary and college Spanish teachers 31
  • 32. Sample materials from the community (1) 32
  • 33. 33
  • 34. Sample materials from the community (2) • Idea from teacher workshop: Use videos for grammar lessons to develop the student‟s metalinguistic and critical thinking skills as they pertain to language. • Searched and selected clips for lesson on “por vs. para”. • Lesson tested in heritage learners class. • Anecdotal evidence that video lessons were effective and motivating to students. 34
  • 35. Template development ideas • Using video clips from the SpinTX video archive, create an activity for classroom use (at any level). • Focus on Topics: Familia, Idioma, Identidad • Focus on Grammar: Por vs. Para, Gustar, Ser vs. Estar • Four steps • Predict: Before watching • Observe: While watching • Discuss: After watching • Produce: Follow-up activity 35
  • 36. Publication of OER • Community-developed lesson plans will be available on the SpinTX website by August, 2013 • We encourage the publication of videos on third-party platforms for remixing educational content, such as TedEd (http://www.ed.ted.com) 36
  • 37. A Model for Open Source Corpus Development 37
  • 38. Open source development • Open Source Software • TreeTagger (part-of-speech tagger) • Drupal • Open API‟s • YouTube Captioning API • Google Fusion Tables API • Custom code developed for the project • Freely available in our GitHub repository: http://github.com/coerll 38
  • 39. Enable sharing of content and data • With educators: • SpinTX interface allows embedding, downloading, & social sharing of videos and transcripts. • With researchers: • Source tagged data in our GitHub repository https://github.com/coerll/SpinTXCorpusData • Documentation of data in our GitHub wiki https://github.com/coerll/SpinTXCorpusData/wiki 39
  • 40. Open content licenses • Creative Commons provides licenses for Open Educational Resources • We use CC BY-NC-SA (Attribution, Non-Commercial, Share-Alike) 40
  • 41. Open Project Documentation • Research protocols, development processes and methodologies, and other project documentation publically available: • Corpus-to-Classroom Blog: http://sites.la.utexas.edu/corpus-to- classroom/ • “For Researchers” page on spanishintexas.orghttp://spanishintexas.org/for-researchers/ 41
  • 43. Links • SpinTX Video Archive: http://www.spintx.org • Spanish in Texas Corpus: http://www.spanishintexas.org 43

Hinweis der Redaktion

  1. Will introduce corpora in general, our source corpus, and the pedagogical corpus
  2. Discuss examples briefly one at a time.How frequently do teachers use them?How easy are they to use?Emphasis on YouTube as probably the most popular in language classes, but hard to use.
  3. Discuss examples briefly one at a time.How frequently do teachers use them?How easy are they to use?Emphasis on YouTube as probably the most popular in language classes, but hard to use.
  4. Describe original corpusThis is similar to the other corpora we looked at earlierIntroduce SpinTX corpus and highlight differences
  5. Will introduce corpora in general, our source corpus, and the pedagogical corpus
  6. We asked teachers how they use videos and how they would like to use videos. (interviews and focus groups
  7. We asked teachers how they use videos and how they would like to use videos.Here is how we havemet their needs
  8. We asked teachers how they use videos and how they would like to use videos.Here is how we havemet their needs
  9. 1. Anonymous userWatch intro video.Show search criteria: topics, grammar, pragmatics, keywords, etc.Show video page: related items, transcripts with highlighting, sharing & downloading tabs2. Registered userHow to favorite and tag a videoTagged video lists
  10. We asked teachers how they use videos and how they would like to use videos.Here is how we havemet their needs
  11. We asked teachers how they use videos and how they would like to use videos.Here is how we havemet their needs
  12. But that’s not all!
  13. This will be an ongoing process that will hopefully eventually be taken over by the users.
  14. This will be an ongoing process that will hopefully eventually be taken over by the users.
  15. This will be an ongoing process that will hopefully eventually be taken over by the users.
  16. This will be an ongoing process that will hopefully eventually be taken over by the users.
  17. This will be an ongoing process that will hopefully eventually be taken over by the users.
  18. This will be an ongoing process that will hopefully eventually be taken over by the users.
  19. 5 guidelines for developing open corporaWill also illustrate how we have implemented each guideline