Presiding Officer Training module 2024 lok sabha elections
DISCUS Project Overview
1. the DISCUS project & SEASR
Xavier Llorà1,2, David E. Goldberg1 & Michael Welge2
1Illinois Genetic Algorithms Lab, Department of Industrial and Enterprise Systems Engineering,!
University of Illinois at Urbana-Champaign!
2Data-Intensive Technologies and Applications, National Center for Supercomputing Applications, !
University of Illinois at Urbana-Champaign!
2. The Vision
• Computers have become mediators of collaborations
– Email, chat rooms, blogs, wikis…
– A flood of available information
– Different modes of communication
• Let’s take advantage of such information
– Logs of conversations
– Archive of documents (email attachments, blogs, personal web
pages…)
– Human-computer interactions
– Social aspect of the communication and collaboration
– Needs to work for multiple languages
3. The Project
• DISCUS started in 2003 as an IlliGAL & NCSA collaboration
• Supports innovation and creativity:
DISCUS: Distributed Innovation and Scalable Collaboration in Uncertain Settings
• Basic research components
– Competent genetic algorithms (HBGA, iGA)
– Advance chance discovery components
– Adapt and expand the analysis of social interaction
– Efficient data mining techniques for conversations
– Develop a social network analysis for creativity and innovation processes
the DISCUS project (May 2007) Xavier Llorà 3
4. The Project
• Technology development
– Infrastructure to support creativity and innovation processes
– Reusable repositories of analytic components
– Standardize heterogeneous data storage to boost interoperability
– Create hooks for non-intrusive usage and deployment
– Rapid adaptation cycle to new technologies
the DISCUS project (May 2007) Xavier Llorà 4
5. Research and Commercial Partners
• Some research partners along the quot;
way
– University of Illinois (IlliGAL, NCSA & CEE)
– University of Osaka
– University of Tokyo (School of Management, quot;
School of Engineering)
– University of Kyushu
• Commercial partner
– Hakuhodo Inc and HOW
– Mazda
– Toyota
6. The Research Picture
Analysis
Data mining
Social networks
Content
Knowledge management
Social aspects
the DISCUS project (May 2007) Xavier Llorà 6
14. CSPAN
• CSPAN digital library
– Videos
– Transcripts
– Annotations
• Example of real-time analysis
• Crawling and results
15. Some Facts
• Number of document: 110,234
• Number of persons: 78,915
• Number of total sentences: 252,132
• Number of total word: 2,034,209
16. Documents per Year
5000
500
Number of documents
50 100
10
5
1
1940 1960 1980 2000
Year
17. Number of words
1e+01 1e+02 1e+03 1e+04 1e+05
1940
Words per Year
1960
Year
1980
2000
18. the DISCUS project & SEASR
Xavier Llorà1,2, David E. Goldberg1 & Michael Welge2
1Illinois Genetic Algorithms Lab, Department of Industrial and Enterprise Systems Engineering,!
University of Illinois at Urbana-Champaign!
2Data-Intensive Technologies and Applications, National Center for Supercomputing Applications, !
University of Illinois at Urbana-Champaign!