Human Factors of XR: Using Human Factors to Design XR Systems
Marsyas
1. An open source platform for
multimedia processing
http://marsyas.sourceforge.net
Nov 2010
Saturday, 20 November, 2010
2. Notice
This work is licensed under the Creative Commons Attribution-Share Alike
2.5 Portugal License. To view a copy of this license, visit
http://creativecommons.org/licenses/by-sa/2.5/pt/
or send a letter to
Creative Commons, 171 Second Street, Suite 300, San Francisco,
California, 94105, USA.
Saturday, 20 November, 2010
4. Marsyas Overview
Software framework for media analysis, synthesis and retrieval
Open source (GPL license) http://www.fsf.org
Efficient and extensible framework design
original emphasis on Music Information Retrieval (MIR) ; now Multimedia!
C++, OOP
Multiplatform (Linux, MacOSX®, MS Windows®, …)
Provides a variety of building blocks for performing common audio tasks:
sound/video file IO, audio/video IO, signal processing and machine learning modules
blocks can be combined into data flow networks that can be modified and controlled
dynamically while they process data in soft real-time.
GM
M
WAV
WAV
source Hanning FFT
FFT source
KNN
LPC
Saturday, 20 November, 2010
5. Marsyas Overview
Marsyas Brief History
1998 ~2000
Created by George Tzanetakis during his PhD activities at Princeton
2000 ~2002
Marsyas 0.1
First stable revisions of the toolkit
Distributions hosted at SourceForge
Creation of a developer community
User and Developer Mailing lists
2002 ~ …
Marsyas 0.2
Major framework revision
SourceForge SubVersion
http://marsyas.sourceforge.net/
http://www.cs.princeton.edu/~gtzan
Saturday, 20 November, 2010
6. Marsyas Overview
Marsyas Core Developers (Present and past...) http://marsyas.info/community/people
George Tzanetakis
Main culprit (also main designer and developer)
Luís Gustavo Martins
Refactoring freak
Graham Percieval
Documentation overlord
Luís Filipe Teixeira
Marsyas X motor-head
Mathieu Lagrange
Biggest network award
Steven Ness
Ruby and Web guy
Check updated list at the website!
Saturday, 20 November, 2010
7. Marsyas Community
https://sourceforge.net/project/stats/graph/detail-graph.php?
group_id=84982&ugn=marsyas&type=prdownload&mode=alltime&file_id=0&graph=1
Saturday, 20 November, 2010
13. Users & Applications
Musicream (Masataka Goto) http://staff.aist.go.jp/m.goto/Musicream/
Music playback system with similarity capabilities
Uses Marsyas as its music similarity engine
Saturday, 20 November, 2010
15. Users & Applications
Marsyas @ INESC Porto http://www.inescporto.pt
Audio Analysis Software prototypes:
Feature Extraction
Audio segmentation/classification
Audio fingerprinting
Speaker Segmentation
Music and Auditory Scene Analysis
Video Analysis Software prototypes:
Background modeling and subtraction
Local feature extraction for object matching
http://www.inescporto.pt/~lfpt
http://www.inescporto.pt/~lmartins
Saturday, 20 November, 2010
16. Users & Applications
Desert Island
Undergraduate work at the Univ. Missouri
Kansas Jared Hoberock
Dan Kelly Ben Tietgen
Saturday, 20 November, 2010
17. Related Work
Open Source frameworks
CLAM (http://clam.iua.upf.edu/)
STK (http://ccrma.stanford.edu/software/stk/)
Chuck (http://chuck.cs.princeton.edu/)
PureData (Pd) (http://crca.ucsd.edu/~msp/software.html)
Open Sound Control (OSC) (http://cnmat.berkeley.edu/OpenSoundControl/)
FAUST (http://faudiostream.sourceforge.net/)
EyesWeb (http://www.infomus.dist.unige.it/EywMain.html)
...
Commercial toolkits
MAX/MSP® (http://www.cycling74.com/)
MATLAB® Simulink® (http://www.mathworks.com/products/simulink/)
LabView® (http://www.ni.com/labview/)
DirectShow® GraphEdit (http://www.microsoft.com)
...
Saturday, 20 November, 2010
18. Usage Scenarios
Marsyas command line tools
Demonstrate key capabilities of the framework
Some are actually research tools
Efficient and can execute in real-time
ANSI C++ only core
several optional libraries
MP3 reading/writing (libMad)
Tools and examples:
sfplay
bextract
phasevocoder
sfplugin
…
Saturday, 20 November, 2010
19. Usage Scenarios
Playing audio files
sfplay
> sfplay foo.wav
> sfplay –s 10.0 –l 3.2 –r 2.5 –g 0.5 foo1.wav foo2.au –f output.wav
> sfplay –l 3.0 foo.wav
> sfplay foo.wav –p playback.mpl
>
-s : start time for playback
-l : length of playback
-r : repeat times
-g : volume (gain) value
-p : playback.mpl (save playback network as a .mpl plugin file)
Saturday, 20 November, 2010
20. Usage Scenarios
Machine Learning: feature extraction and training of classifiers
bextract
> bextract -e STFTMFCC music.mf speech.mf -p ms.mpl -w myweka.arff
>
-e STFTMFCC
extracts spectral and MFCC features
music.mf, speech.mf
lists of sound files (collections)
-w myWeka.arff
WEKA file with extracted features
-p ms.mpl
“trained” Marsyas plug-in for realtime music/speech classification
Saturday, 20 November, 2010
21. Usage Scenarios
Marsyas plugins (.mpl files)
Allow to dynamically recreate a processing network in runtime
sfplugin
e.g. Audio playback
> sfplugin –p playback.mpl foo.wav
>
e.g. Realtime audio classification
> sfplugin –p ms.mpl unknownAudioSignal.wav
>
Saturday, 20 November, 2010
22. Usage Scenarios
Digital Signal Processing
e.g. phasevocoder
> phasevocoder –p 1.4 -s 100
>
http://eceserv0.ece.wisc.edu/~sethares/vocoders/phasevocoder.html
e.g. onset detector
> mudbox -t onsets myMusic.wav
>
Saturday, 20 November, 2010
23. Architecture
Marsyas 0.2
New dataflow model of audio computation
hierarchical messaging system used to control the dataflow network
inspired on Open Sound Control (OSC)
general matrices instead of 1-D arrays as data
Saturday, 20 November, 2010
24. Architecture
MarSystem Slices
Separating things that happen at the same time from things that
happen in different times
Correct semantics
for
spectral processing
Saturday, 20 November, 2010
30. Interoperability
Marsyas Audio and MIDI I/0 http://www.music.mcgill.ca/~gary/rtaudio/
http://www.music.mcgill.ca/~gary/rtmidi/
RtAudio
Multiplatform C++ API for realtime audio input/output
Linux (native ALSA, JACK, and OSS)
MacOSX®
Windows® (DirectSound® and ASIO®)
SGI®
RtMIDI
Multiplatform C++ API for realtime MIDI input/output
Linux (ALSA)
MacOSX®
Windows® (Multimedia Library)
SGI®
Saturday, 20 November, 2010
31. Interoperability
Marsyas & WEKA http://www.cs.waikato.ac.nz/ml/weka/
WEKA: Data Mining Software in Java
Marsyas already includes some machine learning blocks
Marsyas outputs extracted features as .arff files (WEKA)
features can be opened in WEKA for further evaluation and data modeling
Saturday, 20 November, 2010
32. Interoperability
Calling MATLAB® from C++ Marsyas code:
MATLAB® engine API http://www.mathworks.com
exchange data (i.e. matrices) in run-time between C++ and MATLAB®
remotely execute commands in MATLAB® from a C++ routine
Access to all MATLAB® toolboxes, algorithms and available routines
Algorithmic validation of C++ routines
Quick and easy evaluation of proof of concepts
May not allow real-time operation…
Not such a big problem when evaluating or developing algorithms
Saturday, 20 November, 2010
33. Interoperability
Calling MATLAB® from C++
Marsyas code:
Marsyas::MATLABengine class // create a std::vector of real numbers
Utility class std::vector<double> vector_real(4);
vector_real[0] = 1.123456789;
Wraps MATLAB® engine calls for most vector_real[1] = 2.123456789;
POD types and Marsyas data types vector_real[2] = 3.123456789;
vector_real[3] = 4.123456789;
Easy to send/receive data to/from
MATLAB® from anywhere in the code // send a std::vector<double> to MATLAB
PUTVAR(vector_real, "vector_real");
// do some dummy math in MATLAB
EVALUATE("mu = mean(vector_real);");
EVALUATE("sigma = std(vector_real);");
EVALUATE("vector_real = vector_real/max(vector_real);");
// get values from MATLAB
double m, s;
GETVAR(m, "mu");
GETVAR(s, "sigma");
GETVAR(vector_real, "vector_real");
Saturday, 20 November, 2010
34. Interoperability
Python™ Bindings http://www.python.org
easily create scripts for rapid testing and prototyping of data flow
networks
would require much more development effort in C++
bonus: no compiling overheads
can also be embedded in C++ code, similarly to MATLAB® (TBD)
less tools for signal processing in general, but can be used for many other
purposes (“batteries included”)
less licensing headaches
Saturday, 20 November, 2010
35. Interoperability
Marsyas and Trolltech Qt4® http://www.trolltech.com
Qt® Core features optionally used by Marsyas
Multi-platform signal/slot architecture
Multi-platform threads (multithreaded processing)
Multi-platform database access
Multi-platform XML I/O
Qt® GUI Features optionally
used by Marsyas
Multi-platform Widgets
Multi-platform OpenGL
Qt4® is available as GPL open source code for all platforms
Saturday, 20 November, 2010
36. MarsyasX
What is MarsyasX?
Next step for Marsyas
Evolution rather than an alternative implementation
Generalization of the framework to support processing different media
Key points:
crossmodal processing
expandable support
interoperability
GM AV MPEG
M Contour
source Sink
AV
MP WAV
source S EG Hanning FFT
ink source
Motion ur
to
Con
Saturday, 20 November, 2010
37. MarsyasX
MarsyasX brief history:
MarsyasX started as a patch to Marsyas...
...add visual processing support
However, the previous architecture has some restrictions:
Some design options are audio-oriented
No in-place processing - for visual processing this can be too much of a burden
Representing complex data in a single buffer requires inefficient hacks
The slice-based processing paradigm is an elegant solution, but can be very
difficult to adapt to a more generic framework
Saturday, 20 November, 2010
38. MarsyasX - architecture
First step was the creation of a generic data handling
between modules:
Data carried in payloads
Flows define a coherent stream of data and are identified by
type and name
Modules are explicitly associated to flows and can handle
multiple flows with different behaviours (in, inout, out)
source source filter filter sink sink
Saturday, 20 November, 2010
39. MarsyasX - architecture
Payload Mechanism:
Payloads are created in factories, associated to a source;
Payloads are carried between modules through channels
(created automatically by implicit patching);
When payloads are no longer necessary, they are sent back to
its source to be reutilized -> memory is allocated only once.
Saturday, 20 November, 2010
40. MarsyasX - architecture
Controls are formally related to the flows and
changes are propagated automatically
e.g. visual flow is characterized by width, height and colour
space
Saturday, 20 November, 2010
41. MarsyasX - architecture
Timing?
each “tick” corresponds to an elapsed time interval
synchronization is assured by the timing information
associated to each payload
How is a module created?
define how many and what type of flows it supports
factories for outputs and flow-related controls are added implicitly
add specific controls, if needed
create a process function
input and output data structures are automatically passed as arguments
Saturday, 20 November, 2010
42. MarsyasX - architecture
Modules are now loaded from extensions
implemented separately
dynamically loaded
application
marsyas
other
visual audio learning av specific
core
Saturday, 20 November, 2010
43. MarsyasX - interoperability
Python
Scripting language plays a more central role
Complete bindings to create networks and use core
functionalities
Marsyas 0.2
Legacy layer was created to support modules from Marsyas 0.2
transparently
A network can be an hybrid, containing MarsyasX and Marsyas
0.2 modules
MATLAB
Saturday, 20 November, 2010
44. MarsyasX - status
Status of development:
pre-alpha
release 0.1 in mid-2011 (tentative)
feature status:
payload mechanism [done]
flow management [done]
definition of API [partly done]
python bindings [partly done]
legacy layer [partly done]
processing modules [needs a lot more]
event management (expressions, GUI) [not fully defined]
Saturday, 20 November, 2010
45. MarsyasX - future
Main development trunk of the Marsyas project
Larger community:
associating the current users and developers of Marsyas with
many others coming from different areas
Distributed MarsyasX
Data and events exchanged transparently
Saturday, 20 November, 2010