Hkpark apan030828

An Analysis of Fault Isolation
in Multi-Source Multicast Session

Network Research Workshop

2003. 8. 28
Heonkyu Park
hkpark@cosmos.kaist.ac.kr
Korea Advanced Institute of Science and Technology

System Architecture Lab

Table of Contents
1. Motivations / Problem Definition
2. Background
3. Analysis
4. Issues
5. Candidate Model
6. Simulation Results
7. Conclusion
References


Before we start…
• Terminology
– Unicast : to a single receiver
– Multicast : to a specific subset of receiver
• single-source : only one source in a session (one-to-many multicast)
• multi-source : many sources in a session (many-to-many multicast)
– Fault Detection : perceiving the fault in somewhere in the network
– Fault Isolation : locating the fault that on-tree router or link which
is the origin of a fault.

Hmm… Fault is in somewhere… OK! I found the Fault!

Fault Detection Fault Isolation

Motivation / Problem Definition
1. Network monitoring is necessary to detect and discover of
network problems.
2. Some participants in multicast experience
severe packet loss. Obtained using Rqm [rqm] tool

3. Fault detection / isolation approaches in multicast are
focused on single-source network.
4. In multi-source multicast, little work has been done for
fault isolation.
5. Straightforward reuse single-source solution is not
sufficient for large number of multi-source multicast.

 New model for fault isolation in multi-source
multicast is needed.

Background
1. IP Multicast
2. Multi-Source Multicast Applications
3. Challenges of Multicast Monitoring
4. Needs for Multicast Fault Isolation

1. IP Multicast
routing path is changed when a receiver
fault is occurred.
receiver
source send to a
multicast session

receiver
Multicast Packets
When fault occur
receiver

Multi-Source Multicast Applications
1. Networked virtual environments
2. Synchronized resource like database updates
3. Distributed or parallel concurrent processing
4. Large-scale distributed military simulation
5. Peer-to-peer multicast file transfer model
6. Large-scale multimedia conference
7. Large-scale replicated database Group Size [LN01]
8. Cooperative web cache protocols Number Peer-to-Peer
9. Shared editing and collaboration 1,000,000 of Senders Applications

10. Interactive distance learning 1,000
Distributed
Information
11. Network games or chatting Systems
10 Games
12. and more… Streaming
Collaboration
1 Tools Content
Distribution

10 1,000 1,000,000
Number
of Receivers


Multicast Monitoring Tools [SA01]
Management, Debugging and Modeling via Active / Passive Monitoring

Time Monitoring

Management Debugging Modeling
~1992 mrmap
mrinfo mrdebug
rtpmon
mtrace Mah’s
mwatch Study

mstat mlisten
mview
mrtree Dr. Watson Yajnik’s
Study

~1997 GDT MultiMon Handley’s
NetIQ’s mhealth Study
Chariot * RouteMonitor
MantaRay NIMI *
mantra sdr-mon MINC *
mmon Otter
MRM *
mwalk
~2000 HPMM
SNMP_NG * : can be used for active monitoring
recent research work

Needs for Multicast Fault Isolation
1. Monitoring of multicast network has become a crucial for
maintaining the multicast operations
– since the delivery service in multicast is more complex than in
traditional unicast networks
– Supervising multicast traffic is more difficult problem as each
multicast tree involves multiple hosts with correlated,
simultaneous faults.
2. There are various reasons causing multicast fault.
– session announcement problem, reception problem, multicast
router problem, congestion and rate-limiting problems, multicast
routing problem, etc. [TA00]
3. It is not easy work even in single-source multicast, to say
nothing of multi-source multicast.


Analysis on Single-Source Approach
 Only for fault detection
1. MRM (Multicast Reachability Monitoring) [SA01]
• active probing from a test sender(TS) to a test receiver(TR) by
MRM manager
2. SMRM (SNMP-Based MRM) [AT02]
• SNMP-based approach defined several MIB for multicast
monitoring

 Both detection and isolation
3. HPMM (Hierarchical Passive Multicast Monitoring) [WL00]
• passive monitoring scheme that agents are organized in a hierarchy
and communicate with each other using unicast
3. MTR (Fault Isolation in Multicast Tree) [RGE00]
• receiver-driven method using IGMP multicast traceroute

 Most approaches up to now focused on single-source
multicast.

MRM (Multicast Reachability Monitor) [SA01] - Description
Step 3: TR(s) Monitor
Group Transmission
R1 R2 TR2

Step 2: TS
Transmits

TS R6
TR3

TS: Test sender R3 R4 R5
TR: Test Receiver

TR1 MRM
Manager
Step 1: Mgr Configures Step 4: Mgr Collects and
TS(s) and TR(s) Displays TR Reports

Router End-Host Manager ↔ Agent
Communication


SMRM (SNMP-Based MRM) [AT00] - Description

smrmMIB Group in Extended MIB II

HPMM (Hierarchical Passive Multicast Monitor) [WL00] - Description
Foreign source 1 Foreign source 2
domain 1 domain 2

1 2
group 1
Local 1 A group 2
domain 2
1 B
1 C
D E 2

• Each node knows exactly which upstream agent to notify in case of a fault occurrence.
• Node D has only one parent for both multicast groups 1 and 2, which is node B
• Node E defines a parent agent in B for group 1 and a parent agent in C for group 2.


MTR (Fault Isolation in Multicast Tree) [RGE00] - Description
Before After
Source Source

: Fault

Isolated Fault Region
Rb Rb
common ancestor

router of Ra & Rc

Ra Ra
System Architecture Lab Rc Rc

Comparison on Related Works
Single- source Multi-source
Active/
Remarks
Passive Detect Isolate Detect Isolate

MRM Active ○ Χ △ Χ test session
SMRM Passive ○ Χ △ Χ SNMP-based
child-parent
HPMM Passive ○ ○ Χ Χ relationship
MTR Active ○ ○ Χ Χ IGMP mtrace

※ No suggested approaches are sufficient for fault
isolation in multi-source multicast network.

Message Complexity of Current Approaches
1. Network overload exponentially increased by extending
number of members
• As extend member size, mtrace request packets and mtrace reply
packets are excessive.
2. Simulation result by ns-2
• tree topology: 100 nodes, out-degree : 3
• number of members : 5 ~ 60 (increased by 5)
• 5 times average calculation 300
x 1,000

3. Thus, it needs different 250

strategy to handle 200

multi-source multicast
overload
150

fault detection and 100

isolation. 50

0
5 10 15 20 25 30 35 40 45 50 55 60
System Architecture Lab number of members

Issues
1. Application Characteristics 4. Scalability
2. Message Complexity 5. Deployment
3. Fault Isolation Error

1. Application Characteristics
Conferencing Application Broadcasting Application
Performance Require low latency and interested in bandwidth,
requirements high bandwidth latency is not concern
Loss tolerate loss require reliable data delivery
Session length long lived, over 10 min short-lived
Group Dynamic and small groups relatively static
characteristics
Source transmission multiple sources a single static source
patterns
Comparison on two applications [CRSZ01]

Issues for Multi-Source Multicast Fault Isolation
1. Message Complexity
– Message complexity will be main concern. message not good
complexity
– Not to increase linearly, but to logarithmic Acceptable
• not O(N), but O(logN) or O(1)
size of members

2. Fault Isolation Error
– Should be same or decreased compared to previous approach.
– No sudden computation overload to isolate faults
– near-realtime fault detection and isolation function
3. Scalability
– not effected with the number of members
– dynamic member action like join / leave actions
4. Deployment
– should be easily deployable not depend on protocols and techniques.

Candidate Model
• Goal : Isolate the fault promptly and accurately using
efficient and scalable approach in the multicast network
when the fault is occurred.

• Basic Idea : member grouping
1. do not let all member send probe
2. there exists shared path from local member to other members
3. make maximum use of shared information
4. only group leader send probe for fault isolation to other group
leaders
• Benefits
1. reduce message complexity
2. scalable since not depend on size of members

Draft Model
1. Each group select a Group A Group B
A2 B2
group leader. A1
A3 B1

2. Group leader manages
its member and sends
probes for fault isolation.
3. Not send to all other
group leaders, but send
just common ancestor
router with other group
leaders. C1
D2
D1
Group C Group D


Member Grouping
1. how the members are grouped
one group
– simply, boundary within border router
– need to find a way to make a group bigger
since the number of group can be still large
2. how the members in a group know their group leader
group leader
– group leader send a probe to group member
periodically “i-am-leader” packet
border router
3. how know group leader exist
– newly joined member send “i-am-leader” packet in a group using
multicast scoping
– if no response, it becomes the leader.
– if somebody send “i-am-leader” packet, consider there is a leader.

Group Leader Action Lists
1. Managing members in group
– use “i-am-leader” to control group member
– “you-are-leader” packet when leave

2. Fault Isolation
– primarily function for
group leader
– exchange among other
group leaders
2. Group leader
announcement
– It is not easy work to
announce and to find out
the group leaders


Simulation Results
• Overview
– Simulated a simplified protocol using ns-2 simulator
– Random graph by GT-ITM
– Average value after five time simulations
– Compared with best approach among related works
• Results 18000

–
16000
All-member-based (best-performance) 14000

y ≈ 176 ⋅ x

message complexity
12000

– Group-leader-based
10000

8000

y ≈ 58 ⋅ x 6000
All-member-based Approach

Group-leader Approach
4000

2000

– reduced the message complexity 68% 0
10 20 30 40 50 60 70 80 90 100
number of source


Conclusion
1. It is important to locate the fault in a network.
2. Little work has been done for fault isolation even in
detection in multi-source multicast.
3. In multi-source multicast fault isolation, message
complexity is main concern.
4. One candidate approach is a group-based architecture to
locate the fault in a multi-source multicast session.
5. Simulation results show group-based approach reduced
the message complexity as amount of 68% than the best
performance approach among other ones.
6. However, group-based approach is not fully enough for
scalability reason, etc.

Future Works
• Need more efficient approach for message complexity.
• Possible model is suppressed one-way probing mechanism.
– Source sends a special packet to multicast group.
– All internal router records its routing information in the special
packet.
– Without packet suppression, implosion problem will be occurred.
– Receiver compare to check whether routing path was changed.
• Simulation results show that this
4 Message complexity (r = 10)
x 10
12

suppressed one-way probing is 10

well suit for multi-source
No suppression
Max Suppression

message complexity
8 Min Suppression

multicast network.
6

• Several things to elaborate…
4

• Any comment will be
appreciated.
2

0
0 200 400 600 800 1000 1200
number of source

References
[AT02] E. Al-Shaer and Y. Tang, “SMRM: SNMP-based multicast rechability monitoring,”
in IEEE/IFIP Network Operations and Management Symposium (NOMS) 2002,
Florence, Italy, April 2002.
[CRSZ01] Yang-hua Chu, Sanjay G. Rao, Srinivasan Seshan and Hui Zhang, “Enabling
conferencing applications on the Internet using an overlay multicast architecture,” in
ACM SIGCOMM 01, San Diego, California, August 2001.
[LN01] J. Liebeherr and M. Nahas, “Application-layer multicast with Delaunay
Triangulations,” Global Internet Symposium, IEEE GlobeCom 2001, San Antonio,
Texas, November 2001.
[RGE00] A. Reddy, R. Govindan and D. Estrin, “Fault isolation in multicast trees,” In
Proceeding of ACM SigComm 2000, Stockholm, Sweden, Aug. 2000.
[Rqm] C. Perkins, “RTP Quality Matrix,” (RTP Quality Matrix), [online], http://www-
mice.cs.ucl.ac.uk/multimedia/software/rqm/ (Accessed: 7 March 2003).
[SA01] K. Sarac and K. C. Almeroth, “Supporting multicast deployment efforts: A survey of
tools of multicast monitoring,” Journal of High Speed Networking--Special Issue on
Management of Multimedia Networking, vol. 9, num. 3/4, pp. 191-211, March 2001.
[TA00] D. Thaler, B. Aboba, “Multicast Debugging Handbook,” Internet draft, draft-ietf-
mboned-mdh-*.txt, Internet Engineering Task Force (IETF), November 2000.
[WL00] J. Walz and B. N. Levine, “A hierarchical multicast monitoring scheme,” In 2nd
International Workshop on Networked Group Communication, Nov. 2000.


Thank you.
Question?
Comment?


Hkpark apan030828

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Hkpark apan030828

Ähnlich wie Hkpark apan030828 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Hkpark apan030828