SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Высокий пакетрейт на x86-64:
берем планку в 14,88 Mpps
<la@highloadlab.com>
01.2012-05.2013
7594 incidents total
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2012-01 2012-02 2012-03 2012-04 2012-05 2012-06 2012-07 2012-08 2012-09 2012-10 2012-11 2012-12 2013-01 2013-02 2013-03 2013-04 2013-05
SPOOF FULL-CONNECT
Что модно?
• UDP Flood and amplification.
• TCP ( SYN ( open|closed|firewalled) | ACK )
• ICMP Flood ( smurf )
L7 – is out of style
Кто виноват?
Долбанные инопланетяне
static unsigned int tcp_timeouts[TCP_CONNTRACK_TIMEOUT_MAX] __read_mostly = {
[TCP_CONNTRACK_SYN_SENT] = 2 MINS,
[TCP_CONNTRACK_SYN_RECV] = 60 SECS,
[TCP_CONNTRACK_ESTABLISHED] = 5 DAYS,
[TCP_CONNTRACK_FIN_WAIT] = 2 MINS,
[TCP_CONNTRACK_CLOSE_WAIT] = 60 SECS,
[TCP_CONNTRACK_LAST_ACK] = 30 SECS,
[TCP_CONNTRACK_TIME_WAIT] = 2 MINS,
[TCP_CONNTRACK_CLOSE] = 10 SECS,
[TCP_CONNTRACK_SYN_SENT2] = 2 MINS,
/* RFC1122 says the R2 limit should be at least 100 seconds.
Linux uses 15 packets as limit, which corresponds
to ~13-30min depending on RTO. */
[TCP_CONNTRACK_RETRANS] = 5 MINS,
[TCP_CONNTRACK_UNACK] = 5 MINS,
};
Кто еще виноват?
top - 08:16:23 up 39 min, 1 user, load average: 0.44, 0.16, 0.79
Tasks: 158 total, 2 running, 156 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 89.3%id, 0.0%wa, 0.0%hi, 10.7%si, 0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu8 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st
Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu10 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu11 : 0.0%us, 1.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 32921100k total, 4598792k used, 28322308k free, 15496k buffers
Swap: 0k total, 0k used, 0k free, 83252k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
39 root 20 0 0 0 0 R 100 0.0 0:27.91 [ksoftirqd/8]
1401 root 20 0 0 0 0 S 8 0.0 0:03.05 [kpktgend_8]
5346 root 20 0 0 0 0 S 2 0.0 0:00.34 [kworker/8:0]
5740 root 20 0 19356 1472 1076 R 1 0.0 0:00.12 top
Кто еще виноват?
ВЫ
(забыли настроить сетевой стэк)
Cферический сервер в
вакууме
Intel(R) Xeon(R) CPU E5-2670 x2
X520-DA2 (Intel® 82599ES)
Vanilla Linux 3.7.9
modprobe ixgbe(3.13.10) RSS=8
Как быть ?
AFFINITY > BALANCER
%/etc/init.d/irqbalancer stop
%grep eth8 /proc/interrupts
123: 19 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-0
124: 0 15 0 0 0 0 0 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-1
125: 0 0 15 0 0 0 0 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-2
126: 0 0 0 15 0 0 0 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-3
127: 0 0 0 0 15 0 0 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-4
128: 0 0 0 0 0 15 0 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-5
129: 0 0 0 0 0 0 17 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-6
130: 0 0 0 0 0 0 0 15 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-7
Лучше?
top - 07:40:25 up 3 min, 1 user, load average: 4.61, 1.29, 0.44
Tasks: 164 total, 9 running, 155 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st
Cpu2 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st
Cpu3 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st
Cpu4 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st
Cpu6 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st
Cpu7 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st
Cpu8 : 0.0%us, 0.0%sy, 0.0%ni, 90.2%id, 0.0%wa, 0.0%hi, 9.8%si, 0.0%st
Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu10 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu11 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu14 : 0.0%us, 1.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 32921100k total, 4597288k used, 28323812k free, 15340k buffers
Swap: 0k total, 0k used, 0k free, 83240k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15 root 20 0 0 0 0 R 96 0.0 0:46.06 [ksoftirqd/2]
23 root 20 0 0 0 0 R 96 0.0 0:46.04 [ksoftirqd/4]
11 root 20 0 0 0 0 R 95 0.0 0:46.04 [ksoftirqd/1]
19 root 20 0 0 0 0 R 95 0.0 0:46.03 [ksoftirqd/3]
27 root 20 0 0 0 0 R 95 0.0 0:46.02 [ksoftirqd/5]
31 root 20 0 0 0 0 R 95 0.0 0:46.08 [ksoftirqd/6]
35 root 20 0 0 0 0 R 95 0.0 0:46.04 [ksoftirqd/7]
3 root 20 0 0 0 0 R 93 0.0 0:45.23 [ksoftirqd/0]
Более лучше?
# ethtool -K eth8 ntuple on
# ethtool -U eth8 flow-type udp4 action -1
Added rule with ID 8189
# ethtool -u eth8
8 RX rings available
Total 1 rules
Filter: 8189
Rule Type: UDP over IPv4
Src IP addr: 0.0.0.0 mask: 255.255.255.255Dest IP addr: 0.0.0.0 mask: 255.255.255.255
TOS: 0x0 mask: 0xff
Src port: 0 mask: 0xffff
Dest port: 0 mask: 0xffff
VLAN EtherType: 0x0 mask: 0xffff
VLAN: 0x0 mask: 0xffff
User-defined: 0x0 mask: 0xffffffffffffffff
Action: Drop
Более лучше!
(здесь ~14.88Mpps UDP)
Tasks: 163 total, 1 running, 162 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu8 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu10 : 1.0%us, 0.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu11 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 32921100k total, 4374344k used, 28546756k free, 7700k buffers
Swap: 0k total, 0k used, 0k free, 24036k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4348 root 20 0 19356 1476 1076 R 1 0.0 0:00.03 top
1 root 20 0 4120 688 588 S 0 0.0 0:01.22 init [3]
2 root 20 0 0 0 0 S 0 0.0 0:00.00 [kthreadd]
Поприветствуем Flow Director
The flow director filters identify specific flows or
sets of flows and routes them to specific queues.
The flow director filters are programmed by
FDIRCTRL and all other FDIR registers. The 82599
shares the Rx packet buffer for the storage of
these filters.
Flow Director умеет
• Perfect match filters — The hardware checks a
match between the masked fields of the received
packets and the programmed filters. Masked
fields should be programmed as zeros in the filter
context. The 82599 support up to 8 K - 2 perfect
match filters.
• Signature filters — The hardware checks a match
between a hash-based signature of the masked
fields of the received packet. The 82599 supports
up to 32 K - 2 signature filters.
Perfect Filter умеют
(instanteneously)
• VLAN
• proto
• src_ip/mask
• src_port
• dst_ip/mask
• dst_port
• Flexible 2-byte tuple anywhere in the first 64
bytes of the packet (FRAME!)
Not so perfect
(Выкидыш FlowDirector)
• Потребляют память RX buffer (256/512)
• Не умеют ЕСЛИ-ТО
• Masks are GLOBAL for signature filters
• 64b это до обидного мало
• Поддерживается ethtool (perfect, buggy) и
PF_RING(signature only)
Но и на том Intel SPASIBO!
Flex Filters
(Выкидыши реализации RSS)
• 128b of the packet (FRAME!)
• 6 filters
• Кратковременно отключаются при
доступе(R|W)
• Нет публично доступного userland
конфигуратора.
Как быть с TCP SYN?
• SYN без Seq Number
• SYN без MSS
• … и прочие ляпы где можно вывести
сигнатуру до первых 128b
Как быть с Perfect TCP SYN ?
Больно умереть на 400kPPS…
Post mortem
# ========
#
# Samples: 19K of event 'cycles'
# Event count (approx.): 12923232073
#
# Overhead Command Shared Object Symbo l
# ........ ........... ................. .................................... .
#
78.74% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_lock
|
--- _raw_spin_lock
|
|--98.84%-- tcp_v4_rcv
| ip_local_deliver_finish
| ip_local_deliver
| ip_rcv_finish
| ip_rcv
| __netif_receive_skb
| netif_receive_skb
| napi_skb_finish
| napi_gro_receive
| 0xffffffffa005c134
| net_rx_action
| __do_softirq
| run_ksoftirqd
| smpboot_thread_fn
| kthread
| ret_from_fork
net/ipv4/tcp_ipv4.c
process:
if (sk->sk_state == TCP_TIME_WAIT)
goto do_time_wait;
if (unlikely(iph->ttl < inet_sk(sk)->min_ttl)) {
NET_INC_STATS_BH(net, LINUX_MIB_TCPMINTTLDROP);
goto discard_and_relse;
}
if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
goto discard_and_relse;
nf_reset(skb);
if (sk_filter(sk, skb))
goto discard_and_relse;
skb->dev = NULL;
bh_lock_sock_nested(sk);
ret = 0;
if (!sock_owned_by_user(sk)) {
[dd]
} bh_unlock_sock(sk);
sock_put(sk);
return ret;
Опять кто-то виноват..
• Обнаружитель SYNFLOOD
• TCP Cookie Transactions
• MD5SUM
Запилим Инновационный Костыль!
*rawpost
:POSTROUTING ACCEPT [15:1548]
-A POSTROUTING -s 10.1.0.0/24 -o eth8 -j RAWSNAT --to-source 10.10.40.3/32
COMMIT
# Completed on Mon May 20 04:47:30 2013
# Generated by iptables-save v1.4.16.3 on Mon May 20 04:47:30 2013
*raw
:PREROUTING ACCEPT [28:2128]
:OUTPUT ACCEPT [18:2056]
-A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 0 -j RAWDNAT --to-destination 10.1.0.1/32
-A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 1 -j RAWDNAT --to-destination 10.1.0.2/32
-A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 2 -j RAWDNAT --to-destination 10.1.0.3/32
-A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 3 -j RAWDNAT --to-destination 10.1.0.4/32
-A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 4 -j RAWDNAT --to-destination 10.1.0.5/32
-A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 5 -j RAWDNAT --to-destination 10.1.0.6/32
-A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 6 -j RAWDNAT --to-destination 10.1.0.7/32
-A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 7 -j RAWDNAT --to-destination 10.1.0.8/32
COMMIT
WIN!
0
500
1000
1500
2000
2500
3000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
kPPS
RSS
Вопросы?
На сладкое
А что будет если послать пакет на не
слушаемый порт?
А что если послать много-много пакетов?
Linux 3.5.7
0
100
200
300
400
500
600
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
KPPS
RSS
net/ipv4/ip_output.c
bh_lock_sock(sk);
inet->tos = arg->tos;
sk->sk_priority = skb->priority;
sk->sk_protocol = ip_hdr(skb)->protocol;
sk->sk_bound_dev_if = arg->bound_dev_if;
ip_append_data(sk, &fl4, ip_reply_glue_bits, arg->iov->iov_base, len, 0,
&ipc, &rt, MSG_DONTWAIT);
if ((skb = skb_peek(&sk->sk_write_queue)) != NULL) {
}
if (arg->csumoffset >= 0)
*((__sum16 *)skb_transport_header(skb) +
arg->csumoffset) = csum_fold(csum_add(skb->csum,
arg->csum));
skb->ip_summed = CHECKSUM_NONE;
ip_push_pending_frames(sk, &fl4);
bh_unlock_sock(sk);
Спасибо Эрик!
commit be9f4a44e7d41cee50ddb5f038fc2391cbbb4046
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Jul 19 07:34:03 2012 +0000
ipv4: tcp: remove per net tcp_sock
tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket
per network namespace.
This leads to bad behavior on multiqueue NICS, because many cpus
contend for the socket lock and once socket lock is acquired, extra
false sharing on various socket fields slow down the operations.
To better resist to attacks, we use a percpu socket. Each cpu can
run without contention, using appropriate memory (local node)
Спасибо Эрик!
0
500
1000
1500
2000
2500
3000
3500
4000
4500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
KPPS
RSS
К черту мушкетеров,
ПЯТНИЦА!

Weitere ähnliche Inhalte

Ähnlich wie Александр Лямин. HOWTO. Высокий пакетрейт на x86-64: берем планку в 14,88 Mpps

How deep is your buffer – Demystifying buffers and application performance
How deep is your buffer – Demystifying buffers and application performanceHow deep is your buffer – Demystifying buffers and application performance
How deep is your buffer – Demystifying buffers and application performanceCumulus Networks
 
realestate and MySQL devops melbourne
realestate and MySQL devops melbournerealestate and MySQL devops melbourne
realestate and MySQL devops melbournemysqldbahelp
 
Where'd all my memory go? SCALE 12x SCALE12x
Where'd all my memory go? SCALE 12x SCALE12xWhere'd all my memory go? SCALE 12x SCALE12x
Where'd all my memory go? SCALE 12x SCALE12xJoshua Miller
 
Cassandra Performance Tuning Like You've Been Doing It for Ten Years
Cassandra Performance Tuning Like You've Been Doing It for Ten YearsCassandra Performance Tuning Like You've Been Doing It for Ten Years
Cassandra Performance Tuning Like You've Been Doing It for Ten YearsJon Haddad
 
Barriers infront the scalable systems
Barriers infront the scalable systemsBarriers infront the scalable systems
Barriers infront the scalable systemsMarian Marinov
 
Introduction to Java Profiling
Introduction to Java ProfilingIntroduction to Java Profiling
Introduction to Java ProfilingJerry Yoakum
 
Broken Performance Tools
Broken Performance ToolsBroken Performance Tools
Broken Performance ToolsC4Media
 
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
 
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
 
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringOSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringNETWAYS
 
Performance tuning
Performance tuningPerformance tuning
Performance tuningJon Haddad
 
Creating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server HardeningCreating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server Hardeningarchwisp
 
Es werde Licht! Monitoring jenseits von tail und grep
Es werde Licht! Monitoring jenseits von tail und grepEs werde Licht! Monitoring jenseits von tail und grep
Es werde Licht! Monitoring jenseits von tail und grepOliver Fischer
 
OSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
OSDC 2015: Georg Schönberger | Linux Performance Profiling and MonitoringOSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
OSDC 2015: Georg Schönberger | Linux Performance Profiling and MonitoringNETWAYS
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringGeorg Schönberger
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceBrendan Gregg
 
Debugging Ruby
Debugging RubyDebugging Ruby
Debugging RubyAman Gupta
 
Managing PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterManaging PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterAlexey Lesovsky
 

Ähnlich wie Александр Лямин. HOWTO. Высокий пакетрейт на x86-64: берем планку в 14,88 Mpps (20)

How deep is your buffer – Demystifying buffers and application performance
How deep is your buffer – Demystifying buffers and application performanceHow deep is your buffer – Demystifying buffers and application performance
How deep is your buffer – Demystifying buffers and application performance
 
realestate and MySQL devops melbourne
realestate and MySQL devops melbournerealestate and MySQL devops melbourne
realestate and MySQL devops melbourne
 
Where'd all my memory go? SCALE 12x SCALE12x
Where'd all my memory go? SCALE 12x SCALE12xWhere'd all my memory go? SCALE 12x SCALE12x
Where'd all my memory go? SCALE 12x SCALE12x
 
Cassandra Performance Tuning Like You've Been Doing It for Ten Years
Cassandra Performance Tuning Like You've Been Doing It for Ten YearsCassandra Performance Tuning Like You've Been Doing It for Ten Years
Cassandra Performance Tuning Like You've Been Doing It for Ten Years
 
Barriers infront the scalable systems
Barriers infront the scalable systemsBarriers infront the scalable systems
Barriers infront the scalable systems
 
Introduction to Java Profiling
Introduction to Java ProfilingIntroduction to Java Profiling
Introduction to Java Profiling
 
Broken Performance Tools
Broken Performance ToolsBroken Performance Tools
Broken Performance Tools
 
Aerospike & GCE (LSPE Talk)
Aerospike & GCE (LSPE Talk)Aerospike & GCE (LSPE Talk)
Aerospike & GCE (LSPE Talk)
 
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
 
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
 
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringOSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
 
Performance tuning
Performance tuningPerformance tuning
Performance tuning
 
Unix Monitoring Tools
Unix Monitoring ToolsUnix Monitoring Tools
Unix Monitoring Tools
 
Creating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server HardeningCreating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server Hardening
 
Es werde Licht! Monitoring jenseits von tail und grep
Es werde Licht! Monitoring jenseits von tail und grepEs werde Licht! Monitoring jenseits von tail und grep
Es werde Licht! Monitoring jenseits von tail und grep
 
OSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
OSDC 2015: Georg Schönberger | Linux Performance Profiling and MonitoringOSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
OSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
 
Debugging Ruby
Debugging RubyDebugging Ruby
Debugging Ruby
 
Managing PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterManaging PostgreSQL with PgCenter
Managing PostgreSQL with PgCenter
 

Mehr von Positive Hack Days

Инструмент ChangelogBuilder для автоматической подготовки Release Notes
Инструмент ChangelogBuilder для автоматической подготовки Release NotesИнструмент ChangelogBuilder для автоматической подготовки Release Notes
Инструмент ChangelogBuilder для автоматической подготовки Release NotesPositive Hack Days
 
Как мы собираем проекты в выделенном окружении в Windows Docker
Как мы собираем проекты в выделенном окружении в Windows DockerКак мы собираем проекты в выделенном окружении в Windows Docker
Как мы собираем проекты в выделенном окружении в Windows DockerPositive Hack Days
 
Типовая сборка и деплой продуктов в Positive Technologies
Типовая сборка и деплой продуктов в Positive TechnologiesТиповая сборка и деплой продуктов в Positive Technologies
Типовая сборка и деплой продуктов в Positive TechnologiesPositive Hack Days
 
Аналитика в проектах: TFS + Qlik
Аналитика в проектах: TFS + QlikАналитика в проектах: TFS + Qlik
Аналитика в проектах: TFS + QlikPositive Hack Days
 
Использование анализатора кода SonarQube
Использование анализатора кода SonarQubeИспользование анализатора кода SonarQube
Использование анализатора кода SonarQubePositive Hack Days
 
Развитие сообщества Open DevOps Community
Развитие сообщества Open DevOps CommunityРазвитие сообщества Open DevOps Community
Развитие сообщества Open DevOps CommunityPositive Hack Days
 
Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...
Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...
Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...Positive Hack Days
 
Автоматизация построения правил для Approof
Автоматизация построения правил для ApproofАвтоматизация построения правил для Approof
Автоматизация построения правил для ApproofPositive Hack Days
 
Мастер-класс «Трущобы Application Security»
Мастер-класс «Трущобы Application Security»Мастер-класс «Трущобы Application Security»
Мастер-класс «Трущобы Application Security»Positive Hack Days
 
Формальные методы защиты приложений
Формальные методы защиты приложенийФормальные методы защиты приложений
Формальные методы защиты приложенийPositive Hack Days
 
Эвристические методы защиты приложений
Эвристические методы защиты приложенийЭвристические методы защиты приложений
Эвристические методы защиты приложенийPositive Hack Days
 
Теоретические основы Application Security
Теоретические основы Application SecurityТеоретические основы Application Security
Теоретические основы Application SecurityPositive Hack Days
 
От экспериментального программирования к промышленному: путь длиной в 10 лет
От экспериментального программирования к промышленному: путь длиной в 10 летОт экспериментального программирования к промышленному: путь длиной в 10 лет
От экспериментального программирования к промышленному: путь длиной в 10 летPositive Hack Days
 
Уязвимое Android-приложение: N проверенных способов наступить на грабли
Уязвимое Android-приложение: N проверенных способов наступить на граблиУязвимое Android-приложение: N проверенных способов наступить на грабли
Уязвимое Android-приложение: N проверенных способов наступить на граблиPositive Hack Days
 
Требования по безопасности в архитектуре ПО
Требования по безопасности в архитектуре ПОТребования по безопасности в архитектуре ПО
Требования по безопасности в архитектуре ПОPositive Hack Days
 
Формальная верификация кода на языке Си
Формальная верификация кода на языке СиФормальная верификация кода на языке Си
Формальная верификация кода на языке СиPositive Hack Days
 
Механизмы предотвращения атак в ASP.NET Core
Механизмы предотвращения атак в ASP.NET CoreМеханизмы предотвращения атак в ASP.NET Core
Механизмы предотвращения атак в ASP.NET CorePositive Hack Days
 
SOC для КИИ: израильский опыт
SOC для КИИ: израильский опытSOC для КИИ: израильский опыт
SOC для КИИ: израильский опытPositive Hack Days
 
Honeywell Industrial Cyber Security Lab & Services Center
Honeywell Industrial Cyber Security Lab & Services CenterHoneywell Industrial Cyber Security Lab & Services Center
Honeywell Industrial Cyber Security Lab & Services CenterPositive Hack Days
 
Credential stuffing и брутфорс-атаки
Credential stuffing и брутфорс-атакиCredential stuffing и брутфорс-атаки
Credential stuffing и брутфорс-атакиPositive Hack Days
 

Mehr von Positive Hack Days (20)

Инструмент ChangelogBuilder для автоматической подготовки Release Notes
Инструмент ChangelogBuilder для автоматической подготовки Release NotesИнструмент ChangelogBuilder для автоматической подготовки Release Notes
Инструмент ChangelogBuilder для автоматической подготовки Release Notes
 
Как мы собираем проекты в выделенном окружении в Windows Docker
Как мы собираем проекты в выделенном окружении в Windows DockerКак мы собираем проекты в выделенном окружении в Windows Docker
Как мы собираем проекты в выделенном окружении в Windows Docker
 
Типовая сборка и деплой продуктов в Positive Technologies
Типовая сборка и деплой продуктов в Positive TechnologiesТиповая сборка и деплой продуктов в Positive Technologies
Типовая сборка и деплой продуктов в Positive Technologies
 
Аналитика в проектах: TFS + Qlik
Аналитика в проектах: TFS + QlikАналитика в проектах: TFS + Qlik
Аналитика в проектах: TFS + Qlik
 
Использование анализатора кода SonarQube
Использование анализатора кода SonarQubeИспользование анализатора кода SonarQube
Использование анализатора кода SonarQube
 
Развитие сообщества Open DevOps Community
Развитие сообщества Open DevOps CommunityРазвитие сообщества Open DevOps Community
Развитие сообщества Open DevOps Community
 
Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...
Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...
Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...
 
Автоматизация построения правил для Approof
Автоматизация построения правил для ApproofАвтоматизация построения правил для Approof
Автоматизация построения правил для Approof
 
Мастер-класс «Трущобы Application Security»
Мастер-класс «Трущобы Application Security»Мастер-класс «Трущобы Application Security»
Мастер-класс «Трущобы Application Security»
 
Формальные методы защиты приложений
Формальные методы защиты приложенийФормальные методы защиты приложений
Формальные методы защиты приложений
 
Эвристические методы защиты приложений
Эвристические методы защиты приложенийЭвристические методы защиты приложений
Эвристические методы защиты приложений
 
Теоретические основы Application Security
Теоретические основы Application SecurityТеоретические основы Application Security
Теоретические основы Application Security
 
От экспериментального программирования к промышленному: путь длиной в 10 лет
От экспериментального программирования к промышленному: путь длиной в 10 летОт экспериментального программирования к промышленному: путь длиной в 10 лет
От экспериментального программирования к промышленному: путь длиной в 10 лет
 
Уязвимое Android-приложение: N проверенных способов наступить на грабли
Уязвимое Android-приложение: N проверенных способов наступить на граблиУязвимое Android-приложение: N проверенных способов наступить на грабли
Уязвимое Android-приложение: N проверенных способов наступить на грабли
 
Требования по безопасности в архитектуре ПО
Требования по безопасности в архитектуре ПОТребования по безопасности в архитектуре ПО
Требования по безопасности в архитектуре ПО
 
Формальная верификация кода на языке Си
Формальная верификация кода на языке СиФормальная верификация кода на языке Си
Формальная верификация кода на языке Си
 
Механизмы предотвращения атак в ASP.NET Core
Механизмы предотвращения атак в ASP.NET CoreМеханизмы предотвращения атак в ASP.NET Core
Механизмы предотвращения атак в ASP.NET Core
 
SOC для КИИ: израильский опыт
SOC для КИИ: израильский опытSOC для КИИ: израильский опыт
SOC для КИИ: израильский опыт
 
Honeywell Industrial Cyber Security Lab & Services Center
Honeywell Industrial Cyber Security Lab & Services CenterHoneywell Industrial Cyber Security Lab & Services Center
Honeywell Industrial Cyber Security Lab & Services Center
 
Credential stuffing и брутфорс-атаки
Credential stuffing и брутфорс-атакиCredential stuffing и брутфорс-атаки
Credential stuffing и брутфорс-атаки
 

Kürzlich hochgeladen

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 

Kürzlich hochgeladen (20)

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 

Александр Лямин. HOWTO. Высокий пакетрейт на x86-64: берем планку в 14,88 Mpps

  • 1. Высокий пакетрейт на x86-64: берем планку в 14,88 Mpps <la@highloadlab.com>
  • 2. 01.2012-05.2013 7594 incidents total 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 2012-01 2012-02 2012-03 2012-04 2012-05 2012-06 2012-07 2012-08 2012-09 2012-10 2012-11 2012-12 2013-01 2013-02 2013-03 2013-04 2013-05 SPOOF FULL-CONNECT
  • 3. Что модно? • UDP Flood and amplification. • TCP ( SYN ( open|closed|firewalled) | ACK ) • ICMP Flood ( smurf ) L7 – is out of style
  • 5. Долбанные инопланетяне static unsigned int tcp_timeouts[TCP_CONNTRACK_TIMEOUT_MAX] __read_mostly = { [TCP_CONNTRACK_SYN_SENT] = 2 MINS, [TCP_CONNTRACK_SYN_RECV] = 60 SECS, [TCP_CONNTRACK_ESTABLISHED] = 5 DAYS, [TCP_CONNTRACK_FIN_WAIT] = 2 MINS, [TCP_CONNTRACK_CLOSE_WAIT] = 60 SECS, [TCP_CONNTRACK_LAST_ACK] = 30 SECS, [TCP_CONNTRACK_TIME_WAIT] = 2 MINS, [TCP_CONNTRACK_CLOSE] = 10 SECS, [TCP_CONNTRACK_SYN_SENT2] = 2 MINS, /* RFC1122 says the R2 limit should be at least 100 seconds. Linux uses 15 packets as limit, which corresponds to ~13-30min depending on RTO. */ [TCP_CONNTRACK_RETRANS] = 5 MINS, [TCP_CONNTRACK_UNACK] = 5 MINS, };
  • 6. Кто еще виноват? top - 08:16:23 up 39 min, 1 user, load average: 0.44, 0.16, 0.79 Tasks: 158 total, 2 running, 156 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 89.3%id, 0.0%wa, 0.0%hi, 10.7%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 : 0.0%us, 1.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 32921100k total, 4598792k used, 28322308k free, 15496k buffers Swap: 0k total, 0k used, 0k free, 83252k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 39 root 20 0 0 0 0 R 100 0.0 0:27.91 [ksoftirqd/8] 1401 root 20 0 0 0 0 S 8 0.0 0:03.05 [kpktgend_8] 5346 root 20 0 0 0 0 S 2 0.0 0:00.34 [kworker/8:0] 5740 root 20 0 19356 1472 1076 R 1 0.0 0:00.12 top
  • 7. Кто еще виноват? ВЫ (забыли настроить сетевой стэк)
  • 8. Cферический сервер в вакууме Intel(R) Xeon(R) CPU E5-2670 x2 X520-DA2 (Intel® 82599ES) Vanilla Linux 3.7.9 modprobe ixgbe(3.13.10) RSS=8
  • 9. Как быть ? AFFINITY > BALANCER %/etc/init.d/irqbalancer stop %grep eth8 /proc/interrupts 123: 19 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-0 124: 0 15 0 0 0 0 0 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-1 125: 0 0 15 0 0 0 0 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-2 126: 0 0 0 15 0 0 0 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-3 127: 0 0 0 0 15 0 0 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-4 128: 0 0 0 0 0 15 0 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-5 129: 0 0 0 0 0 0 17 0 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-6 130: 0 0 0 0 0 0 0 15 1 0 0 0 0 0 0 0 PCI-MSI-edge eth8-TxRx-7
  • 10. Лучше? top - 07:40:25 up 3 min, 1 user, load average: 4.61, 1.29, 0.44 Tasks: 164 total, 9 running, 155 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st Cpu4 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st Cpu5 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st Cpu6 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st Cpu7 : 0.0%us, 0.0%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 50.2%si, 0.0%st Cpu8 : 0.0%us, 0.0%sy, 0.0%ni, 90.2%id, 0.0%wa, 0.0%hi, 9.8%si, 0.0%st Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 0.0%us, 1.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 32921100k total, 4597288k used, 28323812k free, 15340k buffers Swap: 0k total, 0k used, 0k free, 83240k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15 root 20 0 0 0 0 R 96 0.0 0:46.06 [ksoftirqd/2] 23 root 20 0 0 0 0 R 96 0.0 0:46.04 [ksoftirqd/4] 11 root 20 0 0 0 0 R 95 0.0 0:46.04 [ksoftirqd/1] 19 root 20 0 0 0 0 R 95 0.0 0:46.03 [ksoftirqd/3] 27 root 20 0 0 0 0 R 95 0.0 0:46.02 [ksoftirqd/5] 31 root 20 0 0 0 0 R 95 0.0 0:46.08 [ksoftirqd/6] 35 root 20 0 0 0 0 R 95 0.0 0:46.04 [ksoftirqd/7] 3 root 20 0 0 0 0 R 93 0.0 0:45.23 [ksoftirqd/0]
  • 11. Более лучше? # ethtool -K eth8 ntuple on # ethtool -U eth8 flow-type udp4 action -1 Added rule with ID 8189 # ethtool -u eth8 8 RX rings available Total 1 rules Filter: 8189 Rule Type: UDP over IPv4 Src IP addr: 0.0.0.0 mask: 255.255.255.255Dest IP addr: 0.0.0.0 mask: 255.255.255.255 TOS: 0x0 mask: 0xff Src port: 0 mask: 0xffff Dest port: 0 mask: 0xffff VLAN EtherType: 0x0 mask: 0xffff VLAN: 0x0 mask: 0xffff User-defined: 0x0 mask: 0xffffffffffffffff Action: Drop
  • 12. Более лучше! (здесь ~14.88Mpps UDP) Tasks: 163 total, 1 running, 162 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 : 1.0%us, 0.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 32921100k total, 4374344k used, 28546756k free, 7700k buffers Swap: 0k total, 0k used, 0k free, 24036k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4348 root 20 0 19356 1476 1076 R 1 0.0 0:00.03 top 1 root 20 0 4120 688 588 S 0 0.0 0:01.22 init [3] 2 root 20 0 0 0 0 S 0 0.0 0:00.00 [kthreadd]
  • 13. Поприветствуем Flow Director The flow director filters identify specific flows or sets of flows and routes them to specific queues. The flow director filters are programmed by FDIRCTRL and all other FDIR registers. The 82599 shares the Rx packet buffer for the storage of these filters.
  • 14. Flow Director умеет • Perfect match filters — The hardware checks a match between the masked fields of the received packets and the programmed filters. Masked fields should be programmed as zeros in the filter context. The 82599 support up to 8 K - 2 perfect match filters. • Signature filters — The hardware checks a match between a hash-based signature of the masked fields of the received packet. The 82599 supports up to 32 K - 2 signature filters.
  • 15. Perfect Filter умеют (instanteneously) • VLAN • proto • src_ip/mask • src_port • dst_ip/mask • dst_port • Flexible 2-byte tuple anywhere in the first 64 bytes of the packet (FRAME!)
  • 16. Not so perfect (Выкидыш FlowDirector) • Потребляют память RX buffer (256/512) • Не умеют ЕСЛИ-ТО • Masks are GLOBAL for signature filters • 64b это до обидного мало • Поддерживается ethtool (perfect, buggy) и PF_RING(signature only) Но и на том Intel SPASIBO!
  • 17. Flex Filters (Выкидыши реализации RSS) • 128b of the packet (FRAME!) • 6 filters • Кратковременно отключаются при доступе(R|W) • Нет публично доступного userland конфигуратора.
  • 18. Как быть с TCP SYN? • SYN без Seq Number • SYN без MSS • … и прочие ляпы где можно вывести сигнатуру до первых 128b
  • 19. Как быть с Perfect TCP SYN ? Больно умереть на 400kPPS…
  • 20. Post mortem # ======== # # Samples: 19K of event 'cycles' # Event count (approx.): 12923232073 # # Overhead Command Shared Object Symbo l # ........ ........... ................. .................................... . # 78.74% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_lock | --- _raw_spin_lock | |--98.84%-- tcp_v4_rcv | ip_local_deliver_finish | ip_local_deliver | ip_rcv_finish | ip_rcv | __netif_receive_skb | netif_receive_skb | napi_skb_finish | napi_gro_receive | 0xffffffffa005c134 | net_rx_action | __do_softirq | run_ksoftirqd | smpboot_thread_fn | kthread | ret_from_fork
  • 21. net/ipv4/tcp_ipv4.c process: if (sk->sk_state == TCP_TIME_WAIT) goto do_time_wait; if (unlikely(iph->ttl < inet_sk(sk)->min_ttl)) { NET_INC_STATS_BH(net, LINUX_MIB_TCPMINTTLDROP); goto discard_and_relse; } if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb)) goto discard_and_relse; nf_reset(skb); if (sk_filter(sk, skb)) goto discard_and_relse; skb->dev = NULL; bh_lock_sock_nested(sk); ret = 0; if (!sock_owned_by_user(sk)) { [dd] } bh_unlock_sock(sk); sock_put(sk); return ret;
  • 22. Опять кто-то виноват.. • Обнаружитель SYNFLOOD • TCP Cookie Transactions • MD5SUM
  • 23. Запилим Инновационный Костыль! *rawpost :POSTROUTING ACCEPT [15:1548] -A POSTROUTING -s 10.1.0.0/24 -o eth8 -j RAWSNAT --to-source 10.10.40.3/32 COMMIT # Completed on Mon May 20 04:47:30 2013 # Generated by iptables-save v1.4.16.3 on Mon May 20 04:47:30 2013 *raw :PREROUTING ACCEPT [28:2128] :OUTPUT ACCEPT [18:2056] -A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 0 -j RAWDNAT --to-destination 10.1.0.1/32 -A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 1 -j RAWDNAT --to-destination 10.1.0.2/32 -A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 2 -j RAWDNAT --to-destination 10.1.0.3/32 -A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 3 -j RAWDNAT --to-destination 10.1.0.4/32 -A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 4 -j RAWDNAT --to-destination 10.1.0.5/32 -A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 5 -j RAWDNAT --to-destination 10.1.0.6/32 -A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 6 -j RAWDNAT --to-destination 10.1.0.7/32 -A PREROUTING -d 10.10.40.3/32 -m cpu --cpu 7 -j RAWDNAT --to-destination 10.1.0.8/32 COMMIT
  • 24. WIN! 0 500 1000 1500 2000 2500 3000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 kPPS RSS
  • 26. На сладкое А что будет если послать пакет на не слушаемый порт? А что если послать много-много пакетов?
  • 27. Linux 3.5.7 0 100 200 300 400 500 600 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 KPPS RSS
  • 28. net/ipv4/ip_output.c bh_lock_sock(sk); inet->tos = arg->tos; sk->sk_priority = skb->priority; sk->sk_protocol = ip_hdr(skb)->protocol; sk->sk_bound_dev_if = arg->bound_dev_if; ip_append_data(sk, &fl4, ip_reply_glue_bits, arg->iov->iov_base, len, 0, &ipc, &rt, MSG_DONTWAIT); if ((skb = skb_peek(&sk->sk_write_queue)) != NULL) { } if (arg->csumoffset >= 0) *((__sum16 *)skb_transport_header(skb) + arg->csumoffset) = csum_fold(csum_add(skb->csum, arg->csum)); skb->ip_summed = CHECKSUM_NONE; ip_push_pending_frames(sk, &fl4); bh_unlock_sock(sk);
  • 29. Спасибо Эрик! commit be9f4a44e7d41cee50ddb5f038fc2391cbbb4046 Author: Eric Dumazet <edumazet@google.com> Date: Thu Jul 19 07:34:03 2012 +0000 ipv4: tcp: remove per net tcp_sock tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket per network namespace. This leads to bad behavior on multiqueue NICS, because many cpus contend for the socket lock and once socket lock is acquired, extra false sharing on various socket fields slow down the operations. To better resist to attacks, we use a percpu socket. Each cpu can run without contention, using appropriate memory (local node)