際際滷

際際滷Share a Scribd company logo
Networking	(Containers)	in	Ultra-
Low-Latency	Environments	
Avi	Deitcher	
avi@atomicinc.com
廖
Avi	Deitcher		avi@atomicinc.com
廖
Akh-san-ya	?aksnaja? n.	(ancient	Aramaic,	
from	Ancient	Greek	x辿nos)	1:	Hospitality,	
lodging;	2:	Host.	
	
Avi	Deitcher		avi@atomicinc.com
廖
Akh-san-ya	?aksnaja? n.	(ancient	Aramaic,	
from	Ancient	Greek	x辿nos)	1:	Hospitality,	
lodging;	2:	Host.	
	
		:廖  廚廬
Ancient	Jewish	custom	to	begin	public	speaking	
by	honouring	or	thanking	the	hosts.	
	
Avi	Deitcher		avi@atomicinc.com
Who	Am	I?	
Avi	Deitcher		avi@atomicinc.com
Who	Am	I?	
Avi	Deitcher		avi@atomicinc.com	
(not	24601)
Who	Am	I?	
≒ Life	in	tech	business:	
 10	yrs	鍖nancial	services	IT	
 10+	yrs	consulWng	&	training	
 Some	startups	on	the	way	
≒ Avid	(if	not	very	good)	ice	hockey	player	
≒ Long-Wme	lover	of	great	engineering.	when	
used	to	make	a	real	di鍖erence	
≒ Atomic	Inc:		
 ConsulWng	
 Training	
Avi	Deitcher		avi@atomicinc.com	
(not	24601)
A	Lile	History	
Avi	Deitcher		avi@atomicinc.com
A	Lile	History	
Summer	2015	
≒ Fintech	X:	Help	us	
containerize!	
 Hint:	It	is	harder	than	you	
think	and	worth	it	
 Culture/process	>	technology	
≒ QuesWon:	Networking?	
≒ Answer:	ScienW鍖c	method	
Avi	Deitcher		avi@atomicinc.com
A	Lile	History	
Summer	2015	
≒ Fintech	X:	Help	us	
containerize!	
 Hint:	It	is	harder	than	you	
think	and	worth	it	
 Culture/process	>	technology	
≒ QuesWon:	Networking?	
≒ Answer:	ScienW鍖c	method	
Summer	2016	
	
	
	
≒ Good	pracWce	demands:	
1. Redo	tests	with	new	opWons	
and	versions	
2. Make	tests	available	
3. Explain	it	all	well	
Avi	Deitcher		avi@atomicinc.com
What	Is	Ultra-Low	Latency?	
Avi	Deitcher		avi@atomicinc.com
What	Is	Ultra-Low	Latency?	
1. hp://home.blarg.net/%7Eglinden/StanfordDataMining.2006-11-29.ppt	
Avi	Deitcher		avi@atomicinc.com	
every	100ms	of	delay	costs	1%	of	
sales[1]
What	Is	Ultra-Low	Latency?	
extra	0.5s	in	search	page	generaWon	
Wme	dropped	tra鍖c	by	20%[2]	
	
1. hp://home.blarg.net/%7Eglinden/StanfordDataMining.2006-11-29.ppt	
2. hp://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html	
Avi	Deitcher		avi@atomicinc.com	
every	100ms	of	delay	costs	1%	of	
sales[1]
What	Is	Ultra-Low	Latency?	
extra	0.5s	in	search	page	generaWon	
Wme	dropped	tra鍖c	by	20%[2]	
	
1. hp://home.blarg.net/%7Eglinden/StanfordDataMining.2006-11-29.ppt	
2. hp://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html	
Avi	Deitcher		avi@atomicinc.com	
Not.				Even.					Close.	
every	100ms	of	delay	costs	1%	of	
sales[1]
Ultra-Low	Latency	
38	messages	in	7	milliseconds	
	
1	message	(avg)	every	184	-sec!	
Avi	Deitcher		avi@atomicinc.com
Networking	Workloads	
≒ Networked	Workloads:	
	things	that	do	work	and	must	talk	
≒ Same	principles	for	all	workloads:	
VMs	
Cloud	
Serverless	
Containers	
Avi	Deitcher		avi@atomicinc.com
Two	Types	of	Networking	
Direct	
Avi	Deitcher		avi@atomicinc.com
Two	Types	of	Networking	
Direct	 Fabric+Overlay	
Avi	Deitcher		avi@atomicinc.com
	maybe	four	
Workload	Awareness		
Avi	Deitcher		avi@atomicinc.com
	maybe	four	
Workload	Awareness		 Fabric	Awareness	
Avi	Deitcher		avi@atomicinc.com
Networking	OpWons	
Direct	
Metal	
macvlan	
Bridge/vSwitch		
									(no	NAT)	
net=host	
SR-IOV	
Overlay	
Flannel	
Weave	
Docker	Overlay	
Calico	(IPIP)	
	
	
Workload	Awareness	
Docker	bridge	(NAT)	
Fabric	Awareness	
Calico	(NaWve)	
	
	
	
	
	
	
Avi	Deitcher		avi@atomicinc.com
Our	Tests	
What	We	Tested	
≒ netperf		netserver	
≒ UDP	&	TCP	round-robin	
≒ Sizes:	300,	500,	1024,	2048	
≒ No	orchestraWon	=	complete	
control	
≒ 50000	iteraWons	
 Law	of	large	numbers	
≒ Latency	(Avg,	%iles),	CPU	
	
≒ Di鍖erenRals,	not	absolutes	
How	We	Tested	
≒ 															.net	
 Because	it	had	to	be	metal	
 Wicked	smart	team	
≒ Complete	test	run	
 Network	changes	
 Hardware	variaWons,	errors	
hps://github.com/deitch/network-tests		
Avi	Deitcher		avi@atomicinc.com
Local	vs.	Remote	
Avi	Deitcher		avi@atomicinc.com
Avi	Deitcher		avi@atomicinc.com
Avi	Deitcher		avi@atomicinc.com
Avi	Deitcher		avi@atomicinc.com
Local	Networking	Summary	
≒ SR-IOV	horrible	latency	but	great	CPU	
 Hold	that	thought	
≒ net=host	on	par	with	metal	
≒ macvlan	closest	virtualized	to	metal		
≒ Rest	in	same	range:	
 Latency:	5-10	-sec	overhead		
 CPU:	negligible	di鍖erence	
≒ Calico	(IPIP	&	naWve)	&	Docker	overlay	slightly	
more	performant	
≒ Watch	out	for	very	large	TCP	packets	
Avi	Deitcher		avi@atomicinc.com
Avi	Deitcher		avi@atomicinc.com
Avi	Deitcher		avi@atomicinc.com
Avi	Deitcher		avi@atomicinc.com
Remote	Networking	Summary	
≒ Weave	(sleeve)	adds	latency	and	CPU	
Reason	for	fast	datapath	
≒ Again,	macvlan	best	virtualized	
≒ All	the	rest:	
Latency:	within	50	-sec	of	each	other,	except	SR-
IOV	with	very	large	TCP	packets	
CPU:	similar,	but	keep	an	eye	on	Flannel	(UDP)	
Avi	Deitcher		avi@atomicinc.com
About	that	SR-IOV	
Type	1:	Intel	I350	1Gbps	
Type	3:	Mellanox	MT27500	ConnectX-3	10Gbps		
Avi	Deitcher		avi@atomicinc.com
SR-IOV	
SR-IOV	does	not	automaRcally	mean	beXer	
≒ Switch	in	network	card	
≒ Trades	host	CPU	for	card	processor	
≒ Quality	varies	drama5cally	
 Even	Mellanox	far	worse	locally	
≒ My	2促:	SR-IOV	falls	further	behind	due	to:	
 Speed	of	iteraWon	
 Open-source	
 Sowware	+	CPU	
Avi	Deitcher		avi@atomicinc.com
Headaches	(and	Thanks)	
≒ Headaches	
 Weave	SYN-(nothing)	
 etcd	is	touchy	
 Packet	L3	network	is	powerful	but	unique	
≒ Macvlan,	weave,	鍖annel:	all	required	pings	for	mac	
≒ Se{ng	up	bridge	w/o	NAT,	Calico,	macvlan	was	di鍖erent	
 SR-IOV	is	complicated	and	鍖aky,	especially	Mellanox	
 netperf	with	UDP	packets	can	get	stuck	(Calico-ipip)	
 And	a	whole	lot	more	(ask	me	o鍖ine)	
	
≒ And	thanks:	
 Bryan	Boreham,	Adam	Harrison	at	weave.works	
 Zac	Smith,	Adam,	Aaron,	Andy,	Lucas,	everyone	at	Packet	
Avi	Deitcher		avi@atomicinc.com
What	else	could	we	do?	
Other	hardware	types	
Other	network	fabrics	
Docker	macvlan	network	driver	(experimental)	
Ipvlan	
Other	packet	sizes	
Kernel	and	network	stack	tuning	
Distant	(and	VPN)	networks	
Other	tra鍖c	paerns	
Other	host-to-host	encrypWon		
A	whole	lot	more	
Avi	Deitcher		avi@atomicinc.com
Conclusions	
≒ SR-IOV:	most	of	the	Wme,	just	not	worth	it	
≒ Performance:	
 Metal	(+	net=host):	always	performs	best	
 Direct	network++:	macvlan	is	your	friend	
 Others:		Roughly	similar,	careful	of	Weave	(sleeve)	
	
≒ Whats	your	use	case?	
 ULL:	Metal/net=host	>	macvlan	>	calico	>	overlay	
 Everything	else:	Focus	on	your	architecture	and	skills	
	
Pick	intelligently:	easier,	not	simple	
Avi	Deitcher		avi@atomicinc.com
Conclusions	
≒ SR-IOV:	most	of	the	Wme,	just	not	worth	it	
≒ Performance:	
 Metal	(+	net=host):	always	performs	best	
 Direct	network++:	macvlan	is	your	friend	
 Others:		Roughly	similar,	careful	of	Weave	(sleeve)	
	
≒ Whats	your	use	case?	
 ULL:	Metal/net=host	>	macvlan	>	calico	>	overlay	
 Everything	else:	Focus	on	your	architecture	and	skills	
	
Pick	intelligently:	easier,	not	simple	
Avi	Deitcher		avi@atomicinc.com
QuesWons	and	help:	
@avideitcher							avi@atomicinc.com

More Related Content

LinuxCon/ContainerCon Japan 2016 "Networking Containers in Ultra-Low Latency Environments"