際際滷

際際滷Share a Scribd company logo
Artem Nikitin
DevDays Vilnius| May, 2018
Go implementation
for Flatdata
A long time ago in a galaxy far, far away....
Routing team at HERE Technologies was in the situation
when routing backend was in danger of failing SLA.
Its triggered a massive work to improve its performance.
As a side effect of this work, Flatdata was born: zero-copy
memory-mapped data storage
息 2018 HERE | PublicDevDays Vilnius | May, 2018
01
What is routing?
Routing
 Creating a path from point A to point B
 Hmm It looks familiar
 Can we represent a map as a graph and then use
Dijkstra?
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Routing
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Routing
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Routing
 It looks easy, but reality is different
 What about negative turn costs?
 What to do with time awareness?
 How to handle dynamic info?
 Usually, different algorithms are used for different
cases
息 2018 HERE | PublicDevDays Vilnius | May, 2018
02
Go? Never heard 看韓
Go
 Initially developed by:
Ken Thompson (C, Unix)
Rob Pike (Plan9, UTF-8)
Robert Griesemer (V8)
 Statically typed, compiled
 With GC
 Created with support for concurrency in mind
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Go
Why do we need yet another language?
 Fast compilation
 Simple syntax and quick start
 One code style!
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Go
What is written in Go?
 Docker
 Kubernetes
 HashiCorp tools
 Prometheus and many more
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Go
What I like in Go:
 Simplicity and even primitivity
No more AbstractSingletonProxyFactoryBean ever J
 Code which is easy to understand
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Go
What I DONT like about Go:
 Dependency management
 Ecosystem isnt that good comparing to more mature
languages, like Java
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Go
If you are interested, then:
 https://tour.golang.org
 The Go Programming Language
(Alan A. A. Donovan, Brian W. Kernighan)
息 2018 HERE | PublicDevDays Vilnius | May, 2018
03
Flatdata
What is Flatdata?
https://github.com/heremaps/flatdata
Flatdata is a library providing data structures for convenient
creation, storage and access of packed memory-mappable
immutable data structures with minimal overhead
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Why Flatdata?
Existed data format was optimized for embedded clients:
 Reduced space consumption (Disk)
 Incremental processing tile-by-tile (CPU)
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Why Flatdata?
Drawbacks for backend:
 Decoding tiles and reconstructing world scene (CPU
Time)
 Custom tile cache not shared between processes (RAM)
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Solution
 Store the data in directly-accessible memory-mapped
files
 Store it sorted to preserve locality
 Use constant-time lookups wherever possible
 Avoid unnecessary copying of data
息 2018 HERE | PublicDevDays Vilnius | May, 2018
When it's useful?
 Your data updates infrequently and accessed much
more often than updated
 You can afford to recreate the full data archive on
every update
 Your data fits in the RAM on target instance
 You want to optimize your data to be cache-friendly
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Alternatives
We looked at several alternatives with Flatbuffers as
most well-known.
We prototyped a routing graph with 10M nodes and
compared our solution with Flatbuffers
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Prototype
Graph creation
息 2018 HERE | PublicDevDays Vilnius | May, 2018
cpu_time (ms) peak memory consumption (kb)
flatbuffers 1 936.2096 1 486 116
reference 1 446.0785 743 640
Prototype
Dijkstra
息 2018 HERE | PublicDevDays Vilnius | May, 2018
cpu_time (ms) peak memory consumption (kb)
flatbuffers 62 526.3584 1 735 836
reference 57 970.6155 1 718 032
Prototype
BFS
息 2018 HERE | PublicDevDays Vilnius | May, 2018
cpu_time (ms) peak memory consumption (kb)
flatbuffers 10 997.4264 1 163 352
reference 9 300.9597 1 144 048
Prototype
息 2018 HERE | PublicDevDays Vilnius | May, 2018
 Double memory footprint during graph compilation
due to immutability of Flatbuffers
 Performance is a little bit worse then our solution
 Possibilities for fine-grained optimizations are limited
Flatdata
Library consists of:
 Schema language
 Code generator for C++, Python and Go
 Target language libraries.
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Schema
息 2018 HERE | PublicDevDays Vilnius | May, 2018
How it works
息 2018 HERE | PublicDevDays Vilnius | May, 2018
How it works
息 2018 HERE | PublicDevDays Vilnius | May, 2018
How it works
息 2018 HERE | PublicDevDays Vilnius | May, 2018
How it works
息 2018 HERE | PublicDevDays Vilnius | May, 2018
Implementations comparison
息 2018 HERE | PublicDevDays Vilnius | May, 2018
C++
feature
complete
very efficient used in production has writer
Python
feature
complete
slow debugging, tooling etc. no writer
Go
feature
complete
performance
unclear
beta no writer
Current status of Go implementation
 It's feature complete comparing to C++/Python
 It's passes tests for C++/Python implementations
 Probably, a lots of bugs missed :)
 Performance is unclear
 No guarantees of backward compatible changes for
public API
 No writer :(
息 2018 HERE | PublicDevDays Vilnius | May, 2018
04
Its demo time J
https://github.com/artemnikitin/flatdata-go-coappearances-example
Thank you
Contact
Artem Nikitin hi@artemnikitin.com artemnikitin artemnikitin
Go implementation for Flatdata

More Related Content

Go implementation for Flatdata

  • 1. Artem Nikitin DevDays Vilnius| May, 2018 Go implementation for Flatdata
  • 2. A long time ago in a galaxy far, far away.... Routing team at HERE Technologies was in the situation when routing backend was in danger of failing SLA. Its triggered a massive work to improve its performance. As a side effect of this work, Flatdata was born: zero-copy memory-mapped data storage 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 4. Routing Creating a path from point A to point B Hmm It looks familiar Can we represent a map as a graph and then use Dijkstra? 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 5. Routing 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 6. Routing 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 7. Routing It looks easy, but reality is different What about negative turn costs? What to do with time awareness? How to handle dynamic info? Usually, different algorithms are used for different cases 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 9. Go Initially developed by: Ken Thompson (C, Unix) Rob Pike (Plan9, UTF-8) Robert Griesemer (V8) Statically typed, compiled With GC Created with support for concurrency in mind 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 10. Go Why do we need yet another language? Fast compilation Simple syntax and quick start One code style! 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 11. Go What is written in Go? Docker Kubernetes HashiCorp tools Prometheus and many more 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 12. Go What I like in Go: Simplicity and even primitivity No more AbstractSingletonProxyFactoryBean ever J Code which is easy to understand 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 13. Go What I DONT like about Go: Dependency management Ecosystem isnt that good comparing to more mature languages, like Java 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 14. Go If you are interested, then: https://tour.golang.org The Go Programming Language (Alan A. A. Donovan, Brian W. Kernighan) 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 16. What is Flatdata? https://github.com/heremaps/flatdata Flatdata is a library providing data structures for convenient creation, storage and access of packed memory-mappable immutable data structures with minimal overhead 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 17. Why Flatdata? Existed data format was optimized for embedded clients: Reduced space consumption (Disk) Incremental processing tile-by-tile (CPU) 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 18. Why Flatdata? Drawbacks for backend: Decoding tiles and reconstructing world scene (CPU Time) Custom tile cache not shared between processes (RAM) 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 19. Solution Store the data in directly-accessible memory-mapped files Store it sorted to preserve locality Use constant-time lookups wherever possible Avoid unnecessary copying of data 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 20. When it's useful? Your data updates infrequently and accessed much more often than updated You can afford to recreate the full data archive on every update Your data fits in the RAM on target instance You want to optimize your data to be cache-friendly 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 21. Alternatives We looked at several alternatives with Flatbuffers as most well-known. We prototyped a routing graph with 10M nodes and compared our solution with Flatbuffers 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 22. Prototype Graph creation 息 2018 HERE | PublicDevDays Vilnius | May, 2018 cpu_time (ms) peak memory consumption (kb) flatbuffers 1 936.2096 1 486 116 reference 1 446.0785 743 640
  • 23. Prototype Dijkstra 息 2018 HERE | PublicDevDays Vilnius | May, 2018 cpu_time (ms) peak memory consumption (kb) flatbuffers 62 526.3584 1 735 836 reference 57 970.6155 1 718 032
  • 24. Prototype BFS 息 2018 HERE | PublicDevDays Vilnius | May, 2018 cpu_time (ms) peak memory consumption (kb) flatbuffers 10 997.4264 1 163 352 reference 9 300.9597 1 144 048
  • 25. Prototype 息 2018 HERE | PublicDevDays Vilnius | May, 2018 Double memory footprint during graph compilation due to immutability of Flatbuffers Performance is a little bit worse then our solution Possibilities for fine-grained optimizations are limited
  • 26. Flatdata Library consists of: Schema language Code generator for C++, Python and Go Target language libraries. 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 27. Schema 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 28. How it works 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 29. How it works 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 30. How it works 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 31. How it works 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 32. Implementations comparison 息 2018 HERE | PublicDevDays Vilnius | May, 2018 C++ feature complete very efficient used in production has writer Python feature complete slow debugging, tooling etc. no writer Go feature complete performance unclear beta no writer
  • 33. Current status of Go implementation It's feature complete comparing to C++/Python It's passes tests for C++/Python implementations Probably, a lots of bugs missed :) Performance is unclear No guarantees of backward compatible changes for public API No writer :( 息 2018 HERE | PublicDevDays Vilnius | May, 2018
  • 34. 04 Its demo time J https://github.com/artemnikitin/flatdata-go-coappearances-example
  • 35. Thank you Contact Artem Nikitin hi@artemnikitin.com artemnikitin artemnikitin