You have complex mathematical models (millions of cells, hundreds thousand of formulas) in Excel. And you need to run it browser and mobile without excel. I will talk how we created own spreadsheet engine compatible with MS Excel which allows us to run any Excel model without Excel. I will talk about:
* Architecture
* Algorithms
* JavaScript performance optimization.
1 of 62
Download to read offline
More Related Content
How to create a high performance excel engine in java script
1. Create own Excel with
JavaScript
Viktor Turskyi
CEO at WebbyLab
2019
2. Viktor Turskyi
CEO and principal architect at
WebbyLab
Open source developer
More than 15 years of experience
Delivered more than 60 projects of
different scale
Did projects for 5 companies from
Fortune 500 list
3. Business task
Business logic is in Excel and you need to code
it in your app and run in browser and mobile.
This business logic is complex mathematical
models.
8. Requirements
High performance (<2s full recompute)
Small file size (suitable for work in
browser)
Offline work in browser
Work on server
Offline work on tablets (iOS, Android)
28. Mistake 1: Trying to write own parser
from scratch
Own parser
1. Complex
2. A lot of time
3. Expensive
29. 90% of the work is the same as
writing a parser for programming
language
30. Good solution - ANTLR
1. Parser generator based on Grammars
(including JS)
2. Lexer and Parser
3. Emits AST (Abstract Syntax Tree)
4. The fastest and the most powerful
http://www.antlr.org/
34. Components
LocalRunner(Engine) - works with model,
processes all cells dependencies
Formula Evaluator - computes one
formula
Address Parser - parses address in
runtime
Functions - Excel functions
implementation
37. Implementation of EXCEL functions
One function - one module.
No side effects
Use dependency injection
Test test test test (excel functions often
does not work as documented)
Call example:
SQRT([ 9 ]) returns 3
SUM([2, [5, 6, 7, 9], 1 ]) returns 30
47. We can represent dependencies in
form of directed acyclic graph (DAG)
A1=1;
A2=A1+2;
A3=A1+A2;
Now we can recompute
dependent cells on
changes
48. Mistake 3: reay on synthetic models
too much
Test model with million cells -
2 seconds for recompute
Real model with million cells
- 1 hour for recompute
Reason: we recompute the
same cells several times
49. You can sort your dependency graph
with topological sort
Each cell will be calculated only one time
50. We did it: it worked for test files but did not
work on real models. Why?
51. Reason: Our graph is more than 10k nodes
deep. We got stack overflow (JS limits call
stack to 10k frames).
52. What to do
Do not use recursion, traverse graph manually
with own stack.
Real model results:
No toposort - 1 hour
With toposort - 6 seconds
56. OFFSET breaks everything
=OFFSET(D3, 3, -2) - displays the value in cell
B6
A2 = A1+OFFSET(D3, 3, -2).
Does A2 depend on B6? Do we have any
problem with it?
60. Conclusion
Dependency injection (and SOLID) everywhere
Make everything modular.
You will need a lot of tests. There are tons of
edge cases in Excel behavior.
Measure performance on real models.
You need to have some sort of automatic model
tester.
Create convenient debug tools (you will spent a
lot of time debugging)
Understand how V8 works