In this talk I describe my experience as CTO of Big Data consulting firm Datasalt from 2011 to 2016, the main use cases done for companies and the lessons learned from such a experience.
1 of 35
Download to read offline
More Related Content
Running a small, high tech consulting firm - lessons learned
1. Running a small, high-tech
consulting firm: lessons
learnedPere Ferrera Bertran
Hispanic Startups Meetup, Berlin 27/11/17
2. About me
Pere Ferrera Bertran
+12 y backend developer: Java, Python
Barcelona (2005-2012), Berlin (2012+)
CTO Datasalt (2011-2016)
Amateur jazz piano player
4. Datasalt
Datasalt: Big Data (tech.) consulting company.
Developing proof of concepts, teaching, etc.
From 3-6 months projects to 1 day consulting.
5. Why founding Datasalt?
2011. After having worked in several start-ups, we wanted our own.
We decided to exploit our competitive advantage: (mostly) Hadoop.
Early adopters (2008).
Years of Big Data hype to come.
Iv叩n de Prado (CEO) and me: good work mates and friends.
8. Online marketing
Top use case in our history.
Probably the most challenging industry in terms of scalability.
Aggregate billions of impressions / clicks and produce meaningful reports.
Aggregate activity from billions of devices + external datasets and make sense of it.
9. Online marketing
Exads (2013-2017): Reporting over 5 billion daily impressions.
Our own technology (Splout SQL) at the core of their reporting solution!
Exact reporting: how much has been spent on campaign / country / ?
Adex (2013-2016): 2000+ segments exported daily.
Hadoop first, Spark later.
Multi-stage pipeline, data aggregation, analysis, inference (age, gender, interests)
Bidmotion (2015): Machine learning over billions of impressions
What kind of traffic converts for what campaign? (predict clicks!)
10. Online reputation
Aggregate Twitter / etc activity.
Show a reputation score to the user, plus other insights.
Many challenges involved:
Crawling
Graph analysis
Machine learning: topic modeling (interests / topics)
Use case: PeerIndex (2011-2012)
Complete re-architecting >> 2x improvement in throughput.
Ability to easily scale horizontally (more Twitter profiles - more machines).
Hadoop at the core.
11. Classifieds
Prepare & index many data sources so they can be searched quickly.
Analyze the history of user queries (internal usage).
E.g. system akin to Google Trends.
Use case: Trovit (2011-2013)
Helped in re-architecting the full pipeline, with Hadoop + SOLR at the core.
Helped in other mission-critical processes and complementary internal systems.
12. Other use cases
Financial transactions, BBVA (2012-2013): http://highscalability.com/blog/2013/1/7/analyzing-billions-of-credit-
card-transactions-and-serving-l.html
Aggregate transaction data to help merchants understand their clients.
Enable loyalty programs.
15. Business model
Conversation with a mentor after our first deal:
He - How is it going? Did you think of a product yet?
Us - Were quite happy. We got our first long-term deal with XXXX and they pay $$$ :-)
He - Ok, now youre never going to do anything outside consulting then.
16. Business model
We were 2 persons, quickly became 3.5 and thought about starting to scale
but we actually scaled down to try to focus on product ideas.
The minimum viable international company!
1 x Spain
1 x Germany
17. Business model
Some product attempts: 100% technological, niche usefulness.
Pangool, Splout SQL.
Pangool: a better Java API to Hadoop
Splout SQL: distributed read-only SQL, easy to use with Hadoop
Open-sourced them.
End result: Talk in conferences, get more clients.
19. Founding team
Two techs, with some personality differences (extroverted / introverted, more / less risk averse).
Both a bit stubborn :-)
In the end, two techs.
22. Founding team: lessons learned
We were too similar and lacked more of a business co-founder.
Heterogeneity in a founding team is important, otherwise deadlocks
might happen.
23. Pricing
We learned slowly
First deal: A full retreat week in Slovenia, the two of us, for:
But we were quite happy after that, it got us a 1+ year deal with a cool startup!
24. Pricing
We found fixed price budgeting useful.
Price = expected hours worked * price per hour
How much value will this solution bring to my client?
High-tech consulting
Client cant hire a similar profile easily (often impossible to find)
HR costs, interviewing process, test period, contract costs,
Early-stage startups, reduce TTM for a MVP from 1 year to 3 months
How much value does that bring?
25. Pricing: case
Client had a problem: a slow batch process (it took many days to complete)
We proposed the following billing schema:
We create a new solution to this process and compare its running time.
1X$ if our improved process runs in less than 1 day.
2X$ if it runs in less than 12 hours.
4X$ if it runs in less than 6 hours.
8X$ if it runs in less than 3 hours.
In the end, we billed 8X!
26. Pricing: lesson learned
Know your client (to whom you bring the maximum value).
Price per hour as a function of the context:
High-tech? Europe / US? Kind of project? Kind of client? Remote / On-site?
Higher value for the client = higher rate.
Do a fair estimate, but take uncertainty into account.
Take anchoring bias into account!
27. Getting clients: lesson learned
Networking.
Going to events, giving talks.
Writing good blog posts, papers.
Open-sourcing things.
In the end: building (and maintaining) a reputation.
Never needed cold calls.
28. Tech. stuff: lessons learned
Keep your standards very high (comments, documentation, unit tests).
More chances for the project to remain on a high standard afterwards.
Deliver actionable documentation.
Dont be afraid to deliver a small code base. In programming, reducing
complexity is harder than adding unnecessary complexity.
29. Humans: lessons learned
Contracts: be clear and consistent (conditions).
Write comprehensive contracts / proposals!
Not everybody likes the same explanation style.
Have strong opinions, but in the end your client decides!
31. Why closing Datasalt?
We didnt find a product-based business model together.
We got tired of the small high-tech consulting model.
We could have tried to stay like this ad-infinitum, but preferred to explore other ways before we got
too used to this.
Success or failure?
Failure: We didnt achieve our original idea.
Success: We built something nice, enjoyed a nice lifestyle and learned a lot.
Failure: We didnt become millionaires.
Success: We closed with a positive cash flow and splitted dividends for the last years.
33. Sum up & future
Currently freelancing, and in the search to co-found a new venture.
Trends I see:
In Big Data, big switch to real-time architectures.
Real-time Big Data frameworks are nowadays good enough (e.g. Kafka, Flink, Spark Streaming).
Less and less infrastructure problems, everything as a service.
More machine learning / AI. Deep learning.
Not only image classification / tagging, but also text labeling, sentiment analysis...
Potentially even click prediction!
Blockchain?