ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
? 2016 Ness SES. All Rights
Reserved
1
BIG DATA
Introduction
MOLDOVAN Radu Adrian
Timisoara - HackTM 2016
? 2016 Ness SES. All Rights
Reserved
2
Who am I? :)
? passionate about technology
? 20 years of programming
using open source
? last 4 years in Big Data
? Big Data Architect @
? 2016 Ness SES. All Rights
Reserved
3
How / Who / What ...Big Data
How big is big data?
Who is getting benefit of?
What are the drives
for Cloud/BigData?
Petabyte(10^15)
Exabytes(10^18)
Zettabytes(10^21)
Yottabytes(10^24)
3 Bn Internet users
4.6 Bn Mobile users
information age
flooded by data
data grows exponentially
data complexity
? 2016 Ness SES. All Rights
Reserved
4
3Vs dimensions are not good enough
VOLUME
records
transactions
logs
track data
VARIABILITY
semantic/meaning is changing
VISUALIZATION
complex graphs, interactive trees
VALUE
value for organisations, societies and consumers
VELOCITY
real time
near real time
streams
batch
VARIETY
structured
unstructured (80%)
semistructured
? 2016 Ness SES. All Rights
Reserved
5
Cloud Computing
PRIVATE
high level of
security
enterprise
applications
private network
single customer
CLOUD
PUBLIC
not sensitive data
internet connection
shared infrastructure
multiple customers
¡ñ provisioning on demand for resources
¡ñ scalability & elasticity
¡ñ cost / per unit computation / memory usage / drive usage
¡ñ operational and maintenance services
¡ñ 24/7 support + disaster assistance
¡ñ secure storage
¡ñ tools for monitoring and upgrades
HYBRID
? 2016 Ness SES. All Rights
Reserved
6
¡­ where Enterprise ends and Big Data starts
www.XYZ.com
Load 1
Balancer
Load n
Balancer
Web 1.1
Server
Web 1.x
Server
Web n.1
Server
Web n.x
Server
Database
search
index
Cache
¡û Single Point of Failure
¡û Limited Scalability
read read
writewrite
? 2016 Ness SES. All Rights
Reserved
7
¡­ where Enterprise ends and Big Data starts
www.XYZ.com
Load 1
Balancer
Load n
Balancer
Web 1.1
Server
Web 1.x
Server
Web n.1
Server
Web n.x
Server
readwrite read write
noSQL Ring
1 2
4 5
3
search
1 2
3 4
n
DFS
Resource
Manager
1
HDD
s
CPU
RAM
2
HDD
s
CPU
RAM
n
HDD
s
CPU
RAM
DFS
MPP
RES.
MANAGER
? 2016 Ness SES. All Rights Reserved8
INFRASTRUCTURE LAYER
Database
Analytics
Bigdata
INFORMATION LAYER
MULTI CHANNEL DELIVERY
Dashboard Laptop Mobile/Tablet Email SMS Print
ANALYTICS LAYER
Realtime
Near Realtime
Reports + Statistics Custom Tools
Data Processing
- system generated data
- dimensional data
- de/normalize data
Data Ingestion/Extraction
- external data
- reference internal data
- discovery data
Data Loading
- operational data
- business information
data
Architecture - High Level
? 2016 Ness SES. All Rights
Reserved
9
Big data -ETL+BI
ERP
Flat
Files
CRM
Live
Stream
RDBMS
Web
Services
Extract Transform Load
Massive
Parallel
Processing
Distributed
System
noSQL DB
warehouse
DB(OLAP)
search
engines
Business Intelligence
Web
Services
Data
Science
Data
Monetization
Data
Exploration
Data
Visualisation
ETL BI
? 2016 Ness SES. All Rights
Reserved
10
Big Data ¨C Statistics and Dashboards
Paradigm: accurate numbers doesn¡¯t matter anymore
http://whatsthebigdataidea.com/2015/02/27/big-data-friday-funny/
? 2016 Ness SES. All Rights
Reserved
11
Big Data ¨C Predictions & Recommendations
Data transaction/history -> Interaction -> Observation -> Trends -> Decisions
http://www.kdnuggets.com/2014/06/cartoon-big-data-world-cup-football.html
? 2016 Ness SES. All Rights
Reserved
12
Big Data ¨C How to migrate to cloud?
http://cloudtweaks.com/2012/01/the-lighter-side-of-the-cloud-bullseye/
? 2016 Ness SES. All Rights
Reserved
13
Cloud provider landscape
http://blog.gravitant.com/2012/07/27/cloud-technology-spectrum/
? 2016 Ness SES. All Rights
Reserved
14
? 2016 Ness SES. All Rights
Reserved
15
Big Data ¨C Team ROLEs
information
architect
¡ñ actionable plan
for company¡¯s data
¡ñ define and document
key elements data
¡ñ master IN/OUT data
data engineers
¡ñ ETL developers
¡ñ normalized and denormalized data
¡ñ build adapters for data providers
data analysts
¡ñ interpret data
¡ñ keep high level of data quality
¡ñ take care of client¡¯s data
¡ñ offers statistics about data
¡ñ thresholds for validation
performance
engineers
¡ñ resolve performance issues
¡ñ JVM profiling
¡ñ serialization
¡ñ compression
¡ñ CPU vs RAM vs NIO
QA engineers
¡ñ Unix/Linux
¡ñ SQL
¡ñ pull/work big
datasets
devOps
¡ñ automate everything
¡ñ faster provisioning
¡ñ continuous integration
¡ñ improve releases and
deployments
data scientists
¡ñ create sophisticated analytics
models
to predict and optimise outputs
¡ñ ensure each model is updated
frequently so it remains relevant
for longer
¡ñ advanced statistics
visualization
engineers
¡ñ Tableau, Logi, Spotifre, Qlikview
? 2016 Ness SES. All Rights
Reserved
16
Thank you!
Skype: r.moldovan

More Related Content

Big data introduction (HackTM 2016)

  • 1. ? 2016 Ness SES. All Rights Reserved 1 BIG DATA Introduction MOLDOVAN Radu Adrian Timisoara - HackTM 2016
  • 2. ? 2016 Ness SES. All Rights Reserved 2 Who am I? :) ? passionate about technology ? 20 years of programming using open source ? last 4 years in Big Data ? Big Data Architect @
  • 3. ? 2016 Ness SES. All Rights Reserved 3 How / Who / What ...Big Data How big is big data? Who is getting benefit of? What are the drives for Cloud/BigData? Petabyte(10^15) Exabytes(10^18) Zettabytes(10^21) Yottabytes(10^24) 3 Bn Internet users 4.6 Bn Mobile users information age flooded by data data grows exponentially data complexity
  • 4. ? 2016 Ness SES. All Rights Reserved 4 3Vs dimensions are not good enough VOLUME records transactions logs track data VARIABILITY semantic/meaning is changing VISUALIZATION complex graphs, interactive trees VALUE value for organisations, societies and consumers VELOCITY real time near real time streams batch VARIETY structured unstructured (80%) semistructured
  • 5. ? 2016 Ness SES. All Rights Reserved 5 Cloud Computing PRIVATE high level of security enterprise applications private network single customer CLOUD PUBLIC not sensitive data internet connection shared infrastructure multiple customers ¡ñ provisioning on demand for resources ¡ñ scalability & elasticity ¡ñ cost / per unit computation / memory usage / drive usage ¡ñ operational and maintenance services ¡ñ 24/7 support + disaster assistance ¡ñ secure storage ¡ñ tools for monitoring and upgrades HYBRID
  • 6. ? 2016 Ness SES. All Rights Reserved 6 ¡­ where Enterprise ends and Big Data starts www.XYZ.com Load 1 Balancer Load n Balancer Web 1.1 Server Web 1.x Server Web n.1 Server Web n.x Server Database search index Cache ¡û Single Point of Failure ¡û Limited Scalability read read writewrite
  • 7. ? 2016 Ness SES. All Rights Reserved 7 ¡­ where Enterprise ends and Big Data starts www.XYZ.com Load 1 Balancer Load n Balancer Web 1.1 Server Web 1.x Server Web n.1 Server Web n.x Server readwrite read write noSQL Ring 1 2 4 5 3 search 1 2 3 4 n DFS Resource Manager 1 HDD s CPU RAM 2 HDD s CPU RAM n HDD s CPU RAM DFS MPP RES. MANAGER
  • 8. ? 2016 Ness SES. All Rights Reserved8 INFRASTRUCTURE LAYER Database Analytics Bigdata INFORMATION LAYER MULTI CHANNEL DELIVERY Dashboard Laptop Mobile/Tablet Email SMS Print ANALYTICS LAYER Realtime Near Realtime Reports + Statistics Custom Tools Data Processing - system generated data - dimensional data - de/normalize data Data Ingestion/Extraction - external data - reference internal data - discovery data Data Loading - operational data - business information data Architecture - High Level
  • 9. ? 2016 Ness SES. All Rights Reserved 9 Big data -ETL+BI ERP Flat Files CRM Live Stream RDBMS Web Services Extract Transform Load Massive Parallel Processing Distributed System noSQL DB warehouse DB(OLAP) search engines Business Intelligence Web Services Data Science Data Monetization Data Exploration Data Visualisation ETL BI
  • 10. ? 2016 Ness SES. All Rights Reserved 10 Big Data ¨C Statistics and Dashboards Paradigm: accurate numbers doesn¡¯t matter anymore http://whatsthebigdataidea.com/2015/02/27/big-data-friday-funny/
  • 11. ? 2016 Ness SES. All Rights Reserved 11 Big Data ¨C Predictions & Recommendations Data transaction/history -> Interaction -> Observation -> Trends -> Decisions http://www.kdnuggets.com/2014/06/cartoon-big-data-world-cup-football.html
  • 12. ? 2016 Ness SES. All Rights Reserved 12 Big Data ¨C How to migrate to cloud? http://cloudtweaks.com/2012/01/the-lighter-side-of-the-cloud-bullseye/
  • 13. ? 2016 Ness SES. All Rights Reserved 13 Cloud provider landscape http://blog.gravitant.com/2012/07/27/cloud-technology-spectrum/
  • 14. ? 2016 Ness SES. All Rights Reserved 14
  • 15. ? 2016 Ness SES. All Rights Reserved 15 Big Data ¨C Team ROLEs information architect ¡ñ actionable plan for company¡¯s data ¡ñ define and document key elements data ¡ñ master IN/OUT data data engineers ¡ñ ETL developers ¡ñ normalized and denormalized data ¡ñ build adapters for data providers data analysts ¡ñ interpret data ¡ñ keep high level of data quality ¡ñ take care of client¡¯s data ¡ñ offers statistics about data ¡ñ thresholds for validation performance engineers ¡ñ resolve performance issues ¡ñ JVM profiling ¡ñ serialization ¡ñ compression ¡ñ CPU vs RAM vs NIO QA engineers ¡ñ Unix/Linux ¡ñ SQL ¡ñ pull/work big datasets devOps ¡ñ automate everything ¡ñ faster provisioning ¡ñ continuous integration ¡ñ improve releases and deployments data scientists ¡ñ create sophisticated analytics models to predict and optimise outputs ¡ñ ensure each model is updated frequently so it remains relevant for longer ¡ñ advanced statistics visualization engineers ¡ñ Tableau, Logi, Spotifre, Qlikview
  • 16. ? 2016 Ness SES. All Rights Reserved 16 Thank you! Skype: r.moldovan