ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
???
Cloud Consultant
Helion OpenStack Product Management of Korea
Hewlett Packard Enterprise
Monasca ? ???
Cloud Monitoring
Agenda
- Monitoring as a Service (Monasca)
- Monaca Architecture
- Helion OpenStack ? Monasca
- Helion Monitoring Console ??
Monitoring Challenge
Monitoring ? ???
Region
Region A Region A
Region B Region C Region XRegion B
Zone
Machine
Instance
Container? Scaling ? ??
: ???? ??, Multi-Region ?
Public/Private Cloud ? Scaling
? Cloud ?? ??? ??
: ??? VM ?? Container ?
? ?? Metric data ??
: Monitoring Data ? ??
? Dynamic ? ??
: ?? ??? Infra
MONASCA
MOnitoring At SCAle (Monitoring as a Service)
Monasca ? OpenStack ? Project ? ???, Cloud ???? ??? VM ?
Cloud ? ??? Monitoring ?? ?? Solution
Monasca ? ?? ?? ? ??
? Cloud ? Infra ? Monitoring ?? Metric ? ??? Alarm ??
? E-mail ??? Alarm ?? ? notification
? Java ? Python ?? ?? (???? Java ??)
? Apache Kafka ? ?? component
? REST API ? ???? Data ? ??
? HPE, TWC, Rackspace, Cisco, IBM ? ?? ??
? Kafka : Apache Kafka ? ?? Messaging System
? Storm : Apache Storm ? ?? real-time Computing System
MONASCA
Monasca ? ??
? Monasca API ? REST API ? ???? Data ? ????, ??
? ?? Component ? HA ? Scale Up/Out ? ???? Design ?
? Dedicated DB ? ????, Metric ? ??
o InfluxDB
o Vertica
? Infrastructure ? ??? ?? ???? ??, ??? ?? metric ? ???
?? ??? ??
? ??? Alarm ??? ??, real-time ?? metric ? alarm ? ??? Trouble
Shooting
? Notification ? ?? ?? (??, ?? ?)
? Apache License ?? Open Source ? ???? ??
MONASCA
Monasca ? Architecture
? Monasca ???? ????,
???? ?? ??? ???
????, Micro service
message bus ???
architecture
? REST API ? ????, ??
metric ??
? ?? Major component ??
Kafka ? ??
? Large Scale System ?
????? ?? HA ? Scale
? ??? Architecture
? Kafka : Apache Kafka ? ?? Messaging System
? Storm : Apache Storm ? ?? real-time Computing System
MONASCA
Monasca ? Architecture ? ?? ??
Micro-services Message Bus Based Architecture
? Load-Balancing ? ????, ???(Scalability)? ??? ??? ?? ??
? High-Availability ? ????, ???? ???? Data Loss ? ?? ??? ??
? ??? (Extensibility) ? ????, Component ? Service ? ?? ?? ? ? ??
- Helion ? ??, HP Operation Manager ? ????, Monaca ? ?? ??? ???
- Multi-Site ? Data Replication ? ???
Threshold Engine
? Real-Time ?? Memory ??? Streaming ??? ??
? Apache Storm Base
MONASCA
API Resources
1. /v2.0/versions
2. /v2.0/metrics
3. /v2.0/metrics/measurements
4. /v2.0/metrics/statistics
5. /v2.0/metrics/names
6. /v2.0/alarm-definitions
7. /v2.0/alarms
8. /v2.0/alarms/state-history
9. /v2.0/notification-methods
https://github.com/openstack/monasca-api/blob/master/docs/monasca-api-spec.md
MONASCA
Metrics
Monasca ? ?? ?? ?? ??
? GET, POST /v2.0/metrics
? GET /v2.0/metrics/measurements
? GET /v2.0/metrics/statistics- avg, min, max, sum, count
? GET /v2.0/metrics/names
Metric ? ???? ?? ??? ??? ???(Dimension) ?? ?? ???.
???(Dimensions) ? ??? Dictionary ?, metric ? ??? ??, Key ? Value ?
??? ????.
MONASCA
Metrics Example
POST /v2.0/metrics
{
name: http_status,
dimensions: {
hostname: hlm001-cp1-c1-m2-mgmt,
cluster: c1,
control_plane: ccp,
service: compute
}
timestamp: 0, /* milliseconds */
value: 0.0,
value_meta: {
status_code: 500,
msg: Internal server error
}
}
? Simple, concise, multi-dimensional flexible description
? Name (string)
? Dimensions: Dictionary of user-defined (key, value) pairs that are used to
uniquely identify a metric
? Optional dictionary of user-defined (key, value) pairs that can be used to
describe a measurement
? Normally used for errors and messages
MONASCA
Alarm Definition
- ??? Template ??, metric name ? dimension ?
match ?? Alarm ? ??
- ??? Alarm ??? ??? Alarm ??? ???
GET, POST /v2.0/alarm-definitions
GET, PUT, PATCH, DELETE /v2.0/alarm-definitions{alarm-definition-
id}
? ?? Alarm ?? ??? ?? ??? ??? ??:
avg(cpu.user_perc{}) > 85
or avg(memory.system_perc{}) > 45
or avg(disk.read_ops(device=vda), 120) > 100
? Alarm ?? ?? (OK, ALARM and UNDETERMINED)
? Actions associated with alarms for state transitions
? Severity (LOW, MEDIUM, HIGH, CRITICAL) ??
? Thresholds ? ???? ?? ??
Example:
POST /v2.0/alarm-definitions
{
"name":¡±CPU percent greater than 10",
"description":"The average CPU percent is greater than 85",
"expression":"(avg(cpu,user_perc{region=uswest})> 85)",
"match_by":[
"hostname"
],
"severity":"LOW",
"ok_actions":[
"c60ec47e-5038-4bf1-9f95-4046c6e9a759"
],
"alarm_actions":[
"c60ec47e-5038-4bf1-9f95-4046c6e9a759"
],
"undetermined_actions":[
"c60ec47e-5038-4bf1-9f95-4046c6e9a759¡°
MONASCA
Alarms
- Alarm ? Threshold Engine ? ??, ?? ???
???? metric ? ?? ? ? ????
GET /v2.0/alarms
GET, PUT, PATH, DELETE /v2.0/alarms/{alarm-id}
Query Parameters:
? alarm_definition_id (string, optional) - Alarm definition ID to filter by.
? metric_name (string(255), optional) - Name of metric to filter by.
? metric_dimensions ({string(255): string(255)}, optional) -
Dimensions of metrics to filter by specified as a comma separated
array of (key, value) pairs as `key1:value1,key1:value1, ...`
? state (string, optional) - State of alarm to filter by, either `OK`,
`ALARM` or `UNDETERMINED`.
? state_updated_start_time (string, optional) - The start time in ISO
8601 combined date and time format in UTC.
Example:
List alarms
GET
/v2.0/alarms?metric_name=cpu.user_perc&metric_di
mensions=hostname:devstack&state=ALARM
List alarm
GET /v2.0/alarms/{alarm-id}
MONASCA
Alarm History
- OK, ALARM,UNDETERMINE ? ?? ?? ??? ???? ???
GET /v2.0/alarms/state-history
GET /v2.0/alarms/{alarm-id}/state-history
MONASCA
Notification
- Notification ?? (Email, Pager Duty, WebHook) ? ?? List
- ??? Alarm ??? ??, ?? Notification ?? ?? ??
Examples:
POST /v2.0/notification-methods
{
"name":"Name of notification method",
"type":"EMAIL",
"address":¡°sang-wook.byun@hpe.com"
}
POST /v2.0/notification-methods
{
"name":"Name of notification method",
"type":¡±WEBHOOK",
"address":¡±http://example.com/XXX"
}
MONASCA
Agent
? Python ?? ??
? Monitor ? ?? System ? ??
??, ??
? System metric ? ? Service
metric ?? ??
? System up/down, http status ??
?? Active ?? ??
? Default 30sec ??? ??
? Plug-in Architecture ? ?? ??,
??? ??? ?? ??? ??
? Monasca ???? component ? node ??? ?? ?? ??
? Monasca API ? Keystone ??? ?? ??? ??? ??, in-memory ??? ?? token ?
??
Helion OpenStack 2.1 Architecture ?? Open Source (OpenStack Kilo)
Plug-ins
HPE Value-add (Open Source)
UI
UI
Execution EnvironmentOperations Environment
Infrastructure
Services
Identity Service (Keystone)
Physical Infrastructure ¨C Servers, Networking, Storage
OperationalServices
Deployment (Ansible)
Service
? Deployment Artifacts
? Boot Images
? Service Playbooks
? Deployment Templates
Sub
Systems
Object (Swift)
Storage Service
Image (Glance)
Library
Service
Compute (Nova)
Service
Network (Neutron)
Service
Block Storage (Cinder)
Service
Linux for HPE Helion (Debian)
Operations (OpsConsole)
Dashboard
KVM
FC
Local LDAP/AD
Swift
OpenStack Dashboard (Horizon)
ESX
iSCSI
LHN
3PAR
VMDK
Storage (StoreVirtual
Dashboard CMC) Sherpa
Orchestration Service (Heat)
DVR
VXLAN
VLAN
Bare Metal (Cobbler)
Provisioning Service
Metering Service (Ceilometer)
OVSvApp
IPMI PXE
Ceph
ML2
Network
Services
DNS (DNSaaS)
Service
DNSaaS
Recovery (Freezer)
Management
(Backup/Restore Scripts)
Service Fail-over
Management
(HAProxy, Keepalived)
MySQL
Rabbit MQ
Centralized
Logging
(Logstash, ElasticSearch)
Infrastructure
Monitoring Service
(Monasca)
HTTPS
Termination
(Stunnel)
Logstash Monasca FW (FWaaS)
Service
VPN (VPNaaS)
Service
Federation
Configuration Processor
LB (LBaaS)
Service
Vertica
Nova ESX (EON)
Configuration
Logging Search (Kibana)
Dashboard
HPE Value-add (HPE Assets)
UEFI
Day Zero
Installer
LBaaS VPNaaS FWaaS
VSASwift
Ceph
InfluxDB
Helion OpenStack Cloud Monitoring Benefit
?? ?? ?? (Benefit)
Operations
Excellence
???? Deploy ??, Configuration ? ?? ??, ??? ?? ?? ?? ???,
Turnkey system level monitoring
?? ?? ?? ??
Organic
OpenStack
Monitoring
Helion ? ?? OpenStack ????? ?? ???. ??? Plugin ? ???
agent ?? ??? network ???? ???? ??.
?? ?? ??
Simplifies start up experience
Hybrid
Interoperability
API/CLI/Msg Bus ? HP ?? SW ??? ???? ????, 3rd Party tool ??
HP ?? SW ??? ???
?? ?? ?? ??
?? ?? ?? ??
Prescribed
Resolutions
???? ??? ?? ??? ?? community ? ??? Know-How ??? ??
?? ?? ?? ??
Configurable
alarms
?? Host ? ?? Alarm Level ?? ?? ??? ??? Cluster ?? ??, ?
?? ??? ?? Alarm Level ? ??
Easy tuning
High Definition
Metrics
?? Alarm ?? ??? metric ? ????, ??? ??? ??? ??? ??? ?? ?? ??
Less down time
Performance
Tuning
?? ??? ????, ????? data ? system ? load level ? ????,
???? ??? ????, ??? level ? ??? ??
Scalability and system performance
Faster problem resolution Optimized resources Higher staff productivity
Helion OpenStack Cloud Monitoring
Helion ??? Monasca Coverage
Fully supported
Partially supported
Not Applicable
Helion OpenStack Core Services Helion OpenStack Shared Services
Nova
Neutron
Cinder
Nuetron
L3agent
Glance
Swift
Ceilometer
Horizon
Heat
Keystone
Ops
Console
Logging
Monasca
BURA
OVS
Hlinux
MySQL
Rabbit
Apache
LogStash
Beaver
Elastic
Kafka
HAProxy
Storm
Service
up?
API up?
Host
up?
Perf
Resource
Utilization
Control Plane
Cloud IaaS
Compute Network Storage
Cloud PaaS
Application
Monasca
Helion ??? Monitoring Factor
? System (cpu, memory, network, file system, ¡­ )
? Service (MySQL, Kafka, nova, cinder, ¡­. )
? Application
Built-in Statsd daemon
Python monasca-stats library : Adds support for dimensions
? VM system metrics
? Active checks
HTTP status checks and respose times
System up/down check (ping and ssh)
? Runs any Nagios plugin or check_mk
? Extensible/Pluggable : Additional services can be
easily added
- Host alive check on all systems using ping check
- HTTP Status and response time on all OpenStack
service endpoints
- Process checks on all relevant processes
- System Metrics: CPU, disk, IO, load, memory, process,
network, NTP
- Services:
Elasticsearch, HAProxy, JVM, Kafka, MySQL, RabbitMQ,
Zookeeper
- OpenStack Services
Swift and Monasca specific metrics
- VM Metrics
CPU, IO, Memory, Network and Host Alive
See, http://monasca-
agent.readthedocs.org/en/latest/Plugins/
Helion OpenStack Operation Console ?? ??
Administrator ? ?? ??? Portal
? Cloud ??? ?? Center
Dashboard
? Cloud ? ???? ??? ??
? Severity, service, status ? ??
?? alarm ??
? Log ?? ? ??
? Key metric ? ????, Graph
??
? Real-time ?? alarm ?
????, ??
? Notification ?? ?? : e-mail,
pager duty, web hooks
Helion OpenStack Operation Console ?? ??
Dashboard
? Dashboard ?? ??? ??
Alarm ? ??? Monitoring
? ? ??
? Monitoring ?? ????
Click ??, ??? Alarm ?
?? ?? ?? ???, ??
??? Alarm ? ???
Dimension ? ?? ??
? ? ??, Alarm ? Click ??
Detail ? ??? Alarm ?
History ? ?? ??,
Comment ? Update ? ??
Helion OpenStack Operation Console ?? ??
Alarm Creation
1
1 Menu ? Alarm Creation ??
2 Create Alarm Definition ? ??2
3 Parameter ??
Ex.
Name : Test
Description : Test
Severity : Low
Function : AVG
Metric : CPU.SYSTEM_PERC
Dimension(s): hostname=hellion-c1
Relational Operator: > (Greater Than)
Value : 75
3
Helion OpenStack Operation Console ?? ??
Alarm Creation ?? ??
2
? ??? ??? ???
???, Click ?, Test ??
??? ?????
?????? ????
??? ? ??
? Alarm ? ?????,
??? Test ??? ??
??
1
Helion OpenStack Operation Console ?? ??
Alarm Explorer
? ?? ???? Application ? ??
alarm display
? State, Alarm, Condition ???
sorting ? ??
? Search Bar ? ???, key word
???? filtering ??, ???
alarm ? ?? ??
? 1? ?? alarm ? check box ? ??
??? ??, 2?? SET CONDITION
??? ???, ?? ???
??,Open, Resolved, Acknowledged
? condition ??? ??
? ? ??? condition ? ?? sorting
??
1
2
Helion OpenStack Operation Console ?? ??
Time Series Graph
1
2
? ??? Data ? Chart ??? ??
? 1? ???? Time Series Graph
?? ? Create Chart Click
? 2? Chart ?? ???, Chart ??
(Bar, Line,..) ? Chart ? Size ?
Update Rate ? ??
? ??? metric ? ??
??, ????
dimension ? ?? ??
? Data ? Chart ? Add
??, Create Chart ?
????, ?? Chart ?
3? ?? ???
3
Helion OpenStack Operation Console ?? ??
Logging
? Helion OpenStack ? Central
Log ??? ?? ELK (Elastic
Search, LogStash, Kibana) ?
integrate ?? ??, Operation
Console ? ??, Log ??,
Visual ? Graph ??? ??
??? ???? ??
Monasca
???
- Monasca ? Cloud ? ?? project ? ?? ?? ??? ?? ???.
- VM instance ?? Service ? ?? Monitoring ? need ? ??, ?? ???
??? ??? ?????.
? Monasca ? Devstack ? plugin ?? Monasca ? ? ?? ????.
? Monsaca ? Docker container ? ????, Monasca Demo ? ?? ? ????.
https://github.com/openstack/monasca-api/tree/master/devstack
https://hub.docker.com/r/monasca/demo/
Q & A
?????
29

More Related Content

Monasca ? ??? cloud ???? final

  • 1. ??? Cloud Consultant Helion OpenStack Product Management of Korea Hewlett Packard Enterprise Monasca ? ??? Cloud Monitoring
  • 2. Agenda - Monitoring as a Service (Monasca) - Monaca Architecture - Helion OpenStack ? Monasca - Helion Monitoring Console ??
  • 3. Monitoring Challenge Monitoring ? ??? Region Region A Region A Region B Region C Region XRegion B Zone Machine Instance Container? Scaling ? ?? : ???? ??, Multi-Region ? Public/Private Cloud ? Scaling ? Cloud ?? ??? ?? : ??? VM ?? Container ? ? ?? Metric data ?? : Monitoring Data ? ?? ? Dynamic ? ?? : ?? ??? Infra
  • 4. MONASCA MOnitoring At SCAle (Monitoring as a Service) Monasca ? OpenStack ? Project ? ???, Cloud ???? ??? VM ? Cloud ? ??? Monitoring ?? ?? Solution Monasca ? ?? ?? ? ?? ? Cloud ? Infra ? Monitoring ?? Metric ? ??? Alarm ?? ? E-mail ??? Alarm ?? ? notification ? Java ? Python ?? ?? (???? Java ??) ? Apache Kafka ? ?? component ? REST API ? ???? Data ? ?? ? HPE, TWC, Rackspace, Cisco, IBM ? ?? ?? ? Kafka : Apache Kafka ? ?? Messaging System ? Storm : Apache Storm ? ?? real-time Computing System
  • 5. MONASCA Monasca ? ?? ? Monasca API ? REST API ? ???? Data ? ????, ?? ? ?? Component ? HA ? Scale Up/Out ? ???? Design ? ? Dedicated DB ? ????, Metric ? ?? o InfluxDB o Vertica ? Infrastructure ? ??? ?? ???? ??, ??? ?? metric ? ??? ?? ??? ?? ? ??? Alarm ??? ??, real-time ?? metric ? alarm ? ??? Trouble Shooting ? Notification ? ?? ?? (??, ?? ?) ? Apache License ?? Open Source ? ???? ??
  • 6. MONASCA Monasca ? Architecture ? Monasca ???? ????, ???? ?? ??? ??? ????, Micro service message bus ??? architecture ? REST API ? ????, ?? metric ?? ? ?? Major component ?? Kafka ? ?? ? Large Scale System ? ????? ?? HA ? Scale ? ??? Architecture ? Kafka : Apache Kafka ? ?? Messaging System ? Storm : Apache Storm ? ?? real-time Computing System
  • 7. MONASCA Monasca ? Architecture ? ?? ?? Micro-services Message Bus Based Architecture ? Load-Balancing ? ????, ???(Scalability)? ??? ??? ?? ?? ? High-Availability ? ????, ???? ???? Data Loss ? ?? ??? ?? ? ??? (Extensibility) ? ????, Component ? Service ? ?? ?? ? ? ?? - Helion ? ??, HP Operation Manager ? ????, Monaca ? ?? ??? ??? - Multi-Site ? Data Replication ? ??? Threshold Engine ? Real-Time ?? Memory ??? Streaming ??? ?? ? Apache Storm Base
  • 8. MONASCA API Resources 1. /v2.0/versions 2. /v2.0/metrics 3. /v2.0/metrics/measurements 4. /v2.0/metrics/statistics 5. /v2.0/metrics/names 6. /v2.0/alarm-definitions 7. /v2.0/alarms 8. /v2.0/alarms/state-history 9. /v2.0/notification-methods https://github.com/openstack/monasca-api/blob/master/docs/monasca-api-spec.md
  • 9. MONASCA Metrics Monasca ? ?? ?? ?? ?? ? GET, POST /v2.0/metrics ? GET /v2.0/metrics/measurements ? GET /v2.0/metrics/statistics- avg, min, max, sum, count ? GET /v2.0/metrics/names Metric ? ???? ?? ??? ??? ???(Dimension) ?? ?? ???. ???(Dimensions) ? ??? Dictionary ?, metric ? ??? ??, Key ? Value ? ??? ????.
  • 10. MONASCA Metrics Example POST /v2.0/metrics { name: http_status, dimensions: { hostname: hlm001-cp1-c1-m2-mgmt, cluster: c1, control_plane: ccp, service: compute } timestamp: 0, /* milliseconds */ value: 0.0, value_meta: { status_code: 500, msg: Internal server error } } ? Simple, concise, multi-dimensional flexible description ? Name (string) ? Dimensions: Dictionary of user-defined (key, value) pairs that are used to uniquely identify a metric ? Optional dictionary of user-defined (key, value) pairs that can be used to describe a measurement ? Normally used for errors and messages
  • 11. MONASCA Alarm Definition - ??? Template ??, metric name ? dimension ? match ?? Alarm ? ?? - ??? Alarm ??? ??? Alarm ??? ??? GET, POST /v2.0/alarm-definitions GET, PUT, PATCH, DELETE /v2.0/alarm-definitions{alarm-definition- id} ? ?? Alarm ?? ??? ?? ??? ??? ??: avg(cpu.user_perc{}) > 85 or avg(memory.system_perc{}) > 45 or avg(disk.read_ops(device=vda), 120) > 100 ? Alarm ?? ?? (OK, ALARM and UNDETERMINED) ? Actions associated with alarms for state transitions ? Severity (LOW, MEDIUM, HIGH, CRITICAL) ?? ? Thresholds ? ???? ?? ?? Example: POST /v2.0/alarm-definitions { "name":¡±CPU percent greater than 10", "description":"The average CPU percent is greater than 85", "expression":"(avg(cpu,user_perc{region=uswest})> 85)", "match_by":[ "hostname" ], "severity":"LOW", "ok_actions":[ "c60ec47e-5038-4bf1-9f95-4046c6e9a759" ], "alarm_actions":[ "c60ec47e-5038-4bf1-9f95-4046c6e9a759" ], "undetermined_actions":[ "c60ec47e-5038-4bf1-9f95-4046c6e9a759¡°
  • 12. MONASCA Alarms - Alarm ? Threshold Engine ? ??, ?? ??? ???? metric ? ?? ? ? ???? GET /v2.0/alarms GET, PUT, PATH, DELETE /v2.0/alarms/{alarm-id} Query Parameters: ? alarm_definition_id (string, optional) - Alarm definition ID to filter by. ? metric_name (string(255), optional) - Name of metric to filter by. ? metric_dimensions ({string(255): string(255)}, optional) - Dimensions of metrics to filter by specified as a comma separated array of (key, value) pairs as `key1:value1,key1:value1, ...` ? state (string, optional) - State of alarm to filter by, either `OK`, `ALARM` or `UNDETERMINED`. ? state_updated_start_time (string, optional) - The start time in ISO 8601 combined date and time format in UTC. Example: List alarms GET /v2.0/alarms?metric_name=cpu.user_perc&metric_di mensions=hostname:devstack&state=ALARM List alarm GET /v2.0/alarms/{alarm-id}
  • 13. MONASCA Alarm History - OK, ALARM,UNDETERMINE ? ?? ?? ??? ???? ??? GET /v2.0/alarms/state-history GET /v2.0/alarms/{alarm-id}/state-history
  • 14. MONASCA Notification - Notification ?? (Email, Pager Duty, WebHook) ? ?? List - ??? Alarm ??? ??, ?? Notification ?? ?? ?? Examples: POST /v2.0/notification-methods { "name":"Name of notification method", "type":"EMAIL", "address":¡°sang-wook.byun@hpe.com" } POST /v2.0/notification-methods { "name":"Name of notification method", "type":¡±WEBHOOK", "address":¡±http://example.com/XXX" }
  • 15. MONASCA Agent ? Python ?? ?? ? Monitor ? ?? System ? ?? ??, ?? ? System metric ? ? Service metric ?? ?? ? System up/down, http status ?? ?? Active ?? ?? ? Default 30sec ??? ?? ? Plug-in Architecture ? ?? ??, ??? ??? ?? ??? ?? ? Monasca ???? component ? node ??? ?? ?? ?? ? Monasca API ? Keystone ??? ?? ??? ??? ??, in-memory ??? ?? token ? ??
  • 16. Helion OpenStack 2.1 Architecture ?? Open Source (OpenStack Kilo) Plug-ins HPE Value-add (Open Source) UI UI Execution EnvironmentOperations Environment Infrastructure Services Identity Service (Keystone) Physical Infrastructure ¨C Servers, Networking, Storage OperationalServices Deployment (Ansible) Service ? Deployment Artifacts ? Boot Images ? Service Playbooks ? Deployment Templates Sub Systems Object (Swift) Storage Service Image (Glance) Library Service Compute (Nova) Service Network (Neutron) Service Block Storage (Cinder) Service Linux for HPE Helion (Debian) Operations (OpsConsole) Dashboard KVM FC Local LDAP/AD Swift OpenStack Dashboard (Horizon) ESX iSCSI LHN 3PAR VMDK Storage (StoreVirtual Dashboard CMC) Sherpa Orchestration Service (Heat) DVR VXLAN VLAN Bare Metal (Cobbler) Provisioning Service Metering Service (Ceilometer) OVSvApp IPMI PXE Ceph ML2 Network Services DNS (DNSaaS) Service DNSaaS Recovery (Freezer) Management (Backup/Restore Scripts) Service Fail-over Management (HAProxy, Keepalived) MySQL Rabbit MQ Centralized Logging (Logstash, ElasticSearch) Infrastructure Monitoring Service (Monasca) HTTPS Termination (Stunnel) Logstash Monasca FW (FWaaS) Service VPN (VPNaaS) Service Federation Configuration Processor LB (LBaaS) Service Vertica Nova ESX (EON) Configuration Logging Search (Kibana) Dashboard HPE Value-add (HPE Assets) UEFI Day Zero Installer LBaaS VPNaaS FWaaS VSASwift Ceph InfluxDB
  • 17. Helion OpenStack Cloud Monitoring Benefit ?? ?? ?? (Benefit) Operations Excellence ???? Deploy ??, Configuration ? ?? ??, ??? ?? ?? ?? ???, Turnkey system level monitoring ?? ?? ?? ?? Organic OpenStack Monitoring Helion ? ?? OpenStack ????? ?? ???. ??? Plugin ? ??? agent ?? ??? network ???? ???? ??. ?? ?? ?? Simplifies start up experience Hybrid Interoperability API/CLI/Msg Bus ? HP ?? SW ??? ???? ????, 3rd Party tool ?? HP ?? SW ??? ??? ?? ?? ?? ?? ?? ?? ?? ?? Prescribed Resolutions ???? ??? ?? ??? ?? community ? ??? Know-How ??? ?? ?? ?? ?? ?? Configurable alarms ?? Host ? ?? Alarm Level ?? ?? ??? ??? Cluster ?? ??, ? ?? ??? ?? Alarm Level ? ?? Easy tuning High Definition Metrics ?? Alarm ?? ??? metric ? ????, ??? ??? ??? ??? ??? ?? ?? ?? Less down time Performance Tuning ?? ??? ????, ????? data ? system ? load level ? ????, ???? ??? ????, ??? level ? ??? ?? Scalability and system performance Faster problem resolution Optimized resources Higher staff productivity
  • 18. Helion OpenStack Cloud Monitoring Helion ??? Monasca Coverage Fully supported Partially supported Not Applicable Helion OpenStack Core Services Helion OpenStack Shared Services Nova Neutron Cinder Nuetron L3agent Glance Swift Ceilometer Horizon Heat Keystone Ops Console Logging Monasca BURA OVS Hlinux MySQL Rabbit Apache LogStash Beaver Elastic Kafka HAProxy Storm Service up? API up? Host up? Perf Resource Utilization Control Plane Cloud IaaS Compute Network Storage Cloud PaaS Application
  • 19. Monasca Helion ??? Monitoring Factor ? System (cpu, memory, network, file system, ¡­ ) ? Service (MySQL, Kafka, nova, cinder, ¡­. ) ? Application Built-in Statsd daemon Python monasca-stats library : Adds support for dimensions ? VM system metrics ? Active checks HTTP status checks and respose times System up/down check (ping and ssh) ? Runs any Nagios plugin or check_mk ? Extensible/Pluggable : Additional services can be easily added - Host alive check on all systems using ping check - HTTP Status and response time on all OpenStack service endpoints - Process checks on all relevant processes - System Metrics: CPU, disk, IO, load, memory, process, network, NTP - Services: Elasticsearch, HAProxy, JVM, Kafka, MySQL, RabbitMQ, Zookeeper - OpenStack Services Swift and Monasca specific metrics - VM Metrics CPU, IO, Memory, Network and Host Alive See, http://monasca- agent.readthedocs.org/en/latest/Plugins/
  • 20. Helion OpenStack Operation Console ?? ?? Administrator ? ?? ??? Portal ? Cloud ??? ?? Center Dashboard ? Cloud ? ???? ??? ?? ? Severity, service, status ? ?? ?? alarm ?? ? Log ?? ? ?? ? Key metric ? ????, Graph ?? ? Real-time ?? alarm ? ????, ?? ? Notification ?? ?? : e-mail, pager duty, web hooks
  • 21. Helion OpenStack Operation Console ?? ?? Dashboard ? Dashboard ?? ??? ?? Alarm ? ??? Monitoring ? ? ?? ? Monitoring ?? ???? Click ??, ??? Alarm ? ?? ?? ?? ???, ?? ??? Alarm ? ??? Dimension ? ?? ?? ? ? ??, Alarm ? Click ?? Detail ? ??? Alarm ? History ? ?? ??, Comment ? Update ? ??
  • 22. Helion OpenStack Operation Console ?? ?? Alarm Creation 1 1 Menu ? Alarm Creation ?? 2 Create Alarm Definition ? ??2 3 Parameter ?? Ex. Name : Test Description : Test Severity : Low Function : AVG Metric : CPU.SYSTEM_PERC Dimension(s): hostname=hellion-c1 Relational Operator: > (Greater Than) Value : 75 3
  • 23. Helion OpenStack Operation Console ?? ?? Alarm Creation ?? ?? 2 ? ??? ??? ??? ???, Click ?, Test ?? ??? ????? ?????? ???? ??? ? ?? ? Alarm ? ?????, ??? Test ??? ?? ?? 1
  • 24. Helion OpenStack Operation Console ?? ?? Alarm Explorer ? ?? ???? Application ? ?? alarm display ? State, Alarm, Condition ??? sorting ? ?? ? Search Bar ? ???, key word ???? filtering ??, ??? alarm ? ?? ?? ? 1? ?? alarm ? check box ? ?? ??? ??, 2?? SET CONDITION ??? ???, ?? ??? ??,Open, Resolved, Acknowledged ? condition ??? ?? ? ? ??? condition ? ?? sorting ?? 1 2
  • 25. Helion OpenStack Operation Console ?? ?? Time Series Graph 1 2 ? ??? Data ? Chart ??? ?? ? 1? ???? Time Series Graph ?? ? Create Chart Click ? 2? Chart ?? ???, Chart ?? (Bar, Line,..) ? Chart ? Size ? Update Rate ? ?? ? ??? metric ? ?? ??, ???? dimension ? ?? ?? ? Data ? Chart ? Add ??, Create Chart ? ????, ?? Chart ? 3? ?? ??? 3
  • 26. Helion OpenStack Operation Console ?? ?? Logging ? Helion OpenStack ? Central Log ??? ?? ELK (Elastic Search, LogStash, Kibana) ? integrate ?? ??, Operation Console ? ??, Log ??, Visual ? Graph ??? ?? ??? ???? ??
  • 27. Monasca ??? - Monasca ? Cloud ? ?? project ? ?? ?? ??? ?? ???. - VM instance ?? Service ? ?? Monitoring ? need ? ??, ?? ??? ??? ??? ?????. ? Monasca ? Devstack ? plugin ?? Monasca ? ? ?? ????. ? Monsaca ? Docker container ? ????, Monasca Demo ? ?? ? ????. https://github.com/openstack/monasca-api/tree/master/devstack https://hub.docker.com/r/monasca/demo/
  • 28. Q & A

Editor's Notes

  1. This is a sample Title ºÝºÝߣ with Picture ideal for including a dark picture with a brief title and subtitle. A selection of pre-approved title slides are available in the HPE Title ºÝºÝߣ Library. The location of the library will be communicated later. To insert a slide with a different picture from the HPE Title ºÝºÝߣ Library: Open the file HPE_16x9_Title_ºÝºÝߣ_Library.pptx From the ºÝºÝߣ thumbnails pane, select the slide with the picture you would like to use in your presentation and click Copy (Ctrl+C) Open a copy of the new HPE 16x9 template (Standard or Events) or your current presentation In the ºÝºÝߣ thumbnails pane, click Paste (Ctrl+V) A Paste Options clipboard icon will appear. Click the icon and select Keep Source Formatting. (Ctrl+K)
  2. Monasca Session ??? ??? ?? ??? ?? ?????. Monasca ? ???, ? Architecture, ??? Monasca ? ???? ?? Helion OpenStack ? Architecture ?, Helion OpenStack ??? Monitoring ?? ??? monitoring console ? ?? ?? ??? ?????.
  3. Monitoring solutions have been around for decades, but in many respects they fail to address the requirements of monitoring large-scale public and private clouds. Traditionally, performance, scalability and data retention have been limited to hundreds of systems. In a large-scale cloud service thousands of physical servers and hundreds of thousands of virtual machines (VMs) and containers need to be monitored, resulting in hundreds of terabytes of monitoring data. The original monitoring source data needs to be stored in an on-line, queryable, lossless form at data retention periods greater than thirteen months. Such long data retention periods are necessary for SLAs, business continuity, and analytics. Inventory elasticity is important because cloud infrastructure is constantly evolving with VMs and services continually being created and destroyed monitoring systems must be dynamic enough to understand the difference between a VM be purposely destroyed versus a VM that is in a failed state. Self-service models that empower teams to easily add new resources and monitor them independently of the monitoring teams involvement is necessary. Most solutions assume a static infrastructure that requires new services to be registered with the server prior to being monitored. This results in the monitoring team/server being the bottleneck. Extensibility is critical, but is often limited. Run-time configurability is necessary to be able to tune the system over time by allowing alarms to be dynamically adjusted, which in many systems is not supported. Generalization of alarm definitions/templates is necessary to describe and manage alarms in a one-to-many relationship in order to avoid having to manually declare each alarm even though they may share many common attributes and differ in only one, such as hostname. Spammy alerts and alert fatigue is a common short-coming of every thresholding system. Many operations teams receive thousands of alerts on a weekly basis. Improvements in run-time configurability and generalizing alarm definitions can help to address spammy alerts. Anomaly detection based on non-parametric statistics and machine learning is required as a more fundamental change.
  4. Monasca is a highly performant, scalable, fault-tolerant and extensible micro-services messages bus based architecture. It uses a REST API for high-speed metrics processing and querying and has a streaming alarm engine and notification engine. All of the major components are linked using?Kafka. Every component in the system is built with High Availability (HA) in mind and can be scaled either horizontally or vertically to allow for monitoring of very large?systems. The?Monasca API?is the gateway for all interaction with Monasca. In a typical scenario?metrics?are collected by the?Monasca Agent?running on a system and sent to the Monasca API. The API then published the metrics to the Kafka queue. From here the?Monasca Persister?(metric ? Alarm ??? Kafka ?? Read ??, Metric DB ? ????? ??), consumes metrics and writes them to ourMetrics database. The?Monasca Threshold Engine?also consumes the metrics and uses them to evaluate?alarms. At this point the metrics are in our system and can be queried using the Monasca API, either directly or through one of our other components, such as the Horizon plugin or the?Monasca CLI. When the Threshold Engine evaluates the metrics against the alarms it can create alarm state transition events. These are published back to Kafka and are read by both the persister and?Notification Engine. The Persister writes the alarm transitions to the DB for future retrieval. The notification engine will send a notification of the configured type for appropriate state transitions. In addition to the components discussed above we also have a configuration database used for storing information such as alarm definitions and notification methods. This database can be either MySQL or PostgreSQL.
  5. Advantages of message bus architecture Enables a micro-services foundation Load-balancing, scalability, system maintenance (new deploys) Handle different loads Extensibility: Easily add new components/services: HP Operations Manager i (OMi) BSM Connector for HP Helion Monasca Consumes alarm state transition messages from Kafka Multi-site replication of data And there is more... Pagination (? ?? ?? ??? ??) is supported via offset and limit query parameters The Agent Forwarder buffers metrics for a short time to increase the size of the http request body (number of metrics) sent to the Monasca API. The Monasca API caches auth tokens in-memory to reduce the round-trip authorization requests to Keystone If network connectivity between the Agent and API occurs the Agent will buffer metrics and send when connectivity is restored Metrics are submitted using a ¡°agent¡± role, which only allows metrics to be POST¡¯d to the metrics endpoint Multi-site replication for metrics can be done by running two persisters simultaneously, sending to different metrics databases System can handle failure of any component or node
  6. Monasca-statsd daemon : statsd engine capable of handling dimensions associated with metrics submitted by a client that supports them. Also supports metrics from the standard statsd client. (udp/8125)
  7. The Helion platform provides a turnkey monitoring system that is ready to use immediately after cloud installation. This saves operators time and money by eliminating the need for a separate monitoring infrastructure and from having to manage complex network configurations. All aspects of the monitoring system are certified with HP Linux for Helion and Helion OpenStack and they are supported by HP. This saves operators set up time and lowers costs because operators do not need to stand up separate infrastructure or certify additional-plug ins with the Helion environment. HP Helion OpenStack monitoring ships with many integration points and can easily snap into existing data center management tooling and infrastructure. It ships with supported connectors with HPSW OMi and technical preview connectors for Ops A, Splunk, and ArcSight. The Helion OpenStack 2.0 documentation contains documented triage and resolution steps for common issues. This knowledge is based on years of OpenStack software operations experience from operating HP¡¯s public cloud services. Reduces time to production Simplifies start up experaince
  8. We are monitoring all of the OpenStack core service availability and performance metrics. We are collecting log events from all OpenStack core services and most of the shared services. For a complete listing of alarms and monitored services please see the Helion documentation Monasca/alarms.