ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Keeping the campus community informed
Status Updates
Shawn Plummer
Laurie Fox
State University of New York
Geneseo
Located in the historic village of Geneseo in
the upstate Finger Lakes region, the State
University of New York at Geneseo is a
premier public liberal arts college with a rich
tradition of academic excellence. We are
dedicated to developing socially responsible
citizens with skills and values for a productive
life.
System
Monitoring
Look at all the pretty lights
Status updates
Status updates
Status updates
Posting Status
Messages
Status updates
Be Brief
But give enough detail
But not too much detail
Avoid being to alarming
Try to keep it simple
Let people know when
Communication Timeline
? Establish when you will
communicate again
? Stick to the schedule, even if
you have nothing to report
? If you don¡¯t your customers will
wonder.
Rules for posting a status
message
? Be specific about what is impacted
? Be specific about when things are impacted
? Put it in simple language
? Put the most important information first
? Respect your users and front line staff
Informing the
Campus
Tips to Better Communicate
Tips to better communicate
? Automatically subscribe select users to the status system,
including all technical staff and student employees of the
department.
? Encourage all department chairs and secretaries to subscribe to
the status system to receive updates during outages.
? Search for mentions of the services that are impacted during the
emergency to respond to random mentions on social media.
? Engage customers via Twitter and email when there are problems
and link to status posts. This increases the awareness of your
status system and Twitter feed.
Status updates
This is not a drill
What if your electronic communication
systems are not working?
Communication
During an Outage
What do you when the
unexpected happens
Not all outages are equal
Initial Stages of
Outage
Initial communication
Scope
Impact
Duration
http://gunshowcomic.com/648
Roles
? People Fixing the problem
? Incident Commander
? Incident Communication Liaison
http://www.fema.gov/national-incident-management-system
https://blog.heroku.com/archives/2014/5/9/incident-response-at-heroku
http://en.wikipedia.org/wiki/Incident_Command_System
Timeline
? Move to a shared chat room
? Establish the IC/ICL
? Post an Initial Status about the issue
? Determine Scope, Impact, & Duration if Possible
? Coordinate the Response
? Mitigate the Problem
? Manage On-Going Responses
? Post-incident Cleanup
? Post-incident Follow-up
Information Flow
? Get the burden of communication off the people fixing it
? Ticket in ticket system with all parties subscribed
? Importance of internal communication channel
(HipChat/Slack)
? Ideally your communication medium can also serve as
documentation medium
How does this all work for us?
Questions
Shawn Plummer
@splum6
Laurie Fox
@RubyVixen

More Related Content

Status updates

  • 1. Keeping the campus community informed Status Updates Shawn Plummer Laurie Fox
  • 2. State University of New York Geneseo Located in the historic village of Geneseo in the upstate Finger Lakes region, the State University of New York at Geneseo is a premier public liberal arts college with a rich tradition of academic excellence. We are dedicated to developing socially responsible citizens with skills and values for a productive life.
  • 3. System Monitoring Look at all the pretty lights
  • 10. But give enough detail
  • 11. But not too much detail
  • 12. Avoid being to alarming
  • 13. Try to keep it simple
  • 15. Communication Timeline ? Establish when you will communicate again ? Stick to the schedule, even if you have nothing to report ? If you don¡¯t your customers will wonder.
  • 16. Rules for posting a status message ? Be specific about what is impacted ? Be specific about when things are impacted ? Put it in simple language ? Put the most important information first ? Respect your users and front line staff
  • 17. Informing the Campus Tips to Better Communicate
  • 18. Tips to better communicate ? Automatically subscribe select users to the status system, including all technical staff and student employees of the department. ? Encourage all department chairs and secretaries to subscribe to the status system to receive updates during outages. ? Search for mentions of the services that are impacted during the emergency to respond to random mentions on social media. ? Engage customers via Twitter and email when there are problems and link to status posts. This increases the awareness of your status system and Twitter feed.
  • 20. This is not a drill What if your electronic communication systems are not working?
  • 21. Communication During an Outage What do you when the unexpected happens
  • 22. Not all outages are equal
  • 23. Initial Stages of Outage Initial communication Scope Impact Duration http://gunshowcomic.com/648
  • 24. Roles ? People Fixing the problem ? Incident Commander ? Incident Communication Liaison http://www.fema.gov/national-incident-management-system https://blog.heroku.com/archives/2014/5/9/incident-response-at-heroku http://en.wikipedia.org/wiki/Incident_Command_System
  • 25. Timeline ? Move to a shared chat room ? Establish the IC/ICL ? Post an Initial Status about the issue ? Determine Scope, Impact, & Duration if Possible ? Coordinate the Response ? Mitigate the Problem ? Manage On-Going Responses ? Post-incident Cleanup ? Post-incident Follow-up
  • 26. Information Flow ? Get the burden of communication off the people fixing it ? Ticket in ticket system with all parties subscribed ? Importance of internal communication channel (HipChat/Slack) ? Ideally your communication medium can also serve as documentation medium
  • 27. How does this all work for us?

Editor's Notes

  1. Talk about how we still have all those detailed service checks but they are of limited use to most customers.
  2. Custom Field Template Get Custom Field Values HipChat iFrame Sunscribe2 WP TO Twitter
  3. A communication timeline is extremely important so that your users know the problem is still being worked on, No one needs to wonder
  4. Engage customers via Twitter and email when there are problems and link to status posts. This increases the awareness of your status system and Twitter feed. Encourage all department chairs and secretaries to subscribe to the status system to receive updates during outages. Search for mentions of the services that are impacted during the emergency to respond to random mentions on social media. Automatically subscribe select users to the status system, including all technical staff and student employees of the department. Have a plan for if your electronic communication systems are not working. These could include: Phone trees, external email addresses, sneaker net.
  5. This probably does not apply to a short outage that requires minimal fixing. But can be useful for all outages.
  6. IC/ICL could be same person or separate people. Their job is to Track what is being done, stay on top of it. Note things that may need to be undone or revisited in the post mortem. Handle getting more resources for a problem. Communicate in simple plain language the scope of the outage and when communication will occur next. It can also help to communicate some specific steps that are being taken so customers see progress is being made. Monitor next communication time and be ready to post an update Answer questions about the outage from the community Get feedback from the community about new developments and share it with the people working on the outage. Brief new comers. By default the IC is the ICL and is also the person that first starts working on the outage. For small or short lived outages this may not change. For outages that are not a quick fix, designate an IC. if you want someone to handle communication on behalf of the IC then the IC can have an ICL. The key is that the IC/ICL is not the person fixing the problem for long outages.
  7. Coordinate response. In coordinating the response, the IC focuses on bringing in the right people to solve the problem and making sure that they have the information they need. The IC can use a HipChat bot to page in additional teams as needed (the page will route to the on-call person for that team), or page individuals directly. The IC may also create a shared Google Doc for the team to collect notes together in real time, or start a high-bandwidth video call for more quickly working through issues than is possible with text chat. Mitigate problem. Once the response team has some sense of the problem, it will try to mitigate customer-facing effects if possible. For example, we may put the Platform API in maintenance mode to reduce load on infrastructure systems, or boot additional instances in our fleet to temporarily compensate for capacity issues. A successful mitigation will reduce the impact of the incident on customer apps and actions, or at least prevent the customer-facing issues from getting worse.
  8. The method for the team to communicate can be as detailed and