Decision Support Systems
College of Computing and Informatics
Week 8
Chapter 6:Deep Learning and Cognitive
Analytics, Data Science, & Artificial Intelligence
Systems For Decision Support
This Presentation is mainly dependent on this textbook
o 6.2 - Introduction to Deep Learning.
o 6.3 - Basics of “Shallow” Neural Networks.
o 6.4 - Process of Developing Neural Network–Based Systems.
o 6.5 - Illuminating the Black Box of ANN.
o 6.6 - Deep Neural Networks.
o 6.7 - Computer Frameworks for Implementation of Deep Learning.
o 6.8 - Cognitive Computing.
Weekly Learning Outcomes
1. Learn what deep learning is and how it is changing the world of computing.
2. Know the placement of deep learning within the broad family of artificial intelligence (AI) learning methods.
3. Understand how traditional “shallow” artificial neural networks (ANN) work.
4. Become familiar with the development and learning processes of ANN.
5. Develop an understanding of the methods to shed light into the ANN black box.
6. Know the underlying concept and methods for deep neural networks.
7. Become familiar with different types of deep learning methods.
8. Understand how convolutional neural networks (CNN) work.
9. Learn how recurrent neural networks (RNN) and long short-memory networks (LSTM) work.
10. Become familiar with the computer frameworks for implementing deep learning.
11. Know the foundational details about cognitive and learn how IBM Watson works and what types of application it can be
used for.
Required Reading
? Chapter 6: “Deep Learning and Cognitive Computing” from “Analytics, Data
Science, & Artificial Intelligence: Systems for Decision Support”.
Recommended Reading
? Black, D. (2018, January 23). AI Definitions: Machine Learning vs. Deep
Learning vs. Cognitive Computing vs. Robotics vs. Strong AI. Datanami.
Recommended Video
? What is Cognitive AI? Cognitive Computing vs Artificial Intelligence (2020, Jan 15).
[Video]. YouTube. https://www.youtube.com/watch?v=Zsl7ttA9Kcg
6.2 Introduction to Deep Learning
? Introduction to Deep Learning
? Classic Machine-Learning vs Deep Learning
Introduction to Deep Learning
? Deep learning is among the latest trends in AI that come with great expectations.
? The initial idea of deep learning goes back to the late 1980s.
? Goal: mimic the thought process of humans—using mathematical algorithms to learn from data
pretty much the same way that humans learn (similar to those of the other machine-leaning
? It has added the ability to automatically acquire the features required to accomplish highly
complex and unstructured tasks (e.g. image recognition) to the classic machine-learning
methods that contribute to the superior system performance.
? The recent emergence and popularity of deep learning can largely be attributed to very large
data sets and rapidly advancing commuting infrastructures.
? Many deep learning applications have promised to make our life easier.
? E.g., Google Home, Amazon’s Alexa, Google Translate, …)
Introduction to Deep
? Deep learning is an extension of neural networks with the idea that deep
learning is able to deal with more complicated tasks with a higher level of
? Neural networks are extended by employing many layers of connected
neurons along with much larger data sets to automatically characterized
variables and solve the problems.
? The initial idea of deep learning had to wait more than two decades until
some advanced computational and technological infrastructure emerged,
because of:
1. Very high computational requirement.
2. The need for very large data sets.
Introduction to Deep
Placement of Deep Learning within the Overarching AI-Based Learning Methods
? Deep learning is categorized as part of the representation learning within the AI learning
family of methods
? Representation learning focus on learning and discovering features by the system in
addition to discovering the mapping from those features to the output/target.
Classic Machine-Learning vs Deep
? In Knowledge-based systems and
classic machine-learning methods,
features (i.e., the representation)
are created manually by data
scientists to achieve the desired
? Deep learning enables the
computer to derive some complex
features from simple concepts that
would be very effort intensive to be
discovered by humans manually.
6.3 Basics of “Shallow” Neural Networks
? Artificial Neural Networks (ANN)
? Elements of an Artificial Neural Network
? Common Transfer Functions in Neural Networks
Artificial Neural Networks
? The human brain has a set of billions of interconnected neurons that facilitate our thinking,
learning, and understanding of the world around us.
? Artificial neural networks emulate the way the human brain works.
? The basic processing unit is a neuron. Multiple neurons are grouped into layers and linked
ANN with single neuron, single inputs and outputs
A Biological Neural Network: Two Interconnected
Processing Information in
? The basic processing unit is a neuron (processing element
– PE).
? PE: perform a set of predefined mathematical operations
on the numerical values coming from the input or from
the other neuron outputs to create and push out its own
? A neuron can have more than a single input p, each of
the individual input values would have its own adjustable
weight w.
? In a neural network, knowledge is stored in the weight
associated with the connections between neurons.
? Multiple neurons are grouped into layers and linked
Typical Neural Network with Three Layers and Eight
Elements of an Artificial Neural
? Processing element (PE)
? Network architecture
? Hidden layers
? Parallel processing
? Network information processing
? Inputs
? Outputs
? Connection weights
? Summation function
? Transfer Function
Neural Network with
One Hidden Layer
Elements of an Artificial Neural
Summation Function for a Single
Neuron/PE (a), and
Several Neurons/PEs (b)
Elements of an Artificial Neural
? Various types of transfer functions are commonly used in the design of neural
? Common Transfer Function types (Linear function, Sigmoid (log) function [0 1]
and Tangent Hyperbolic function [-1 1]).
? Example of ANN Transfer Function (sigmoid-type activation function)
Common Transfer Functions in Neural
? The selection of proper transfer functions for a
network requires a broad knowledge of neural
networks ( e.g. characteristics of the data as well as
the specific purpose for which the network is
? There are some guidelines for choosing the
appropriate transfer function especially for the
neurons located at the output layer of the network.
? E.g., if the nature of the output for a model is binary,
it is advised to use Sigmoid transfer functions at the
output layer so that it produces an output between 0
and 1.
some of the most common transfer functions and their
corresponding operations
6.4 Process of Developing Neural Network–Based
? Development Process of an ANN Model
? Learning Process in ANN
? Backpropagation Learning for ANN
? Overfitting in ANN
Development Process of an ANN
? Developing neural network–based systems requires a step-by-step process.
Learning Process in
? A supervised learning process.
? The learning process is inductive; that is,
connection weights are derived from existing
? The usual process of learning involves three
? Compute temporary outputs.
? Compare outputs with desired targets.
? Adjust the weights and repeat the
Supervised Learning Process of an ANN.
Backpropagation Learning for
? Backpropagation is the most popular supervised learning paradigm for ANN.
Backpropagation of Error for a Single Neuron
Backpropagation Learning for ANN
The learning algorithm procedure:
1. Initialize weights with random values and set other
2. Read in the input vector and the desired output.
3. Compute the actual output via the calculations, working
forward through the layers.
4. Compute the error.
5. Change the weights by working backward from the output
layer through the hidden layers.
Overfitting in ANN
? Occurs when neural networks
are trained for a large number
of iterations with relatively
small data sets.
? To prevent overfitting, the
training process is controlled by
an assessment process using a
separate validation data set.
Overfitting in ANN—Gradually Changing Error Rates in the
Training and Validation Data Sets As the Number of Iterations
? Sensitivity Analysis on ANN Models
Sensitivity Analysis on ANN
? ANNs are known as black-box models.
? But, “how the model does what it does?”
? ANNs lack of explanation/transparency -> black-box syndrome!
? To shed light into the black-box syndrome sensitivity analysis is applied.
? Sensitivity analysis:
1. Preformed on a trained ANN
2. Perturbed the inputs to the network systematically within the allowable
value ranges.
3. The corresponding change in the output is recorded for each and every
input variable.
4. The relative importance of input variables are illustrated in the result.
Sensitivity Analysis on ANN
? Sensitivity analysis extract the cause-and-effect relationships
among the inputs and the outputs of a trained neural network
6.6 Deep Neural Networks
? Deep Neural Networks
? Feedforward Multilayer Perceptron (MLP)
Deep Neural Networks
? Most neural network applications involved network architectures with only a few hidden
layers and a limited number of neurons in each layer.
? Deep neural networks broke the generally accepted notion of “no more than two hidden
layers are needed to formulate complex prediction problems.”
? They promote increasing the hidden layer to arbitrarily large numbers to better represent
the complexity in the data set.
? Different types of deep networks involve various modifications to the architecture of
standard neural networks.
? Typically equipped with distinct capabilities of dealing with particular data types for
advanced purposes (e.g. image or text processing).
Feedforward Multilayer Perceptron
? MLP deep networks (a.k.a deep feedforward networks) are the most general type of deep
? MLP Consists of an input layer, an output layer, and a number of hidden layers.
? The nodes in one layer are connected to the nodes in the next layer.
? Each node at the input layer typically represents a single attribute that may affect the prediction.
? The flow of information is always forwarding and no feedback connections, hence it is called “called
feedforward network”.
More Hidden Layers versus More Neurons?
? it is still an open research question, practically using more layers in a network seems to be more and
computationally more efficient than using many neurons in a few layers.
Feedforward Multilayer Perceptron
The First Three Layers in a Typical MLP Network.
? Frameworks
? Example DL Applications
? Deep learning implementation frameworks (open-source) include:
? Torch: is a scientific computing framework for implementing machine-learning algorithms
using GPUs.
? Caffe: The deep learning libraries are written in the C++ programming language, everything
is done using text files instead of code.
? TensorFlow: a popular deep learning framework, It was originally developed by the Google
Brain Group.
? Theano: one of the first deep learning frameworks
? Keras: functions as a high-level application programming interface (API) and is able to run on
top of various deep learning frameworks including Theano and TensorFlow.
Example DL
Source: https://www.mygreatlearning.com/blog/what-is-deep-learning/
6.8 Cognitive Computing
? Conceptual Framework for Cognitive Computing
? How Does Cognitive Computing Work?
? Cognitive Computing and AI
? Typical use cases for cognitive computing
? Cognitive analytics and Search
? IBM Watson
Cognitive Computing
? Cognitive computing makes a new class of problems computable.
? It address highly complex situations that are characterized by ambiguity
and uncertainty.
? Handles the kinds of problems that are thought to be solvable by human
ingenuity and creativity.
? Computing system offers a synthesis not just of information sources but
also of influences, contexts, and insights that help users understand their
Conceptual Framework for Cognitive
? To provide the best possible
answers to a given question or
problem, cognitive computing:
? finds and synthesizes data from
various information sources,
? And weighs the context and
conflicting evidence inherent in
the data.
? And suggest an answer that is
“best” rather than “right.”
a general framework for cognitive computing where data
and AI technologies are used to solve complex real-world
How Does Cognitive Computing
? Cognitive computing works much like a human thought process, reasoning
mechanism, and cognitive system.
? It includes self-learning technologies that use data mining, pattern recognition, deep
learning, and NLP to mimic the way the human brain works.
? Cognitive systems may draw on multiple sources of vast amounts of information,
including structured and unstructured data and visual, auditory, or sensor data solve
the types of problems that humans are typically tasked.
? Over time, cognitive systems are able to refine the way in which they learn and
recognize patterns and the way they process data to become capable of anticipating
new problems and modelling and proposing possible solutions.
How Does Cognitive Computing
The key attributes of cognitive computing capabilities:
? Adaptability: be flexible enough to learn as information changes and goals
? Interactivity: Users must be able to interact with cognitive machines and define
their needs as those needs change.
? Iterative and stateful: ability to maintaining information about similar situations
that have previously occurred.
? Contextual: must understand, identify, and mine contextual data, such as syntax,
time, location, domain, requirements, and a specific user’s profile, tasks, or goals.
Cognitive Computing and AI
Typical use cases for cognitive
? Development of smart and adaptive search engines.
? Effective use of natural language processing.
? Speech recognition.
? Language translation.
? Context-based sentiment analysis.
? Face recognition and facial emotion detection.
? Risk assessment and mitigation.
? Fraud detection and mitigation.
? Behavioral assessment and recommendations.
Cognitive analytics
? Cognitive analytics is a term that refers to cognitive computing–branded
technology platforms.
? E.g., IBM Watson specialize in the processing and analysis of large unstructured data sets.
? The benefit of utilizing cognitive analytics over traditional Big Data
analytics tools is that for cognitive analytics such data sets do not need to
be pretagged.
? Cognitive analytics systems can use machine learning to adapt to different
contexts with minimal human supervision.
? These systems can be equipped with a chatbot or search assistant that understands
queries, explains data insights, and interacts with humans in human languages.
Cognitive Search
? Searching for information is a tedious task.
? Cognitive search is the new generation of search method that uses AI (e.g.,
advanced indexing, NLP, and machine learning) to return results that are
much more relevant to the user than traditional search methods.
? It creates searchable information out of non-searchable content by
leveraging cognitive computing algorithms to create an indexing platform.
? Cognitive search proposes the next generation of search tailored for use in
Cognitive Search
Cognitive search is different from traditional
search because, according to Gualtieri (2017),
? Can handle a variety of data types.
? Can contextualize the search space.
? Employ advanced AI technologies.
? Enable developers to build enterprise-
specific search applications.
The progressive evolution of search methods.
IBM Watson
? IBM Watson is perhaps the smartest computer system built to date. It has coined and
popularized the term cognitive computing.
? It is an extraordinary computer system—a novel combination of advanced hardware
and software—designed to answer questions posed in natural human language.
? IBM Watson beat the best of men (the two most winning competitors) at the quiz game
Jeopardy!, showcasing the ability of commuters to do tasks that are designed for human
? Watson and systems like it are now in use in many application areas including:
? Healthcare, finance, security, retail, education, government and research.
How Does Watson Do
? DeepQA is the system behind Watson, which is a massively parallel, text mining–
focused, probabilistic evidence–based computational architecture.
? Goal: to bring their strengths to bear and contribute to improvements in accuracy,
confidence, and speed.
Principles in DeepQA
? Massive parallelism.
? Many experts.
? Pervasive confidence estimation.
? Integration of shallow and deep knowledge.
How Does Watson Do
A High-Level Depiction of DeepQA Architecture
Main Reference
? Chapter 6: “Deep Learning and Cognitive Computing” from “Analytics, Data
Science, & Artificial Intelligence: Systems for Decision Support”.
? Application case 6.1 to 6.8 from “Analytics, Data Science, & Artificial Intelligence:
Systems for Decision Support”
Week self-review exercises
Thank You

More from AiondBdkpt (10)



