際際滷

際際滷Share a Scribd company logo
Kaya Ota
1/26/2015-3/14/2015
Computer Science Department
San Jose State University
*
*
*This is made for preparation to the mid-term
for CS156 Spring2015 in SJSU
*May contain misunderstandings!
*
* what is AI?
*PEAS
*Problem-Solving Agent
*Informed and Uninformed Search
* Informed Search
*Bidirectional
*A* search
*Uninformed Search
*DFS with Problem-Solving Agent
*BFS with Problem-Solving Agent
*Uniform Cost Search
*
*Beyond Advanced Search
*Hill-Climbing
*(deterministic) Hill Climbing
*Stochastic Hill Climbing
*Random Hill Climbing
* Local Search
*Simulated Annealing
*Local Beam Search
*Genetic Algorithm
*Game and Adversarial Search
*Minmax Algorirhm
*Constraint Satisfaction
*CSP
*
*AI is to solve an existing problem without human hands!
*Acting like a human
*Do natural language processing: to communicate with human
languages
*Knowledge representation : to store what it hears
*Automated reasoning : to draw a conclusion based on stored info.
*Machine learning :to adapt new circumstance and uncertainty
*Thinking like a human
*Brings together Cognitive Science that include linguistics,
neuroscience, AI, philosophy, anthropology and psychology
*
*Acting Rationally
*Algorithms that allow an agent to act to achieve the best
outcome/best expected outcome possible for the agent. That is a
rational agent
*Thinking Rationally
*Might involve algorithms for reasoning about things expressed in
formal logic
Definition of a Rational Agent
For each possible percept sequence, the rational agent select an action
that is expected to maximize its performance measure, given the
evidence provided by the percept sequence and whatever build-in
knowledge it has.
*PEAS
*PEAS stands for Performance, Environment, Actuator, Sensors
*Performance : how well based on which perspective.
*How do we measure quality of an agent? How well an agent find a
solution.
*Environment : Conditions for limitations based on a situation
description of a situation where an agent is and based on the
description, we want to code.
*Actuator : something make an agent act
*Sensors :something needs to make a decision
*More general(abstract) ideas then task environment
Define this 4 factor is the first
step to make AI!
*
Sensors to
detect
something
Actuators
to
execute
an action
Actuators
to
execute
an action
*
*Defining a task environment means defining PEAS of Environment
*Task Environment Properties
*Fully Observable vs Partially Observable
* Fully={chess, 8puzzle´} partial = {taxi, soccer, rescue´}
*Single Agent vs Multi-Agent
* Singleton = {rescue, 8puzzle´} Multi ={tic-toc-toe, soccer, chess}
*Deterministic vs Stochastic
* Deterministic = { } Stochastic = { }
*Episodic vs Sequential
*Static vs Dynamic
* Static = {rescue, chess, 8puzzle´} Dynamic = {soccer, taxi´}
*Discrete vs Continuous
* Discrete = {8puzzle} Continuous = {soccer}
*Known vs Unknown
* Known ={ } unknown = { }
*
*Ways of measuring a search strategy
*Completeness = guaranteed to find a solution if there is one.
*Optimality = whether found solution is best, how good.
*Time Complexity = the speed of finding a solution
*Space Complexity = how much memory is required to execute the
strategy ( i.e. how deep the smallest goal state
in the search tree.)
*Depth First Search
*Breadth First Search
*Uniform Cost Search
*Bidirectional Search
*A*star search
*
*
*Heuristic Function is a function that ranks alternatives in search
algorithms at each branching step based on available info to
decide which branch to follow.
Greedy Best First Search : chooses for its next node to expand
the node of the smallest cost according to this heuristic.
*
A* search
non-Heuristic
Heuristic
Uniform-Cost Search
Greedy Algorithm
Greedy best first
Not Complete i.e.
sometimes cannot find a
solution even if there is one
*
*A* search is a function that is a combination of a heuristic
function and a cost function(= return cost from the root to
current)
Greedy Algorithm,
estimating the cost to a
solution
Cost to get the
current node
A* Search
*
*Define A* search f(n) = c(n) + h(n)
*where c(n) =cost to get current node
*And h(n) = heuristic function which returns the number of the
panel matched to the goal state
*7 2 5
3 4
1 8 6
7 2 5
3 4
1 8 6
7 2 5
3 4
1 8 6
7 2 5
3 8 4
1 6
7 5
3 2 4
1 8 6
d
o
w
n
l
e
f
t
c(n) = 1
h(n) = 2
1 2 3
4 5 6
7 8
c(n) = 1
h(n) = 2
c(n) = 1
h(n) = 1
c(n) = 1
h(n) = 1
Go on the next Go on the next
Because light blue and red take max value
from A* search.
*
7 2 5
3 4
1 8 6
7 2 5
3 4
1 8 6
c(n) = 1
h(n) = 2
c(n) = 1
h(n) = 2
2 5
7 3 4
1 8 6
c(n) = 1
h(n) = 2
7 2 5
1 3 4
8 6
c(n) = 1
h(n) = 2
7 2
3 4 5
1 8 6
c(n) = 1
h(n) = 2
7 2 5
3 4 6
1 8
c(n) = 1
h(n) = 3
1 2 3
4 5 6
7 8
And so on´´´Until we get the final state
*
*Because breadth first search is Uninformed search, the way of
processing does not refer the goal state i.e. not use additional
info
*7 2 5
3 4
1 8 6
7 2
3 4 5
1 8 6
7 2 5
3 4
1 8 6
7 2 5
3 8 4
1 6
7 5
3 2 4
1 8 6
d
o
w
n
l
e
f
t
7 2 5
1 3 4
8 6
7 2 5
3 8 4
1 6
7 5
3 2 4
1 8 6
7 5
3 2 4
1 8 6
7 2 5
3 8 4
1 6
2 5
7 3 4
1 8 6
7 2 5
3 4 6
1 8
7 2 5
3 4
1 8 6
*7 2 5
3 4
1 8 6
7 2
3 4 5
1 8 6
7 2 5
3 4
1 8 6
7 2 5
3 8 4
1 6
7 5
3 2 4
1 8 6
7 2 5
1 3 4
8 6
7 2 5
3 8 4
1 6
7 5
3 2 4
1 8 6
7 5
3 2 4
1 8 6
7 2 5
3 8 4
1 6
2 5
7 3 4
1 8 6
7 2 5
3 4
1 8 6
7 2 5
3 4 6
1 8
1 2 3
4 5 6
7 8
1 2 3
4 5
7 8 6
1 2 3
4 5 6
7 8
1 2 3
4 5 6
7 8
1 2 3
4 5 6
7 8
1 2 3
4 6
7 5 8
1 2
4 5 3
7 8 6
*
*Bidirectional is another informed search because it refers to
the goal state, as A* does so.
*
*
*Hill Climbing Algorithm is, to choose successors, look at the result
of evaluation function value of neighbor state, and pick one
neighbor. If the evaluate function returns the same value for more
then two neighbors, then it will pick one neighbor randomly.
*
*Stochastic Hill Climbing checks neighbors of the evaluation
function¨s value, assign probabilities depending on the evaluation
function, then roll a die!(or get starts) to decide which branch to
choose.
*
*Random Hill Climbing with restart
1. Pick an initial state randomly
2. Run hill-climbing algorithm
i. If it can find solution, then done
ii. If it cannot find solution, then pick another initial state randomly and
go back to 2.
*
*
1. Begin with k-randomly generated states.
2. Generate k-successors of all generated states.
3. Pick k-best successors
4. Using the k-best successors, go back to 1.
*Step 1
Step 2
Step 3
Back to step1
*
*
*Adversarial Search problems: which is often called Games.
*Adversarial =羨する。
*Used in Multi-agent AND competitive environment
*Consider turn taking games, and have two players
*And zero sum with perfect information
*Perfect information means the game is deterministic and the
environment is visible to all players.
*Zero sum means if player1 gets payoff(+a), then the player2 gets
payoff(-a). It is possible for both players to get pay-off 0.
*
*Let S0 be the initial state of the game. i.e. How a game is set up.
*Player(S) returns a player who have a right to take an action in
state Si
*Action(S) returns the set of legal moves in a state.
*Result(S,A) the transition model which define the result of moves
*Terminal_Test(S) determine when the game is over
*Utility(S,P) defines the final numeric value
*Optimal Strategy: to choose the best choice for us as possible as
we can even if the opponents always pick the best case
*
*Procedure MinMax takes State S as an argument.
*Define it like below:
*This function returns Numerical value that depends on goodness
of the choice and seems it is the best
*Player Max likes to choose bigger number
*Player Min likes to choose smaller number
*Example of Tic-Toc-Toe
*Action(S0) = {Up, Right, Down, Left}
*S0
T0
U3U1U0
V1 V3
T1
U2
V0 V2
Min Max
We want to know:
Minmax(S0)
A0 A1
B0 B1 B2 B0
C0 C1 C2 C3
Action(S)
*S0
T0
U3U1U0
V1 V3
T1
U2
V0 V2
Min Max
We want to know:
Minmax(S0)
A0 A1
B0 B1 B2 B0
C0 C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
*S0
T0
U3U1U0
V1 V3
T1
U2
V0 V2
Min Max
A0 A1
B0 B1 B2 B3
C0 C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
Max_val(U0)=
Max{Min_val(Result(U0,C0))}
= Max{Min_val(V0)}
Max_val(U2)=
Max{Min_val(Result(U2,C2))}
= Max{Min_val(V2)}
Max_val(U3)=
Max{Min_val(Result(U3,C3))}
= Max{Min_val(V3)}
Max_val(U1)=
Max{Min_val(Result(U1,C1))}
= Max{Min_val(V1)}
*S0
T0
U3U1U0
V1 V3
T1
U2
V0 V2
Min Max
A0 A1
B0 B1 B2 B3
C0 C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
1 2 0 -3
*S0
T0
U3U1
U0
V1 V3
T1
U2
V0 V2
Min Max
A0 A1
B0 B1 B2 B3
C0
C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
1
2
0 -3
Max_val(T0)=
min{Max_val(Result(T0,B0))
Max_val(Result(T0,B1))}
= min{Max_val(U0),Max_val(U1)}
=min{1,2} = 1
Max_val(T1)=
min{Max_val(Result(T1,B2))
Max_val(Result(T1,B3))}
= min{Max_val(U2),Max_val(U3)}
=min{0,-3} = -3
*S0
T0
U3U1
U0
V1 V3
T1
U2
V0 V2
Min Max
A0 A1
B0
B1 B2 B3
C0
C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
1
2
0 -3
1 -3
*S0
T0
U3U1
U0
V1 V3
T1
U2
V0 V2
Min Max
A0 A1
B0
B1 B2 B3
C0
C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
1
2
0 -3
1
-3
Minmax(S0)=
max{Min_val(Result(S0,A0))
Min_val(Result(S0,A1))}
= max{Min_val(T0),Min_val(T1)}
=´recursive´=max{1,-3} = 1
*S0
T0
U3U1
U0
V1 V3
T1
U2
V0 V2
Min Max
A0 A1
B0
B1 B2 B3
C0
C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
1
2
0 -3
1
-3
max{1,-3} = 1
So, the player Max should pick the way
that he can get 1.
i.e. Action = A0 seems reasonable choice
*
*Problem of Minmax algorithm
*take too much space and time
*We want to minimize the search of the best choice.
*So, Alpha Beta Pruning comes in!
*S0
T0
U3U1
U0
V1 V3
T1
U2
V0 V2
Min Max
A0 A1
B0
B1 B2 B3
C0
C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
1
2
1
?
Assume we calculated the left side
of subtree.
And so we know we will get 1 if we
take Action A0
Not
Calculate yet
*S0
T0
U3U1
U0
V1 V3
T1
U2
V0 V2
Min Max
A0 A1
B0
B1 B2 B3
C0
C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
1
2
1
?
We know Next we will take Max.
So, if ? Is less then 0, then it does
not need to evaluate the right side
of subtree.
*S0
T0
U3U1
U0
V1 V3
T1
U2
V0 V2
Min Max
A0 A1
B0
B1 B2 B3
C0
C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
1
2
1
?
So, we evaluate little more,
evaluate the right of the left
subtree. And we get 0 in result
0
*S0
T0
U3U1
U0
V1 V3
T1
U2
V0 V2
Min Max
A0 A1
B0
B1 B2 B3
C0
C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
1
2
1
?
So, we evaluate little more,
evaluate the right of the left
subtree. And we get 0 in result
0
*S0
T0
U3U1
U0
V1 V3
T1
U2
V0 V2
Min Max
A0 A1
B0
B1 B2 B3
C0
C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
1
2
1
?
Because we get 0 in result,
If ? > 0 then ? Is not selected because
we take min in this level.
Also, if ? < 1, then ? Is not selected as
decision of state S0
0
*S0
T0
U3U1
U0
V1 V3
T1
U2
V0 V2
Min Max
A0 A1
B0
B1 B2 B3
C0
C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
1
2
1
?
So, we can skip evaluation the right
of the left subtree
0
*S0
T0
U3U1
U0
V1 V3
T1
U2
V0 V2
Min Max
A0 A1
B0
B1 B2 B3
C0
C1 C2 C3
Action(S)
Min_val(V0)
=Utility(V0,Max)=1
Min_val(V1)
=Utility(V1,Max)=2
Min_val(V2)
=Utility(V2,Max)=0
Min_val(V3) =
Utility(V3,Max)=-3
1
2
1
?
So, we can skip evaluation the right
of the left subtree
0
*
*
*Constraint Satisfaction Problem, CSP, is defined as a set of
objects whose state must satisfy a number of constrains or
limitations.
*
*A variable is said to be a node consistent if the value in its
current domain satisfy all the variable¨s unary constrains.
*Example:
Given Limitation {SA != Green} this is Node consistency
Given Domain {Red, Green, Blue}
Possible Value for SA {Red, Green, Blue} = {Red, Blue}
*
*A variable is arc constraint if the values in its current domain
satisfy all the variable¨s binary constraints
*Example Variable A != Variable B
*
*A pair of variables are said to be path consistency with respect
to a third variable if for every assignment of the first two
variables consistent with their constrains, there is an assignment
to the third variable that satisfies the constrains of the third
variable with the first variable and the third variable with the
second variable
*
*Minimum-Remaining Value(MRV) heuristic
*Choose the variable from amongst those that have the fewest
^legal ̄ values.
*It is also known as the most-constrained variable heuristic and the
fail-fast heuristic.
*Degree heuristic
*Choose the variable that is involved in the largest number on other
unassigned variables
*Less possibility to use backtracking
*
*Simple Map Coloring has
an arc consistency
* Adjacent Node can not be
the same color.
*E.g. SA != WA
*Domain = {Green, Blue,
light Blue(LB)}
WA
Q
SA
NT
NSW
V
*
Q 1
WA 3
NSW
3
V
2
NT
2
SA 2
Q 1
NSW
3
V
2
SA
2
WA
2
WA
2
Q 3
SA
5
NT
3
NSW
3
V 2
SA 1
V
2
WA
2
NSW
3
*
*In Austrian example, it might choose the vertex which
minimum degree. i.e. least constrain
*Because All variables={WA,SA,NT,Q,V,NSW} have its domains
={Green, Blue, Light Blue} = 3.
*The case it does not choose WA first, we need to back tracking
*This algorithm seems powerful when we have node
consistency or unary consistency.
*
Q 3
SA 5
NSW
3
V
2
NT
3
WA
2
Q 3
NSW
3
V
2
NT
3
WA
2
WA
2
Q 3
SA
5
NT
3
NSW
3
V 2
Pick 1st b/c it
has
*
*Degree Heuristic¨s choice depends on the number of degree in
graph representation.
*So, it choose the vertex that has the most arc constraint.

More Related Content

Midterm review for CS156

  • 1. Kaya Ota 1/26/2015-3/14/2015 Computer Science Department San Jose State University *
  • 2. * *This is made for preparation to the mid-term for CS156 Spring2015 in SJSU *May contain misunderstandings!
  • 3. * * what is AI? *PEAS *Problem-Solving Agent *Informed and Uninformed Search * Informed Search *Bidirectional *A* search *Uninformed Search *DFS with Problem-Solving Agent *BFS with Problem-Solving Agent *Uniform Cost Search
  • 4. * *Beyond Advanced Search *Hill-Climbing *(deterministic) Hill Climbing *Stochastic Hill Climbing *Random Hill Climbing * Local Search *Simulated Annealing *Local Beam Search *Genetic Algorithm *Game and Adversarial Search *Minmax Algorirhm *Constraint Satisfaction *CSP
  • 5. * *AI is to solve an existing problem without human hands! *Acting like a human *Do natural language processing: to communicate with human languages *Knowledge representation : to store what it hears *Automated reasoning : to draw a conclusion based on stored info. *Machine learning :to adapt new circumstance and uncertainty *Thinking like a human *Brings together Cognitive Science that include linguistics, neuroscience, AI, philosophy, anthropology and psychology
  • 6. * *Acting Rationally *Algorithms that allow an agent to act to achieve the best outcome/best expected outcome possible for the agent. That is a rational agent *Thinking Rationally *Might involve algorithms for reasoning about things expressed in formal logic Definition of a Rational Agent For each possible percept sequence, the rational agent select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever build-in knowledge it has.
  • 7. *PEAS *PEAS stands for Performance, Environment, Actuator, Sensors *Performance : how well based on which perspective. *How do we measure quality of an agent? How well an agent find a solution. *Environment : Conditions for limitations based on a situation description of a situation where an agent is and based on the description, we want to code. *Actuator : something make an agent act *Sensors :something needs to make a decision *More general(abstract) ideas then task environment Define this 4 factor is the first step to make AI!
  • 9. * *Defining a task environment means defining PEAS of Environment *Task Environment Properties *Fully Observable vs Partially Observable * Fully={chess, 8puzzle´} partial = {taxi, soccer, rescue´} *Single Agent vs Multi-Agent * Singleton = {rescue, 8puzzle´} Multi ={tic-toc-toe, soccer, chess} *Deterministic vs Stochastic * Deterministic = { } Stochastic = { } *Episodic vs Sequential *Static vs Dynamic * Static = {rescue, chess, 8puzzle´} Dynamic = {soccer, taxi´} *Discrete vs Continuous * Discrete = {8puzzle} Continuous = {soccer} *Known vs Unknown * Known ={ } unknown = { }
  • 10. * *Ways of measuring a search strategy *Completeness = guaranteed to find a solution if there is one. *Optimality = whether found solution is best, how good. *Time Complexity = the speed of finding a solution *Space Complexity = how much memory is required to execute the strategy ( i.e. how deep the smallest goal state in the search tree.)
  • 11. *Depth First Search *Breadth First Search *Uniform Cost Search *Bidirectional Search *A*star search *
  • 12. * *Heuristic Function is a function that ranks alternatives in search algorithms at each branching step based on available info to decide which branch to follow. Greedy Best First Search : chooses for its next node to expand the node of the smallest cost according to this heuristic.
  • 13. * A* search non-Heuristic Heuristic Uniform-Cost Search Greedy Algorithm Greedy best first Not Complete i.e. sometimes cannot find a solution even if there is one
  • 14. * *A* search is a function that is a combination of a heuristic function and a cost function(= return cost from the root to current) Greedy Algorithm, estimating the cost to a solution Cost to get the current node A* Search
  • 15. * *Define A* search f(n) = c(n) + h(n) *where c(n) =cost to get current node *And h(n) = heuristic function which returns the number of the panel matched to the goal state
  • 16. *7 2 5 3 4 1 8 6 7 2 5 3 4 1 8 6 7 2 5 3 4 1 8 6 7 2 5 3 8 4 1 6 7 5 3 2 4 1 8 6 d o w n l e f t c(n) = 1 h(n) = 2 1 2 3 4 5 6 7 8 c(n) = 1 h(n) = 2 c(n) = 1 h(n) = 1 c(n) = 1 h(n) = 1 Go on the next Go on the next Because light blue and red take max value from A* search.
  • 17. * 7 2 5 3 4 1 8 6 7 2 5 3 4 1 8 6 c(n) = 1 h(n) = 2 c(n) = 1 h(n) = 2 2 5 7 3 4 1 8 6 c(n) = 1 h(n) = 2 7 2 5 1 3 4 8 6 c(n) = 1 h(n) = 2 7 2 3 4 5 1 8 6 c(n) = 1 h(n) = 2 7 2 5 3 4 6 1 8 c(n) = 1 h(n) = 3 1 2 3 4 5 6 7 8 And so on´´´Until we get the final state
  • 18. * *Because breadth first search is Uninformed search, the way of processing does not refer the goal state i.e. not use additional info
  • 19. *7 2 5 3 4 1 8 6 7 2 3 4 5 1 8 6 7 2 5 3 4 1 8 6 7 2 5 3 8 4 1 6 7 5 3 2 4 1 8 6 d o w n l e f t 7 2 5 1 3 4 8 6 7 2 5 3 8 4 1 6 7 5 3 2 4 1 8 6 7 5 3 2 4 1 8 6 7 2 5 3 8 4 1 6 2 5 7 3 4 1 8 6 7 2 5 3 4 6 1 8 7 2 5 3 4 1 8 6
  • 20. *7 2 5 3 4 1 8 6 7 2 3 4 5 1 8 6 7 2 5 3 4 1 8 6 7 2 5 3 8 4 1 6 7 5 3 2 4 1 8 6 7 2 5 1 3 4 8 6 7 2 5 3 8 4 1 6 7 5 3 2 4 1 8 6 7 5 3 2 4 1 8 6 7 2 5 3 8 4 1 6 2 5 7 3 4 1 8 6 7 2 5 3 4 1 8 6 7 2 5 3 4 6 1 8 1 2 3 4 5 6 7 8 1 2 3 4 5 7 8 6 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 6 7 5 8 1 2 4 5 3 7 8 6
  • 21. * *Bidirectional is another informed search because it refers to the goal state, as A* does so.
  • 22. *
  • 23. * *Hill Climbing Algorithm is, to choose successors, look at the result of evaluation function value of neighbor state, and pick one neighbor. If the evaluate function returns the same value for more then two neighbors, then it will pick one neighbor randomly.
  • 24. * *Stochastic Hill Climbing checks neighbors of the evaluation function¨s value, assign probabilities depending on the evaluation function, then roll a die!(or get starts) to decide which branch to choose.
  • 25. * *Random Hill Climbing with restart 1. Pick an initial state randomly 2. Run hill-climbing algorithm i. If it can find solution, then done ii. If it cannot find solution, then pick another initial state randomly and go back to 2.
  • 26. *
  • 27. * 1. Begin with k-randomly generated states. 2. Generate k-successors of all generated states. 3. Pick k-best successors 4. Using the k-best successors, go back to 1.
  • 28. *Step 1 Step 2 Step 3 Back to step1
  • 29. *
  • 30. * *Adversarial Search problems: which is often called Games. *Adversarial =羨する。 *Used in Multi-agent AND competitive environment *Consider turn taking games, and have two players *And zero sum with perfect information *Perfect information means the game is deterministic and the environment is visible to all players. *Zero sum means if player1 gets payoff(+a), then the player2 gets payoff(-a). It is possible for both players to get pay-off 0.
  • 31. * *Let S0 be the initial state of the game. i.e. How a game is set up. *Player(S) returns a player who have a right to take an action in state Si *Action(S) returns the set of legal moves in a state. *Result(S,A) the transition model which define the result of moves *Terminal_Test(S) determine when the game is over *Utility(S,P) defines the final numeric value *Optimal Strategy: to choose the best choice for us as possible as we can even if the opponents always pick the best case
  • 32. * *Procedure MinMax takes State S as an argument. *Define it like below: *This function returns Numerical value that depends on goodness of the choice and seems it is the best *Player Max likes to choose bigger number *Player Min likes to choose smaller number *Example of Tic-Toc-Toe *Action(S0) = {Up, Right, Down, Left}
  • 33. *S0 T0 U3U1U0 V1 V3 T1 U2 V0 V2 Min Max We want to know: Minmax(S0) A0 A1 B0 B1 B2 B0 C0 C1 C2 C3 Action(S)
  • 34. *S0 T0 U3U1U0 V1 V3 T1 U2 V0 V2 Min Max We want to know: Minmax(S0) A0 A1 B0 B1 B2 B0 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3
  • 35. *S0 T0 U3U1U0 V1 V3 T1 U2 V0 V2 Min Max A0 A1 B0 B1 B2 B3 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3 Max_val(U0)= Max{Min_val(Result(U0,C0))} = Max{Min_val(V0)} Max_val(U2)= Max{Min_val(Result(U2,C2))} = Max{Min_val(V2)} Max_val(U3)= Max{Min_val(Result(U3,C3))} = Max{Min_val(V3)} Max_val(U1)= Max{Min_val(Result(U1,C1))} = Max{Min_val(V1)}
  • 36. *S0 T0 U3U1U0 V1 V3 T1 U2 V0 V2 Min Max A0 A1 B0 B1 B2 B3 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3 1 2 0 -3
  • 37. *S0 T0 U3U1 U0 V1 V3 T1 U2 V0 V2 Min Max A0 A1 B0 B1 B2 B3 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3 1 2 0 -3 Max_val(T0)= min{Max_val(Result(T0,B0)) Max_val(Result(T0,B1))} = min{Max_val(U0),Max_val(U1)} =min{1,2} = 1 Max_val(T1)= min{Max_val(Result(T1,B2)) Max_val(Result(T1,B3))} = min{Max_val(U2),Max_val(U3)} =min{0,-3} = -3
  • 38. *S0 T0 U3U1 U0 V1 V3 T1 U2 V0 V2 Min Max A0 A1 B0 B1 B2 B3 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3 1 2 0 -3 1 -3
  • 39. *S0 T0 U3U1 U0 V1 V3 T1 U2 V0 V2 Min Max A0 A1 B0 B1 B2 B3 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3 1 2 0 -3 1 -3 Minmax(S0)= max{Min_val(Result(S0,A0)) Min_val(Result(S0,A1))} = max{Min_val(T0),Min_val(T1)} =´recursive´=max{1,-3} = 1
  • 40. *S0 T0 U3U1 U0 V1 V3 T1 U2 V0 V2 Min Max A0 A1 B0 B1 B2 B3 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3 1 2 0 -3 1 -3 max{1,-3} = 1 So, the player Max should pick the way that he can get 1. i.e. Action = A0 seems reasonable choice
  • 41. * *Problem of Minmax algorithm *take too much space and time *We want to minimize the search of the best choice. *So, Alpha Beta Pruning comes in!
  • 42. *S0 T0 U3U1 U0 V1 V3 T1 U2 V0 V2 Min Max A0 A1 B0 B1 B2 B3 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3 1 2 1 ? Assume we calculated the left side of subtree. And so we know we will get 1 if we take Action A0 Not Calculate yet
  • 43. *S0 T0 U3U1 U0 V1 V3 T1 U2 V0 V2 Min Max A0 A1 B0 B1 B2 B3 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3 1 2 1 ? We know Next we will take Max. So, if ? Is less then 0, then it does not need to evaluate the right side of subtree.
  • 44. *S0 T0 U3U1 U0 V1 V3 T1 U2 V0 V2 Min Max A0 A1 B0 B1 B2 B3 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3 1 2 1 ? So, we evaluate little more, evaluate the right of the left subtree. And we get 0 in result 0
  • 45. *S0 T0 U3U1 U0 V1 V3 T1 U2 V0 V2 Min Max A0 A1 B0 B1 B2 B3 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3 1 2 1 ? So, we evaluate little more, evaluate the right of the left subtree. And we get 0 in result 0
  • 46. *S0 T0 U3U1 U0 V1 V3 T1 U2 V0 V2 Min Max A0 A1 B0 B1 B2 B3 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3 1 2 1 ? Because we get 0 in result, If ? > 0 then ? Is not selected because we take min in this level. Also, if ? < 1, then ? Is not selected as decision of state S0 0
  • 47. *S0 T0 U3U1 U0 V1 V3 T1 U2 V0 V2 Min Max A0 A1 B0 B1 B2 B3 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3 1 2 1 ? So, we can skip evaluation the right of the left subtree 0
  • 48. *S0 T0 U3U1 U0 V1 V3 T1 U2 V0 V2 Min Max A0 A1 B0 B1 B2 B3 C0 C1 C2 C3 Action(S) Min_val(V0) =Utility(V0,Max)=1 Min_val(V1) =Utility(V1,Max)=2 Min_val(V2) =Utility(V2,Max)=0 Min_val(V3) = Utility(V3,Max)=-3 1 2 1 ? So, we can skip evaluation the right of the left subtree 0
  • 49. *
  • 50. * *Constraint Satisfaction Problem, CSP, is defined as a set of objects whose state must satisfy a number of constrains or limitations.
  • 51. * *A variable is said to be a node consistent if the value in its current domain satisfy all the variable¨s unary constrains. *Example: Given Limitation {SA != Green} this is Node consistency Given Domain {Red, Green, Blue} Possible Value for SA {Red, Green, Blue} = {Red, Blue}
  • 52. * *A variable is arc constraint if the values in its current domain satisfy all the variable¨s binary constraints *Example Variable A != Variable B
  • 53. * *A pair of variables are said to be path consistency with respect to a third variable if for every assignment of the first two variables consistent with their constrains, there is an assignment to the third variable that satisfies the constrains of the third variable with the first variable and the third variable with the second variable
  • 54. * *Minimum-Remaining Value(MRV) heuristic *Choose the variable from amongst those that have the fewest ^legal ̄ values. *It is also known as the most-constrained variable heuristic and the fail-fast heuristic. *Degree heuristic *Choose the variable that is involved in the largest number on other unassigned variables *Less possibility to use backtracking
  • 55. * *Simple Map Coloring has an arc consistency * Adjacent Node can not be the same color. *E.g. SA != WA *Domain = {Green, Blue, light Blue(LB)} WA Q SA NT NSW V
  • 56. * Q 1 WA 3 NSW 3 V 2 NT 2 SA 2 Q 1 NSW 3 V 2 SA 2 WA 2 WA 2 Q 3 SA 5 NT 3 NSW 3 V 2 SA 1 V 2 WA 2 NSW 3
  • 57. * *In Austrian example, it might choose the vertex which minimum degree. i.e. least constrain *Because All variables={WA,SA,NT,Q,V,NSW} have its domains ={Green, Blue, Light Blue} = 3. *The case it does not choose WA first, we need to back tracking *This algorithm seems powerful when we have node consistency or unary consistency.
  • 58. * Q 3 SA 5 NSW 3 V 2 NT 3 WA 2 Q 3 NSW 3 V 2 NT 3 WA 2 WA 2 Q 3 SA 5 NT 3 NSW 3 V 2 Pick 1st b/c it has
  • 59. * *Degree Heuristic¨s choice depends on the number of degree in graph representation. *So, it choose the vertex that has the most arc constraint.