Your IP : 216.73.216.40

Current Path : /var/www/html/rkala/c3p5/
Current File : /var/www/html/rkala/c3p5/IIT2019237.txt
Note: You have to upload every small piece of rough work done. 


question no. 1: 
For the following planning problem, use the planning graph. Show all the variables generated at level 2, assuming source is level 0. 
Also calculate the sum level heuristic and max level heuristic for the given goal 
Also calculate the mutex relationships at the first action level only. 
The first line of the answer is all the state variables generated till level 2 in a sorted order. Use ~ to denote negation. 
The second line of the answer is the sum level heuristic and max level heuristic separated by a space 
The third line of the answer is the mutex relationships in pairs. The first two words should be the first mutex relationship. The next two words should be the next mutex relationship, and so on. The names of the persistence action will be the name of the associated variable itself. Use ~ for not. 

Source: N, P, ~X
Goal: N, ~P, T
Action A1: Precondition N, P, Effect ~P
Action A2: Precondition P, ~X, Effect X
Action A3: Precondition ~P, N, Effect ~N
Action A4: Precondition P, X, Effect T




question no. 2: 
For the undirected graph represented by the following weight matrix and the given heuristic function, give the vertices in a sorted order at which the heuristic function is **not** admissible. Source is 0 and goal is 5. The vertices are numbered 0, 1, 2, 3, 4 and 5. -1 means no edge.
Answer is a list of sorted vertices or -1 if the heuristic function is admissible at all the vertices. E.g. 0 2 3 or -1


-1 -1 6 8 1 -1 
-1 -1 18 6 14 13 
6 18 -1 -1 17 18 
8 6 -1 -1 2 -1 
1 14 17 2 -1 -1 
-1 13 18 -1 -1 -1 



Heuristic Function:
0: 19
1: 9
2: 15
3: 22
4: 27
5: 0




question no. 3: 
For the undirected graph represented by the following weight matrix and the given heuristic function, give the order in which the nodes are dequeued along with all costs and parents when using the A* algorithm as a tree search. In case of a tie, a numerically smaller vertex is preferred. Source is 0 and goal is 5. The vertices are numbered 0, 1, 2, 3, 4 and 5. -1 means no edge.
Answer Format: 
Each vertex in a newline in the format vertex name (in the order of dequeue) <space> distance from source <space> total cost <space> parent. Use NIL if no parent. E.g.
0 10 14 NIL
3 8 10 0
2 5 11 0
5 21 23 3


-1 -1 3 -1 6 17 
-1 -1 0 13 -1 -1 
3 0 -1 19 0 15 
-1 13 19 -1 7 3 
6 -1 0 7 -1 -1 
17 -1 15 3 -1 -1 



Heuristic Function:
0: 15
1: 0
2: 11
3: 2
4: 9
5: 0




question no. 4: 
Using resolution any number of times find all new facts that contain a single symbol only that are known to be true and can be added in the knowledge base
Only answer the new facts in any (not necissarily sorted) order in a space separated format e.g a b c d
Initial knowldge base is as follows
t→f
t∨(e→f)
f∨t∨e
f∨t∨¬e
¬t∨¬e
f∨(t→e)
t→e





question no. 5: 
Considering all the heuristics, solve the graph coloring problem represented in the following adjacency list with colors R, G and B; given the constraints that no two neighboring vertices may have the same color. The vertices are numbered 0, 1, 2, 3, 4 and 5. Show the order in which the assignments are made. In case of a tie, a numerically smaller vertex is preferred. The preference of colours is R (highest), followed by G (intermediate) and B (least preferred).

Answer format: Each vertex in a newline in the format vertex name (in the order of coloring) <space> color.  If you backtrack add <space>B in the answer format. E.g.
0 R
2 G
5 B
1 R
3 G
4 B

Vertex 0: 1 2
Vertex 1: 0 2
Vertex 2: 0 1 3
Vertex 3: 2 5
Vertex 4: 2 5
Vertex 5: 3 4




question no. 6: 
Consider the following Q-table.

(1,1) Up 1.4
(1,1) Down 9.9
(1,1) Right 2.5
(1,1) Left 5.6
(1,2) Up 9.1
(1,2) Down 5.4
(1,2) Right 5.6
(1,2) Left 5.3
(2,1) Up 4.3
(2,1) Down 4.5
(2,1) Right 5.7
(2,1) Left 8.9
(2,2) Up 1
(2,2) Down 5.5
(2,2) Right 0.8
(2,2) Left 6.5



An agent at state (2,1) takes an action down and reaches state (1,1) and observes a reward of 7 with a learning rate of 0.1 and a discount factor of 0.9. 
An agent is reset at the state (2,1) that again takes the action down and reaches state (2,2) and observes a reward of 4 with the same learning rate and discount factor.
Give the updated Q-value for the specific state/action after the 1st step and the 2nd step. Give the policy for the state (2,1) after all updates
The answer are two real numbers followed by a space and the action as a string. E.g. 10.3 20.9 right





question no. 7: 
Solve the following problem using a backward search. The answer is the action sequence that leads from source to goal, with each action separated by a space, E.g. A3 A1 A6. 

Source: N, I
Goal: Y
Action A1: Precondition N, I, Effect ~N
Action A2: Precondition ~N, I, Effect ~Y
Action A3: Precondition N, ~I, Effect ~N
Action A4: Precondition ~Y, ~N, Effect N
Action A5: Precondition ~N, ~I, Effect Y
Action A6: Precondition N, I, Effect ~I




question no. 8: 
Using forward chaining find all **new** facts that contain a single symbol only that are known to be true and can be added in the knowledge base
Only answer the new facts in a sorted order in a space separated format e.g a b c d. Use ~ for negation.
Initial knowldge base is as follows
¬c∨d∨¬z
¬t∨¬c∨o
u∨¬o∨¬d
¬c∨(x→o))
t∧c∧z
(¬o∧true)∨¬x∨(b∧true)
¬t∨(x→b))
¬u∨¬x∨a∨false
(¬o∨¬b∨u) ∧ (a∨¬bv¬d)





question no. 9: 
Assume the following grid-map with walls on all sides.

a b c
d e f
g h i

The current utility estimate of the different states are as follows, when using a value iteration algorithm:
a=9
b=7.8
c=9.3
d=7
e=8.8
f=1.6
g=8.1
h=8.8
i=6.4

The agent has action up, down, right and left. Each action results in the desired action with a probability of 0.7. The agent makes no motion with a probability of 0.1. The agent slips in either of the perpendicular direction of the applied action with no motion in the intended direction with a probability of 0.1 for each of the perpendicular directions. Hitting a wall has a negative reward of -10 with no motion. Motion without hitting or no motion has a reward of -0.1. Discount factor is kept as 1. There is no goal/terminal.

For the following fixed policy, calculate the updated value of the utility of a. [1], [2], and similarly thereafter denote the question numbers.
policy(a)=up, U(a)=[1]
policy(a)=down, U(a)=[2]
policy(a)=right, U(a)=[3]
policy(a)=left, U(a)=[4]

For value iteration with no fixed policy, calculate the updated value of the utility estimate of a, U(a)=[5]. Also calculate the policy for the updated utility estimate of a, policy(a)=[6]. In case of a tie, the preference of actions is up (highest), down, right, left (lowest).

The answer will be 1 line with all abswers space separated as [1] [2] [3] [4] [5] [6], without square brackets