モーション計画

2013-05-08

AI for Robotics - Lesson 4: Motion Planning (pdf) (動画)

ゴールまでの最短経路探索。A*とかDynamic Programmingのざっくりとした説明。まあこのあたりは一応知っていたからわかったけど、動画だけで理解するのは大変そう。

グリッドベースの最短経路探索ならわかるけど、クルマの自動運転のようにグリッドベースじゃない場合はどうしたらいいんだろう?

問題:確率によって直進しないことがある場合のコストの計算

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
# --------------
# USER INSTRUCTIONS
#
# Write a function called stochastic_value that
# takes no input and RETURNS two grids. The
# first grid, value, should contain the computed
# value of each cell as shown in the video. The
# second grid, policy, should contain the optimum
# policy for each cell.
#
# Stay tuned for a homework help video! This should
# be available by Thursday and will be visible
# in the course content tab.
#
# Good luck! Keep learning!
#
# --------------
# GRADING NOTES
#
# We will be calling your stochastic_value function
# with several different grids and different values
# of success_prob, collision_cost, and cost_step.
# In order to be marked correct, your function must
# RETURN (it does not have to print) two grids,
# value and policy.
#
# When grading your value grid, we will compare the
# value of each cell with the true value according
# to this model. If your answer for each cell
# is sufficiently close to the correct answer
# (within 0.001), you will be marked as correct.
#
# NOTE: Please do not modify the values of grid,
# success_prob, collision_cost, or cost_step inside
# your function. Doing so could result in your
# submission being inappropriately marked as incorrect.

# -------------
# GLOBAL VARIABLES
#
# You may modify these variables for testing
# purposes, but you should only modify them here.
# Do NOT modify them inside your stochastic_value
# function.

grid = [[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 1, 1, 0]]

goal = [0, len(grid[0])-1] # Goal is in top right corner


delta = [[-1, 0 ], # go up
[ 0, -1], # go left
[ 1, 0 ], # go down
[ 0, 1 ]] # go right

delta_name = ['^', '<', 'v', '>'] # Use these when creating your policy grid.

success_prob = 0.5
failure_prob = (1.0 - success_prob)/2.0 # Probability(stepping left) = prob(stepping right) = failure_prob
collision_cost = 100
cost_step = 1


############## INSERT/MODIFY YOUR CODE BELOW ##################
#
# You may modify the code below if you want, but remember that
# your function must...
#
# 1) ...be called stochastic_value().
# 2) ...NOT take any arguments.
# 3) ...return two grids: FIRST value and THEN policy.

def stochastic_value():
value = [[1000 for row in range(len(grid[0]))] for col in range(len(grid))]
policy = [[' ' for row in range(len(grid[0]))] for col in range(len(grid))]

policy[goal[0]][goal[1]] = '*'

for i in range(100): # TODO: Check convergence.
process(value, policy)

return value, policy

def process(value, policy):
processed = [[False for row in range(len(grid[0]))] for col in range(len(grid))]
x = goal[0]
y = goal[1]
g = 0
value[x][y] = 0
processed[x][y] = True

open = [[g, x, y]]
while len(open) > 0:
open.sort()
open.reverse()
next = open.pop()
x = next[1]
y = next[2]
g = next[0]

if x != goal[0] or y != goal[1]:
value[x][y], dir = calc_min_cost(grid, value, x, y)
policy[x][y] = delta_name[dir]

for i in range(len(delta)):
x2 = x + delta[i][0]
y2 = y + delta[i][1]
if x2 >= 0 and x2 < len(grid) and y2 >=0 and y2 < len(grid[0]):
if not processed[x2][y2] and grid[x2][y2] == 0:
g2 = value[x][y] + cost_step
open.append([g2, x2, y2])
processed[x2][y2] = True

def calc_min_cost(grid, value, x, y):
mindir = -1
mincost = 0
for dir in range(4):
cost = calc_cost(grid, value, x, y, dir)
if mindir < 0 or cost < mincost:
mindir = dir
mincost = cost
return mincost, mindir

def calc_cost(grid, value, x, y, dir):
return (get_dir_cost(grid, value, x, y, dir) * success_prob +
get_dir_cost(grid, value, x, y, dir - 1) * failure_prob +
get_dir_cost(grid, value, x, y, dir + 1) * failure_prob) + cost_step

def get_dir_cost(grid, value, x, y, dir):
dir &= 3
x += delta[dir][0]
y += delta[dir][1]
if x < 0 or y < 0 or x >= len(value) or y >= len(value[x]) or grid[x][y] != 0:
return collision_cost
else:
return value[x][y]


value, policy = stochastic_value()
for line in value:
print line
for line in policy:
print line

故障を恐れるあまり、左上のゴールに直接向かわないようになってしまっている

1
2
3
4
5
6
7
8
[57.902, 40.278, 26.066, 0]
[47.054, 36.572, 29.993, 27.269]
[53.171, 42.022, 37.775, 45.091]
[77.585, 1000, 1000, 73.54582111158368]
['>', 'v', 'v', '*']
['>', '>', '^', '<']
['>', '^', '^', '<']
['^', ' ', ' ', '^']