Spaces:
Runtime error
Runtime error
jackyliang42
commited on
Commit
·
fc126c5
1
Parent(s):
c81e0c0
updated readme
Browse files
README.md
CHANGED
@@ -11,37 +11,31 @@ license: apache-2.0
|
|
11 |
---
|
12 |
|
13 |
# Code as Policies Tabletop Manipulation Interactive Demo
|
14 |
-
|
15 |
This notebook is a part of the open-source code release associated with the paper:
|
16 |
-
|
17 |
[Code as Policies: Language Model Programs for Embodied Control](https://code-as-policies.github.io/)
|
18 |
-
|
19 |
This notebook gives an interactive demo for the simulated tabletop manipulation domain, seen in the paper section IV.D
|
20 |
|
21 |
-
## Preparations
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
2) Gain Codex access by [joining the waitlist](https://openai.com/blog/openai-codex/)
|
26 |
-
|
27 |
Once you have Codex access you can use `code-davinci-002`. Using the GPT-3 model (`text-dainvci-002`) is also ok, but performance won't be as good (there will be more code logic errors).
|
28 |
|
29 |
-
##
|
30 |
-
|
31 |
1. Fill in the API Key, model name, and how many blocks and bowls to be spawned in the environment.
|
32 |
2. Click Setup/Reset Env
|
33 |
3. Based on the new randomly sampled object names, input an instruction and click Run Instruction. If successful, this will render a video and update the simulation environment visualization.
|
34 |
|
35 |
-
You can run instructions in sequence and refer back to previous
|
36 |
|
37 |
-
Supported
|
38 |
* Spatial reasoning (e.g. to the left of the red block, the closest corner, the farthest bowl, the second block from the right)
|
39 |
* Sequential actions (e.g. put blocks in matching bowls, stack blocks on the bottom right corner)
|
40 |
-
* Contextual
|
41 |
* Language-based reasoning (e.g. put the forest-colored block on the ocean-colored bowl).
|
42 |
* Simple Q&A (e.g. how many blocks are to the left of the blue bowl?)
|
43 |
|
44 |
-
Example
|
|
|
45 |
* put the sun-colored block on the bowl closest to it
|
46 |
* stack the blocks on the bottom most bowl
|
47 |
* arrange the blocks as a square in the middle
|
@@ -49,9 +43,8 @@ Example commands (note object names may need to be changed depending the sampled
|
|
49 |
* how many blocks are to the right of the orange bowl?
|
50 |
* pick up the block closest to the top left corner and place it on the bottom right corner
|
51 |
|
52 |
-
Known
|
53 |
-
* In simulation we're using ground truth object poses instead of using vision models. This means that
|
54 |
* Currently, the low-level pick place primitive does not do collision checking, so if there are many objects on the table, placing actions may incur collisions.
|
55 |
-
* Prompt saturation - if too many
|
56 |
* Ambiguous instructions - if a given instruction doesn't lead to the desired actions, try rephrasing it to remove ambiguities (e.g. place the block on the closest bowl -> place the block on its closest bowl)
|
57 |
-
|
|
|
11 |
---
|
12 |
|
13 |
# Code as Policies Tabletop Manipulation Interactive Demo
|
|
|
14 |
This notebook is a part of the open-source code release associated with the paper:
|
|
|
15 |
[Code as Policies: Language Model Programs for Embodied Control](https://code-as-policies.github.io/)
|
|
|
16 |
This notebook gives an interactive demo for the simulated tabletop manipulation domain, seen in the paper section IV.D
|
17 |
|
18 |
+
## Preparations
|
19 |
+
1. Obtain an [OpenAI API Key](https://openai.com/blog/openai-api/)
|
20 |
+
2. Gain Codex access by [joining the waitlist](https://openai.com/blog/openai-codex/)
|
|
|
|
|
|
|
21 |
Once you have Codex access you can use `code-davinci-002`. Using the GPT-3 model (`text-dainvci-002`) is also ok, but performance won't be as good (there will be more code logic errors).
|
22 |
|
23 |
+
## Usage
|
|
|
24 |
1. Fill in the API Key, model name, and how many blocks and bowls to be spawned in the environment.
|
25 |
2. Click Setup/Reset Env
|
26 |
3. Based on the new randomly sampled object names, input an instruction and click Run Instruction. If successful, this will render a video and update the simulation environment visualization.
|
27 |
|
28 |
+
You can run instructions in sequence and refer back to previous instructions (e.g. do the same with other blocks, move the same block to the other bowl, etc). Click Setup/Reset Env to reset, and this will clear the current instruction history.
|
29 |
|
30 |
+
## Supported Instructions
|
31 |
* Spatial reasoning (e.g. to the left of the red block, the closest corner, the farthest bowl, the second block from the right)
|
32 |
* Sequential actions (e.g. put blocks in matching bowls, stack blocks on the bottom right corner)
|
33 |
+
* Contextual instructions (e.g. do the same with the blue block, undo that)
|
34 |
* Language-based reasoning (e.g. put the forest-colored block on the ocean-colored bowl).
|
35 |
* Simple Q&A (e.g. how many blocks are to the left of the blue bowl?)
|
36 |
|
37 |
+
## Example Instructions
|
38 |
+
Note object names may need to be changed depending the sampled object names.
|
39 |
* put the sun-colored block on the bowl closest to it
|
40 |
* stack the blocks on the bottom most bowl
|
41 |
* arrange the blocks as a square in the middle
|
|
|
43 |
* how many blocks are to the right of the orange bowl?
|
44 |
* pick up the block closest to the top left corner and place it on the bottom right corner
|
45 |
|
46 |
+
## Known Limitations
|
47 |
+
* In simulation we're using ground truth object poses instead of using vision models. This means that instructions the require knowledge of visual apperances (e.g. darkest bowl, largest object) are not supported.
|
48 |
* Currently, the low-level pick place primitive does not do collision checking, so if there are many objects on the table, placing actions may incur collisions.
|
49 |
+
* Prompt saturation - if too many instructions (10+) are executed in a row, then the LLM may start to ignore examples in the early parts of the prompt.
|
50 |
* Ambiguous instructions - if a given instruction doesn't lead to the desired actions, try rephrasing it to remove ambiguities (e.g. place the block on the closest bowl -> place the block on its closest bowl)
|
|
app.py
CHANGED
@@ -80,7 +80,7 @@ class DemoRunner:
|
|
80 |
|
81 |
self._lmp_tabletop_ui = self.make_LMP(self._env)
|
82 |
|
83 |
-
info = '
|
84 |
img = self._env.get_camera_image()
|
85 |
|
86 |
return info, img
|
@@ -118,6 +118,7 @@ if __name__ == '__main__':
|
|
118 |
|
119 |
with demo:
|
120 |
gr.Markdown(readme_text)
|
|
|
121 |
with gr.Row():
|
122 |
with gr.Column():
|
123 |
with gr.Row():
|
|
|
80 |
|
81 |
self._lmp_tabletop_ui = self.make_LMP(self._env)
|
82 |
|
83 |
+
info = '### Available Objects: \n- ' + '\n- '.join(obj_list)
|
84 |
img = self._env.get_camera_image()
|
85 |
|
86 |
return info, img
|
|
|
118 |
|
119 |
with demo:
|
120 |
gr.Markdown(readme_text)
|
121 |
+
gr.Markdown('# Interactive Demo')
|
122 |
with gr.Row():
|
123 |
with gr.Column():
|
124 |
with gr.Row():
|