Spaces:

jackyliang42
/

code-as-policies

Runtime error

App Files Files Community

jackyliang42 commited on Nov 7, 2022

Commit

fc126c5

1 Parent(s): c81e0c0

updated readme

Browse files

Files changed (2) hide show

README.md +12 -19
app.py +2 -1

README.md CHANGED Viewed

@@ -11,37 +11,31 @@ license: apache-2.0
 ---
 # Code as Policies Tabletop Manipulation Interactive Demo
 This notebook is a part of the open-source code release associated with the paper:
 [Code as Policies: Language Model Programs for Embodied Control](https://code-as-policies.github.io/)
 This notebook gives an interactive demo for the simulated tabletop manipulation domain, seen in the paper section IV.D
-## Preparations:
-1) Obtain an [OpenAI API Key](https://openai.com/blog/openai-api/)
-2) Gain Codex access by [joining the waitlist](https://openai.com/blog/openai-codex/)
 Once you have Codex access you can use `code-davinci-002`. Using the GPT-3 model (`text-dainvci-002`) is also ok, but performance won't be as good (there will be more code logic errors).
-## Instructions:
 1. Fill in the API Key, model name, and how many blocks and bowls to be spawned in the environment.
 2. Click Setup/Reset Env
 3. Based on the new randomly sampled object names, input an instruction and click Run Instruction. If successful, this will render a video and update the simulation environment visualization.
-You can run instructions in sequence and refer back to previous commands (e.g. do the same with other blocks, move the same block to the other bowl, etc). Click Setup/Reset Env to reset, and this will clear the current instruction history.
-Supported commands:
 * Spatial reasoning (e.g. to the left of the red block, the closest corner, the farthest bowl, the second block from the right)
 * Sequential actions (e.g. put blocks in matching bowls, stack blocks on the bottom right corner)
-* Contextual commands (e.g. do the same with the blue block, undo that)
 * Language-based reasoning (e.g. put the forest-colored block on the ocean-colored bowl).
 * Simple Q&A (e.g. how many blocks are to the left of the blue bowl?)
-Example commands (note object names may need to be changed depending the sampled object names):
 * put the sun-colored block on the bowl closest to it
 * stack the blocks on the bottom most bowl
 * arrange the blocks as a square in the middle
@@ -49,9 +43,8 @@ Example commands (note object names may need to be changed depending the sampled
 * how many blocks are to the right of the orange bowl?
 * pick up the block closest to the top left corner and place it on the bottom right corner
-Known limitations:
-* In simulation we're using ground truth object poses instead of using vision models. This means that commands the require knowledge of visual apperances (e.g. darkest bowl, largest object) are not supported.
 * Currently, the low-level pick place primitive does not do collision checking, so if there are many objects on the table, placing actions may incur collisions.
-* Prompt saturation - if too many commands (10+) are executed in a row, then the LLM may start to ignore examples in the early parts of the prompt.
 * Ambiguous instructions - if a given instruction doesn't lead to the desired actions, try rephrasing it to remove ambiguities (e.g. place the block on the closest bowl -> place the block on its closest bowl)

 ---
 # Code as Policies Tabletop Manipulation Interactive Demo
 This notebook is a part of the open-source code release associated with the paper:
 [Code as Policies: Language Model Programs for Embodied Control](https://code-as-policies.github.io/)
 This notebook gives an interactive demo for the simulated tabletop manipulation domain, seen in the paper section IV.D
+## Preparations
+1. Obtain an [OpenAI API Key](https://openai.com/blog/openai-api/)
+2. Gain Codex access by [joining the waitlist](https://openai.com/blog/openai-codex/)
 Once you have Codex access you can use `code-davinci-002`. Using the GPT-3 model (`text-dainvci-002`) is also ok, but performance won't be as good (there will be more code logic errors).
+## Usage
 1. Fill in the API Key, model name, and how many blocks and bowls to be spawned in the environment.
 2. Click Setup/Reset Env
 3. Based on the new randomly sampled object names, input an instruction and click Run Instruction. If successful, this will render a video and update the simulation environment visualization.
+You can run instructions in sequence and refer back to previous instructions (e.g. do the same with other blocks, move the same block to the other bowl, etc). Click Setup/Reset Env to reset, and this will clear the current instruction history.
+## Supported Instructions
 * Spatial reasoning (e.g. to the left of the red block, the closest corner, the farthest bowl, the second block from the right)
 * Sequential actions (e.g. put blocks in matching bowls, stack blocks on the bottom right corner)
+* Contextual instructions (e.g. do the same with the blue block, undo that)
 * Language-based reasoning (e.g. put the forest-colored block on the ocean-colored bowl).
 * Simple Q&A (e.g. how many blocks are to the left of the blue bowl?)
+## Example Instructions
+Note object names may need to be changed depending the sampled object names.
 * put the sun-colored block on the bowl closest to it
 * stack the blocks on the bottom most bowl
 * arrange the blocks as a square in the middle
 * how many blocks are to the right of the orange bowl?
 * pick up the block closest to the top left corner and place it on the bottom right corner
+## Known Limitations
+* In simulation we're using ground truth object poses instead of using vision models. This means that instructions the require knowledge of visual apperances (e.g. darkest bowl, largest object) are not supported.
 * Currently, the low-level pick place primitive does not do collision checking, so if there are many objects on the table, placing actions may incur collisions.
+* Prompt saturation - if too many instructions (10+) are executed in a row, then the LLM may start to ignore examples in the early parts of the prompt.
 * Ambiguous instructions - if a given instruction doesn't lead to the desired actions, try rephrasing it to remove ambiguities (e.g. place the block on the closest bowl -> place the block on its closest bowl)

app.py CHANGED Viewed

@@ -80,7 +80,7 @@ class DemoRunner:
         self._lmp_tabletop_ui = self.make_LMP(self._env)
-        info = '## Available Objects: \n- ' + '\n- '.join(obj_list)
         img = self._env.get_camera_image()
         return info, img
@@ -118,6 +118,7 @@ if __name__ == '__main__':
     with demo:
         gr.Markdown(readme_text)
         with gr.Row():
             with gr.Column():
                 with gr.Row():

         self._lmp_tabletop_ui = self.make_LMP(self._env)
+        info = '### Available Objects: \n- ' + '\n- '.join(obj_list)
         img = self._env.get_camera_image()
         return info, img
     with demo:
         gr.Markdown(readme_text)
+        gr.Markdown('# Interactive Demo')
         with gr.Row():
             with gr.Column():
                 with gr.Row():