Spaces:

wesslen
/

prodigy-rlhf-respond

Runtime error

wesslen commited on May 23, 2023

Commit

f40eb98

1 Parent(s): d0750ff

copy repo from rlhf-ranking; update for respond

Files changed (5) hide show

Dockerfile ADDED Viewed

+FROM python:3.9
+#COPY requirements.txt /app/
+WORKDIR /app
+# # Set up a new user named "user" with user ID 1000
+# RUN useradd -m -u 1000 user
+# # Switch to the "user" user
+# USER user
+# # Set home to the user's home directory
+# ENV HOME=/home/user \
+# 	PATH=/home/user/.local/bin:$PATH
+# # Set the working directory to the user's home directory
+# WORKDIR $HOME/app
+# # Copy the current directory contents into the container at $HOME/app setting the owner to the user
+# COPY --chown=user . $HOME/app
+RUN --mount=type=secret,id=LICENSE_KEY,mode=0444,required=true \
+    pip install --upgrade pip \
+    && pip install typing_extensions==4.5.0 \
+    && pip install --quiet prodigy -f https://$(cat /run/secrets/LICENSE_KEY)@download.prodi.gy
+RUN chmod 777 .
+COPY prodigy.json .
+COPY data ./data/
+COPY rlhf_ranking.py .
+COPY prodigy.sh .
+ENV PRODIGY_HOME /app
+ENV PRODIGY_LOGGING "verbose"
+ENV PRODIGY_ALLOWED_SESSIONS "user1,user2"
+EXPOSE 7860
+CMD ["bash","prodigy.sh"]

data/dataset.jsonl ADDED Viewed

The diff for this file is too large to render. See raw diff

prodigy.json ADDED Viewed

+{
+    "port": 7860,
+    "host": "0.0.0.0",
+    "db": "sqlite",
+    "db_settings": {
+      "sqlite": {
+        "name": "prodigy.db",
+        "path": "/app"
+      }
+    },
+    "feed_overlap": true,
+    "show_stats": true
+}

prodigy.sh ADDED Viewed

	@@ -0,0 +1 @@


1	+ python -m prodigy rlhf.respond rlhf_data data/dataset.jsonl -F rlhf_respond.py

rlhf_respond.py ADDED Viewed

+import prodigy
+import itertools as it
+from prodigy.util import set_hashes
+from prodigy import get_stream
+@prodigy.recipe(
+    "rlhf.respond",
+    dataset=("Dataset to save answers to", "positional", None, str),
+    source=("Datafile to load", "positional", None, str),
+)
+def ranking(dataset, source):
+    # Load your own streams from anywhere you want
+    stream = get_stream(source)
+    def prep_stream(stream):
+        for ex in stream:
+            ex['text'] = ex['instruction']
+            del ex['instruction']
+            yield ex
+    return {
+        "dataset": dataset,
+        "view_id": "blocks",
+        "stream": prep_stream(stream),
+        "config":{
+            "global_css": ".prodigy-option{font-size: 15px;}",
+            "blocks":[
+                {"view_id": "text"},
+                {"view_id": "text_input", "field_autofocus": True, "field_rows": 4, "field_placeholder": "Try to use 2-3 sentences to answer the question."},
+            ],
+        }
+    }