Spaces:

microsoft
/

VITRA

Running on Zero

App Files Files Community

arnoldland commited on 1 day ago

Commit

5a9dd28

1 Parent(s): 2d28e96

update UI

Browse files

Files changed (1) hide show

app.py +4 -4

app.py CHANGED Viewed

@@ -695,7 +695,7 @@ def create_gradio_interface():
         <div style="line-height: 1.8;">
         <br>
-        <p style="font-size: 18px;">Upload a <strong style="color: #7C4DFF;">landscape</strong>, <strong style="color: #7C4DFF;">egocentric (first-person)</strong> image containing hand(s) and provide instructions to predict future 3D hand trajectories.</p>
         <h3>🌟 Steps:</h3>
         <ol>
@@ -707,9 +707,9 @@ def create_gradio_interface():
         <h3>💡 Tips:</h3>
         <ul>
             <li><strong>Use Left/Right Hand</strong>: Select which hand to predict based on what's detected and what you want to predict.</li>
-            <li><strong>Instruction</strong>: Provide clear and specific imperative instructions separately for the left and right hands, and enter them in the corresponding fields. If the results are unsatisfactory, try providing more detailed instructions (e.g., color, orientation, etc.).</li>
-            <li><strong>For best inference quality, it is recommended to capture landscape view images from a camera height close to that of a human head. Highly unusual or distorted hand poses/positions may cause inference failures.</strong></li>
-            <li><strong>It is worth noting that each generation produces only a single action chunking starting from the current state. A single action chunking does not necessarily complete the entire task, and executing an entire chunking in one step may lead to reduced precision. The demonstrations provided here are intended solely for visualization purposes.</strong></li>
         </ul>
         </div>

         <div style="line-height: 1.8;">
         <br>
+        <p style="font-size: 16px;">Upload a <strong style="color: #7C4DFF;">landscape</strong>, <strong style="color: #7C4DFF;">egocentric (first-person)</strong> image containing hand(s) and provide instructions to predict future 3D hand trajectories.</p>
         <h3>🌟 Steps:</h3>
         <ol>
         <h3>💡 Tips:</h3>
         <ul>
             <li><strong>Use Left/Right Hand</strong>: Select which hand to predict based on what's detected and what you want to predict.</li>
+            <li><strong>Instruction</strong>: Provide clear and specific imperative instructions separately for the left and right hands, and enter them in the corresponding fields. If the results are unsatisfactory, <strong style="color: #7C4DFF;">try providing more detailed instructions</strong> (e.g., color, orientation, etc.).</li>
+            <li>For best inference quality, it is recommended to <strong style="color: #7C4DFF;">capture landscape view images from a camera height close to that of a human head</strong>. Highly unusual or distorted hand poses/positions may cause inference failures.</li>
+            <li>It is worth noting that each generation produces only a single action chunking starting from the current state, which <strong style="color: #7C4DFF;">does not necessarily complete the entire task</strong>. Executing an entire chunking in one step may lead to reduced precision.</li>
         </ul>
         </div>