About the output of the sample code

by Tomie0506 - opened 3 days ago

3 days ago

The sample code gives an output of reward = [-6.5], what is the numerical significance of this. What is the range of outputs for this model?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment