Spaces:
Runtime error
A newer version of the Gradio SDK is available:
5.11.0
title: Average Precision
tags:
- evaluate
- metric
description: Average precision score.
sdk: gradio
sdk_version: 3.19.1
app_file: app.py
pinned: false
Metric Card for Average Precision
How to Use
import evaluate
metric = evaluate.load("chanelcolgate/average_precision")
results = metric.compute(references=references, prediction_scores=prediction_scores)
Inputs
y_true (
ndarray
of shape (n_samples,) or (n_samples, n_classes)): True binary labels or binary label indicators.y_score (
ndarray
of shape (n_samples,) or (n_samples, n_classes)): Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by :term:decision_function
on some classifiers).average: {'micro', 'samples', 'weighted', 'macro'} or None, default='macro`
If
None
, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:'micro'
: Calculate metrics globally by considering each element of the label indicator matrix as a label.'macro'
: Calculate metrics for each label, and find their unweighted mean This does not take label imbalance into account.
'weighted'
: Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label).'samples'
: Calculate metrics for each label, and find their average Will be ignored wheny_true
is binary.pos_label (
int
orstr
, default=1): The label of the positive class. Only applied to binaryy_true
. For multilabel-indicatory_true
,pos_label
is fixed to 1.sample_weight (
array-like
of shape (n_samples,), default=None): Sample weights.
Output Values
Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}
State the range of possible values that the metric's output can take, as well as what in that range is considered good. For example: "This metric can take on any value between 0 and 100, inclusive. Higher scores are better."
Values from Popular Papers
Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported.
Examples
Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed.
Limitations and Bias
Note any known limitations or biases that the metric has, with links and references if possible.
Citation
Cite the source where this metric was introduced.
Further References
Add any useful further references.