Loading methods

Methods for listing and loading evaluation modules:

List

evaluate.list_evaluation_modules

< source >

( module_type = None include_community = True with_details = False )

Parameters

module_type (str, optional, defaults to None) — Type of evaluation modules to list. Has to be one of 'metric', 'comparison', or 'measurement'. If None, all types are listed.
include_community (bool, optional, defaults to True) — Include community modules in the list.
with_details (bool, optional, defaults to False) — Return the full details on the metrics instead of only the ID.

List all evaluation modules available on the Hugging Face Hub.

Example:

>>> from evaluate import list_evaluation_modules
>>> list_evaluation_modules(module_type="metric")

Load

evaluate.load

< source >

( path: str config_name: typing.Optional[str] = None module_type: typing.Optional[str] = None process_id: int = 0 num_process: int = 1 cache_dir: typing.Optional[str] = None experiment_id: typing.Optional[str] = None keep_in_memory: bool = False download_config: typing.Optional[datasets.download.download_config.DownloadConfig] = None download_mode: typing.Optional[datasets.download.download_manager.DownloadMode] = None revision: typing.Union[str, datasets.utils.version.Version, NoneType] = None **init_kwargs )

Parameters

path (str) — Path to the evaluation processing script with the evaluation builder. Can be either:
- a local path to processing script or the directory containing the script (if the script has the same name as the directory), e.g. './metrics/rouge' or './metrics/rouge/rouge.py'
- a evaluation module identifier on the HuggingFace evaluate repo e.g. 'rouge' or 'bleu' that are in either 'metrics/', 'comparisons/', or 'measurements/' depending on the provided module_type
config_name (str, optional) — Selecting a configuration for the metric (e.g. the GLUE metric has a configuration for each subset).
module_type (str, default 'metric') — Type of evaluation module, can be one of 'metric', 'comparison', or 'measurement'.
process_id (int, optional) — For distributed evaluation: id of the process.
num_process (int, optional) — For distributed evaluation: total number of processes.
cache_dir (str, optional) — Path to store the temporary predictions and references (default to ~/.cache/huggingface/evaluate/).
experiment_id (str) — A specific experiment id. This is used if several distributed evaluations share the same file system. This is useful to compute metrics in distributed setups (in particular non-additive metrics like F1).
keep_in_memory (bool) — Whether to store the temporary results in memory (defaults to False).
download_config (~evaluate.DownloadConfig, optional) — Specific download configuration parameters.
download_mode (DownloadMode, defaults to REUSE_DATASET_IF_EXISTS) — Download/generate mode.
revision (Union[str, evaluate.Version], optional) — If specified, the module will be loaded from the datasets repository at this version. By default it is set to the local version of the lib. Specifying a version that is different from your local version of the lib might cause compatibility issues.

Load a EvaluationModule.

Example:

>>> from evaluate import load
>>> accuracy = load("accuracy")

< > Update on GitHub

Evaluate

Loading methods

List

evaluate.list_evaluation_modules

Load

evaluate.load