---
library_name: transformers
tags: 
- biology
- electrocardiogram
language:
- en
pipeline_tag: feature-extraction
---
# 📚 What is RPeaks2HRV?
A pipeline to derive heart rate variability (HRV) features from R-Peaks derived from an electrocardiogram (ECG) signal.

*Note*: You need to process raw ECG signals? Consider using [ECG2HRV](https://huggingface.co/hubii-world/ecg-to-hrv-pipeline) instead!

# Usage Instructions

## 0 (Optional but recommeded) Set up Virtual Environment

## 1 Install Pipeline Dependencies
In order to use the pipeline, you need to install some dependencies the pipeline relies on. Run the following command to install the dependencies defined in requirements_rpeak2hrv_pipeline.txt. You can get the file from this repository.
```python
%pip install -r requirements_rpeak2hrv_pipeline.txt
```

## 2 Instantiate Pipeline
```python
from transformers import pipeline

rpeak2hrv_pipeline = pipeline(model = "hubii-world/rpeaks-to-hrv-pipeline", trust_remote_code=True)
```

## 3 Pipeline Parameters & Supported File Formats

### Overiew: Parameters
The pipeline provides a variety of different parameters that can be set to adjust the preprocessing behavior. The following sections explain the individual parameters in detail and provide illustrative examples.


#### Mandatory Parameters
In general, the pipeline relies on 2 mandatory parameters the user has to set for every parameter execution:
| Parameter name | Type | Default value | Description |
|----------------|------|---------------|-------------|
|    `inputs`     | _str_ or _Dataframe_  | No default value | The input that should be processed by the pipeline. This can either be a path to a file containing the data to process or the data itself |
| `feature_domains`| _list[str]_ | ['time', 'freq', 'non_lin'] | The domains the pipeline should calculate features for. |
| `sampling_rate` | _int_  | 1000          | The sampling rate of the continuous cardiac signal in which peaks occur |


#### Optional Parameters
Besides the mandatory parameters, the pipeline offers multiple optional parameters that may be necessary to set in order to compute correct HRV-features:
| Parameter name | Type | Default value | Description |
|----------------|------|---------------|-------------|
| `time_header`  | _str_| 'SystemTime'  | The name of the data column that contains the timestamp to which the respective values in the same row are recorded |
| `rri_header`   | _str_| 'interbeat_interval' | The name of the data column that contains the RR-Intervals in msec |
| `windowing_method` | _str_| None |  The method that should be applied to divide the raw data into windows. Default setting is None, so no windowing is applied |
| `window_size`  | _str_| '60s' | The size of a window in terms of a time frame. Only relevant if windowing should be applied to the data |


###  3.1 `inputs`

The `inputs` parameter represents the data the pipeline should process to HRV-Features. The pipeline supports values of type _str_ and _Dataframe_ as input.

When providing the `inputs` as string, it has to represent a file path to a file containing the data to process. Supported file formats are .csv and .txt.

Alternatively, you can also provide the data directly to the pipeline in form of a _DataFrame_.

#### Example: Provide input as file path
```python
file_path = "./Example_data/RRIntervalExample.csv"
result = rpeak2hrv_pipeline(inputs=file_path, sampling_rate=1000)
result.head()
```
### 3.2 `feature_domains`
The `feature_domains` parameter controls which domain features the pipeline calculates. The domains are provided to the pipeline as an array of keys. Supported keys are:
| Key | Description |
|-----|-------------|
| 'time' | pipeline calculates time-domain HRV metrics |
| 'freq' | pipeline calculates frequency-domain HRV metrics |
| 'non_lin' | pipeline calculates non-linear HRV indices |

For additional information regarding the calculated features, consult the [NeuroKit2 documentation](https://neuropsychology.github.io/NeuroKit/functions/hrv.html#).
Per default, the pipeline will calculate features for all three domains.

##### Example: Feature domains
In the following code, the pipeline only calculates time- and non-lin HRV indices for the provided data
```python
file_path = "./Example_data/RRIntervalExample.csv"
result = rpeak2hrv_pipeline(inputs=file_path, feature_domains=['time', 'non_lin'], sampling_rate=1000)
result.head()
```

### 3.3 `sampling_rate`
The `sampling_rate` (Hz) represents the rate with which the sensor sampled data from the patient. It has to be provided as integer. In the example above, you can see a configuration where the `sampling_rate` is set to 1000.

The default rate is 1000 Hz, meaning that the sensor sampled 1000 values per second.

### 3.4 `time_header` & `rri_header`
`time_header` and `rri_header` are important settings to define the structure of the data the pipeline has to process. In general, the pipeline supports two possible data formats:
- R Peak Flags
- RR-Intervals with timestamps

#### 3.4.1 R Peak Flags
The first format option is defined by a _Dataframe_ with one column named `'ECG_R_Peaks'`. The column values are simple binary flags indicating whether a R peak occured or not. 

This is the standard data format used by neurokit2 to represent R peaks. If you use this data format, you do not need to specify `time_header` and `rri_header`.

__Important__: Make sure that the column has the correct name and that you specify the correct `sampling_rate`, as this is indispensable information to compute the correct HRV-Features.

##### Example: R Peak Flags
The following code generates an example for a _DataFrame_ containing R Peak Flags
```python
import pandas as pd
df = pd.read_csv("./Example_data/RPeaksDataExample.csv")
df.head()
```
You can process this data without setting `time_header`and `rri_header`
```python
result = rpeak2hrv_pipeline(inputs=df, sampling_rate=1000)
result.head()
```

#### 3.4.2 RR-Intervals with timestamps
The second format option is defined by a _DataFrame_ with two columns containing the RR-Intervals in milliseconds and the corresponding timestamps at which the RR-intervals have been recorded by the sensor. Here, `time_header` speficies the column name containing the timestamps and `rri_header` speficies the column containing the RR-intervals.
The default column names are `'SystemTime'` and `'interbeat_intervals'`.

##### Example: RR-Intervals with timestamps
The following code generates an example for a _DataFrame_ containing RR intervals and their timestamps.
```python
import pandas as pd
df = pd.read_csv("./Example_data/RRIntervalExample.csv")
df.head()
```
As in this example the column names match the default values of `time_header` and `rr_header`, you also do not need to specify them individually to process the data.
```python
result = rpeak2hrv_pipeline(inputs=df, sampling_rate=1000)
result.head()
```

### 3.5 `windowing_method`
The `windowing_method` defines the method to be used to divide the raw data into windows. The supported settings are:
| Parameter value | Description |
|-----------------|-------------|
|'rolling'        | Creates a window rolling over the data. For more information see [pandas.DataFrame.rolling()](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rolling.html) |
|'first_interval' | Keeps the data values that are recorded within the __first__ timeframe defined by _window_size_ and omits the rest |
|'last_interval'  | Keeps the data values that are recorded within the __last__ timeframe defined by _window_size_ and omits the rest |


#### Example: 'first_interval'-windowing
The following code snippet shows an exemplary usage of first_interval windowing. In this example, only the values recorded within the first 5 minutes of the data collection are used to compute HRV-Features.
```python
file_path = "./Example_data/RRIntervalExample.csv"
result = rpeak2hrv_pipeline(inputs=file_path, windowing_method="first_interval", window_size="5m", sampling_rate=1000)
result.head()
```

### 3.6 `window_size`
The `window_size` defines the size of the windows the data should be divided into. In general, the definition follows this pattern: '{any positive integer}{t}', where t is an element of {'d', 'h', 'm', 's'}.

For example: the setting '20m' represents a window size of 20 minutes.

The default setting is '60s' corresponding to a window size of a minute.

Setting this parameter is only necessary, if you want to apply windowing.

#### Example: Window size
In the following code, a rolling window of 5 minutes is applied to the data. For each window, the pipeline then calculates the HRV-Features and creates a new row in the result _DataFrame_. The pipeline returns a _DataFrame_ in which each row represents a specific window.
For each window, the corresponding starting and ending timestamps are included in the result.
```python
file_path = "./Example_data/RRIntervalExample.csv"
result = rpeak2hrv_pipeline(inputs=file_path, windowing_method="rolling", window_size="5m", sampling_rate=1000)
result.head()
```

### 3.7 Supported file formats
As already mentioned in Section 3.1, the pipeline can process 2 types of data formats when providing a file path: .csv and .txt.
When using a .csv file, the pipeline supports two column seprarators: ',' and ';'. 

The pipeline recognizes the column separator in the .csv file automatically.

When using a .txt file, the pipeline only supports the column separator '\t'. Make sure your data file matches this requirement before providing it to the pipeline.

#### Example: Provide .csv file to pipeline
The following example provides a .csv file to the pipeline and lets it calculate the HRV-Features on the first 10 minutes of the data.
```python
file_path = "./Example_data/RRIntervalExample.csv"
result = rpeak2hrv_pipeline(inputs=file_path, windowing_method="first_interval", window_size="10m", sampling_rate=1000)
result.head()
```

#### Example: Provide .txt file to pipeline
The same can be done using a .txt file.
```python
file_path = "./Example_data/RRIntervalExample.txt"
result = rpeak2hrv_pipeline(inputs=file_path, windowing_method="first_interval", window_size="10m", sampling_rate=1000)
result.head()
```