--- library_name: transformers tags: - biology - eyetracking language: - en pipeline_tag: feature-extraction --- # 📚 What is EyeTrack2Saccade? A pipeline to derive saccation features from eyetracking records. A saccade is a fast movement of the eye that abruptly changes the point of fixation. # Usage Instructions ## 0 (Optional but recommeded) Set up Virtual Environment ## 1 Install Pipeline Dependencies In order to use the pipeline, you need to install some dependencies the pipeline relies on. Run the following command to install the dependencies defined in requirements.txt. You can get the file from this repository. ```python %pip install -r requirements.txt ``` ## 2 Instantiate Pipeline ```python from transformers import pipeline eye2sacc_pipeline = pipeline(model = "hubii-world/eyetrack-to-sacc-pipeline", trust_remote_code=True) ``` ## 3 Pipeline Parameters & Supported File Formats ### Overiew: Parameters The pipeline provides a variety of different parameters that can be set to adjust the preprocessing behavior: | Parameter name | Type | Default value | Description | |----------------|------|---------------|-------------| | `inputs` | _str_ or _DataFrame_ | No default value | The input that should be processed by the pipeline. This can either be a path to a file containing the data to process or the data itself | | `time_header` | _str_ | 'Time' | The column of `inputs` that contains the recording timestamp information of the different data points | | `x_headers` | _str_ or _list_ | 'X' | The column(s) of `inputs` that contain x coordinates of the eye gaze | | `y_headers` | _str_ or _list_ | 'Y' | The column(s) of `inputs` that contain y coordinates of the eye gaze | | `missing` | _double_ | 0.0 | value to be used for missing data | | `minlen` | _int_ | 5 | minimal length of saccades in milliseconds to be recognized. All detected saccades with len(sac) < minlen will be ignored | | `maxvel` | _int_ | 40 | velocity threshold in pixels/second | | `maxacc` | _int_ | 30 | acceleration threshold in pixels / second**2 | The following sections explain some parameters in detail and provide illustrative examples. ### 3.1 `Inputs` The `inputs` parameter represents the data the pipeline should process to saccades. The pipeline supports values of type _str_ and _DataFrame_ as input. When providing the `inputs` as _str_, it has to represent a file path to a file containing the data to process. Supported file formats are .csv and .txt. Alternatively, you can also provide the data directly to the pipeline in form of a _DataFrame_. #### Example: Provide input as file path ```python file_path = "./Example_data/eyetracker_freeviewing.txt" result = rpeak2hrv_pipeline(inputs=file_path, time_header = "Time", x_headers = ['L POR X [px]', 'R POR X [px]'], y_headers = ['L POR Y [px]', 'R POR Y [px]']) result.head() ``` ### 3.2 `time_header`, `x_headers` & `y_headers` The parameters `time_header`, `x_headers` and `y_headers` specify the structure of the input the pipeline should process. Section 3.4 provides illustrative examples regarding possible input structures. `time_header` expects a _str_ and specifies the column that contains the timestamp information of the data. The parameter has the default value 'Time'. In case that the specified parameter does not match any column in `inputs`, the pipeline also checks for alternative popular column headers, such as 'time' and 'timestamp'. `x_headers` specifies the column(s) that contain x coordintates of the eye gaze to process. The default parameter settting expects a single column named 'X' to provide x coordinates. In case that the specified parameter does not match any column in `inputs`, the pipeline also checks alternative popular column headers, such as 'x' and 'POR X [px]'. Instead of a single column, the data may provide x coordinates for each eye separately. In this case, `x_headers` can be set to a list of column names. Make sure that the first list entry corresponds to the left eye, and the second list entry corresponds to the right eye. In case that the specified list does not match any columns in `inputs`, the pipeline also checks alternative popular column headers, such as 'L POR X [px]' and 'R POR X [px]' respectively. `y_headers`specifies the column(s) that contain y coordinates of the eye gaze, similar to `x_headers`. The default parameter value expects a single column named 'Y', but in case that no matching column is found, the pipeline also checks for 'y' and 'POR Y [px]'. Same as for `x_headers`, `y_headers` can be set to a list of two column names if the coordinates in `inputs` are specified for each eye separately. In this case, the pipeline also checks for 'L POR Y [px]' and 'R POR Y [px]' as column names respectively, if the specified names cannot be found. Whenever the pipelines automatically checks alternative column names, the names are checked case-insensitive (e.g. the pipeline checks for 'POR X [px]' and 'por x [px]'). ### 3.3 Supported file formats As already mentioned in Section 3.1, the pipeline can process 2 types of data formats when providing a file path: .csv and .txt. When using a .csv file, the pipeline supports two column seprarators: ',' and ';'. The pipeline recognizes the column separator in the .csv file automatically. When using a .txt file, the pipeline only supports the column separator '\t'. Make sure your data file matches this requirement before providing it to the pipeline. ### 3.4 Required data features | Required Features (#2 or #3) | Description | |:--------------------------------|:--------------------------------------| | * Time | Timestamp of the eye-tracking record. | | * X and Y eye coordinates | Coordinates of the eye gaze. | | **or** | | | * Left X and Y eye coordinates | Coordinates of the left eye gaze. | | * Right X and Y eye coordinates | Coordinates of the right eye gaze. | #### Example: Input with separate coordinates per eye ```python Time Left_X Left_Y Right_X Right_Y 0 554262330 92.60 648.54 79.27 648.36 ``` 💡 "At timestamp ```554262330``` the left eye is at coordinates ```(92.60, 648.54)``` (in px) and the right eye is at coordinates ```(79.27, 648.36)``` (in px)." To process this structure, the parameters would be speficified as following: ```pyhon result = rpeak2hrv_pipeline(inputs, time_header = "Time", x_headers = ['Left_X', 'Right_X'], y_headers = ['Left_Y', 'Right_Y']) ``` To be specific, `time_header` does not have to be specified in this case, as 'Time' is the default value of this parameter. **Important**: The pipeline only checks for alternative column headers for separated gaze coordinates per eye, if you initially specified `x_header`and `y_header` as lists containing two entries. Otherwise, the pipeline assumes one column provides combined information for both eyes and checks for an alternative single column in the data. #### Example: Input with joint x and y coordinates ```python Time X Y 0 554262330 92.60 648.54 ``` 💡 "At timestamp ```554262330``` the eyes are at coordinates ```(92.60, 648.54)``` (in px)." Similar to the example above, the parameters would be speficified as following: ```pyhon result = rpeak2hrv_pipeline(inputs, time_header = "Time", x_headers = 'X', y_headers = 'Y') ``` To be specific, in this case, none of the parameters would have to be specified as the data structure matches their default values. A shorter processing could look like this: ```pyhon result = rpeak2hrv_pipeline(inputs) ``` ## 4 Output The pipeline provides a _DataFrame_ as output containing detected saccades. To be specific, the output contains the following information: | Features (#7) | Description | |:--------------|:-----------------------------------------| | * Start Time | Timestamp of the start of the saccade. | | * End Time | Timestamp of the end of the saccade. | | * Duration | Duration of the saccade. | | * Start X | X coordinate of the first saccade point. | | * Start Y | Y coordinate of the first saccade point. | | * End X | X coordinate of the last saccade point. | | * End Y | Y coordinate of the last saccade point. | #### Example: Pipeline output ```python starttime endtime duration startx starty endx endy 0 560999822 561123849 124027 519.355000 635.295000 717.440000 892.385000 ``` 💡 "The saccade starts at timestamp ```560999822``` and ends at timestamp ```561123849```. The duration of the saccade is ```124027``` nanoseconds. The saccade starts at coordinates ```(519.355000, 635.295000)``` (in px) and ends at coordinates ```(717.440000, 892.385000)``` (in px)."