📚 What is EyeTrack2Saccade?

A pipeline to derive saccation features from eyetracking records. A saccade is a fast movement of the eye that abruptly changes the point of fixation.

Usage Instructions

0 (Optional but recommeded) Set up Virtual Environment

1 Install Pipeline Dependencies

In order to use the pipeline, you need to install some dependencies the pipeline relies on. Run the following command to install the dependencies defined in requirements.txt. You can get the file from this repository.

%pip install -r requirements.txt

2 Instantiate Pipeline

from transformers import pipeline
eye2sacc_pipeline = pipeline(model = "hubii-world/eyetrack-to-sacc-pipeline", trust_remote_code=True)

3 Pipeline Parameters & Supported File Formats

Overiew: Parameters

The pipeline provides a variety of different parameters that can be set to adjust the preprocessing behavior:

Parameter name	Type	Default value	Description
`inputs`	str or DataFrame	No default value	The input that should be processed by the pipeline. This can either be a path to a file containing the data to process or the data itself
`time_header`	str	'Time'	The column of `inputs` that contains the recording timestamp information of the different data points
`x_headers`	str or list	'X'	The column(s) of `inputs` that contain x coordinates of the eye gaze
`y_headers`	str or list	'Y'	The column(s) of `inputs` that contain y coordinates of the eye gaze
`missing`	double	0.0	value to be used for missing data
`minlen`	int	5	minimal length of saccades in milliseconds to be recognized. All detected saccades with len(sac) < minlen will be ignored
`maxvel`	int	40	velocity threshold in pixels/second
`maxacc`	int	30	acceleration threshold in pixels / second**2

The following sections explain some parameters in detail and provide illustrative examples.

3.1 `Inputs`

The inputs parameter represents the data the pipeline should process to saccades. The pipeline supports values of type str and DataFrame as input.

When providing the inputs as str, it has to represent a file path to a file containing the data to process. Supported file formats are .csv and .txt.

Alternatively, you can also provide the data directly to the pipeline in form of a DataFrame.

Example: Provide input as file path

file_path = "./Example_data/eyetracker_freeviewing.txt"
result = rpeak2hrv_pipeline(inputs=file_path, time_header = "Time", x_headers = ['L POR X [px]', 'R POR X [px]'], y_headers = ['L POR Y [px]', 'R POR Y [px]'])
result.head()

3.2 `time_header`, `x_headers` & `y_headers`

The parameters time_header, x_headers and y_headers specify the structure of the input the pipeline should process. Section 3.4 provides illustrative examples regarding possible input structures.

time_header expects a str and specifies the column that contains the timestamp information of the data. The parameter has the default value 'Time'. In case that the specified parameter does not match any column in inputs, the pipeline also checks for alternative popular column headers, such as 'time' and 'timestamp'.

x_headers specifies the column(s) that contain x coordintates of the eye gaze to process. The default parameter settting expects a single column named 'X' to provide x coordinates. In case that the specified parameter does not match any column in inputs, the pipeline also checks alternative popular column headers, such as 'x' and 'POR X [px]'. Instead of a single column, the data may provide x coordinates for each eye separately. In this case, x_headers can be set to a list of column names. Make sure that the first list entry corresponds to the left eye, and the second list entry corresponds to the right eye. In case that the specified list does not match any columns in inputs, the pipeline also checks alternative popular column headers, such as 'L POR X [px]' and 'R POR X [px]' respectively.

y_headersspecifies the column(s) that contain y coordinates of the eye gaze, similar to x_headers. The default parameter value expects a single column named 'Y', but in case that no matching column is found, the pipeline also checks for 'y' and 'POR Y [px]'. Same as for x_headers, y_headers can be set to a list of two column names if the coordinates in inputs are specified for each eye separately. In this case, the pipeline also checks for 'L POR Y [px]' and 'R POR Y [px]' as column names respectively, if the specified names cannot be found.

Whenever the pipelines automatically checks alternative column names, the names are checked case-insensitive (e.g. the pipeline checks for 'POR X [px]' and 'por x [px]').

3.3 Supported file formats

As already mentioned in Section 3.1, the pipeline can process 2 types of data formats when providing a file path: .csv and .txt. When using a .csv file, the pipeline supports two column seprarators: ',' and ';'.

The pipeline recognizes the column separator in the .csv file automatically.

When using a .txt file, the pipeline only supports the column separator '\t'. Make sure your data file matches this requirement before providing it to the pipeline.

3.4 Required data features

Required Features (#2 or #3)	Description
* Time	Timestamp of the eye-tracking record.
* X and Y eye coordinates	Coordinates of the eye gaze.
or
* Left X and Y eye coordinates	Coordinates of the left eye gaze.
* Right X and Y eye coordinates	Coordinates of the right eye gaze.

Example: Input with separate coordinates per eye

    Time       Left_X  Left_Y  Right_X  Right_Y  
0   554262330  92.60   648.54  79.27    648.36

💡 "At timestamp 554262330 the left eye is at coordinates (92.60, 648.54) (in px) and the right eye is at coordinates (79.27, 648.36) (in px)."

To process this structure, the parameters would be speficified as following:

result = rpeak2hrv_pipeline(inputs, time_header = "Time", x_headers = ['Left_X', 'Right_X'], y_headers = ['Left_Y', 'Right_Y'])

To be specific, time_header does not have to be specified in this case, as 'Time' is the default value of this parameter.

Important: The pipeline only checks for alternative column headers for separated gaze coordinates per eye, if you initially specified x_headerand y_header as lists containing two entries. Otherwise, the pipeline assumes one column provides combined information for both eyes and checks for an alternative single column in the data.

Example: Input with joint x and y coordinates

    Time       X     Y
0   554262330  92.60  648.54

💡 "At timestamp 554262330 the eyes are at coordinates (92.60, 648.54) (in px)."

Similar to the example above, the parameters would be speficified as following:

result = rpeak2hrv_pipeline(inputs, time_header = "Time", x_headers = 'X', y_headers = 'Y')

To be specific, in this case, none of the parameters would have to be specified as the data structure matches their default values. A shorter processing could look like this:

result = rpeak2hrv_pipeline(inputs)

4 Output

The pipeline provides a DataFrame as output containing detected saccades. To be specific, the output contains the following information:

Features (#7)	Description
* Start Time	Timestamp of the start of the saccade.
* End Time	Timestamp of the end of the saccade.
* Duration	Duration of the saccade.
* Start X	X coordinate of the first saccade point.
* Start Y	Y coordinate of the first saccade point.
* End X	X coordinate of the last saccade point.
* End Y	Y coordinate of the last saccade point.

Example: Pipeline output

    starttime  endtime    duration  startx      starty      endx        endy
0   560999822  561123849  124027    519.355000  635.295000  717.440000  892.385000

💡 "The saccade starts at timestamp 560999822 and ends at timestamp 561123849. The duration of the saccade is 124027 nanoseconds. The saccade starts at coordinates (519.355000, 635.295000) (in px) and ends at coordinates (717.440000, 892.385000) (in px)."