sunga25 commited on
Commit
12c5911
·
verified ·
1 Parent(s): 5adca0a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +100 -3
README.md CHANGED
@@ -1,3 +1,100 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Detecting Anomalies in Professional Men's Tennis Tournament Draws
2
+ This project investigates potential manipulation in professional men’s tennis tournament draws using statistical analysis and artificial intelligence (AI). By analyzing ATP match data from 2000 to 2017, this study applies a Variational Autoencoder (VAE) model to detect anomalies that may indicate non-randomness or biases in the draw process.
3
+
4
+ ## Table of Contents
5
+
6
+ Overview
7
+
8
+ Features
9
+
10
+ Data
11
+
12
+ Installation
13
+
14
+ Usage
15
+
16
+ Results
17
+
18
+ Project Structure
19
+
20
+ Contributing
21
+
22
+ License
23
+
24
+ Acknowledgements
25
+
26
+
27
+ ## Overview
28
+ The integrity of tournament draws is crucial for maintaining fairness in professional tennis. This project leverages machine learning, specifically Variational Autoencoders, to analyze historical match data and identify patterns or anomalies that suggest potential manipulation in the draw process.
29
+
30
+ ## Features
31
+ Load and preprocess ATP match data from 2000 to 2017.
32
+
33
+ Engineer features relevant to player rankings, age differences, and match statistics.
34
+
35
+ Train a Variational Autoencoder (VAE) model to detect anomalies in match outcomes.
36
+
37
+ Analyze anomalies by year, player, and tournament to assess potential biases.
38
+
39
+ Save analysis results to CSV files for further review.
40
+
41
+ ## Data
42
+ Match data used in this project is sourced from publicly available ATP records. The data includes information such as player rankings, match outcomes, tournament dates, and player statistics.
43
+
44
+ Data Files:
45
+
46
+ atp_matches_2000.csv to atp_matches_2017.csv
47
+
48
+ Ensure these data files are placed in the root directory of the project before running the scripts.
49
+
50
+
51
+ ## Installation
52
+ Clone the repository and install the required dependencies.
53
+
54
+ git clone https://github.com/your-username/tennis-draw-anomalies.git
55
+
56
+ cd tennis-draw-anomalies
57
+
58
+ pip install -r requirements.txt
59
+
60
+ ## Usage:
61
+
62
+ Prepare Data:
63
+
64
+ Ensure all ATP match data CSV files are in the root directory.
65
+
66
+ ## Run the Main Script:
67
+
68
+ Execute the main script to load data, preprocess it, train the model, and analyze anomalies.
69
+
70
+ python main.py
71
+
72
+ ## Review Results:
73
+
74
+ Anomaly detection results will be saved as CSV files in the output directory.
75
+
76
+ ## Results
77
+ The results of the analysis include:
78
+
79
+ Detected Anomalies:
80
+
81
+ A CSV file listing all detected anomalies in matchups.
82
+
83
+ Anomalies per Year:
84
+
85
+ A summary of anomalies detected per year.
86
+
87
+ Anomalies by Player and Tournament:
88
+
89
+ Analysis of which players and tournaments had the most anomalies.
90
+
91
+
92
+ ## Contributing
93
+ Contributions are welcome! Please fork the repository and submit a pull request with your changes. For major changes, please open an issue first to discuss what you would like to change.
94
+
95
+ ## License
96
+ This project is licensed under the MIT License - see the LICENSE file for details.
97
+
98
+ ## Acknowledgements
99
+ Thanks to the ATP Tour for making match data publicly available.
100
+ Special thanks to all contributors and the open-source community for their tools and resources.