Spaces:
Runtime error
Runtime error
Kévin Yauy
commited on
Commit
·
9d42d90
1
Parent(s):
59e5ae2
feat(app): first commit PhenoGenius web app standalone repository
Browse files- README.md +75 -0
- data/img/logo-chuga.png +0 -0
- data/img/logo-seqone.png +0 -0
- data/img/logo-uga.png +0 -0
- data/img/logoMIAI-rvb.png +0 -0
- data/img/phenogenius.png +0 -0
- data/resources/Homo_sapiens.gene_info.gz +3 -0
- data/resources/hpo_obo_2024.json +3 -0
- data/resources/main_topics_hpo_390_42_filtered_norm_004_2024.tsv +0 -0
- data/resources/ohe_all_thesaurus_weighted_2024.tsv.gz +3 -0
- data/resources/pheno_NMF_390_matrix_42_2024.pkl +3 -0
- data/resources/pheno_NMF_390_model_42_2024.pkl +3 -0
- data/resources/similarity_dict_threshold_80.json +3 -0
- phenogenius_app.py +643 -0
- poetry.lock +0 -0
- pyproject.toml +21 -0
- requirements.txt +57 -0
README.md
ADDED
@@ -0,0 +1,75 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
title: PhenoGenius
|
3 |
+
emoji: genie
|
4 |
+
sdk: streamlit
|
5 |
+
sdk_version: 1.25.0
|
6 |
+
app_file: phenogenius_app.py
|
7 |
+
python_version: 3.11
|
8 |
+
pinned: true
|
9 |
+
---
|
10 |
+
|
11 |
+
# PhenoGenius web app
|
12 |
+
|
13 |
+
Symptom interaction modeling for precision medicine
|
14 |
+
|
15 |
+
## Overview
|
16 |
+
|
17 |
+
Symptom interaction model provide a method to standardize clinical descriptions and fully exploit phenotypic data in precision medicine.
|
18 |
+
|
19 |
+
This repository contains scripts and files to use PhenoGenius Web app, the phenotype matching system for genetic disease based on this model. **Please try PhenoGenius in the cloud at [https://huggingface.co/spaces/kyauy/PhenoGenius](https://huggingface.co/spaces/kyauy/PhenoGenius).**
|
20 |
+
|
21 |
+
If you use PhenoGenius, please cite:
|
22 |
+
> Yauy et al., Learning phenotypic patterns in genetic disease by symptom interaction modeling. medrXiv (2023). [https://doi.org/10.1101/2022.07.29.22278181](https://doi.org/10.1101/2022.07.29.22278181)
|
23 |
+
|
24 |
+
## Install
|
25 |
+
|
26 |
+
- Requirements
|
27 |
+
|
28 |
+
```bash
|
29 |
+
python == 3.11 #(pyenv install 3.11)
|
30 |
+
poetry #(https://python-poetry.org/docs/#installation)
|
31 |
+
git-lfs
|
32 |
+
```
|
33 |
+
|
34 |
+
- Install dependencies
|
35 |
+
|
36 |
+
```bash
|
37 |
+
poetry install
|
38 |
+
```
|
39 |
+
|
40 |
+
If you need to generate a `requirements.txt` file, use the following command:
|
41 |
+
```
|
42 |
+
poetry export --without-hashes --format=requirements.txt > requirements.txt
|
43 |
+
```
|
44 |
+
|
45 |
+
NB: if git-lfs is not installed, you won't be able to download PhenoGenius Web app resources.
|
46 |
+
|
47 |
+
## Use streamlit webapp in your desktop
|
48 |
+
|
49 |
+
### Run
|
50 |
+
|
51 |
+
```bash
|
52 |
+
poetry shell
|
53 |
+
streamlit run phenogenius_app.py
|
54 |
+
```
|
55 |
+
|
56 |
+
|
57 |
+
Enjoy !
|
58 |
+
|
59 |
+
## Command line interface
|
60 |
+
|
61 |
+
The command line interface is available in the PhenoGenius client repository (https://github.com/kyauy/PhenoGenius/)[https://github.com/kyauy/PhenoGenius/].
|
62 |
+
|
63 |
+
## License
|
64 |
+
|
65 |
+
*PhenoGenius* is licensed under the Apache License, Version 2.0. See [LICENSE](LICENSE) for the full license text.
|
66 |
+
|
67 |
+
## Misc
|
68 |
+
|
69 |
+
*PhenoGenius* is a collaboration of :
|
70 |
+
|
71 |
+
[](https://seqone.com/)
|
72 |
+
|
73 |
+
[](https://iab.univ-grenoble-alpes.fr/)
|
74 |
+
|
75 |
+
|
data/img/logo-chuga.png
ADDED
![]() |
data/img/logo-seqone.png
ADDED
![]() |
data/img/logo-uga.png
ADDED
![]() |
data/img/logoMIAI-rvb.png
ADDED
![]() |
data/img/phenogenius.png
ADDED
![]() |
data/resources/Homo_sapiens.gene_info.gz
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0d068aeddb48594d70dc3d91409059bc2b5992ac280f6971d2ae41583d895707
|
3 |
+
size 3234358
|
data/resources/hpo_obo_2024.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:655dd665ba80c547844e8a7095398fafe155bc149566b656a3a62019f9aed813
|
3 |
+
size 11488780
|
data/resources/main_topics_hpo_390_42_filtered_norm_004_2024.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
data/resources/ohe_all_thesaurus_weighted_2024.tsv.gz
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8f2a882aa1c25da99468af56829fc35299631b3f9e30aca821bfe9cfa9f220e2
|
3 |
+
size 14121583
|
data/resources/pheno_NMF_390_matrix_42_2024.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6a2740bb601d0776dbccccf17da25e8c09342475a510f37f875232160ac47b41
|
3 |
+
size 17447203
|
data/resources/pheno_NMF_390_model_42_2024.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4c8cde8d11d69f6ffa5c405a41187f5ef42b5375e7aa8fe1f8f6d986967ba13b
|
3 |
+
size 58193070
|
data/resources/similarity_dict_threshold_80.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:73187c97bb1bf2898d1a85b27ed260e708d195e74f98338b44517bf8537bf076
|
3 |
+
size 38378638
|
phenogenius_app.py
ADDED
@@ -0,0 +1,643 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import streamlit as st
|
2 |
+
import numpy as np
|
3 |
+
import pandas as pd
|
4 |
+
from PIL import Image
|
5 |
+
import ujson as json
|
6 |
+
import pickle as pk
|
7 |
+
from plotnine import *
|
8 |
+
|
9 |
+
# -- Set page config
|
10 |
+
apptitle = "PhenoGenius"
|
11 |
+
|
12 |
+
st.set_page_config(
|
13 |
+
page_title=apptitle,
|
14 |
+
page_icon=":genie:",
|
15 |
+
layout="wide",
|
16 |
+
initial_sidebar_state="auto",
|
17 |
+
)
|
18 |
+
|
19 |
+
# -- Set Sidebar
|
20 |
+
image_pg = Image.open("data/img/phenogenius.png")
|
21 |
+
st.sidebar.image(image_pg, caption=None, width=100)
|
22 |
+
st.sidebar.title("PhenoGenius")
|
23 |
+
|
24 |
+
st.sidebar.header(
|
25 |
+
"Learning phenotypic patterns in genetic diseases by symptom interaction modeling"
|
26 |
+
)
|
27 |
+
|
28 |
+
st.sidebar.markdown(
|
29 |
+
"""
|
30 |
+
This webapp presents symptom interaction models in genetic diseases to provide:
|
31 |
+
- Standardized clinical descriptions
|
32 |
+
- Interpretable matches between symptoms and genes
|
33 |
+
|
34 |
+
Code source is available in GitHub:
|
35 |
+
[https://github.com/kyauy/PhenoGenius](https://github.com/kyauy/PhenoGenius)
|
36 |
+
|
37 |
+
Last update: 2024-07-15
|
38 |
+
|
39 |
+
PhenoGenius is a collaborative project from:
|
40 |
+
"""
|
41 |
+
)
|
42 |
+
|
43 |
+
image_uga = Image.open("data/img/logo-uga.png")
|
44 |
+
st.sidebar.image(image_uga, caption=None, width=95)
|
45 |
+
|
46 |
+
image_seqone = Image.open("data/img/logo-seqone.png")
|
47 |
+
st.sidebar.image(image_seqone, caption=None, width=95)
|
48 |
+
|
49 |
+
image_miai = Image.open("data/img/logoMIAI-rvb.png")
|
50 |
+
st.sidebar.image(image_miai, caption=None, width=95)
|
51 |
+
|
52 |
+
image_chuga = Image.open("data/img/logo-chuga.png")
|
53 |
+
st.sidebar.image(image_chuga, caption=None, width=60)
|
54 |
+
|
55 |
+
|
56 |
+
@st.cache_data(max_entries=50)
|
57 |
+
def convert_df(df):
|
58 |
+
return df.to_csv(sep="\t").encode("utf-8")
|
59 |
+
|
60 |
+
|
61 |
+
@st.cache_data(max_entries=50)
|
62 |
+
def load_data():
|
63 |
+
matrix = pd.read_csv(
|
64 |
+
"data/resources/ohe_all_thesaurus_weighted_2024.tsv.gz",
|
65 |
+
sep="\t",
|
66 |
+
compression="gzip",
|
67 |
+
index_col=0,
|
68 |
+
)
|
69 |
+
return matrix
|
70 |
+
|
71 |
+
|
72 |
+
@st.cache_data(hash_funcs={"Pickle": lambda _: None}, max_entries=50)
|
73 |
+
def load_nmf_model():
|
74 |
+
with open("data/resources/pheno_NMF_390_model_42_2024.pkl", "rb") as pickle_file:
|
75 |
+
pheno_NMF = pk.load(pickle_file)
|
76 |
+
with open("data/resources/pheno_NMF_390_matrix_42_2024.pkl", "rb") as pickle_file:
|
77 |
+
reduced = pk.load(pickle_file)
|
78 |
+
return pheno_NMF, reduced
|
79 |
+
|
80 |
+
|
81 |
+
@st.cache_data(max_entries=50)
|
82 |
+
def symbol_to_id_to_dict():
|
83 |
+
# from NCBI
|
84 |
+
ncbi_df = pd.read_csv("data/resources/Homo_sapiens.gene_info.gz", sep="\t")
|
85 |
+
ncbi_df = ncbi_df[ncbi_df["#tax_id"] == 9606]
|
86 |
+
ncbi_df_ncbi = ncbi_df.set_index("Symbol")
|
87 |
+
ncbi_to_dict_ncbi = ncbi_df_ncbi["GeneID"].to_dict()
|
88 |
+
ncbi_df = ncbi_df.set_index("GeneID")
|
89 |
+
ncbi_to_dict = ncbi_df["Symbol"].to_dict()
|
90 |
+
return ncbi_to_dict_ncbi, ncbi_to_dict
|
91 |
+
|
92 |
+
|
93 |
+
@st.cache_data(hash_funcs={"_json.Scanner": hash}, max_entries=50)
|
94 |
+
def load_hp_ontology():
|
95 |
+
with open("data/resources/hpo_obo_2024.json") as json_data:
|
96 |
+
data_dict = json.load(json_data)
|
97 |
+
return data_dict
|
98 |
+
|
99 |
+
|
100 |
+
@st.cache_data(max_entries=50)
|
101 |
+
def hpo_description_to_id():
|
102 |
+
data_dict = {}
|
103 |
+
for key, value in hp_onto.items():
|
104 |
+
data_dict[value["name"]] = key
|
105 |
+
return data_dict
|
106 |
+
|
107 |
+
|
108 |
+
@st.cache_data(max_entries=50)
|
109 |
+
def load_topic_data():
|
110 |
+
topic = pd.read_csv(
|
111 |
+
"data/resources/main_topics_hpo_390_42_filtered_norm_004_2024.tsv",
|
112 |
+
sep="\t",
|
113 |
+
index_col=0,
|
114 |
+
)
|
115 |
+
return topic
|
116 |
+
|
117 |
+
|
118 |
+
@st.cache_data(hash_funcs={"_json.Scanner": hash}, max_entries=50)
|
119 |
+
def load_similarity_dict():
|
120 |
+
with open("data/resources/similarity_dict_threshold_80.json") as json_data:
|
121 |
+
data_dict = json.load(json_data)
|
122 |
+
return data_dict
|
123 |
+
|
124 |
+
|
125 |
+
def get_symbol(gene):
|
126 |
+
if gene in symbol.keys():
|
127 |
+
return symbol[gene]
|
128 |
+
|
129 |
+
|
130 |
+
def get_hpo_name(hpo):
|
131 |
+
names = {}
|
132 |
+
if hpo in hp_onto.keys():
|
133 |
+
names[hpo] = hp_onto[hpo]["name"]
|
134 |
+
return names
|
135 |
+
|
136 |
+
|
137 |
+
def get_hpo_name_only(hpo):
|
138 |
+
if hpo in hp_onto.keys():
|
139 |
+
return hp_onto[hpo]["name"]
|
140 |
+
else:
|
141 |
+
return None
|
142 |
+
|
143 |
+
|
144 |
+
def get_hpo_name_list(hpo_list, hp_onto):
|
145 |
+
names = {}
|
146 |
+
for hpo in hpo_list:
|
147 |
+
if hpo in hp_onto.keys():
|
148 |
+
names[hpo] = hp_onto[hpo]["name"]
|
149 |
+
return names
|
150 |
+
|
151 |
+
|
152 |
+
def get_similar_terms(hpo_list, similarity_terms_dict):
|
153 |
+
hpo_list_w_simi = {}
|
154 |
+
for term in hpo_list:
|
155 |
+
hpo_list_w_simi[term] = 1
|
156 |
+
if term in similarity_terms_dict.keys():
|
157 |
+
for key, value in similarity_terms_dict[term].items():
|
158 |
+
if value > 0.8:
|
159 |
+
score = value / len(similarity_terms_dict[term].keys())
|
160 |
+
if key in hpo_list_w_simi.keys():
|
161 |
+
if score > hpo_list_w_simi[key]:
|
162 |
+
hpo_list_w_simi[key] = score
|
163 |
+
else:
|
164 |
+
pass
|
165 |
+
else:
|
166 |
+
hpo_list_w_simi[key] = score
|
167 |
+
hpo_list_all = hpo_list_w_simi.keys()
|
168 |
+
return hpo_list_w_simi, list(hpo_list_all)
|
169 |
+
|
170 |
+
|
171 |
+
def score(hpo_list, matrix):
|
172 |
+
# Create a copy of the filtered matrix to avoid SettingWithCopyWarning
|
173 |
+
matrix_filter = matrix[hpo_list].copy()
|
174 |
+
|
175 |
+
# Use .loc to safely add or modify columns in the copy of the DataFrame
|
176 |
+
matrix_filter.loc[:, "sum"] = matrix_filter.sum(axis=1)
|
177 |
+
matrix_filter.loc[:, "gene_symbol"] = matrix_filter.index.to_series().apply(
|
178 |
+
get_symbol
|
179 |
+
)
|
180 |
+
|
181 |
+
# Return the modified DataFrame sorted by 'sum'
|
182 |
+
return matrix_filter.sort_values("sum", ascending=False)
|
183 |
+
|
184 |
+
|
185 |
+
def score_sim_add(hpo_list_add, matrix, sim_dict):
|
186 |
+
# Ensure matrix_filter is a copy to avoid modifying the original DataFrame
|
187 |
+
matrix_filter = matrix[hpo_list_add].copy()
|
188 |
+
|
189 |
+
# Iterate through sim_dict to update matrix_filter values
|
190 |
+
for key, value in sim_dict.items():
|
191 |
+
if key in matrix_filter.columns:
|
192 |
+
matrix_filter[key] = (
|
193 |
+
matrix_filter[key] * value
|
194 |
+
) # Direct column assignment is fine here
|
195 |
+
|
196 |
+
# Calculate the sum and assign gene_symbol, using direct assignment for these operations
|
197 |
+
matrix_filter["sum"] = matrix_filter.sum(axis=1)
|
198 |
+
matrix_filter["gene_symbol"] = matrix_filter.index.to_series().apply(get_symbol)
|
199 |
+
|
200 |
+
# Return the DataFrame sorted by 'sum'
|
201 |
+
return matrix_filter.sort_values("sum", ascending=False)
|
202 |
+
|
203 |
+
|
204 |
+
def get_phenotype_specificity(gene_diag, data_patient):
|
205 |
+
rank = data_patient.loc[int(ncbi[gene_diag]), "rank"]
|
206 |
+
max_rank = data_patient["rank"].max()
|
207 |
+
if rank == max_rank:
|
208 |
+
return "D - the reported phenotype is NOT consistent with what is expected for the gene/genomic region or not consistent in general."
|
209 |
+
elif rank < 41:
|
210 |
+
return "A - the reported phenotype is highly specific and relatively unique to the gene (top 40, 50 perc of diagnosis in PhenoGenius cohort)."
|
211 |
+
elif rank < 250:
|
212 |
+
return "B - the reported phenotype is consistent with the gene, is highly specific, but not necessarily unique to the gene (top 250, 75 perc of diagnosis in PhenoGenius cohort)."
|
213 |
+
else:
|
214 |
+
return "C - the phenotype is reported with limited association with the gene, not highly specific and/or with high genetic heterogeneity."
|
215 |
+
|
216 |
+
|
217 |
+
def get_relatives_list(hpo_list, hp_onto):
|
218 |
+
all_list = []
|
219 |
+
for hpo in hpo_list:
|
220 |
+
all_list.append(hpo)
|
221 |
+
if hpo in hp_onto.keys():
|
222 |
+
for parent in hp_onto[hpo]["parents"]:
|
223 |
+
all_list.append(parent)
|
224 |
+
for children in hp_onto[hpo]["childrens"]:
|
225 |
+
all_list.append(children)
|
226 |
+
return list(set(all_list))
|
227 |
+
|
228 |
+
|
229 |
+
def get_hpo_id(hpo_list):
|
230 |
+
hpo_id = []
|
231 |
+
for description in hpo_list:
|
232 |
+
hpo_id.append(hp_desc_id[description])
|
233 |
+
return ",".join(hpo_id)
|
234 |
+
|
235 |
+
|
236 |
+
hp_onto = load_hp_ontology()
|
237 |
+
hp_desc_id = hpo_description_to_id()
|
238 |
+
ncbi, symbol = symbol_to_id_to_dict()
|
239 |
+
|
240 |
+
|
241 |
+
with st.form("my_form"):
|
242 |
+
c1, c2 = st.columns(2)
|
243 |
+
with c1:
|
244 |
+
hpo_raw = st.multiselect(
|
245 |
+
"Select interactively your HPOs or...",
|
246 |
+
list(hp_desc_id.keys()),
|
247 |
+
["Renal cyst", "Hepatic cysts"],
|
248 |
+
)
|
249 |
+
with c2:
|
250 |
+
hpo = st.text_input(
|
251 |
+
"copy/paste your HPOs, separated with comma",
|
252 |
+
"HP:0000107,HP:0001407",
|
253 |
+
)
|
254 |
+
gene_diag_input = st.multiselect(
|
255 |
+
"Optional: provide HGNC gene symbol to be tested",
|
256 |
+
options=list(ncbi.keys()),
|
257 |
+
default=["PKD1"],
|
258 |
+
max_selections=1,
|
259 |
+
)
|
260 |
+
submit_button = st.form_submit_button(
|
261 |
+
label="Submit",
|
262 |
+
)
|
263 |
+
|
264 |
+
|
265 |
+
if submit_button:
|
266 |
+
if hpo_raw != ["Renal cyst", "Hepatic cysts"] and len(hpo_raw) > 0:
|
267 |
+
hpo = get_hpo_id(hpo_raw)
|
268 |
+
data = load_data()
|
269 |
+
pheno_NMF, reduced = load_nmf_model()
|
270 |
+
topic = load_topic_data()
|
271 |
+
similarity_terms_dict = load_similarity_dict()
|
272 |
+
|
273 |
+
hpo_list_ini = hpo.strip().split(",")
|
274 |
+
|
275 |
+
if gene_diag_input:
|
276 |
+
if gene_diag_input[0] in ncbi.keys():
|
277 |
+
gene_diag = gene_diag_input[0]
|
278 |
+
else:
|
279 |
+
st.write(
|
280 |
+
gene_diag_input
|
281 |
+
+ " gene are not in our database. Please check gene name (need to be in CAPITAL format)."
|
282 |
+
)
|
283 |
+
gene_diag = None
|
284 |
+
else:
|
285 |
+
gene_diag = None
|
286 |
+
|
287 |
+
hpo_list_up = []
|
288 |
+
for hpo in hpo_list_ini:
|
289 |
+
if hpo in ["HP:0000001"]:
|
290 |
+
pass
|
291 |
+
elif len(hpo) != 10:
|
292 |
+
st.write(
|
293 |
+
"Incorrect HPO format: "
|
294 |
+
+ hpo
|
295 |
+
+ ". Please check (7-digits terms with prefix HP:, and separed by commas)."
|
296 |
+
)
|
297 |
+
pass
|
298 |
+
elif hpo not in data.columns:
|
299 |
+
pass
|
300 |
+
st.write(hpo + " not available in current database. Please modify.")
|
301 |
+
else:
|
302 |
+
if data[hpo].astype(bool).sum(axis=0) != 0:
|
303 |
+
hpo_list_up.append(hpo)
|
304 |
+
else:
|
305 |
+
hpo_to_test = hp_onto[hpo]["direct_parent"][0]
|
306 |
+
while data[hpo_to_test].astype(bool).sum(
|
307 |
+
axis=0
|
308 |
+
) == 0 and hpo_to_test not in ["HP:0000001"]:
|
309 |
+
hpo_to_test = hp_onto[hpo_to_test]["direct_parent"][0]
|
310 |
+
if hpo_to_test in ["HP:0000001"]:
|
311 |
+
st.write(
|
312 |
+
"No gene-HPO associations was found for "
|
313 |
+
+ hpo
|
314 |
+
+ " and parents."
|
315 |
+
)
|
316 |
+
else:
|
317 |
+
hpo_list_up.append(hpo_to_test)
|
318 |
+
st.write(
|
319 |
+
"We replaced: ",
|
320 |
+
hpo,
|
321 |
+
" by ",
|
322 |
+
hp_onto[hpo]["direct_parent"][0],
|
323 |
+
"-",
|
324 |
+
get_hpo_name(hpo_to_test),
|
325 |
+
)
|
326 |
+
hpo_list = list(set(hpo_list_up))
|
327 |
+
del hpo_list_up
|
328 |
+
|
329 |
+
if hpo_list:
|
330 |
+
with st.expander("See HPO inputs"):
|
331 |
+
st.write(get_hpo_name_list(hpo_list_ini, hp_onto))
|
332 |
+
del hpo_list_ini
|
333 |
+
|
334 |
+
hpo_list_name = get_relatives_list(hpo_list, hp_onto)
|
335 |
+
|
336 |
+
st.header("Clinical description with symptom interaction modeling")
|
337 |
+
|
338 |
+
witness = np.zeros(len(data.columns))
|
339 |
+
witness_nmf = np.matmul(pheno_NMF.components_, witness)
|
340 |
+
|
341 |
+
patient = np.zeros(len(data.columns))
|
342 |
+
for hpo in hpo_list:
|
343 |
+
hpo_index = list(data.columns).index(hpo)
|
344 |
+
patient[hpo_index] = 1
|
345 |
+
|
346 |
+
patient_nmf = np.matmul(pheno_NMF.components_, patient)
|
347 |
+
|
348 |
+
witness_sugg_df = (
|
349 |
+
pd.DataFrame(reduced)
|
350 |
+
.set_index(data.index)
|
351 |
+
.apply(lambda x: (x - witness_nmf) ** 2, axis=1)
|
352 |
+
)
|
353 |
+
patient_sugg_df = (
|
354 |
+
pd.DataFrame(reduced)
|
355 |
+
.set_index(data.index)
|
356 |
+
.apply(lambda x: (x - patient_nmf) ** 2, axis=1)
|
357 |
+
)
|
358 |
+
|
359 |
+
case_sugg_df = (patient_sugg_df - witness_sugg_df).sum()
|
360 |
+
|
361 |
+
patient_df_info = pd.DataFrame(case_sugg_df).merge(
|
362 |
+
topic, left_index=True, right_index=True
|
363 |
+
)
|
364 |
+
|
365 |
+
patient_df_info["mean_score"] = round(
|
366 |
+
patient_df_info[0] / (patient_df_info["total_weight"] ** 2), 4
|
367 |
+
)
|
368 |
+
|
369 |
+
patient_df_info_write = patient_df_info[
|
370 |
+
["mean_score", "main_term", "n_hpo", "hpo_name", "hpo_list", "weight"]
|
371 |
+
].sort_values("mean_score", ascending=False)
|
372 |
+
|
373 |
+
del case_sugg_df
|
374 |
+
del patient_sugg_df
|
375 |
+
del witness_sugg_df
|
376 |
+
del patient
|
377 |
+
|
378 |
+
with st.expander("See projection in groups of symptoms dimension*"):
|
379 |
+
st.dataframe(patient_df_info_write)
|
380 |
+
st.write(
|
381 |
+
"\* For interpretability, we report only the top 10% of the 390 groups of interacting symptom associations"
|
382 |
+
)
|
383 |
+
match_proj_csv = convert_df(patient_df_info_write)
|
384 |
+
|
385 |
+
st.download_button(
|
386 |
+
"Download description projection",
|
387 |
+
match_proj_csv,
|
388 |
+
"clin_desc_projected.tsv",
|
389 |
+
"text/csv",
|
390 |
+
key="download-csv-proj",
|
391 |
+
)
|
392 |
+
|
393 |
+
sim_dict, hpo_list_add = get_similar_terms(hpo_list, similarity_terms_dict)
|
394 |
+
similar_list = list(set(hpo_list_add) - set(hpo_list))
|
395 |
+
similar_list_desc = get_hpo_name_list(similar_list, hp_onto)
|
396 |
+
|
397 |
+
if similar_list_desc:
|
398 |
+
with st.expander("See symptoms with similarity > 80%"):
|
399 |
+
similar_list_desc_df = pd.DataFrame.from_dict(
|
400 |
+
similar_list_desc, orient="index"
|
401 |
+
)
|
402 |
+
similar_list_desc_df.columns = ["description"]
|
403 |
+
st.write(similar_list_desc_df)
|
404 |
+
del similar_list_desc_df
|
405 |
+
del similar_list
|
406 |
+
del similar_list_desc
|
407 |
+
|
408 |
+
st.header("Phenotype matching")
|
409 |
+
results_sum = score(hpo_list, data)
|
410 |
+
results_sum["matchs"] = results_sum[hpo_list].astype(bool).sum(axis=1)
|
411 |
+
results_sum["score"] = results_sum["matchs"] + results_sum["sum"]
|
412 |
+
results_sum["rank"] = (
|
413 |
+
results_sum["score"].rank(ascending=False, method="max").astype(int)
|
414 |
+
)
|
415 |
+
cols = results_sum.columns.tolist()
|
416 |
+
cols = cols[-4:] + cols[:-4]
|
417 |
+
match = results_sum[cols].sort_values(by=["score"], ascending=False)
|
418 |
+
st.dataframe(match[match["score"] > 1.01].drop(columns=["sum"]))
|
419 |
+
match_csv = convert_df(match)
|
420 |
+
|
421 |
+
st.download_button(
|
422 |
+
"Download matching results",
|
423 |
+
match_csv,
|
424 |
+
"match.tsv",
|
425 |
+
"text/csv",
|
426 |
+
key="download-csv-match",
|
427 |
+
)
|
428 |
+
|
429 |
+
if gene_diag:
|
430 |
+
if int(ncbi[gene_diag]) in results_sum.index:
|
431 |
+
p = (
|
432 |
+
ggplot(match, aes("score"))
|
433 |
+
+ geom_density()
|
434 |
+
+ geom_vline(
|
435 |
+
xintercept=results_sum.loc[int(ncbi[gene_diag]), "score"],
|
436 |
+
linetype="dashed",
|
437 |
+
color="red",
|
438 |
+
size=1.5,
|
439 |
+
)
|
440 |
+
+ ggtitle("Matching score distribution")
|
441 |
+
+ xlab("Gene matching score")
|
442 |
+
+ ylab("% of genes")
|
443 |
+
+ theme_bw()
|
444 |
+
+ theme(
|
445 |
+
text=element_text(size=12),
|
446 |
+
figure_size=(5, 5),
|
447 |
+
axis_ticks=element_line(colour="black", size=4),
|
448 |
+
axis_line=element_line(colour="black", size=2),
|
449 |
+
axis_text_x=element_text(angle=45, hjust=1),
|
450 |
+
axis_text_y=element_text(angle=60, hjust=1),
|
451 |
+
subplots_adjust={"wspace": 0.1},
|
452 |
+
legend_position=(0.7, 0.35),
|
453 |
+
)
|
454 |
+
)
|
455 |
+
col1, col2, col3 = st.columns(3)
|
456 |
+
|
457 |
+
with col1:
|
458 |
+
st.pyplot(ggplot.draw(p))
|
459 |
+
|
460 |
+
st.write(
|
461 |
+
"Gene ID rank:",
|
462 |
+
results_sum.loc[int(ncbi[gene_diag]), "rank"],
|
463 |
+
" | ",
|
464 |
+
"Gene ID count:",
|
465 |
+
round(results_sum.loc[int(ncbi[gene_diag]), "sum"], 4),
|
466 |
+
)
|
467 |
+
st.write(results_sum.loc[[int(ncbi[gene_diag])]])
|
468 |
+
st.write(
|
469 |
+
"Gene ID phenotype specificity:",
|
470 |
+
get_phenotype_specificity(gene_diag, results_sum),
|
471 |
+
)
|
472 |
+
del p
|
473 |
+
|
474 |
+
else:
|
475 |
+
st.write("Gene ID rank:", " Gene not available in PhenoGenius database")
|
476 |
+
del results_sum
|
477 |
+
del match
|
478 |
+
|
479 |
+
st.header("Phenotype matching by similarity of symptoms")
|
480 |
+
results_sum_add = score_sim_add(hpo_list_add, data, sim_dict)
|
481 |
+
results_sum_add["rank"] = (
|
482 |
+
results_sum_add["sum"].rank(ascending=False, method="max").astype(int)
|
483 |
+
)
|
484 |
+
cols = results_sum_add.columns.tolist()
|
485 |
+
cols = cols[-2:] + cols[:-2]
|
486 |
+
match_sim = results_sum_add[cols].sort_values(by=["sum"], ascending=False)
|
487 |
+
st.dataframe(match_sim[match_sim["sum"] > 0.01])
|
488 |
+
|
489 |
+
match_sim_csv = convert_df(match_sim)
|
490 |
+
|
491 |
+
st.download_button(
|
492 |
+
"Download matching results",
|
493 |
+
match_sim_csv,
|
494 |
+
"match_sim.tsv",
|
495 |
+
"text/csv",
|
496 |
+
key="download-csv-match-sim",
|
497 |
+
)
|
498 |
+
|
499 |
+
if gene_diag:
|
500 |
+
if int(ncbi[gene_diag]) in results_sum_add.index:
|
501 |
+
p2 = (
|
502 |
+
ggplot(match_sim, aes("sum"))
|
503 |
+
+ geom_density()
|
504 |
+
+ geom_vline(
|
505 |
+
xintercept=results_sum_add.loc[int(ncbi[gene_diag]), "sum"],
|
506 |
+
linetype="dashed",
|
507 |
+
color="red",
|
508 |
+
size=1.5,
|
509 |
+
)
|
510 |
+
+ ggtitle("Matching score distribution")
|
511 |
+
+ xlab("Gene matching score")
|
512 |
+
+ ylab("% of genes")
|
513 |
+
+ theme_bw()
|
514 |
+
+ theme(
|
515 |
+
text=element_text(size=12),
|
516 |
+
figure_size=(5, 5),
|
517 |
+
axis_ticks=element_line(colour="black", size=4),
|
518 |
+
axis_line=element_line(colour="black", size=2),
|
519 |
+
axis_text_x=element_text(angle=45, hjust=1),
|
520 |
+
axis_text_y=element_text(angle=60, hjust=1),
|
521 |
+
subplots_adjust={"wspace": 0.1},
|
522 |
+
legend_position=(0.7, 0.35),
|
523 |
+
)
|
524 |
+
)
|
525 |
+
col1, col2, col3 = st.columns(3)
|
526 |
+
|
527 |
+
with col1:
|
528 |
+
st.pyplot(ggplot.draw(p2))
|
529 |
+
|
530 |
+
st.write(
|
531 |
+
"Gene ID rank:",
|
532 |
+
results_sum_add.loc[int(ncbi[gene_diag]), "rank"],
|
533 |
+
" | ",
|
534 |
+
"Gene ID count:",
|
535 |
+
round(results_sum_add.loc[int(ncbi[gene_diag]), "sum"], 4),
|
536 |
+
)
|
537 |
+
st.write(
|
538 |
+
"Gene ID phenotype specificity:",
|
539 |
+
get_phenotype_specificity(gene_diag, results_sum_add),
|
540 |
+
)
|
541 |
+
del p2
|
542 |
+
|
543 |
+
else:
|
544 |
+
st.write("Gene ID rank:", " Gene not available in PhenoGenius database")
|
545 |
+
|
546 |
+
del sim_dict
|
547 |
+
del hpo_list_add
|
548 |
+
del results_sum_add
|
549 |
+
del match_sim
|
550 |
+
|
551 |
+
st.header("Phenotype matching by groups of symptoms")
|
552 |
+
|
553 |
+
patient_df = (
|
554 |
+
pd.DataFrame(reduced)
|
555 |
+
.set_index(data.index)
|
556 |
+
.apply(lambda x: sum((x - patient_nmf) ** 2), axis=1)
|
557 |
+
)
|
558 |
+
|
559 |
+
witness_df = (
|
560 |
+
pd.DataFrame(reduced)
|
561 |
+
.set_index(data.index)
|
562 |
+
.apply(lambda x: sum((x - witness_nmf) ** 2), axis=1)
|
563 |
+
)
|
564 |
+
del patient_nmf
|
565 |
+
del witness
|
566 |
+
del witness_nmf
|
567 |
+
|
568 |
+
case_df = pd.DataFrame(patient_df - witness_df)
|
569 |
+
case_df.columns = ["score"]
|
570 |
+
case_df["score_norm"] = abs(case_df["score"] - case_df["score"].max())
|
571 |
+
# case_df["frequency"] = matrix_frequency["variant_number"]
|
572 |
+
case_df["sum"] = case_df["score_norm"] # + case_df["frequency"]
|
573 |
+
case_df_sort = case_df.sort_values(by="sum", ascending=False)
|
574 |
+
case_df_sort["rank"] = (
|
575 |
+
case_df_sort["sum"].rank(ascending=False, method="max").astype(int)
|
576 |
+
)
|
577 |
+
case_df_sort["gene_symbol"] = case_df_sort.index.to_series().apply(get_symbol)
|
578 |
+
match_nmf = case_df_sort[["gene_symbol", "rank", "sum"]]
|
579 |
+
st.dataframe(match_nmf[match_nmf["sum"] > 0.01])
|
580 |
+
|
581 |
+
match_nmf_csv = convert_df(match_nmf)
|
582 |
+
|
583 |
+
st.download_button(
|
584 |
+
"Download matching results",
|
585 |
+
match_nmf_csv,
|
586 |
+
"match_groups.tsv",
|
587 |
+
"text/csv",
|
588 |
+
key="download-csv-match-groups",
|
589 |
+
)
|
590 |
+
|
591 |
+
if gene_diag:
|
592 |
+
if int(ncbi[gene_diag]) in case_df_sort.index:
|
593 |
+
p3 = (
|
594 |
+
ggplot(match_nmf, aes("sum"))
|
595 |
+
+ geom_density()
|
596 |
+
+ geom_vline(
|
597 |
+
xintercept=case_df_sort.loc[int(ncbi[gene_diag]), "sum"],
|
598 |
+
linetype="dashed",
|
599 |
+
color="red",
|
600 |
+
size=1.5,
|
601 |
+
)
|
602 |
+
+ ggtitle("Matching score distribution")
|
603 |
+
+ xlab("Gene matching score")
|
604 |
+
+ ylab("% of genes")
|
605 |
+
+ theme_bw()
|
606 |
+
+ theme(
|
607 |
+
text=element_text(size=12),
|
608 |
+
figure_size=(5, 5),
|
609 |
+
axis_ticks=element_line(colour="black", size=4),
|
610 |
+
axis_line=element_line(colour="black", size=2),
|
611 |
+
axis_text_x=element_text(angle=45, hjust=1),
|
612 |
+
axis_text_y=element_text(angle=60, hjust=1),
|
613 |
+
subplots_adjust={"wspace": 0.1},
|
614 |
+
legend_position=(0.7, 0.35),
|
615 |
+
)
|
616 |
+
)
|
617 |
+
col1, col2, col3 = st.columns(3)
|
618 |
+
|
619 |
+
with col1:
|
620 |
+
st.pyplot(ggplot.draw(p3))
|
621 |
+
|
622 |
+
st.write(
|
623 |
+
"Gene ID rank:",
|
624 |
+
case_df_sort.loc[int(ncbi[gene_diag]), "rank"],
|
625 |
+
" | ",
|
626 |
+
"Gene ID count:",
|
627 |
+
round(case_df_sort.loc[int(ncbi[gene_diag]), "sum"], 4),
|
628 |
+
)
|
629 |
+
st.write(
|
630 |
+
"Gene ID phenotype specificity:",
|
631 |
+
get_phenotype_specificity(gene_diag, case_df_sort),
|
632 |
+
)
|
633 |
+
del p3
|
634 |
+
else:
|
635 |
+
st.write("Gene ID rank:", " Gene not available in PhenoGenius database")
|
636 |
+
del case_df_sort
|
637 |
+
del match_nmf
|
638 |
+
del case_df
|
639 |
+
|
640 |
+
else:
|
641 |
+
st.write(
|
642 |
+
"No HPO terms provided in correct format.",
|
643 |
+
)
|
poetry.lock
ADDED
The diff for this file is too large to render.
See raw diff
|
|
pyproject.toml
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[tool.poetry]
|
2 |
+
name = "PhenoGenius_app"
|
3 |
+
version = "1.1.0"
|
4 |
+
description = ""
|
5 |
+
authors = ["kevin.yauy <[email protected]>"]
|
6 |
+
|
7 |
+
[tool.poetry.dependencies]
|
8 |
+
python = ">=3.11"
|
9 |
+
pandas = ">=1.3.0"
|
10 |
+
ujson = "^5.4.0"
|
11 |
+
streamlit = "^1.11.1"
|
12 |
+
plotnine = "^0.13.0"
|
13 |
+
numpy = ">=1.24,<2.1"
|
14 |
+
scikit-learn = "^1.5.1"
|
15 |
+
|
16 |
+
[tool.poetry.dev-dependencies]
|
17 |
+
pytest = "^5.2"
|
18 |
+
|
19 |
+
[build-system]
|
20 |
+
requires = ["poetry-core>=1.0.0"]
|
21 |
+
build-backend = "poetry.core.masonry.api"
|
requirements.txt
ADDED
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
altair==5.4.1 ; python_version >= "3.11"
|
2 |
+
attrs==24.2.0 ; python_version >= "3.11"
|
3 |
+
blinker==1.8.2 ; python_version >= "3.11"
|
4 |
+
cachetools==5.5.0 ; python_version >= "3.11"
|
5 |
+
certifi==2024.8.30 ; python_version >= "3.11"
|
6 |
+
charset-normalizer==3.3.2 ; python_version >= "3.11"
|
7 |
+
click==8.1.7 ; python_version >= "3.11"
|
8 |
+
colorama==0.4.6 ; python_version >= "3.11" and platform_system == "Windows"
|
9 |
+
contourpy==1.3.0 ; python_version >= "3.11"
|
10 |
+
cycler==0.12.1 ; python_version >= "3.11"
|
11 |
+
fonttools==4.53.1 ; python_version >= "3.11"
|
12 |
+
gitdb==4.0.11 ; python_version >= "3.11"
|
13 |
+
gitpython==3.1.43 ; python_version >= "3.11"
|
14 |
+
idna==3.8 ; python_version >= "3.11"
|
15 |
+
jinja2==3.1.4 ; python_version >= "3.11"
|
16 |
+
joblib==1.4.2 ; python_version >= "3.11"
|
17 |
+
jsonschema-specifications==2023.12.1 ; python_version >= "3.11"
|
18 |
+
jsonschema==4.23.0 ; python_version >= "3.11"
|
19 |
+
kiwisolver==1.4.7 ; python_version >= "3.11"
|
20 |
+
markdown-it-py==3.0.0 ; python_version >= "3.11"
|
21 |
+
markupsafe==2.1.5 ; python_version >= "3.11"
|
22 |
+
matplotlib==3.9.2 ; python_version >= "3.11"
|
23 |
+
mdurl==0.1.2 ; python_version >= "3.11"
|
24 |
+
mizani==0.11.4 ; python_version >= "3.11"
|
25 |
+
narwhals==1.6.2 ; python_version >= "3.11"
|
26 |
+
numpy==2.0.2 ; python_version >= "3.11"
|
27 |
+
packaging==24.1 ; python_version >= "3.11"
|
28 |
+
pandas==2.2.2 ; python_version >= "3.11"
|
29 |
+
patsy==0.5.6 ; python_version >= "3.11"
|
30 |
+
pillow==10.4.0 ; python_version >= "3.11"
|
31 |
+
plotnine==0.13.6 ; python_version >= "3.11"
|
32 |
+
protobuf==5.28.0 ; python_version >= "3.11"
|
33 |
+
pyarrow==17.0.0 ; python_version >= "3.11"
|
34 |
+
pydeck==0.9.1 ; python_version >= "3.11"
|
35 |
+
pygments==2.18.0 ; python_version >= "3.11"
|
36 |
+
pyparsing==3.1.4 ; python_version >= "3.11"
|
37 |
+
python-dateutil==2.9.0.post0 ; python_version >= "3.11"
|
38 |
+
pytz==2024.1 ; python_version >= "3.11"
|
39 |
+
referencing==0.35.1 ; python_version >= "3.11"
|
40 |
+
requests==2.32.3 ; python_version >= "3.11"
|
41 |
+
rich==13.8.0 ; python_version >= "3.11"
|
42 |
+
rpds-py==0.20.0 ; python_version >= "3.11"
|
43 |
+
scikit-learn==1.5.1 ; python_version >= "3.11"
|
44 |
+
scipy==1.14.1 ; python_version >= "3.11"
|
45 |
+
six==1.16.0 ; python_version >= "3.11"
|
46 |
+
smmap==5.0.1 ; python_version >= "3.11"
|
47 |
+
statsmodels==0.14.2 ; python_version >= "3.11"
|
48 |
+
streamlit==1.38.0 ; python_version >= "3.11"
|
49 |
+
tenacity==8.5.0 ; python_version >= "3.11"
|
50 |
+
threadpoolctl==3.5.0 ; python_version >= "3.11"
|
51 |
+
toml==0.10.2 ; python_version >= "3.11"
|
52 |
+
tornado==6.4.1 ; python_version >= "3.11"
|
53 |
+
typing-extensions==4.12.2 ; python_version >= "3.11"
|
54 |
+
tzdata==2024.1 ; python_version >= "3.11"
|
55 |
+
ujson==5.10.0 ; python_version >= "3.11"
|
56 |
+
urllib3==2.2.2 ; python_version >= "3.11"
|
57 |
+
watchdog==4.0.2 ; platform_system != "Darwin" and python_version >= "3.11"
|