opendatalab/AICC
Viewer
•
Updated
•
4.84B
•
34.8k
•
70
OpenDataLab provides high-quality open datasets and tools for large models. China Large model corpus Data Alliance open source data service designated platform
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights
AICC: Parse HTML Finer, Make Models Better -- A 7.3T AI-Ready Corpus Built by a Model-Based HTML Parser