Spaces:
Running
Running
metadata
title: README
emoji: π
colorFrom: pink
colorTo: gray
sdk: static
pinned: false
π BigLAM: Machine Learning for Libraries, Archives, and Museums
BigLAM is a community-driven initiative to build an open ecosystem of machine learning models, datasets, and tools for Libraries, Archives, and Museums (LAMs).
We aim to:
- ποΈ Share machine-learning-ready datasets from LAMs via the Hugging Face Hub
- π€ Train and release open-source models for LAM-relevant tasks
- π οΈ Develop tools and approaches tailored to LAM use cases
β¨ Background
BigLAM began as a datasets hackathon within the BigScience πΈ project, a large-scale, open NLP collaboration.
Our goal: make LAM datasets more discoverable and usable to support researchers, institutions, and ML practitioners working with cultural heritage data.
π What You'll Find
The BigLAM organization hosts:
- Datasets: image, text, and tabular data from and about libraries, archives, and museums
- Models: fine-tuned for tasks like:
- Art/historical image classification
- Document layout analysis and OCR
- Metadata quality assessment
- Named entity recognition in heritage texts
- Spaces: tools for interactive exploration and demonstration
π§© Get Involved
We welcome contributions! You can:
- Use our datasets and models
- Join the discussion on GitHub
- Contribute your own tools or data
- Share your work using BigLAM resources
π Why It Matters
Cultural heritage data is often underrepresented in machine learning. BigLAM helps address this by:
- Supporting inclusive and responsible AI
- Helping institutions experiment with ML for access, discovery, and preservation
- Ensuring that ML systems reflect diverse human knowledge and expression
- Developing tools and methods that work well with the unique formats, values, and needs of LAMs
Empowering AI with the richness of human culture.