Back to Glossary
Training Data
Fundamentals
Data used for model learning.
Training data determine a model’s capabilities and limitations.
- Quality criteria: Representativeness, low noise, accurate labels, sufficient coverage.
- Risks: Bias, duplicates/leakage, inappropriate content.
- Practice: Curate, deduplicate, balance, and document datasets (data cards).