Back to Glossary

Training Data

Fundamentals

Data used for model learning.


Training data determine a model’s capabilities and limitations.

  • Quality criteria: Representativeness, low noise, accurate labels, sufficient coverage.
  • Risks: Bias, duplicates/leakage, inappropriate content.
  • Practice: Curate, deduplicate, balance, and document datasets (data cards).