(2) a listing of all publicly available training data and where to obtain it; and (3) a listing of all training data obtainable from third parties and where to obtain it, including for fee.
imo this also includes “the data must be open”. The data used for training must be obtainable.
The description mentioned open source ai definition doesnt require data to be open. But https://opensource.org/ai/open-source-ai-definition explicitly require the data to be open?
no it doesn’t. it only says the weights and information about the training data must be open, not the training data itself. which is honestly useless.
Data information == metadata?
imo this also includes “the data must be open”. The data used for training must be obtainable.