将TinyML4.3.2 生成数据(Create Sin)文章中创建的1000个样本点数据拆分为训练集、验证集、测试集,其中训练集占60%,验证集占20%,测试集占20%。
新建一个cell,编辑代码:
#We'll use 60% of our data for training and 20% for testing. The remaining 20%
#will be used for validation. Caculate the indices of each section.
TRAIN_SPLIT=int(0.6 * SAMPLES)
TEST_SPLIT=int(0.2 * SAMPLES + TRAIN_SPLIT)#Use np.split to chop our data into three parts.
#The second argument to np.split is an array of indices where the data will be
#split. We provide two indices, so the data will be divided into three chunks.
x_train, x_test, x_validate=np.split(x_values, [TRAIN_SPLIT, TEST_SPLIT])
y_train, y_test, y_validate=np.split(y_values, [TRAIN_SPLIT, TEST_SPLIT])#Double check that our splits add up correctly
assert(x_train.size + x_validate.size + x_test.size) == SAMPLES
assert(y_train.size + y_validate.size + y_test.size) == SAMPLES#Plot the data in each partition in different colors:
plt.plot(x_train, y_train, 'b.', label="Train")
plt.plot(x_validate, y_validate, 'y.', label="Validate")
plt.plot(x_test, y_test, 'r.', label="Test")
plt.legend()
plt.show()
运行结果:
蓝色:训练集;
黄色:验证集;
红色:测试集;