-
Notifications
You must be signed in to change notification settings - Fork 6
Evaluating Models for Entity Linking with Datasets
This leaderboard competition evaluates models based on a Top5uptoD
relevance ranking, assuming that each publication contains D <= 5
datasets.
This approach prioritizes precision, taking into account the variable number of datasets per publication.
In the case of D = 3
datasets, if the Top4 contains all 3 datasets then nothing beyond the 3rd ranked item will be considered relevant.
In other words, this approach does not penalize relevance past discovering all D corpora in the rank-ordered results.
If all D datasets do not appear in the Top5, the ranking reverts back to a Top5 error.
To illustrate the relevance ranking, see:
If possible with the modeling approach, each predicted dataset should have an estimate for the uncertainty of that prediction.
To calculate the aggregate precision for correct datasets in the Top5uptoD
entries across all publications in the corpus, use a 5-fold average. In the following sample code, assume that my_train()
trains a model, my_predict()
uses that model to predict dataset labels from a publication, and top5UptoD_err()
calculates the Top5uptoD relevance ranking:
from sklearn.model_selection import KFold
kf = KFold(n_splits=5, shuffle=True, random_state=2019)
cv_errs = []
iter = 0
for train_index, test_index in kf.split(pub_contexts):
print(f'fold {iter}')
X_train = [pub_contexts[i] for i in train_index]
X_test = [pub_contexts[i] for i in test_index]
y_train = [pub_labels[i] for i in train_index]
y_test = [pub_labels[i] for i in test_index]
model = my_train(X_train, y_train)
errs = []
for i, context in enumerate(X_test):
preds = my_predict(context, model, 5)
errs.append(top5UptoD_err(y_test[i], [p[0] for p in preds]))
cv_errs.append(mean(errs))
print(f'top5UptoD error rate: {mean(errs)}')
iter += 1
print(f'aggregate precision: {1.0 - mean(cv_errs)}')
kudos: @philipskokoh