You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I did work on some "institution string" -> country/state/city/institution_id models before, so I looked a bit into this code and found that you use two dense layers with only 2048 and 1024 neurons in these layers. The final softmax layer has a length of 102392. In my experience this might lead to non-optimal results. The 1024 neurons act as a bottleneck. Of course they can encode even much more than 100000 classes - it is just that I guess the results might benefit from larger layers.
Some more comments:
separate models for country/city/institution_id perform better than one model which only predicts the institution_ids. Mainly because
some institutions are spread out over different cities and countries
some raw_affilliation_strings contain information about a city or country, but not about the specific institution
Since the raw_affilliation_strings do not contain complicated structure like natural language does, DistilBERT might be overkill. I had very good results with word- and character n-grams
to disambiguate very information-poor strings like "department of Bology", we must connect other sources of information to this data. E.g. the trajectories of the authors. We haven't done this yet.
Thank you for providing all this data and code openly! This is great!
The text was updated successfully, but these errors were encountered:
Hi!
I did work on some "institution string" -> country/state/city/institution_id models before, so I looked a bit into this code and found that you use two dense layers with only 2048 and 1024 neurons in these layers. The final softmax layer has a length of 102392. In my experience this might lead to non-optimal results. The 1024 neurons act as a bottleneck. Of course they can encode even much more than 100000 classes - it is just that I guess the results might benefit from larger layers.
Some more comments:
Thank you for providing all this data and code openly! This is great!
The text was updated successfully, but these errors were encountered: