Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics: Connection Score vs Internal Score High Level Discussion #29

Open
lomky opened this issue Jun 27, 2018 · 2 comments
Open

Metrics: Connection Score vs Internal Score High Level Discussion #29

lomky opened this issue Jun 27, 2018 · 2 comments

Comments

@lomky
Copy link
Collaborator

lomky commented Jun 27, 2018

For 7/11.

At a high level, what do we want to consider for the internal score of an object? For the connection score? What exceptions do we want to make for strange objects (ie contributor, publication).

@lomky
Copy link
Collaborator Author

lomky commented Jul 11, 2018

The main wrench in the works here is not all distinct tables should be considered connections. To give a list from clearest "internal seperate table" to "maybe?":

Seem likely to be more internal than connection:

  • Country
  • Org Alt Names
  • Org Type
  • role_type
  • publication_type
  • report_type

Less Certain Internal vs Connection:

  • File
  • GCMD Keywords
  • region
  • Array
  • Contributor
  • Org Relationships
  • Publication object

Maybe we should consider Connections only between certain high level types? i.e. Publication to Publication, Contributor to Contributor, and Contributor to Publication?

@lomky
Copy link
Collaborator Author

lomky commented Jul 11, 2018

Discussion notes

We concur the types and Country are internal. Same for org alt names.

Files
Since these are relatively flat objects with a exist / doesn't exist, we think this should go in internal.
For example, a URL and a file are about equivalent pronenance-wise, so we want to compare them apples to apples.

Array
Should be connection, not internal to Table, as Arrays have semantic connections available to them.

GCMD Keywords & Region
These help "connect" things, but not in a provenance way.
We think these should be scored internally and with connection.

Contributor Object
This shouldn't have an internal score, as it is the embodiment of the connection score for a person-org-publication

publication object
This shouldn't have an internal score, as it is the embodiment of the connection score between the publication entities and the things that can be connected to them.

organization relationship object
This shouldn't have an internal score, as it is the embodiment of the connection score between two organizations

What do we mean by connection?

  • whether or not a specific connection exists or not
  • the internal-score of a connection, and how that should affect our perception of the current object
  • connections: we should have a score per-type of connection (Images, References, Contributors)
  • optional connections: If an optional connection doesn't exist, that should not negatively affect the connection score. The Field would be left out. (I.e. If a chapter has no tables)

NB: Should we have the Figure store its number of panels?

Still to do: come up with the weights for the various connections

@lomky lomky removed their assignment Jun 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants