Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Target name mismatching in output #1

Open
shammimore opened this issue Jun 5, 2024 · 2 comments
Open

Target name mismatching in output #1

shammimore opened this issue Jun 5, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@shammimore
Copy link

shammimore commented Jun 5, 2024

If you look at row number 8 in the target table below, 'var' name is 4220292, however after the text mapping (using both GPT4Adapter and MPNetAdapter) the output returns 4220292Â for the same target variable in the output table.

Source:

  var desc
0 edmmtyp Multiples Myelom (symptomatisch)
1 edmmtyp Smouldering Myeloma (asymptomatisch)
2 edmmtyp MGUS - monoklonale Gammopathie unklarer Signifikanz
3 edmmtyp Solitäres Plasmozytom
4 edmmtyp Plasmazell-Leukämie
5 *sympmws Schmerzenim Bereich der mittleren WS
6 *sympuws Schmerzen im Bereich der unteren WS
7 *sympknoch Knochenschmerzen
8 *sympleist Leistungsverlust
9 *sympmued Müdigkeit
10 *sympschwae Schwäche

Target:

  var desc
0 437233 Multiple myeloma
1 4184985 Smoldering myeloma
2 4082463 Monoclonal gammopathy of uncertain significance
3 4216139 Plasmacytoma
4 133154 Plasma cell leukemia
5 4169580 Pain in spine
6 4169580 Pain in spine
7 4129418 Bone pain
8 4220292 Impaired psychomotor performance
9 4223659 Fatigue
10 437113 Asthenia

Output:

  Source Variable Target Variable Similarity
0 edmmtyp1 437233 0.898915
1 edmmtyp2 4184985 0.910589
2 edmmtyp3 4082463 0.903709
3 edmmtyp4 4216139 0.847672
4 edmmtyp5 133154 0.897126
5 *sympmws1 4169580 0.813263
6 *sympuws2 4169580 0.81713
7 *sympknoch 4169580 0.844623
8 *sympleist 4220292Â 0.790407
9 *sympmued 4223659 0.899128
10 *sympschwae 437113 0.828928
@shammimore
Copy link
Author

4220292 --> 4220292Â only shows when the output is written in a csv file. If the output is written in excel file, then this doesn't happen.

@tiadams tiadams self-assigned this Jun 10, 2024
@tiadams tiadams added the bug Something isn't working label Jun 10, 2024
@tiadams tiadams transferred this issue from SCAI-BIO/index Jul 9, 2024
@tiadams tiadams modified the milestones: v1.0.0, v0.5.0 Oct 21, 2024
@mehmetcanay
Copy link
Member

Could you give me steps to reproduce this bug?

@tiadams tiadams removed this from the v0.4.1 milestone Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants