Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation not correct #1

Closed
sohaibumar62 opened this issue Apr 1, 2019 · 7 comments
Closed

Evaluation not correct #1

sohaibumar62 opened this issue Apr 1, 2019 · 7 comments
Assignees

Comments

@sohaibumar62
Copy link

Hi...
I am working on improving NUSCCF.
Your code was great help.
But, Evaluation values like Recall, Precision are not giving correct values(very low values on ML_100k).
I have read the NUSCCF Research Paper http://dx.doi.org/10.1016/j.eswa.2017.04.027 but the values are not matching with table given in the above paper of NUSCCF with ML_100k.
Help needed with code please.

Evaluation given in the Paper of ML_100k dataset:
Accuracy 81.56
Precision 61.24
Recall 13.43

With your code:
recall 0.012
precision 0.01829407729247656
f1_measure 0.0072465956097708265

@soran-ghaderi
Copy link
Owner

soran-ghaderi commented Apr 17, 2019

Hi...
I am working on improving NUSCCF.
Your code was great help.
But, Evaluation values like Recall, Precision are not giving correct values(very low values on ML_100k).
I have read the NUSCCF Research Paper http://dx.doi.org/10.1016/j.eswa.2017.04.027 but the values are not matching with table given in the above paper of NUSCCF with ML_100k.
Help needed with code please.

Evaluation given in the Paper of ML_100k dataset:
Accuracy 81.56
Precision 61.24
Recall 13.43

With your code:
recall 0.012
precision 0.01829407729247656
f1_measure 0.0072465956097708265

Hi,
As this algorithm is unsupervised, the evaluation criteria should be calculated in a different way. Therefore, I strongly recommend having a look at https://arxiv.org/pdf/1010.0725.pdf
(Calculating AUC and precision) - from here you can get the idea behind the concept, hopefully!

@soran-ghaderi soran-ghaderi pinned this issue Apr 17, 2019
@soran-ghaderi soran-ghaderi unpinned this issue Apr 17, 2019
@soran-ghaderi soran-ghaderi pinned this issue Apr 17, 2019
@soran-ghaderi soran-ghaderi self-assigned this Apr 17, 2019
@soran-ghaderi soran-ghaderi unpinned this issue Apr 17, 2019
@soran-ghaderi soran-ghaderi pinned this issue Apr 17, 2019
@soran-ghaderi soran-ghaderi reopened this Apr 18, 2019
@sohaibumar62
Copy link
Author

What if we divide test_test according to ratings (uninterested, NeiNorInterested, Interested)?
Can we apply precision and recall in this way?

@soran-ghaderi
Copy link
Owner

What if we divide test_test according to ratings (uninterested, NeiNorInterested, Interested)?
Can we apply precision and recall in this way?

It's already divided; As it should be averaged in the end, nothing will changes.

@zhuhaif
Copy link

zhuhaif commented Mar 10, 2024

Hi... I am working on improving NUSCCF. Your code was great help. But, Evaluation values like Recall, Precision are not giving correct values(very low values on ML_100k). I have read the NUSCCF Research Paper http://dx.doi.org/10.1016/j.eswa.2017.04.027 but the values are not matching with table given in the above paper of NUSCCF with ML_100k. Help needed with code please.

Evaluation given in the Paper of ML_100k dataset: Accuracy 81.56 Precision 61.24 Recall 13.43

With your code: recall 0.012 precision 0.01829407729247656 f1_measure 0.0072465956097708265

I have the same problem. Is there a specific way to solve this problem?

@soran-ghaderi
Copy link
Owner

Hi... I am working on improving NUSCCF. Your code was great help. But, Evaluation values like Recall, Precision are not giving correct values(very low values on ML_100k). I have read the NUSCCF Research Paper http://dx.doi.org/10.1016/j.eswa.2017.04.027 but the values are not matching with table given in the above paper of NUSCCF with ML_100k. Help needed with code please.
Evaluation given in the Paper of ML_100k dataset: Accuracy 81.56 Precision 61.24 Recall 13.43
With your code: recall 0.012 precision 0.01829407729247656 f1_measure 0.0072465956097708265

I have the same problem. Is there a specific way to solve this problem?

Hi, from this paper:

"AUC.— Provided the rank of all non-observed links, the AUC value can be interpreted as the probability that a randomly chosen missing link (i.e., a link in E P) is given a higher score than a randomly chosen nonexistent link (i.e., a link in U −E). In the algorithmic implementation, we usually calculate the score of each non-observed link instead of giving the ordered list since
the latter task is more time-consuming. Then, at each time we randomly pick a missing link and a nonexistent link to compare their scores, if among n independent comparisons, there are n′ times the missing link having a higher score and n ′′ times they have the same score, the AUC value is: $(n'+0.5*n")/n$".

precision (from the paper):

"Precision.— Given the ranking of the non-observed links, the Precision is defined as the ratio of relevant items selected to the number of items selected. That is to say, if we take the top-L links as the predicted ones, among which Lr links are right (i.e., there are Lr links in the probe set EP), then the Precision equals $Lr/L$.

Please modify the evaluation accordingly and let me know.

@zhuhaif
Copy link

zhuhaif commented Mar 12, 2024

Hi... I am working on improving NUSCCF. Your code was great help. But, Evaluation values like Recall, Precision are not giving correct values(very low values on ML_100k). I have read the NUSCCF Research Paper http://dx.doi.org/10.1016/j.eswa.2017.04.027 but the values are not matching with table given in the above paper of NUSCCF with ML_100k. Help needed with code please.
Evaluation given in the Paper of ML_100k dataset: Accuracy 81.56 Precision 61.24 Recall 13.43
With your code: recall 0.012 precision 0.01829407729247656 f1_measure 0.0072465956097708265

I have the same problem. Is there a specific way to solve this problem?

Hi, from this paper:

"AUC.— Provided the rank of all non-observed links, the AUC value can be interpreted as the probability that a randomly chosen missing link (i.e., a link in E P) is given a higher score than a randomly chosen nonexistent link (i.e., a link in U −E). In the algorithmic implementation, we usually calculate the score of each non-observed link instead of giving the ordered list since the latter task is more time-consuming. Then, at each time we randomly pick a missing link and a nonexistent link to compare their scores, if among n independent comparisons, there are n′ times the missing link having a higher score and n ′′ times they have the same score, the AUC value is: (n′+0.5∗n")/n".

precision (from the paper):

"Precision.— Given the ranking of the non-observed links, the Precision is defined as the ratio of relevant items selected to the number of items selected. That is to say, if we take the top-L links as the predicted ones, among which Lr links are right (i.e., there are Lr links in the probe set EP), then the Precision equals Lr/L.

Please modify the evaluation accordingly and let me know.

I take the top-L (L from 1 to 100) predictions for evaluation, and the best Precision (L = 5) is only 0.02241953385127636.
I compare each user's predictions to the test set. Most users' predictions are wrong (Lr = 0). Are there any specific ways to improve Lr?

@soran-ghaderi
Copy link
Owner

Hi... I am working on improving NUSCCF. Your code was great help. But, Evaluation values like Recall, Precision are not giving correct values(very low values on ML_100k). I have read the NUSCCF Research Paper http://dx.doi.org/10.1016/j.eswa.2017.04.027 but the values are not matching with table given in the above paper of NUSCCF with ML_100k. Help needed with code please.
Evaluation given in the Paper of ML_100k dataset: Accuracy 81.56 Precision 61.24 Recall 13.43
With your code: recall 0.012 precision 0.01829407729247656 f1_measure 0.0072465956097708265

I have the same problem. Is there a specific way to solve this problem?

Hi, from this paper:
"AUC.— Provided the rank of all non-observed links, the AUC value can be interpreted as the probability that a randomly chosen missing link (i.e., a link in E P) is given a higher score than a randomly chosen nonexistent link (i.e., a link in U −E). In the algorithmic implementation, we usually calculate the score of each non-observed link instead of giving the ordered list since the latter task is more time-consuming. Then, at each time we randomly pick a missing link and a nonexistent link to compare their scores, if among n independent comparisons, there are n′ times the missing link having a higher score and n ′′ times they have the same score, the AUC value is: (n′+0.5∗n")/n".
precision (from the paper):
"Precision.— Given the ranking of the non-observed links, the Precision is defined as the ratio of relevant items selected to the number of items selected. That is to say, if we take the top-L links as the predicted ones, among which Lr links are right (i.e., there are Lr links in the probe set EP), then the Precision equals Lr/L.
Please modify the evaluation accordingly and let me know.

I take the top-L (L from 1 to 100) predictions for evaluation, and the best Precision (L = 5) is only 0.02241953385127636. I compare each user's predictions to the test set. Most users' predictions are wrong (Lr = 0). Are there any specific ways to improve Lr?

Thanks for the update.

I would suggest you to print the outputs step by step and see at which step does it return wrong outputs.

At this point, unfortunately, I'm tight on time, therefore, I might not be able to fix them in a foreseeable time.

Apparently this is not the last (right) version of the codes and I noticed that the helper.py has some typo issues with the docstring quotations which I presume you've already corrected them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants