Skip to content
This repository has been archived by the owner on Nov 30, 2023. It is now read-only.

ran Anonymizer.py file but no output #1

Open
aryansoni1108 opened this issue Feb 11, 2019 · 9 comments
Open

ran Anonymizer.py file but no output #1

aryansoni1108 opened this issue Feb 11, 2019 · 9 comments

Comments

@aryansoni1108
Copy link

I just ran the Anonymizer.py file but it seems to get stuck in processing i think. Iam pretty new to these type of projects so please help me.
Adult data ['c:/Users/Aryan Soni/Downloads/Clustering_based_K_Anon-master/anonymizer.py'] K=10 Begin to K-Member Cluster based on NCP
getting this output but after this no output is shown and cmd is basically stuck after this output.
Please help me

@qiyuangong
Copy link
Owner

Hi @aryansoni1108
It is not stuck! You didn't get output because this clustering based algorithm is too slow (single core single thread). It requires nearly 3 hours on my laptop (2017 macbook pro 15 inch). You can achieve better performance with optimized clustering algorithm. Or, you can get result in shorter time with less data (1000 records of adult data) or larger k (20 or 50).

Adult data
['anonymizer.py']
K=10
Begin to K-Member Cluster based on NCP
NCP 11.20%
Running time 10744.34seconds

@dataExperimenter2019
Copy link

Hi, question which adult and informs datasets is actually used by the anonymiser.py? I want to try to cut down the processing time.

I've been running the algorithm (on informs) for past 4 hours with the data as is (from the gitHub downloads) and it still hasn't finished :(

Also, where can I find the optimized clustering algorithm on gitHub?

Thank you!

@qiyuangong
Copy link
Owner

Hi, the datasets are placed in data dir. The adult.data is for adult dataset, while conditions.csv and demographics.csv are for Informs dataset.

About optimized clustering algorithm, I think you can start from optimized k-means clustering. Search these keywords with search engine, such as Google.

@FarihaHossain
Copy link

Hi @qiyuangong ,
This is a great initiative. Appreciatable.
ddddd
While compiling the anonymizer.py this problem is occurring. Can you please help regarding this.?

@qiyuangong
Copy link
Owner

Hi @FarihaHossain
I think this error is caused by IS_CAT mismatch with ATT_TREES. It seems that in a given attribute, it should be categoric attribute, but it is actually NumRange.

Can you give me the detailed running command?

@shivjais13
Copy link

Hi, the algorithm works fine with adult data and it produced the result in 3 hours, but its running for a day and haven't produced any output for the INFORM dataset for k = 20.
[python2 anonymizer.py i kmember 20]
The above was the code i used in the terminal and its stuck on Begin to k-member cluster based on NCP from past 20 hours.
Can you suggest an update or anything i can do to produce a result.

@FarihaHossain
Copy link

Hi @FarihaHossain
I think this error is caused by IS_CAT mismatch with ATT_TREES. It seems that in a given attribute, it should be categoric attribute, but it is actually NumRange.

Can you give me the detailed running command?

hi,
thanks for the reply I just cloned this repository and run it. Nothing changed and this problem came out.

@qiyuangong
Copy link
Owner

Hi @FarihaHossain
I think this error is caused by IS_CAT mismatch with ATT_TREES. It seems that in a given attribute, it should be categoric attribute, but it is actually NumRange.
Can you give me the detailed running command?

hi,
thanks for the reply I just cloned this repository and run it. Nothing changed and this problem came out.

Hi,
I run it on my env. Things go on well exception an saving error related to INFORM dataset.

Can you give me more details about your environment and running commend ?

@qiyuangong
Copy link
Owner

Hi, the algorithm works fine with adult data and it produced the result in 3 hours, but its running for a day and haven't produced any output for the INFORM dataset for k = 20.
[python2 anonymizer.py i kmember 20]
The above was the code i used in the terminal and its stuck on Begin to k-member cluster based on NCP from past 20 hours.
Can you suggest an update or anything i can do to produce a result.

Well, I will add some output (maybe a progress bar) about that.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants