-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues How JSON is Being Written + Manually Stopping Collection #8
Comments
Re: 1 I've redone the script and now it outputs to a CSV file, I found this approach much better for dumping it into a sqlite db (https://pypi.org/project/csv-to-sqlite/) Re 2: This can be fixed with a signal handler. such as how I did it here: https://github.com/KonradIT/parler-py-api/blob/master/experiments/00_suggested_hashtags.py#L26 |
The script has been now adjusted to write to a csv file, please pull from origin and let me know if it works. |
Hi @taylorbreannaray , can you confirm the fixes work for your usecase? |
Hi @KonradIT, So sorry for taking awhile to get back to you on that! Yes, the modifications you made seemed to work just fine back then. However, I am revisiting the API and wanting to collect more now that Parler is back up. I essentially didn't change anything, besides the hashtags being collected from (to reflect the latest data), and files aren't being written like they once were. I also noticed that the JST cookie seems to expire within minutes - is that going to be a problem for collecting data? |
Hi, I haven't had my JST+MST keypair expired yet, in fact I've been collecting qanon related messages since a few days ago with no modifications to the library and didn't get any unauthorized error response. |
Hmm... Interesting. I never had the issue when using your API prior either. However, now when I look at my JST and MST values for Parler in Chrome, it has my JST set to expire 5 minutes from when it was created. The MST isn't a problem, as it has it set to expire 2 months from the creation time. Is this a Chrome thing? |
Ah, yes my API will refresh the JST using the MST value if it expired |
Hi @KonradIT,
First off, thanks for your help before with the prior issue I opened. Second, thank you for developing this unofficial API in the first place—it has been of great use to me!
Now, to get to my point, I am still utilizing your experiment 02_multiple_hashtags.py. I have run into a couple of issues along the way and was wondering if you had any advice/clarification that you could provide:
When the posts and links are being written to JSON files, there is always a closing square (i.e., "]") bracket being written in each of the files at arbitrary places. This, in turn, produces a JSON decoding error that claims it is expecting it to be the end of the file (even though it clearly isn't yet). For now, I have been going into the files myself and removing these so that it decodes to proper JSON.
When I am manually choosing to stop the data collection myself before it has retrieved all possible posts, there are often instances for which it will write more objects to the Links file than it did to the corresponding Posts file. This results in having an uneven ratio of links to posts. Is there a way I can ensure that it stops writing at the same number for each?
This might just be something I have to fix myself if I am going to stop the script at random times. Not sure though.
Thanks!
The text was updated successfully, but these errors were encountered: