Issues How JSON is Being Written + Manually Stopping Collection #8

taylorbreannaray · 2020-12-11T17:58:40Z

First off, thanks for your help before with the prior issue I opened. Second, thank you for developing this unofficial API in the first place—it has been of great use to me!

Now, to get to my point, I am still utilizing your experiment 02_multiple_hashtags.py. I have run into a couple of issues along the way and was wondering if you had any advice/clarification that you could provide:

When the posts and links are being written to JSON files, there is always a closing square (i.e., "]") bracket being written in each of the files at arbitrary places. This, in turn, produces a JSON decoding error that claims it is expecting it to be the end of the file (even though it clearly isn't yet). For now, I have been going into the files myself and removing these so that it decodes to proper JSON.
When I am manually choosing to stop the data collection myself before it has retrieved all possible posts, there are often instances for which it will write more objects to the Links file than it did to the corresponding Posts file. This results in having an uneven ratio of links to posts. Is there a way I can ensure that it stops writing at the same number for each?

This might just be something I have to fix myself if I am going to stop the script at random times. Not sure though.

Thanks!

KonradIT · 2020-12-12T14:40:46Z

Re: 1

I've redone the script and now it outputs to a CSV file, I found this approach much better for dumping it into a sqlite db (https://pypi.org/project/csv-to-sqlite/)

Re 2:

This can be fixed with a signal handler. such as how I did it here: https://github.com/KonradIT/parler-py-api/blob/master/experiments/00_suggested_hashtags.py#L26

KonradIT · 2020-12-13T15:19:08Z

The script has been now adjusted to write to a csv file, please pull from origin and let me know if it works.

KonradIT · 2021-01-01T21:22:27Z

Hi @taylorbreannaray , can you confirm the fixes work for your usecase?

taylorbreannaray · 2021-03-05T16:40:33Z

Hi @KonradIT,

So sorry for taking awhile to get back to you on that! Yes, the modifications you made seemed to work just fine back then. However, I am revisiting the API and wanting to collect more now that Parler is back up. I essentially didn't change anything, besides the hashtags being collected from (to reflect the latest data), and files aren't being written like they once were.

I also noticed that the JST cookie seems to expire within minutes - is that going to be a problem for collecting data?

KonradIT · 2021-03-05T17:40:12Z

Hi, I haven't had my JST+MST keypair expired yet, in fact I've been collecting qanon related messages since a few days ago with no modifications to the library and didn't get any unauthorized error response.

taylorbreannaray · 2021-03-05T17:46:18Z

Hmm... Interesting. I never had the issue when using your API prior either. However, now when I look at my JST and MST values for Parler in Chrome, it has my JST set to expire 5 minutes from when it was created. The MST isn't a problem, as it has it set to expire 2 months from the creation time. Is this a Chrome thing?

KonradIT · 2021-03-05T18:21:12Z

Ah, yes my API will refresh the JST using the MST value if it expired

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues How JSON is Being Written + Manually Stopping Collection #8

Issues How JSON is Being Written + Manually Stopping Collection #8

taylorbreannaray commented Dec 11, 2020 •

edited

Loading

KonradIT commented Dec 12, 2020

KonradIT commented Dec 13, 2020

KonradIT commented Jan 1, 2021

taylorbreannaray commented Mar 5, 2021

KonradIT commented Mar 5, 2021

taylorbreannaray commented Mar 5, 2021

KonradIT commented Mar 5, 2021

Issues How JSON is Being Written + Manually Stopping Collection #8

Issues How JSON is Being Written + Manually Stopping Collection #8

Comments

taylorbreannaray commented Dec 11, 2020 • edited Loading

KonradIT commented Dec 12, 2020

KonradIT commented Dec 13, 2020

KonradIT commented Jan 1, 2021

taylorbreannaray commented Mar 5, 2021

KonradIT commented Mar 5, 2021

taylorbreannaray commented Mar 5, 2021

KonradIT commented Mar 5, 2021

taylorbreannaray commented Dec 11, 2020 •

edited

Loading