Preprocessing? As in... more programmatic fetching? (For paginated endpoints) #38
-
Problem: Inefficient fetching of data from multiple sources. Particularly with paginated endpoints. Inefficient by:
Context: I figured out how to run multiple steps by following the COVID example. And... while I do have it working right now, it's a little less than ideal, because 2 of them are fetching data from paginated endpoints, meaning I have a bunch of this happening: # 0000-0099
- name: Fetch data 0000-0049
uses: githubocto/flat@v3
with:
http_url: https://api.opensea.io/api/v1/assets?asset_contract_address=0x585a2c37858d3b03824bc683829e4dbbf58969ee&order_direction=asc&offset=0&limit=50
downloaded_filename: flatdata/opensea_junks_00a.json
- name: Fetch data 0050-0099
uses: githubocto/flat@v3
with:
http_url: https://api.opensea.io/api/v1/assets?asset_contract_address=0x585a2c37858d3b03824bc683829e4dbbf58969ee&order_direction=asc&offset=50&limit=50
downloaded_filename: flatdata/opensea_junks_00b.json
# this endpoint only returns 50, and I need to fetch 2000... XD The second one is more tedious because I have to manually specify the pagination using block numbers # 12153412-12355753
- name: Fetch data etherscan logs
uses: githubocto/flat@v3
with:
http_url: https://api.etherscan.io/api?module=logs&action=getLogs&fromBlock=12153412&toBlock=12355753&address=0x585A2C37858D3B03824BC683829e4DBbF58969ee&apikey=APIKEY
downloaded_filename: flatdata/etherscan_logs_00.json
# this returns 1000, so I only have a couple of them Then I have several post processing steps to parse, extract, filter, and merge the data for how I need it. I understand that I am... slightly pushing the limits of what flat data was meant to do. But what I would really like to be able to do is write up a JS file, kind of like the post processing one, that (Then run the postprocessing normally, to merge it all back together) Any input or advice is very welcome! PS. I was also hoping to specify different schedules for each of the endpoints... so I tried making several yml files, but it created some issues with git commit/merging so I moved them back into a single file with one schedule. PPS. @Wattenberger big fan of your work! I absolutely love this flat data project & I already have 7! different use-cases for it across multiple projects. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
I think I'm going to try just trying writing my own action that runs Deno like this: https://github.com/githubocto/flat-demo-covid-dashboard/blob/main/.github/workflows/flat.yml#L50-L51 which links together several programmed fetches. |
Beta Was this translation helpful? Give feedback.
-
You can just get a small file (like a 1 byte file) and discard it. The fetching can be done from the code itself. It works for me. I know I am quite late, but thought I should put it out here. |
Beta Was this translation helpful? Give feedback.
I think I'm going to try just trying writing my own action that runs Deno like this: https://github.com/githubocto/flat-demo-covid-dashboard/blob/main/.github/workflows/flat.yml#L50-L51 which links together several programmed fetches.