Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small improvements to import performance #131

Merged
merged 8 commits into from
Apr 26, 2024

Conversation

kfsone
Copy link
Contributor

@kfsone kfsone commented Apr 26, 2024

There's a bunch of linting/fixing first, which helped me spot some actual problems. After that there's some tidy-up of eddbplugin that made it easier for me to reason about, and enabled me to find a balance between building big lists in memory to attack the db with vs making better use of the dbs own inevitable cache anyway.

this reduces import time from 40 minutes to 23 minutes on my windows pc.

please run this branch on other hardware yourself before merging.

kfsone added 3 commits April 26, 2024 01:33
small tweaks, but it redunce time to import the listings on my machine from 40 minutes to 23
@eyeonus
Copy link
Owner

eyeonus commented Apr 26, 2024

Just got home from work, running a comparison check of trade import -P eddblink -O clean on this versus TD v11.0.3

@eyeonus
Copy link
Owner

eyeonus commented Apr 26, 2024

NOTE: Downloading file 'listings.csv'.
NOTE: Requesting https://elite.tromador.com/files/listings.csv
NOTE: Downloaded   2.5GB of gziped data  22.6MB/s
NOTE: Processing market data from listings.csv: Start time = 2024-04-26 03:50:03.790430. Live = True

Live should not be true on this one.

@eyeonus
Copy link
Owner

eyeonus commented Apr 26, 2024

36 minutes to completion with this branch, 11.0.3 is still at 88% on the main listings file

@eyeonus
Copy link
Owner

eyeonus commented Apr 26, 2024

This branch:

NOTE: Processing market data from listings.csv: Start time = 2024-04-26 03:50:03.790430. Live = True
[================================================= ] 98% 43355318 / 44240120 NOTE: Optimizing database...
NOTE: Finished processing market data. End time = 2024-04-26 04:24:45.873538  
NOTE: Checking for update to 'listings-live.csv'.
NOTE: Downloading file 'listings-live.csv'.
NOTE: Requesting https://elite.tromador.com/files/listings-live.csv
NOTE: Downloaded  10.2MB of gziped data   7.4MB/s
NOTE: Processing market data from listings-live.csv: Start time = 2024-04-26 04:24:48.857846. Live = True
[================================================= ] 98% 184164 / 187922 NOTE: Optimizing database...
NOTE: Finished processing market data. End time = 2024-04-26 04:26:32.942536
NOTE: Import completed.

listings: ~35 minutes
listings-live: ~2 minutes

Total: ~37 minutes
TD v11.0.3:

NOTE: Processing market data from listings.csv: Start time = 2024-04-26 03:48:12.684702
NOTE: Finished processing market data. End time = 2024-04-26 04:34:42.974157 
NOTE: Checking for update to 'listings-live.csv'.
NOTE: Downloading file 'listings-live.csv'.
NOTE: Requesting https://elite.tromador.com/files/listings-live.csv
NOTE: Downloaded  10.3MB of gziped data   7.3MB/s
NOTE: Processing market data from listings-live.csv: Start time = 2024-04-26 04:34:45.877749
NOTE: Finished processing market data. End time = 2024-04-26 04:37:06.013385
NOTE: Import completed.

listings: ~46 minutes
listings-live: ~3 minutes

Total: ~49 minutes

Total time savings: 12 minutes

@eyeonus
Copy link
Owner

eyeonus commented Apr 26, 2024

I'm going to push a couple changes, check I didn't break anything, and then do the merge.

eyeonus added 2 commits April 26, 2024 05:05
Have all the methods external to the class have matching naming
convention
Don't use self.execute() or self.commit() in some cases and db.execute()
or db.commit() in others, use db.* in all cases.
eyeonus added 2 commits April 26, 2024 05:23
Need to include `self.dataPath` or it will always return False, which is
not what we want.
@eyeonus eyeonus merged commit 631758c into eyeonus:release/v1 Apr 26, 2024
7 of 10 checks passed
@kfsone
Copy link
Contributor Author

kfsone commented Apr 26, 2024

Nice!

| remove unused methods

mmmmm - I love it when you talk dirty

@kfsone kfsone deleted the kfsone/import-perf branch April 26, 2024 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants