Optimising/adapting IDBBatchAtomicVFS for performance on large tables #109
Replies: 2 comments 5 replies
-
I'm not a SQL expert so questions about how best to structure a query or index, or how to optimize a schema are best addressed to someone not me, like the SQLite forum. All I/O is going to be slower with IDBBatchAtomicVFS than AccessHandlePoolVFS, but the overhead for each read and write call is also a factor. I would see if increasing the database page size (with PRAGMA page_size) helps. My guess is that it will but I don't know by how much. You can also try to reduce the number of VFS calls by increasing the cache size (with PRAGMA cache_size). I'm not sure if that will help at all since you're having to read your entire large table, but perhaps it will if it keeps indexes and/or the smaller joined tables in memory. |
Beta Was this translation helpful? Give feedback.
-
A small note for self and others who might also be starting out with sqlite - it sometimes doesn't pick that right index, and making sure it does by checking the query planner and potentially forcing it to use a specific index can make a massive difference. By increasing But I guess this is just normal query optimisation advice! |
Beta Was this translation helpful? Give feedback.
-
I have a large table (400k+ rows) with an
INTEGER PRIMARY KEY
and need to join this table on various other (much smaller) tables. An example joining against a table of ~5k rows on the PK (to the other PK) takes about 150ms withAccessHandlePoolVFS
. The exact same query on a db created with exactly the same SQL in exactly the same browser takes around 6000ms withIDBBatchAtomicVFS
. Even acount(0)
on this table takes around 5000ms. The 400k table has 10 columns, 3 of which contain quite a lot of text data, though the value I am getting for this query is around 2-6 bytes of UTF8 (this particular column has a max of about 6 bytes). Because of the (great) performance withAccessHandlePoolVFS
, it took me a bit to think of creating a unique index with just the PK and the single small text column. This reduces the query time to around 650ms and the count to around 400ms. I'm guessing the PK is stored with the rest of the data, so the entire table has to be read without the index?Are there any rough guidelines on how we should be preferring to create and manage reasonably large tables to better manage the particulars of
IDBBatchAtomicVFS
? Like would it be definitely preferable to split this table into several? Or is it just a matter of testing everywhere?Beta Was this translation helpful? Give feedback.
All reactions