You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The following seems like it would be a pretty common use case:
importtlzc= [{'a':1, 'b':2}, {'a':3, 'b':4}, {'a':5, 'b':6}]
d=tlz.merge_with(list, c)
print(d) # {'a': [1, 3, 5], 'b': [2, 4, 6]}# Now chunk to a certain sizee=chunk_dict(2, d)
print(e) # [{'a':[1,3], 'b':[2,4]}, {'a':[5], 'b':[6]}]
In other words, I first combine a bunch of dictionaries into a single dict with a concatenated list of values, then I want to chunk those dictionaries so that the lists are at most some specified length.
Perhaps even more useful is what I actually want it for: inverting merge_with. That is, I have a list of batched data to put through a model, and I want to change the batch size:
Unless I'm missing something, there doesn't seem to be a straight-forward way to do this with toolz (though it seems like functionality it would have). Here's the best solution I've come up with so far:
fromitertoolsimportstarmap, zip_longestfromtypingimportIteratorfrommathimportceilfromtlzimportmerge, partition_alldefchunk_dict(n: int, d: dict) ->Iterator[dict]:
"""Chunk a dict of lists into separate dicts with lists of max length `n`. Parameters --------- n : int Chunk size, i.e. max length of the new dictionary values. d : dict Dictionary of iterables to chunk. Returns ------- Iterator[dict] Dictionaries whose values are now of length at most `n`. """defchunk(k, v):
""" Allows slicing numpy arrays so they remain array objects """ifhasattr(v, '__len__') andhasattr(v, '__getitem__'):
try: return ({k: v[i*n:(i+1)*n]} foriinrange(ceil(len(v)/n)))
except: pass# objects may still not support slicingreturn ({k: part} forpartinpartition_all(n, v))
returnmap(merge, zip_longest(*starmap(chunk, d.items()), fillvalue={}))
Is there a better way of doing any of this, and would it be useful to add this type of function to toolz?
The text was updated successfully, but these errors were encountered:
The following seems like it would be a pretty common use case:
In other words, I first combine a bunch of dictionaries into a single dict with a concatenated list of values, then I want to chunk those dictionaries so that the lists are at most some specified length.
Perhaps even more useful is what I actually want it for: inverting merge_with. That is, I have a list of batched data to put through a model, and I want to change the batch size:
Unless I'm missing something, there doesn't seem to be a straight-forward way to do this with toolz (though it seems like functionality it would have). Here's the best solution I've come up with so far:
Is there a better way of doing any of this, and would it be useful to add this type of function to toolz?
The text was updated successfully, but these errors were encountered: