-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NestIO compatible with Nest 3.2 #1097
base: master
Are you sure you want to change the base?
NestIO compatible with Nest 3.2 #1097
Conversation
Looking into the capability to deal with lists it only seems to complicate things and it is not capable of dealing with multiple files of the same type, which is something I would expect. I would suggest getting rid of the capability to deal with lists altogether and stick to single files as most other IOs do. This then allows to remove code further and if someone really wants to have the analogsignals and spikes together it is just a matter of running this two times. |
Hi all, Since this was tested with NEST 3.2, maybe add this info to the docstring(?): Line 4 in 50abfbc
|
Hi @morales-gregorio Do you have a small test file for this format that could be added to gin? |
Just rebased the branch and some chaos has followed. My local history looks as one would expect, with this branch now splitting from the current master. I am not sure why Github interprets that all the previous commits are part of this PR :/ |
@JuliaSprenger I created a small NEST (3.3) simulation where I record both spikes and analogsignals. The results and simulation script are here: test_NEST_simulation_and_results.zip. The current intended use is therefore something like this:
|
Co-Authored-By: Simon Essink <[email protected]>
Co-Authored-By: Robin Gutzen <[email protected]>
Co-Authored-By: Robin Gutzen <[email protected]> Co-Authored-By: Simon Essink <[email protected]> Co-Authored-By: Jasper Albers <[email protected]>
… longer dependent on the file format. Previous behaviour lead to the IO trying to always read spikes from .gdf files and analogsignals from .dat files, which is not necessarily the case with Nest 3.2 (possibly earlier)
Co-Authored-By: Simon Essink <[email protected]>
or @apdavison |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great, and I was able to successfully read in a couple of gdf from NEST 3.3. I marked a few points in this code that caught my eye when going through the changes.
@@ -482,8 +500,8 @@ def read_segment(self, gid_list=None, time_unit=pq.ms, t_start=None, | |||
gid_list : list, default: None | |||
A list of GDF IDs of which to return SpikeTrain(s). gid_list must | |||
be specified if the GDF file contains neuron IDs, the default None | |||
then raises an error. Specify an empty list [] to retrieve the spike | |||
trains of all neurons. | |||
then raises an error. Specify an empty list [] to retrieve the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This help text is not complete: The code also accepts tuples of the form (a,b), which are expanded to range(a,b)
. I am not sure this is particularly helpful, I would suggest to drop this code below (line 538), rather than adapt this help text to include this case.
neo/io/nestio.py
Outdated
gid_ids = np.array([np.searchsorted(data[:, 0], gid, side='left'), | ||
np.searchsorted(data[:, 0], gid, side='right')]) | ||
gid_data = data[gid_ids[0]:gid_ids[1], :] | ||
gids = np.array([np.searchsorted(data[:, 0], gid, side='left'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not this be the searched in the ID column, i.e.,
gids = np.array([np.searchsorted(data[:, 0], gid, side='left'), | |
gids = np.array([np.searchsorted(data[:, id_column], gid, side='left'), |
A similar change is probably needed below for id_shifts
, i.e.,
id_shifts[0] = np.searchsorted(gid_data[:, time_column],
spiketrain_list = [SpikeTrain(train, units=time_unit, | ||
t_start=t_start, t_stop=t_stop, | ||
id=None, **args)] | ||
spiketrain_list = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spiketrains are read here as SpikeTrain
object, however, meanwhile, a SpikeTrainList
object is probably more appropriate. Perhaps this would be a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then again, maybe one could just
spiketrain_list = [] | |
spiketrain_list = SpikeTrainList |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The NestIO was written before SpikeTrainList even existed, so yes, we should use SpikeTrainList, since this is also the superior way to load many spiketrains (which simulations will create)
# extracting complete gid list for anasig generation | ||
if (gid_list == []) and id_column is not None: | ||
gid_list = np.unique(data[:, id_column]) | ||
analogsignal_list = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand the NEST analogsignal outputs correctly, one could consider creating one AnalogSignal
per file (i.e., for each entry in self.IOs
), and then add all channels (gids) into columns of that AnalogSignal
. Currently, separate AnalogSignal
s are created for each gid in each file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably needs discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that a single analogsignal object should be created per source file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for multimeter files it should be one AnalogSignal per recorded variable (i.e. per column in the source file, not including the time and id columns), but each signal object should contain channels for all neuron ids.
then raises an error. Specify an empty list [] to retrieve the spike | ||
trains of all neurons. | ||
then raises an error. Specify an empty list [] to retrieve the | ||
spike trains of all neurons. | ||
time_unit : Quantity (time), optional, default: quantities.ms | ||
The time unit of recorded time stamps in DAT as well as GDF files. | ||
t_start : Quantity (time), optional, default: 0 * pq.ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not quite true, the default is None. The identical discrepancy is found in a few doc strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dear @morales-gregorio!
Thank you for preparing this PR, the lack of support for NEST3 in NEO has been holding back use of many useful tools for some time. I must admit that I do not know NEO in detail, so some of my comments below are probably just due to my ignorance.
In addition to the comments in the code, I have some points that concern importing from NEST overall. I focus here on NEST 3. For backward compatibility with NEST2, I would suggest to check if the first line of file begins with a numeric value. If so, it must be NEST2 and can be read with a NEST2 reader.
File header
If I understand your code correctly, it simply ignores the three header lines of NEST ASCII files. This is okay to just get things to work again, but I think it would make much more sense to make use of this information. The header looks like this:
# NEST version: 3.6.0-post0.dev0
# RecordingBackendASCII version: 2
sender time_ms
1 9.900
1 21.800
It provides several bits of useful information which could be extracted with regexes:
- The first line can be used to check that it really is NEST output and to record the NEST version in metadata (if supported by NEO)
- The second line can be used to check NEST output format version, in case we later introduce different versions; this would allow explicit selection of the compatible reader
- One could actually parse those first two lines and if they do not match conclude that it must be a NEST 2 file.
- The third line identifies columns. This becomes important in particular when recording with the multimeter or if time is not recorded in milliseconds (see examples below). I think it would be advantageous if the column headers were read and used to label the signals recorded.
Data formats
The following script produces output in all formats NEST currently supports. Note that the voltmeter
is just a multimeter
pre-configured to record V_m
, so the formats are redundant. I still provide both examples, since voltmeter
is widely used.
Notes
- In the examples below, all offsets are zero. If one simulates with precise-spiking neurons, they can differ from zero.
- Where time is given in steps, the actual spike time is calculated as
steps * resolution - offset
. - The current NEST data format using steps/offset is actually incomplete, because the resolution, i.e., the duration of the time step is not given in the file. We need to fix that in the next version of the recording file format. Therefore, I think it is okay for NEO not to support that format a present.
- Recording time in steps is technically possible for voltmeter/multimeter, but makes little sense, so it is okay not to support that format.
- In multimeter recordings, the quantities recorded depend on the neuron model and the choices made by the experimenter. I am not sure if any order is imposed on the columns (e.g., alphabetical).
- File rows are not guaranteed to be sorted either by time or GID.
Sample script
nest.ResetKernel()
nest.overwrite_files = True
n = nest.Create('iaf_psc_alpha', params={'I_e': 600})
s1 = nest.Create('spike_recorder', params={'record_to': 'ascii'})
s2 = nest.Create('spike_recorder', params={'record_to': 'ascii', 'time_in_steps': True})
v1 = nest.Create('voltmeter', params={'record_to': 'ascii', 'interval': nest.resolution})
v2 = nest.Create('voltmeter', params={'record_to': 'ascii', 'interval': nest.resolution, 'time_in_steps': True})
m1 = nest.Create('multimeter', params={'record_to': 'ascii', 'interval': nest.resolution,
'record_from': n.recordables})
m2 = nest.Create('multimeter', params={'record_to': 'ascii', 'interval': nest.resolution,
'record_from': n.recordables, 'time_in_steps': True})
nest.Connect(n, s1 + s2)
nest.Connect(v1 + v2 + m1 + m2, n)
nest.Simulate(100)
Plain spike recording
# NEST version: 3.6.0-post0.dev0
# RecordingBackendASCII version: 2
sender time_ms
1 9.900
1 21.800
Spike recording with time in steps
# NEST version: 3.6.0-post0.dev0
# RecordingBackendASCII version: 2
sender time_step time_offset
1 99 0.000
1 218 0.000
Plain voltmeter
# NEST version: 3.6.0-post0.dev0
# RecordingBackendASCII version: 2
sender time_ms V_m
1 0.100 -69.761
1 0.200 -69.525
Voltmeter with time in steps
time_offset
will always be zero.
# NEST version: 3.6.0-post0.dev0
# RecordingBackendASCII version: 2
sender time_step time_offset V_m
1 1 0.000 -69.761
1 2 0.000 -69.525
Multimeter
- Also possible with time in steps
- Columns beyond
time_ms
depend on model and user choice
# NEST version: 3.6.0-post0.dev0
# RecordingBackendASCII version: 2
sender time_ms I_syn_ex I_syn_in V_m
1 0.100 0.000 0.000 -69.761
1 0.200 0.000 0.000 -69.525
Data from parallel simulations
- When recording from parallel simulations, file names will have the form
<label>-<gid>-<vp>.dat
, with each virtual process recording to a single file - Data needs to be collected for all files matching
<label>-<gid>-[0-9]+.dat
- This can be done with the current interface by providing a list of all file names matching the pattern.
- For a better user interface, it might make sense to allow the user to specify a pattern and then do the globbing internally
Support for pathlib
Currently, file names can only be provided as strings. In the longer term, one could provide a more powerful interface by supporting also Python's pathlib, maybe even including glob
s.
neo/io/nestio.py
Outdated
mode = 'file' | ||
|
||
def __init__(self, filenames=None): | ||
def __init__(self, filenames=None, target_object='SpikeTrain', | ||
additional_parameters={}): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never use a mutable as default argument value. Use None
instead and then create the empty dictionary inside the constructor if None
is passed. Otherwise, the dict is created once when the class is defined, not every time the constructor is called.
The additional_paramters
parameter is also not described in the docstring.
neo/io/nestio.py
Outdated
self.IOs = [] | ||
for filename in filenames: | ||
path, ext = os.path.splitext(filename) | ||
ext = ext.strip('.') | ||
if ext in self.extensions: | ||
if ext in self.avail_IOs: | ||
raise ValueError('Received multiple files with "%s" ' | ||
'extension. Can only load single file of ' | ||
'this type.' % ext) | ||
self.avail_IOs[ext] = ColumnIO(filename) | ||
self.avail_formats[ext] = path | ||
self.IOs.append(ColumnIO(filename, additional_parameters)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could turn this into a list comprehension
self.IOs = [ColumnIO(filename, additional_parameters) for filename in filenames]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
column_ids = [id_column, time_column] + value_columns | ||
column_ids = [id_column, time_column] | ||
if value_columns is not None: | ||
column_ids += value_columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work? value_columns
is an int from above and here you add it to a list? Should it be .append(value_columns)
?
And how could value_columns
be None
?
column_list = [id_column, time_column] | ||
if value_columns is not None: | ||
column_list += value_columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this differ from the column_ids
above?
column_ids = [id_column, time_column] + value_columns | ||
column_ids = [id_column, time_column] | ||
if value_columns is not None: | ||
column_ids += value_columns | ||
for i, cid in enumerate(column_ids): | ||
if cid is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could cid
be None
?
neo/io/nestio.py
Outdated
if '.' not in line: | ||
additional_parameters['dtype'] = np.int32 | ||
else: | ||
additional_parameters['dtype'] = np.float32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you use 32-bit types here? In the "full K" simulation we had 1.86e9 neurons, so the largest GIDs got very close to the upper end of signed 32 bit ints. For coming brainscale models, GIDs will need the 64 bit range. For long-running simulations, there is also a risk of loss of time precision if we only have 7-8 digits in the mantissa of floats.
Hi @heplesser , thanks for the comments and the detailed description of NEST outputs. I think this information is super helpful in driving this IO forwards. Also, thanks for your insights in problems that may arise at the "full scale" level. |
Co-authored-by: Michael Denker <[email protected]>
Co-authored-by: Michael Denker <[email protected]>
Hello everyone! After discussion with @mdenker and @apdavison yesterday, I provide here a little script that creates a network consisting of
Running the script will generate a number of spike recorder, voltmeter, multimeter and weight recorder output files with different settings. Note in particular the effect of "time_in_steps": If false (default), time is recorded in ms under heading "time_ms", if true time will be recorded in columns "time_step" and "time_offset", where the actual spike time is given by time_step * resolution - time_offset. Unfortunately, the resolution is not stored in the file, so I suggest to not support these files for now. Do not rely on ordering of columns or rows. In parallel simulations, there will be one file per virtual process, enumerated by the last part of the file name. NEST 3 files can be safely identified by the
header. NEST 2 files do not have these headers. I would suggest to use the header for identification, also to be prepared for future versions of the ASCII backend. nest.ResetKernel()
nest.overwrite_files = True
g = nest.Create('poisson_generator', params={'rate': 100000})
n = nest.Create('aeif_cond_alpha', 10)
p = nest.Create('iaf_psc_alpha_ps', 5)
rnt = nest.Create('spike_recorder', params={'record_to': 'ascii',
'label': 'aeif_spikes_times'})
rns = nest.Create('spike_recorder', params={'record_to': 'ascii',
'time_in_steps': True,
'label': 'aeif_spikes_steps'})
rpt = nest.Create('spike_recorder', params={'record_to': 'ascii',
'label': 'precise_spikes_times'})
rps = nest.Create('spike_recorder', params={'record_to': 'ascii',
'time_in_steps': True,
'label': 'precise_spikes_steps'})
vt = nest.Create('voltmeter', params={'record_to': 'ascii',
'label': 'voltmeter_times'})
vs = nest.Create('voltmeter', params={'time_in_steps': True,
'label': 'voltmeter_steps'})
mm1t = nest.Create('multimeter', params={'record_from': ('g_ex', 'g_in', 'V_m', 'w'),
'record_to': 'ascii',
'label': 'multimeter_1ms'})
mmrt = nest.Create('multimeter', params={'record_from': ('g_ex', 'g_in', 'V_m', 'w'),
'record_to': 'ascii',
'interval': nest.resolution,
'label': 'multimeter_res'})
w = nest.Create('weight_recorder', params={'record_to': 'ascii',
'label': 'weight_rec'})
nest.SetDefaults('stdp_synapse', {'weight_recorder': w})
nest.Connect(g, n)
nest.Connect(n, n, syn_spec={'synapse_model': 'stdp_synapse', 'weight': 100.})
nest.Connect(n, p, syn_spec={'weight': 1000.})
nest.Connect(n, rnt + rns)
nest.Connect(p, rpt + rps)
nest.Connect(vt + vs + mm1t + mmrt, n)
nest.Simulate(100) |
Hi @heplesser ! Thanks for pushing this forward and coming up with good test examples. I think we should not introduce a NEST dependency to the neo tests, I would instead suggest to create several (small) files for multiple NEST versions. The files should have descriptive names like "nest3.7_voltmeter_ascii.dat". What do you think? |
@morales-gregorio I agree, otherwise we could easily end up with circular dependencies and all-to-heavy test setups. Frozen example files for testing should work nicely. It should suffice to base the versioning on the ASCII Recording Backend version, which changes more slowly than the NEST version. |
@heplesser could you create those files? 😇 |
Hey @morales-gregorio , the current test files for testing NEST-IO are located on NeuralEnsemble / The contents are: 0gid-1time-1256-0.gdf
0gid-1time-2gex-1262-0.dat
0gid-1time-2gex-3Vm-1261-0.dat
0gid-1time-2Vm-1259-0.dat
0gid-1time-2Vm-3gex-4gin-1260-0.dat
0gid-1time-2Vm-3Iex-4Iin-1264-0.dat
0gid-1time_in_steps-1258-0.gdf
0gid-1time_in_steps-2Vm-1263-0.dat
0time-1255-0.gdf
0time_in_steps-1257-0.gdf
brunel-delta-nest_mod.py # <--- script that generated the files
N1-0gid-1time-2Vm-1265-0.dat
N1-0time-1Vm-1266-0.dat
N1-0Vm-1267-0.dat I have copied and formatted the above script into """
Script to generate test data for the nestIO.
This script creates a network simulation using the NEST simulator (https://www.nest-simulator.org/).
It generates output files for spike recorders, voltmeters, multimeters and weight recorders with
different settings. Note the effect of "time_in_steps": If false (default), time is recorded in ms
under heading "time_ms"; if true, time will be recorded in columns "time_step" and "time_offset",
where the actual spike time is given by time_step * resolution - time_offset. The resolution is not
stored in the file. In parallel simulations, there will be one file per virtual process, enumerated
by the last part of the file name.
Usage:
python generate_data.py
"""
import nest
def main():
"""
Executes the main logic of the script.
This function sets up a simulation using the NEST simulator (https://www.nest-simulator.org/)
for simulating neural networks. It creates a network consisting of aeif_cond_alpha neurons
(to show multimeter recording), iaf_psc_alpha_ps neurons (to show precise time recording),
and weight_recorder (to show weight recording). The simulation is run for a specified duration.
Note: Do not rely on ordering of columns or rows. In parallel simulations, there will be
one file per virtual process, enumerated by the last part of the file name.
Parameters:
None
Returns:
None
"""
nest.ResetKernel()
nest.overwrite_files = True
g = nest.Create("poisson_generator", params={"rate": 100000})
n = nest.Create("aeif_cond_alpha", 10)
p = nest.Create("iaf_psc_alpha_ps", 5)
rnt = nest.Create(
"spike_recorder",
params={"record_to": "ascii", "label": "aeif_spikes_times"},
)
rns = nest.Create(
"spike_recorder",
params={
"record_to": "ascii",
"time_in_steps": True,
"label": "aeif_spikes_steps",
},
)
rpt = nest.Create(
"spike_recorder",
params={"record_to": "ascii", "label": "precise_spikes_times"},
)
rps = nest.Create(
"spike_recorder",
params={
"record_to": "ascii",
"time_in_steps": True,
"label": "precise_spikes_steps",
},
)
vt = nest.Create("voltmeter", params={"record_to": "ascii", "label": "voltmeter_times"})
vs = nest.Create("voltmeter", params={"time_in_steps": True, "label": "voltmeter_steps"})
mm1t = nest.Create(
"multimeter",
params={
"record_from": ("g_ex", "g_in", "V_m", "w"),
"record_to": "ascii",
"label": "multimeter_1ms",
},
)
mmrt = nest.Create(
"multimeter",
params={
"record_from": ("g_ex", "g_in", "V_m", "w"),
"record_to": "ascii",
"interval": nest.resolution,
"label": "multimeter_res",
},
)
w = nest.Create("weight_recorder", params={"record_to": "ascii", "label": "weight_rec"})
nest.SetDefaults("stdp_synapse", {"weight_recorder": w})
nest.Connect(g, n)
nest.Connect(n, n, syn_spec={"synapse_model": "stdp_synapse", "weight": 100.0})
nest.Connect(n, p, syn_spec={"weight": 1000.0})
nest.Connect(n, rnt + rns)
nest.Connect(p, rpt + rps)
nest.Connect(vt + vs + mm1t + mmrt, n)
nest.Simulate(100)
if __name__ == "__main__":
main() which generates: (.zip attached) aeif_spikes_steps-18-0.dat
aeif_spikes_times-17-0.dat
generate_data.py
multimeter_1ms-23-0.dat
multimeter_res-24-0.dat
precise_spikes_steps-20-0.dat
precise_spikes_times-19-0.dat
voltmeter_times-21-0.dat
weight_rec-25-0.dat Refer to this readme here: https://gin.g-node.org/NeuralEnsemble/ephy_testing_data to add the files. |
Co-authored-by: Hans Ekkehard Plesser <[email protected]>
Co-authored-by: Hans Ekkehard Plesser <[email protected]>
There are 8 failing tests. I've looked into the first of these, and it seems to be that the case of |
Hi all,
@rgutzen @jasperalbers and I were looking into the NestIO today and we came up with this edits that make the current NestIO compatible with both Nest 2.x and 3.x versions.
We incorporated all the changes suggested by @essink into this PR.
Essentially we force the IO to ignore all the headers (which raises a warning that it did so) as suggested by @rgutzen, which I also wrote into PR #739
Additionally, we found that the current NestIO decides whether to read
neo.AnalogSignals
orneo.SpikeTrain
based on the file name alone. This is problematic since Nest 3.2 (maybe others too) saves spikes to the.dat
format by default, which the IO expected to be analogsignals. I fixed this issue by adding a new keywordtarget_object
which then determines what kind of object will be read. This does not allow to read analogsignals and spikes into the same segment (two different calls are needed), but this seems a rather good practice in my opinion, since spiketrains and analogsignales are rarely combined into the same segment.I tested the code to load a
.dat
file generated with Nest 3.2 and it works, I can add a proper test somewhere if you want.Best,
Aitor