Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Size Difference #3

Open
cdiggins opened this issue Dec 2, 2024 · 3 comments
Open

File Size Difference #3

cdiggins opened this issue Dec 2, 2024 · 3 comments

Comments

@cdiggins
Copy link

cdiggins commented Dec 2, 2024

I was looking at the IFC file:
https://github.com/buildingSMART/IFC5-development/blob/main/Linear%20placement%20of%20signals/linear-placement-of-signal.ifc
which is 223 KB,

The new IFCX file at:
https://github.com/buildingSMART/IFC5-development/blob/main/Linear%20placement%20of%20signals/linear-placement-of-signal.ifcx
is 835KB.

Is there any attempt being made to consider how to reduce file size in IFC5?

@cdiggins cdiggins changed the title Size Difference File Size Difference Dec 3, 2024
@aothms
Copy link
Contributor

aothms commented Dec 3, 2024

The major difference is that the ifcx file contains the explicit tessellated representation of the alignment curves, which means (a) this model can be visualized by software not natively understanding infra alignment concepts (b) it serves as a validation means for the tools that do support the alignment subschema to double check whether their interpretation is correct.

I think file size is not the most important metric at this point in time. I'd be more interesting in optimizing time to render, which includes parse time and i/o (so small file is still better, but if we end up with a binary serialization that is a little bit bigger due to alignment, but can therefore be immediately memory-mapped I wouldn't mind).

I think we're all aware that JSON is not that optimal and we likely will investigate other formats e.g BSON, CBOR, USDC, but at this point focus is on data model (ie. ECS-inspired components), composition (inheritance, multiple file layers) and schema (i.e USD-inspired geometry, mapping of IFC concepts).


The current serialization and model is also optimized for legibility:

A lot of whitespace

$ grep -c ' ' linear-placement-of-signal.ifcx
25845

A lot of meta-data on correspondence between ifc4.3 model

$ grep originalStepInstance linear-placement-of-signal.ifcx | wc -c
15217

It also seems the referent geometry is not instanced (but maybe also not in the original model; didn't check)

$ grep -c faceVertexIndices linear-placement-of-signal.ifcx
26

@atomczak
Copy link

atomczak commented Dec 3, 2024

I agree that time-to-render makes more sense than disk-size.

@berlotti had a good point - while the individual file size is increased, the project size is not necessarily - before, each version of a model had to be exported as a full file, and in the new approach, only the changes are being saved and exchanged.

Geometry aside, it looks like a single property definition now takes twice less bytes:

IFC4x3 (304B):

#2713 = IFCPROPERTYSET('3xcBgZuJX43Q00yUCRC9aq', #1, 'Pset_Stationing', $, (#2714));
#2714 = IFCPROPERTYSINGLEVALUE('Station', $, IFCLENGTHMEASURE(876.2720713), $);
#2715 = IFCRELDEFINESBYPROPERTIES('3jDLE3VjXErBi29wD5I895', #1, 'Object to Properties', 'Object to Properties Relation', (#2712), #2713);

versus IFC5 (146B):

{
 "def": "over",
 "name": "N1bed3e86ecf84553b89d33a596f1cad9",
 "attributes": {
  "ifc5:properties": {
   "Station": 876.2721
  }
 }
},

Partially because it misses PSET and measure. However, with more properties in a single over, that difference would be even bigger.

@tomi-p
Copy link

tomi-p commented Dec 3, 2024

While BIM data passing through interfaces will certainly become more common in the future, files will continue to play an essential role in information exchange for a long time to come. From this perspective, the example given by @atomczak looks promising. As the "I" in BIM becomes more important, IFC files are growing primarily because of increased data. If the IFC5 data structure for alphanumeric information meets this challenge, the problem may not exist or may not be very significant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants