-
Notifications
You must be signed in to change notification settings - Fork 37
Problem Statement
The two major OOP frameworks in R, S3 and S4, each have their own limitations, with neither one being sufficiently applicable to gain dominance. This had led to social fracturing in the community and technical impediments to compatibility and interoperability. We summarize those limitations in the table below.
S3 limitations | S4 limitations | S4 implementation issues |
---|---|---|
Classes are only implicit | Multiple inheritance and dispatch hard to understand | Poor performance |
No systematic object validation | Syntax is unusual (side effects) | Difficult to maintain |
Single dispatch only | Lack of transparency of object structure and methods |
S3 defines classes implicitly at the instance level, so there is no explicit class hierarchy. While the S3 system supports tracking the class of every object, there is no systematic means of constructing and validating them to ensure correctness. S3 only supports single dispatch, so it is difficult to write polymorphic code for arithmetic, merging objects, converting objects, etc.
S4 has solutions to all of those problems, but it is quite ambitious, introducing significant complexity, unusual syntax and loss of transparency. Multiple inheritance, while expressive and powerful, allows for multiple overlapping taxonomies, which is difficult to reason about, and the difficulty increases quadratically when combined with multiple dispatch, where method selection uses a distance calculation in /n/ dimensions where /n/ is the number of arguments. The syntax for defining classes and methods is non-idiomatic and relies on side effects. Finally, the S4 convention (although not a requirement) is to hide slots behind an API, which improves encapsulation but prevents the basic introspection capabilities that are desirable when analyzing data and that R users have come to expect.
Somewhat tangentially, but still motivating, there are also technical issues with the methods package, the only implementation of the S4 system. Its incremental growth over the decades has led to excessive complexity, as well as performance issues. In the absence of a new system, we would need to reimplement S4, so there will be implementation effort regardless.
Documentation limitations afflict both S3 and S4. It is difficult to describe a programming interface when it consists of generic functions not coupled to each other or any class. Any package can define a method on a generic or extend a class, so the documentation needs to adapt according to which packages are loaded.