Merge pull request #37 from hit9/dev

Add performance.rst and fix some typos
hit9 · Feb 1, 2021 · 77a7b91 · 77a7b91
2 parents 8f751d6 + 5f974de
commit 77a7b91
Show file tree

Hide file tree

Showing 8 changed files with 182 additions and 45 deletions.
diff --git a/README.rst b/README.rst
@@ -1,10 +1,10 @@
 bitproto
 ========
 
-Bitproto is a lightweight, easy-to-use and production-proven bit level data
+Bitproto is a lightweight, easy-to-use and fast bit level data
 interchange data format for serializing data structures.
 
-Website: TODO
+Website: https://bitproto.readthedocs.io
 
 Features
 ---------
@@ -21,3 +21,5 @@ Features
   - C - No dynamic memory allocation.
   - Go - No reflection or type assertions.
   - Python - No magic :)
+
+- Blazing fast encoding/decoding.
diff --git a/docs/c-guide.rst b/docs/c-guide.rst
@@ -14,14 +14,14 @@ Firstly, run the bitproto compiler to generate code for C:
 
    $ bitproto c pen.bitproto
 
-Where the `pen.bitproto` is introduced in earlier section :ref:`quickstart-example-bitproto`.
+Where the ``pen.bitproto`` is introduced in earlier section :ref:`quickstart-example-bitproto`.
 
 We will find that bitproto generates us two files in current directory:
 
-- `pen_bp.h`: Contains the declarations of structs, macros and api functions etc.
-- `pen_bp.c`: Contains the function implementations.
+- ``pen_bp.h``: Contains the declarations of structs, macros and api functions etc.
+- ``pen_bp.c``: Contains the function implementations.
 
-It's recommended to open this two generated files to have a look. In the generated file `pen_bp.h`:
+It's recommended to open this two generated files to have a look. In the generated file ``pen_bp.h``:
 
 * The ``enum Color`` in bitproto is mapped to a ``typedef`` statement in C, and the enum
   values are mapped to macros:
@@ -71,7 +71,7 @@ encoder and decoder depends on the bitproto C library underlying.
 
 Download the bitproto library for C language from
 `this github link <https://github.com/hit9/bitproto/tree/master/lib/c>`_,
-and put them (the `bitproto.c` and `bitproto.h`) to current working directory.
+and put them (the ``bitproto.c`` and ``bitproto.h``) to current working directory.
 
 Run the code
 ^^^^^^^^^^^^
@@ -102,17 +102,17 @@ Now, we create a file named ``main.c`` and put the following code in it:
      return 0;
    }
 
-In the code above, we firstly creates a ``p`` of type ``struct Pen`` with data initilization,
+In the code above, we firstly create a ``p`` of type ``struct Pen`` with data initilization,
 then call a function ``EncodePen`` to encode ``p`` into buffer ``s``. The length of buffer ``s``
 is generated by compiler as a macro defined as ``BYTES_LENGTH_PEN``.
 
-In the decoding part, we constructs another ``p1`` instance of type ``struct Pen`` with zero
+In the decoding part, we construct another ``p1`` instance of type ``struct Pen`` with zero
 initilization, then call a function ``DecodePen`` to decode bytes from buffer ``s`` into ``p1``.
 
-Finally, uses a function ``JsonPen`` generated by the compiler to format the structure ``p1``
+Finally, use a function ``JsonPen`` generated by the compiler to format the structure ``p1``
 to json string to checkout if the decoding works ok.
 
-Let's compile it with the C library `bitproto.c` and generated `pen_bp.c`, and run:
+Let's compile it with the C library ``bitproto.c`` and generated ``pen_bp.c``, and run:
 
 .. sourcecode:: bash
 
@@ -129,15 +129,15 @@ Naming Prefix
 ^^^^^^^^^^^^^
 
 As we know, there's no namespace mechanism to scope definition names across including header files in C.
-Bitproto provide an option to add a name prefix to all generated types. To use it, define an ``option``
+Bitproto provides an option to add a name prefix to all generated types. To use it, define an ``option``
 at the global scope of the bitproto file:
 
 .. sourcecode:: bitproto
 
    option c.name_prefix = "my_prefix_"
 
-Run the bitproto compiler again, we will that names in `pen_bp.h` are changed:
+Run the bitproto compiler again, we will that names in ``pen_bp.h`` are changed:
 
-* The ``enum Color`` is mapped to ``MyPrefixColor``.
-* The ``Timestamp`` is mapped to ``MyPrefixTimestamp``.
-* The ``message Pen`` is mapped to ``struct MyPrefixPen``.
+* The ``enum Color`` is now mapped to ``MyPrefixColor``.
+* The ``Timestamp`` is now mapped to ``MyPrefixTimestamp``.
+* The ``message Pen`` is now mapped to ``struct MyPrefixPen``.
diff --git a/docs/go-guide.rst b/docs/go-guide.rst
@@ -20,12 +20,12 @@ Then run the bitproto compiler to generate code for Go:
 
    $ bitproto go pen.bitproto bp/
 
-Where the `pen.bitproto` is introduced in earlier section :ref:`quickstart-example-bitproto`.
+Where the ``pen.bitproto`` is introduced in earlier section :ref:`quickstart-example-bitproto`.
 
-We will find that bitproto generates us a file named `pen_bp.go` in the output directory,
+We will find that bitproto generates us a file named ``pen_bp.go`` in the output directory,
 which contains the mapped structs, constants and api methods etc.
 
-In the generated `pen_bp.go`:
+In the generated ``pen_bp.go``:
 
 * The ``enum Color`` in bitproto is mapped to a ``type`` definition on unsigned integer
   statement in Go, and the enum values are mapped to constants:
@@ -85,7 +85,7 @@ If you wish to install bitproto go library to local vendor directory via ``go mo
 Run the code
 ^^^^^^^^^^^^
 
-Now, we create a file named  `main.go` and put the following code in it:
+Now, we create a file named  ``main.go`` and put the following code in it:
 
 .. sourcecode:: go
 
@@ -109,13 +109,13 @@ Now, we create a file named  `main.go` and put the following code in it:
    	fmt.Printf("%v", p1)
    }
 
-Notes to replace the import path of the generated `pen_bp.go` to yours.
+Note to replace the import path of the generated ``pen_bp.go`` to yours.
 
-In the code above, we firstly creates a ``p`` of type ``Pen`` with data initilization,
+In the code above, we firstly create a ``p`` of type ``Pen`` with data initilization,
 then call a method ``p.Encode()`` to encode ``p`` and return the encoded buffer ``s``, which
 is a slice of bytes.
 
-In the decoding part, we constructs another ``p1`` instance of type ``Pen`` with zero initilization,
+In the decoding part, we construct another ``p1`` instance of type ``Pen`` with zero initilization,
 then call a method ``p1.Decode()`` to decode bytes from buffer ``s`` into ``p1``.
 
 The compiler also generates json tags on the generated struct's fields. And generates a method ``String()``

diff --git a/docs/index.rst b/docs/index.rst
@@ -10,7 +10,7 @@ The bit level data interchange format
 Introduction
 ------------
 
-Bitproto is a lightweight, easy-to-use and production-proven bit level data
+Bitproto is a fast, lightweight and easy-to-use bit level data
 interchange data format for serializing data structures.
 
 The protocol describing syntax looks like the great
@@ -50,6 +50,7 @@ Features
    - :ref:`C (ANSI C)<quickstart-c-guide>` - No dynamic memory allocation.
    - :ref:`Go <quickstart-go-guide>` - No reflection or type assertions.
    - :ref:`Python <quickstart-python-guide>` - No magic :)
+- Blazing fast encoding/decoding (:ref:`benchmark <performance-benchmark>`).
 
 Code Example
 ------------
@@ -113,7 +114,7 @@ The differences between bitproto and protobuf are:
 
 * bitproto doesn't use any dynamic memory allocations. Few of
   `protobuf C implementations <https://github.com/protocolbuffers/protobuf/blob/master/docs/third_party.md>`_
-  support this except `nanopb <https://jpa.kapsi.fi/nanopb>`_.
+  support this, except `nanopb <https://jpa.kapsi.fi/nanopb>`_.
 
 * bitproto doesn't support varying sized data, all types are fixed sized.
 
@@ -149,6 +150,14 @@ Known shortcomes of bitproto:
   tight and compact. Consider to wrap a compression mechanism like `zlib <https://zlib.net/>`_
   on the encoded buffer if you really care.
 
+* bitproto can't provide :ref:`best encoding performance <performance-optimization-mode>`
+  with :ref:`extensibility <language-guide-extensibility>`.
+
+  There's an :ref:`optimization mode <performance-optimization-mode>` designed in bitproto
+  to generate plain encoding/decoding statements directly at code-generation time, since all
+  types in bitproto are fixed-sized, how-to-encode can be determined earlier at code-generation
+  time. This mode gives a huge performance improvement, but I still haven't found a way to
+  make it work with bitproto's extensibility mechanism together.
 
 Content list
 ------------
@@ -162,5 +171,6 @@ Content list
     python-guide
     compiler
     language
+    performance
     changelog
     license
diff --git a/docs/language.rst b/docs/language.rst
@@ -314,8 +314,8 @@ Nested types can also be referenced across message scopes:
        Outer.Color color = 1;
    }
 
-A bitproto message opens a scope, bitproto will lookup a type from local scope first
-and then the outer scope. In the following example, the type of field ``color`` is
+A bitproto message opens a scope, bitproto will lookup a type from local scopes first
+and then the outer scopes. In the following example, the type of field ``color`` is
 enum ``Color`` in local ``B``:
 
 .. sourcecode:: bitproto
@@ -326,10 +326,10 @@ enum ``Color`` in local ``B``:
 
    message A {
        message B {
-           enum Color : uint3 {}  // Local first
+           enum Color : uint3 {}
        }
 
-       B.Color color = 1
+       B.Color color = 1   // Local `B.Color` wins
    }
 
 In bitproto, only messages and enums can be nested declared.
@@ -416,7 +416,7 @@ However it is sometimes desirable to bind to a different name, to avoid name cla
 
    import lib "path/to/shared.bitproto"
 
-The statement above import `shared.bitproto` as a name ``lib`` in current bitproto, the reference
+The statement above import ``shared.bitproto`` as a name ``lib`` in current bitproto, the reference
 now starts with ``lib.``:
 
 .. sourcecode:: bitproto
@@ -432,12 +432,12 @@ now starts with ``lib.``:
 Extensibility
 ^^^^^^^^^^^^^
 
-Bitproto knows exactly how many bits a message occupy at compile time, because all types
+Bitproto knows exactly how many bits a message will occupy at compile time, because all types
 are fix-sized. This may make backwards-compatibility hard.
 
 It seems ok to add new fields to the end of a message in use, because the structures of
 existing fields are unchanged, the decoding end won't scan the encoded bytes of new fields,
-then the backward-compatibility achieved:
+then "the backward-compatibility achieved":
 
 .. sourcecode:: bitproto
 
@@ -449,10 +449,11 @@ then the backward-compatibility achieved:
 
 But this mechanism works only if there's no data after this message, that's to say, to make
 this mechanism work, this message should be a top-level message, none of other messages can
-refer it, for instance, it can be a communication packet itself.
+refer it, for instance, it can only be a communication packet itself.
 
 This mechanism fails with in-middle messages, for instance, we can't add new fields to the
-following message ``Middle``, it affects the decoding of other old fields like ``following_field``:
+following message ``Middle``, it affects the decoding of other old fields, like the
+``following_field``:
 
 .. sourcecode:: bitproto
 
@@ -489,7 +490,7 @@ Bitproto introduces a symbol ``'`` to mark a message to be extensible:
 In the code above, ``ExtensibleMessage`` occupies ``1+16`` bits, and ``TraditionalMessage`` still
 occupies ``1`` bit.
 
-By marking a message to be extensible via a single quote, we increases buffer size by two bytes
+By marking a message to be extensible via a single quote, we increase buffer size by two bytes
 in exchange for the possibility of adding new fields in the future. You should balance buffer size
 and extensibility when declaring a message, mark the messages those will be extended in the future.
 
@@ -524,7 +525,7 @@ Back to the example of message ``Middle``, if this message in use is marked to b
        uint7 following_field = 2
    }
 
-Decoding will goes wrong if you exchange data between two ends, of which one marks this message as extensible,
+But decoding will go wrong if you exchange data between two ends, of which one marks this message as extensible,
 and the other marks it as traditional.
 
 Extensible messages can also be nested declared, in the example below, message ``Outer`` occupies ``2+2`` bytes:
@@ -533,6 +534,7 @@ Extensible messages can also be nested declared, in the example below, message `
 
    message Outer' {
        message Inner' {}
+       // Ha, empty extensible messages still cost bytes ~
    }
 
 In addtion, arrays are also supported to be marked as extensible:
@@ -551,7 +553,7 @@ It is the same with extensible messages, an extensible array gains ``2`` bytes o
    For enums, extensibility is not supported, because enum values are atomic in targeting languages,
    the decoding end holding an older version protocol will get a wrong enum value if the encoder end
    increases the enum's number of bits, the unsigned integer types mapped in languages may cast large
-   values to smaller values unexpected.
+   values to unexpected smaller values.
 
 .. _language-guide-option: