-
Notifications
You must be signed in to change notification settings - Fork 95
/
notes.txt
445 lines (326 loc) · 12.7 KB
/
notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
Code Quality
============
* errchk ./...
* go vet ./...
IDEA replace all .(cast) with CheckExactString or whatever python calls it...
Then can use CheckString later...
Instead of using TypeCall etc, just implement all the __methods__ for
Type. Then there is one and only one way of calling the __methods__.
How subclass a builtin type? Methods etc should work fine, but want
CheckString to return the "string" out of the superclass.
Things to do before release
===========================
* Line numbers
* frame.Lasti is pointing past the instruction which puts tracebacks out
* Subclass builtins
* pygen
* consider whether to re-use the grumpy runtime
FIXME recursive types for __repr__ in list, dict, tuple
>>> L=[0]
>>> L[0]=L
>>> L
[[...]]
Features
========
* lexes, parses and compiles all the py3 source in the python distribution
find /usr/lib/python3.4 /usr/lib/python3/ -type f -name \*.py | xargs testparser
Failes on /usr/lib/python3/dist-packages/mpmath/usertools.py
- DOS mode problem
Limitations & Missing parts
===========================
* string keys only in dictionaries
* \N{...} escapes not implemented
* lots of builtins still to implement
* FIXME eq && ne should throw an error for a type which doesn' have eq implemented
* repr/str
* subclassing built in types - how? Need to make sure we have called the appropriate method everywhere rather than just .(String)
* FIXME how do mapping types work?
* PyMapping_Check
* is it just an interface?
* name mangling for private attributes in classes
Type ideas
==========
* Embed a zero sized struct with default implementations of all the methods to avoid interface bloat
* would have a pointer to the start of the object so could do comparisons
* however don't know anything about the object - can't call Type() for instance
* could possibly use non-empty struct { *Type } and get that to provide Type() method
* except for String/Int/Bool/Tuple/Float/Bytes and possibly some others...
* can embed this default interface whether we do fat interfaces or not...
* can have heirachies of embedding
* Make Object a fat interface so it defines all the M__method__ or at least all the common ones
* would make the M__call__ quicker than needing a type assertion
* Or Make the methods be pointers in *Type more like python
* Closer to the way python does it and would be faster than type assertion
* Function pointers would have to be func(a Object) (Object,
error) which would mean a type assertion which we are trying to
avoid :-(
* An interface is the go way of doing this
What methods are a default implementation useful for?
__eq__
__ne__
__repr__
__str__
__lt__ __add__ etc - could return NotImplemented
__hash__ - based on pointer?
Minimum set of methods / attributes (eg on None)
Attributes
__class__
__doc__
Methods
__bool__
__delattr__
__dir__
__format__
__getattribute__
__hash__
__init__
__new__
__reduce__ - pickle protocol - missing!
__reduce_ex__ - pickle protocol - missing!
__repr__
__setattr__
__sizeof__ - missing from py.go
__str__
__subclasshook__ - missing from py.go
Comparison
__eq__
__ge__
__gt__
__le__
__lt__
__ne__
dict
====
Implementation notes
Implement __hash__. This should return an Int which in turn will
convert to an int64 (either a py.Int or a py.BigInt.Int64()). Make
py.BigInt.M__hash__ return a Py.Int and can use that to squash
the return when it is an BigInt.
dict should have a .StringDict() method so can convert easily for
argument passing etc.
Maybe dict should contain a StringDict() so if py.CheckString(Exact?)
then can use the embedded StringDict which is what the majority of
dict access is. Might complicate other things though.
Basic implementation is map[int64][]struct{key, value Object} where
int64 is the hash. Types must implement __hash__ and __eq__ to be a
key in a dict.
Would be much faster if __eq__ and __hash__ were on all types!
Idea: Each slice of objects in dictionary or set can be a SimpleDict
or SimpleSet which is an O(n**2) Dict or Set. This could be the
implementation for a few elements also if we can figure out how to
switch implementation on the fly. Maybe with an interface...
Would like to be able to switch implementation on the fly from
StringDict to SimpleDict to full Dict. Can we define an interface to
make that possible? A dict type then becomes a pointer to an
interface, or probably better a pointer to a struct with a mutex and a
pointer to an interface.
Lookup what go uses as hash key eg int and use that for gpython.
Perhaps use a little bit of assembler to export the go hash routine.
genpy
=====
Code automation to help with the boilerplate of making types and modules
* For modules
* instrospect code
* make module table
* fill in arguments
* fill in docstr (from comments)
* Module methods should be exportable with capital letter
* For a type
* create getters/setters from struct comments
* create default implementations
* independent of use 0 sized embedded struct idea
Todo
====
FIXME move the whole of Vm into Frame! Perhaps decode the extended args inline.
Speedup
* Getting rid of panic/defer error handling
Speed 35% improvement
Before 2430 pystone/s
After 3278 pystone/s
After compile debug out with a constant 9293 pystones/s + 180%
Keep locals 9789 local 5% improvement
Still to do
* think about exception handling - do nested tracebacks work?
* write a test for it!
* Make exception catching only catch py objects?
* perhaps make it convert go io errors into correct py error etc
* stop panics escaping symtable package
CAN now inline the do_XXX functions into the EvalFrame which will probably speed things up!
* also make vm.STACK into frame.STACK and keep a pointer to frame
* probably makes vm struct redundant? move its contents as locals.
* move frame into vm module? Then can't use *Frame in
* tried this - was 20% slower
* Write go module
* "go" command which is go(fn, args, kwargs) - implement with a builtin special in the interpreter probably
* "chan" to make a channel with methods get and put
* "select" - no idea how to implement!
* "mutex"
FIXME need to be able to tell classes an instances apart!
Put C modules in sub directory
Make an all submodule so can insert all of them easily with
import _ "github.com/go-python/gpython/stdlib/all"
Factor main code into submodule so gpython is minimal.
Make a gpython-minimal with no built in stdlib
Bytecode compile the standard library and embed into go files - can then make a gpython with no external files.
NB Would probably be a lot quicker to define
Arg struct {
Name string
Value Object
}
And pass args using []Arg instead of StringDict
Compiler
========
Complete but without optimisation.
Easy wins
* Constant folding, eg -1 is LOAD_CONST 1; NEGATE
* Jump optimisation - compiler emits lots of jumps to jumps
Testing
=======
python3 -m dis hello.py
python3 -mcompileall hello.py
mv __pycache__/hello.cpython-33.pyc hello.pyc
python3 -c 'import hello; import dis; dis.dis(hello.fn)'
go build ./... && go build
./gpython hello.pyc
When converting C code, run it through astyle first as a first pass
astyle --style=java --add-brackets < x.c > x.go
Note: clang-tidy can force all if/while/for statement bodies to be
enclosed in braces and clang-format is supposed to be better so try
those next time.
Roadmap
=======
* make pystone.py work - DONE
* make enough of internals work so can run python tests - DONE
* import python tests and write go test runner to run them
* make lots of them pass
Exceptions
==========
panic/recover only within single modules (like compiler/parser)
Polymorphism
============
Want to try to retain I__XXX__ and M__XXX__ for speed
so first try
ok, I := obj.(.I__XXX__)
if ok {
res = I.M__XXX__(obj)
}
However want to be able to look up "__XXX__" in dict
ok, res := obj.Type().Call0("__XXX__")
ok, res := obj.Type().Call1("__XXX__", a)
ok, res := obj.Type().Call2("__XXX__", a, b)
ok is method found
res is valid if found (or maybe NotImplemented)
Call() looks up name in methods and if found calls it either C wise or py-wise
Calling C-wise is easy enough, but how to call py-wise?
all together
var ok bool
var res Object
if ok, I := obj.(.I__XXX__); ok {
res = I.M__XXX__(b)
} else ok, res = obj.Type().Call1("__XXX__", b); !ok {
res = NotImplemented
}
ObjectType in *Type is probably redundant
- except that Base can be nil
Difference between gpython and cpython
======================================
Uses native types for some of the objects, eg String, Int, Tuple, StringDict
The immutable types String, Int are passed by value. Tuple is an
[]Object which is a reference type as is StringDict which is a
map[string]Object. Note that Tuples can't appended to.
Note that strings are kept as Go native utf-8 encoded strings. This
makes for a small amount of awkwardness when indexing etc, but makes
life much easier interfacing with Go.
List is a struct with a []Object in it so append can mutate the list.
All other types are pointers to structures.
Does not support threading, so no thread state or thread local
variables as the intention is to support go routines and channels via
the threading module.
Attributes of built in types
============================
Use introspection to find methods of built in types and to find attributes
When the built in type is created with TypeNew fill up the type Dict
with the methods and data descriptors (which implement __get__
and __set__ as appropriate)
Introspect attributes by looking for `py:"args"` in the structure
type. If it is writable it has a ",w" flag `py:"args,w"` - we'll use
the type of the structure member to convert the incoming arg, so if it
is a Tuple() we'll call MakeTuple() on it, etc. The attribute can
also have help on it, eg `py:"__cause__,r,exception cause"`
Introspect methods by looking at all public methods
Transform the names, so that initial "M__" becomes "__" then lowercase
the method. Only if the method matches one of the defined function types will it be exported.
FIXME meta data for help would be nice? How attache metadata to go function?
Introspection idea
==================
var method string
if strings.HasPrefix(key, "_") {
method = "M" + key
} else {
method = strings.Title(key)
}
fmt.Printf("*** looking for method %q (key %q)\n", method, key)
r := reflect.TypeOf(self)
if m, ok := r.MethodByName(method); ok {
fmt.Printf("*** m = %#v\n", m.Func)
fmt.Printf("*** type = %#v\n", m.Func.Type())
var fptr func(Object) Object
// fptr is a pointer to a function.
// Obtain the function value itself (likely nil) as a reflect.Value
// so that we can query its type and then set the value.
fn := reflect.ValueOf(fptr).Elem()
// Make a function of the right type.
v := reflect.MakeFunc(fn.Type(), m.Func)
// Assign it to the value fn represents.
fn.Set(v)
fmt.Printf("fptr = %v\n", fptr)
} else {
fmt.Printf("*** method not found\n")
}
Parser
======
Used Python's Grammar file - converted into parser/grammar.y
Used go tool yacc and a hand built lexer in parser/lexer.go
DONE
Go Module
=========
Implement go(), chan()
Implement select() with
http://golang.org/pkg/reflect/#Select
Eg from the mailing list
// timedReceive receives on channel (which can be a chan of any type), waiting
// up to timeout.
//
// timeout==0 means do a non-blocking receive attempt. timeout < 0 means block
// forever. Other values mean block up to the timeout.
//
// Returns (value, nil) on success, (nil, Timeout) on timeout, (nil, closeErr()) on close.
//
func timedReceive(channel interface{}, timeout time.Duration, closeErr func() error) (interface{}, error) {
cases := []reflect.SelectCase{
reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(channel)},
}
switch {
case timeout == 0: // Non-blocking receive
cases = append(cases, reflect.SelectCase{Dir: reflect.SelectDefault})
case timeout < 0: // Block forever, nothing to add
default: // Block up to timeout
cases = append(cases,
reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(time.After(timeout))})
}
chosen, recv, recvOk := reflect.Select(cases)
switch {
case chosen == 0 && recvOk:
return recv.Interface(), nil
case chosen == 0 && !recvOk:
return nil, closeErr()
default:
return nil, Timeout
}
}
My "Incoming()" function, and other places that use this pattern are now quite trivial, just a teeny bit of casting:
func (c *Connection) Incoming(timeout time.Duration) (Endpoint, error) {
value, err := timedReceive(c.incoming, timeout, c.Error)
ep, _ := value.(Endpoint)
return ep, err
}