Skip to content

Latest commit

 

History

History
620 lines (388 loc) · 38.7 KB

2021-05-15.md

File metadata and controls

620 lines (388 loc) · 38.7 KB

< 2021-05-15 >

2,056,924 events, 1,154,077 push events, 1,674,494 commit messages, 100,307,891 characters

Saturday 2021-05-15 00:04:34 by Bloopsdoob

I cant remember what i was trying to do

music shit uhhh fuck


Saturday 2021-05-15 00:24:23 by SuperMYL

fuck you

looks like you're at risk of losing your collaborator now (jk)


Saturday 2021-05-15 00:31:30 by Maksim Popov

Fuck you, protobuf! Need to make interfaces better. Need to add performance tests.


Saturday 2021-05-15 06:20:11 by Wilmer Adalid

Updates for: Hat check girl: "Goodness! What lovely diamonds!" Mae West: "Goodness had nothin' to do with it, dearie." -- "Night After Night", 1932


Saturday 2021-05-15 10:12:50 by Iris Morelle

lua/gui2: Restore layout invalidation hack when setting widget labels

Fixes #5755.

This is one of several layout invalidation hacks used in the old Lua GUI2 API lost in the implementation of the new API (which the old API now wraps).

It turns out that in the previous API (see 1.14 for reference if you need easy access to the source) there were several calls to invalidate_layout() that weren't there just to look pretty.

For wesnoth.set_dialog_value:

  • One prior to setting the value of a text_box
  • One prior to setting the value of a unit_preview_pane
  • One prior to setting the value of a styled_widget
  • One after changing a widget's visibility to hidden

All of these were removed by the new API implementation, with the last one ending up commented out, restored now in commit 8989eb430311092e66177bc09e1b1c344000d5aa. All the other ones are entirely gone, however.

It turns out that the dialog layout invalidation done for the styled_widget case was essential to get listboxes working correctly after clearing and adding rows to them (which was automatically done right before in the implementation of wesnoth.set_dialog_value by the find_widget() helper function). It was done for every single listbox item, but without it the listbox would become completely useless under certain conditions:

20210501 03:24:37 error gui/layout: grid [_content_grid] place: Failed to place a grid, we have 39,30 space but we need 39,93 space. This happened at a grid with the id '_content_grid' in a 'N4gui27listboxE' with the id 'list' in a 'N4gui24gridE' with the id '_window_content_grid' in a 'N4gui24gridE' with the id '_content_grid' in a 'N4gui215scrollbar_panelE' with the id '' in a 'N4gui24gridE' with the id '' in a 'N4gui26windowE' with the id ''.

Now, it seems horribly inefficient to do this for every single new row added to the listbox, but this solution works.

As for the text_box and unit_preview_pane cases, their presence or absence doesn't seem to affect anything in my testing with a live dialog, so I'm not adding them back until someone runs into a bug with them.


Saturday 2021-05-15 12:15:13 by Joel Robert Justiawan

Okeh ayo be serious

Quick milestones! we have

  • Make week list procedural, sort of. not really, but what I mean is, is that now you have weekList.json in the assets/data folder. it mirrors like one from coding. so the game will load the data, parse, and overwrite current weeklist here from json yay!!! finally! easy make mod!!!!!! I hope..
  • more intro textoid
  • install 2 more weekStrong song first. we added just the music right now, not the step. we have Rule The World and uh. wait I forgot the We'll meet again. ah damn! let's just...
  • augment schoolIntro and DialogueBox! let's make compatibility for non-pixel level as well!
  • bruh! forgot the load setting data of Perkedel and Odysee watermark. ah there we go KadeEngineData.hx. whew, we got it!
  • psst, question. why for web uses mp3? cannot ogg?? I have trouble. my songs aren't converted to mp3 yet. we took the wav file and convert to ogg only.. so, that's our problem.
  • NEW SCREEN!!! There are Green Screen, Blue chroma screen, and STUART SEMPLE'S PINKEST PINK SCREEN https://culturehustle.com/products/pink-50g-powdered-paint-by-stuart-semple yay!!! ssshhh! make sure Anish doesn't play this game. he's a hoe! he got Vantablack while we can't!! and he got it for drawing! oof! that's cheating!
  • thx Verwex for the inspiration with your Mic'd up FNF. I think this mod had something that inspires us. hopefully.

Saturday 2021-05-15 16:50:45 by Peter Zijlstra

sched/core: Fix ttwu() race

Paul reported rcutorture occasionally hitting a NULL deref:

sched_ttwu_pending() ttwu_do_wakeup() check_preempt_curr() := check_preempt_wakeup() find_matching_se() is_same_group() if (se->cfs_rq == pse->cfs_rq) <-- BOOM

Debugging showed that this only appears to happen when we take the new code-path from commit:

2ebb17717550 ("sched/core: Offload wakee task activation if it the wakee is descheduling")

and only when @cpu == smp_processor_id(). Something which should not be possible, because p->on_cpu can only be true for remote tasks. Similarly, without the new code-path from commit:

c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu")

this would've unconditionally hit:

smp_cond_load_acquire(&p->on_cpu, !VAL);

and if: 'cpu == smp_processor_id() && p->on_cpu' is possible, this would result in an instant live-lock (with IRQs disabled), something that hasn't been reported.

The NULL deref can be explained however if the task_cpu(p) load at the beginning of try_to_wake_up() returns an old value, and this old value happens to be smp_processor_id(). Further assume that the p->on_cpu load accurately returns 1, it really is still running, just not here.

Then, when we enqueue the task locally, we can crash in exactly the observed manner because p->se.cfs_rq != rq->cfs_rq, because p's cfs_rq is from the wrong CPU, therefore we'll iterate into the non-existant parents and NULL deref.

The closest semi-plausible scenario I've managed to contrive is somewhat elaborate (then again, actual reproduction takes many CPU hours of rcutorture, so it can't be anything obvious):

				X->cpu = 1
				rq(1)->curr = X

CPU0				CPU1				CPU2

				// switch away from X
				LOCK rq(1)->lock
				smp_mb__after_spinlock
				dequeue_task(X)
				  X->on_rq = 9
				switch_to(Z)
				  X->on_cpu = 0
				UNLOCK rq(1)->lock

								// migrate X to cpu 0
								LOCK rq(1)->lock
								dequeue_task(X)
								set_task_cpu(X, 0)
								  X->cpu = 0
								UNLOCK rq(1)->lock

								LOCK rq(0)->lock
								enqueue_task(X)
								  X->on_rq = 1
								UNLOCK rq(0)->lock

// switch to X
LOCK rq(0)->lock
smp_mb__after_spinlock
switch_to(X)
  X->on_cpu = 1
UNLOCK rq(0)->lock

// X goes sleep
X->state = TASK_UNINTERRUPTIBLE
smp_mb();			// wake X
				ttwu()
				  LOCK X->pi_lock
				  smp_mb__after_spinlock

				  if (p->state)

				  cpu = X->cpu; // =? 1

				  smp_rmb()

// X calls schedule()
LOCK rq(0)->lock
smp_mb__after_spinlock
dequeue_task(X)
  X->on_rq = 0

				  if (p->on_rq)

				  smp_rmb();

				  if (p->on_cpu && ttwu_queue_wakelist(..)) [*]

				  smp_cond_load_acquire(&p->on_cpu, !VAL)

				  cpu = select_task_rq(X, X->wake_cpu, ...)
				  if (X->cpu != cpu)
switch_to(Y)
  X->on_cpu = 0
UNLOCK rq(0)->lock

However I'm having trouble convincing myself that's actually possible on x86_64 -- after all, every LOCK implies an smp_mb() there, so if ttwu observes ->state != RUNNING, it must also observe ->cpu != 1.

(Most of the previous ttwu() races were found on very large PowerPC)

Nevertheless, this fully explains the observed failure case.

Fix it by ordering the task_cpu(p) load after the p->on_cpu load, which is easy since nothing actually uses @cpu before this.

Fixes: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu") Reported-by: Paul E. McKenney [email protected] Tested-by: Paul E. McKenney [email protected] Signed-off-by: Peter Zijlstra (Intel) [email protected] Signed-off-by: Ingo Molnar [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: CloudedQuartz [email protected]


Saturday 2021-05-15 17:43:32 by Billy Einkamerer

Created Text For URL [www.sowetanlive.co.za/news/south-africa/2021-05-15-i-love-you-my-child-emotions-run-high-at-memorial-for-murder-victim-yolandi-botes/]


Saturday 2021-05-15 18:03:32 by Ken Collins

Add AWS Lambda Supported "Web Server"

Hey y'all, hope everyone is doing well. My name is Ken Collins and I work at Custom Ink (the t-shirt place) and I have been fortunate enough to also be recognized by AWS as Serverless Hero. I have authored a Rack adapter, primarily targeted at Rails, to work with AWS Lambda. More info in the proposed link's destination and description.

https://lamby.custominktech.com

Lamby is as a Rack adapter that converts AWS Lambda integration events into native Rack Environment objects which are sent directly to your application. Lamby can do this when using either API Gateway REST API, API Gateway HTTP API's v1/v2 payloads, or even Application Load Balancer (ALB) integrations.

The fact our Ruby community has this HTTP standard to allow this to happen is just... well... amazing. I really appreciate the work here and the abstractions we have obtained. It has made the work I have done for Lambda really enjoyable. I used to only contribute to Rails via ActiveRecord adapter's like the SQL Server adapter. It has been fun learning both Rack and HTTP for a change.

Possible Feedback?

Considering that y'all are open to helping promote this Rack project for AWS Lambda, is there anything else I can do to help or change the pull request in any way to meet other's needs? For example:

  • Is this for Rails only? No... Lamby supports any Rack application and we have even decoupled our integration from Rails & ActiveSupport to allow that to happen easily.
  • Would you rather the link go to our basic GitHub (https://github.com/customink/lamby) page? I'd prefer the little product site because it has bit of copy explaining how we leverage Rack.
  • Do you have issues with this fitting into the web server section? It is a little mind bending but in these cases with Lambda & Rack integration the web server is best described as one of API Gateway (REST or HTTP) or an ALB. But if you tilt your head just right... the web server is basically Lambda since each instance is analogous to a server process. Given the Lamby gem is a Rack adapter for ALB Lambda Integration, it did not feel right saying Lamby is a Rack adapter for only API Gateway. Thoughts?

Saturday 2021-05-15 20:08:53 by Ben Dornis

Updating: 5/15/2021 8:00:00 PM

  1. Added: Dear Diary-Cats are Magical #FriendlyFuture #dogsontiktok #deardiary @zefrank #fyp #petsontiktok #sayhello #animalparty #dogsrule #moreyouknow (https://www.tiktok.com/@boxer_winston/video/6960295061552663814?_d=secCgYIASAHKAESMgowrI%2Fg2b8J236Z%2BzTDofKkzPkyRLQr0y3sLjg1%2B1rcoLpEWenOej45%2BxkSzGKsbwq7GgA%3D&language=en&preview_pb=0&sec_user_id=MS4wLjABAAAA8a_OQRNHc5xiQvN9d8ZyAyksjvy_xGLEair-Z5E2uXejG1BQMMixXkSZQ84XEudL&share_app_id=1233&share_item_id=6960295061552663814&share_link_id=45B37205-0DCA-49BA-8A2B-707294F11428&timestamp=1621105478&u_code=dd7gf8aedf1a8k&user_id=6844722995261604869&_r=1)

Generation took: 00:08:44.4196738 Maintenance update - cleaning up homepage and feed


Saturday 2021-05-15 20:23:02 by FaultyTwo

Delete fine-night-funkout - Shortcut.lnk

stupid ass shortcut makes my repo look bad!


Saturday 2021-05-15 20:27:48 by Marko Grdinić

"9:35am. I am up. In one way things are great - for the first time, I have a scheme for an actual learning system. The supervised learning that was so successful in the past decade is nothing more than fine tuning to random initializations. To have real learning what is needed is a system that is selective towards the data that it takes in and absorbs its meaning in a spongy fashion.

For the first time, I have an answer to how a real learning system could be constructed using backprop as a crutch to do local learning. It really was not easy, getting to the duality gap method was killer.

On the bad side, such a system really blows a hole through the current hardware setup. I really need something better than the GPU to take advantage of it.

For the first time in a while I feel how tragic it is that memristors did not come to fruition, and neither are they on the horizon. Will the Singularity 2029 date have to be moved to 2045 because of this? It could be.

https://www.researchgate.net/publication/351221765_Brain-inspired_computing_We_need_a_master_plan https://www.researchgate.net/publication/351575466_2021_Roadmap_on_Neuromorphic_Computing_and_Engineering https://www.researchgate.net/publication/351247779_Adjoint_Equations_of_Spiking_Neural_Networks https://deepai.org/publication/eventprop-backpropagation-for-exact-gradients-in-spiking-neural-networks

I found a citation in that third paper that EventProp (exact backprop for SNNs) should be implementable on Loihi. Besides doing some more reading on neuro computing, I want to investigate PILCO again.

Has anybody managed to get the same sample efficiency as it by replacing GPs with ensembles?

9:50am. Oh that third link is a PhD thesis by the author of the EventProp paper.

https://arxiv.org/abs/1807.01675 Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion

https://keawang.github.io/assets/documents/stanton2019model.pdf Model-based Policy Gradients with Entropy Exploration through Sampling

Training a deterministic policy with a model based reinforcement learning algorithm is a delicate procedure. While deterministic policies are very data-efficient, they are not nearly as forgiving of model error and gradient noise as stochastic policies. In practice this makes algorithms like PILCO very time-consuming to implement, with brittle dependence on design choices like task specific initialization schemes and policy architecture choice. In this work we present two practical improvements that make PILCO significantly more robust and easier to implement. We relax the moment-matching approximation and instead rely only on Monte Carlo reward estimates, and propose a model entropy regularization term that improves the reliabilty and dataefficiency of the algorithm while replicating the exceptional data-efficiency of PILCO.

I had no idea PILCO was brittle. Let me see if somebody got it to work through ensembles.

PILCO was extended to use probabilistic ensembles of neural networks in Gal et al. (2016).

Chua et al. (2018) proposed an algorithm (PETS) for model predictive control (MPC) with probabilistic neural network ensembles.

Gal is the guy who worked on dropout, I think.

The exceptional reliability and data efficiency of the original PILCO implementation appears to be a consequence, at least in part, of a great deal of handtuning.

https://eide.ai/rl/2019/01/17/pilco+improve.html

He used dropout to get the same sample efficiency. Maybe I do not really understand what exactly PILCO does.

10:20am. Right now I am reading the STEVE paper.

Gal et al. [11] use a deep neural network in combination with the PILCO algorithm [4] to do sampleefficient reinforcement learning. They demonstrate good performance on the continuous-control task of cartpole swing-up. They model uncertainty in the learned neural dynamics function using dropout as a Bayesian approximation, and provide evidence that maintaining these uncertainty estimates is very important for model-based reinforcement learning.

All of these variants show the same trend: fast initial gains, which quickly taper off and are overtaken by the baseline. STEVE is the only variant to converge faster and higher than the baseline; this provides strong evidence that the gains come specifically from the uncertainty-aware reweighting of targets. Additionally, we find that increasing the rollout horizon increases the sample efficiency of STEVE, even though the dynamics model for Humanoid-v1 has high error.

10:30am. Let me read the Deep PILCO paper by Yarin Gal next.

Here we attempt to answer these shortcomings by replacing PILCO’s Gaussian process with a Bayesian deep dynamics model, while maintaining the framework’s probabilistic nature and its data-efficiency benefits.

10:35am. Ok, enough of this. I do not get these papers. What I'll assume is that GPs can be replaced with ensembles, even if I do not exactly get how it works in the case of Deep PILCO.

https://www.researchgate.net/publication/351221765_Brain-inspired_computing_We_need_a_master_plan https://www.researchgate.net/publication/351575466_2021_Roadmap_on_Neuromorphic_Computing_and_Engineering https://www.researchgate.net/publication/351247779_Adjoint_Equations_of_Spiking_Neural_Networks https://deepai.org/publication/eventprop-backpropagation-for-exact-gradients-in-spiking-neural-networks

Let me move on to these. I want to consume neurochip papers.

https://www.researchgate.net/publication/351221765_Brain-inspired_computing_We_need_a_master_plan

New computing technologies inspired by the brain promise fundamentally different ways to process informationwith extreme energy efficiency and the ability to handle the avalanche of unstructured and noisy data that we are generating at an ever-increasing rate. To realise this promise requires a brave and coordinated plan to bring together disparate research communities and to provide them with the funding, focus and support needed. We have done this in the past with digital technologies; we are in the process of doing it with quantum technologies; can we now do it for brain-inspired computing?

This one is short so let me start with it.

Let me just get this out of my system. I have level 1 set with the current scheme. I have level 2 set with the spongy temporal hierarchy plan. The neuromorphic stuff will be level 3 that will allow me to actually use level 2 and get me all the way to the Singularity.

Even though I cannot use level 2, I should be able to predict it.

1,000x or three orders of magnitude efficiency means I could train something that would cost me 10k on current GPUs for only 10$. I absolutely need this. I am not made of gold. I do not want to turn my room into a data server room.

For level 3, I'll need the meta learning of arbitrary depth of the kind Schmidhuber talks about. Right now I feel like I have many of the basic principles down, but I do not have everything.

I am a really roboust 3/5 in ML at the moment. More than just curve fitting, it is only at this stage that I could make a system capable of actual learning.

10:45am. The paper is good, it actually hammers the economic case of ML. AlphaGo literally cost between 1 and 10m to train.

4/9. Oh, there are some embrionic systems that use memristors.

11:10am. https://www.researchgate.net/publication/351575466_2021_Roadmap_on_Neuromorphic_Computing_and_Engineering

This one is next. I'll skip the EventProp paper, but I'll take a look at the PhD thesis by the author.

This roadmap is 140 pages long.

Modern computation based on the von Neumann architecture is today a mature cutting-edge science. In this architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exascale — that is a quadrillion (1018) calculations each second. Even though these future computers will be incredibly powerful, if they are based on von Neumann type architectures, they will consume between 20 and 30 megawatts of power and will not have intrinsic physically built-in capabilities to learn or deal with complex and unstructured data as our brain does. Neuromorphic computing systems are aimed at addressing these needs. The human brain performs about 10^15 calculations per second using 20W and a 1.2L volume. By taking inspiration from biology, new generation computers could have much lower power consumption than conventional processors, could exploit integrated non-volatile memory and logic, and could be explicitly designed to support dynamic learning in the context of complex and unstructured data. Among their potential future applications, business, health care, social security, disease and viruses spreading control might be the most impactful at societal level.

Exascale is 10^18.

So based on the above, the human brain is 10^6 times more energy efficient than current hardware. I am not sure if this is meant to be CPU or GPU. But since we are talking about supercomputing, I guess it will be a hybrid of both.

11:15am. This roadmap has a section on materials. It will definitely tell me the memristor status.

12:25pm. 94/153. Let me have breakfast here.

12:55pm. Done with breakfast. Ch 46 of Hellrage and then I'll do the chores. The MTL is so bad it is bad, but it is good enough to understand the story and even enjoy it.

Today I am going to get my neuromorphic interests out of the way. Tomorrow, I will definitely do programming. I will stake all of my dreams on getting improved hardware. There is no point in messing with algorithms, when at best all I am able to do is supervised learning. For real AI, I need more.

I am going to get poker out of the way, and if making money using poker agents turns out to be too tough, I'll get a real job. At that point I'll try emailiing Mike Davies asking him whether he wants to sponsor Spiral and giving me some toys to play with. But while I wait for the replies, I'll have picked the first offer and will work on that.

1:15pm. Let me do the chores.

1:35pm. Let me resume. I'll be done with the roadmap soon.

1:45pm. 110/153. This chapter that constrasts rate coding and rank coding is interesting. I had no idea how much efficiency could be gained by just moving to spikes. About 40x for rank coding and almost 200x for n of m coding.

2:30pm. Ugh, these last few chapters were the pits. Let me move to the PhD.

2:40pm. > I demonstrate the algorithm’s effectiveness by applying it to several standard machine learning problems and find that it performs either better or equivalently to surrogate gradient based training. Since the algorithm admits a purely eventbased implementation, it has memory advantages over surrogate gradients as well: It is only necessary to store data at events local to each computational unit (neuron). Moreover, the events during the backward pass can reuse routing information used during the forward pass. This implies it is possible to implement the EventProp algorithm on (digital) neuromorphic hardware, such as Loihi [44], Spinnaker [60], or Tianjic [125], efficiently. It also means it can be efficiently implemented on a hybrid digital-analog system like BrainScaleS-2.

3:10pm. I am reading some SNN papers.

https://arxiv.org/pdf/2103.12593.pdf

This one for example uses surrogate gradients + backprop to make a spiking RNN that is SOTA on some benchmarks. The paper concludes that plenty of surrogate gradient functions work well, but what the network is really sensitive is their overall scale.

3:30pm. 68/140. I am mostly skimming the paper. I definitely won't study the equations. But based on the paper I linked above which cited the EventProp paper, it does seems like SNNs have a bright future ahead of them. This kind of math is not my forte.

Figuring out how to stack independent GANs and then using discriminator uncertainty to drive their learning behavior is the kind of broad minded planning that I excel at. Guys like this who are doing the micro stuff are better than I could ever be at it.

3:50pm. Ok, enough of the PhD. I did not get much out of anything I've looked at today. At most I've broadened my horizons a bit.

But with this level of brainpower going into SNNs, they will eventually have their rise.

https://www.youtube.com/results?search_query=memristor

There are some new memristor vids, so let me go for them next.

https://www.youtube.com/watch?v=ieMdXnjQTXc Leon Chua, UC Berkeley - 10 Things You Didn't Know About Memristors

This is from 3 years ago. I really liked the Memristor Minds talk by Chua.

https://www.youtube.com/watch?v=yChwSOXO538 Neuromorphic computing with memristors: from device to system - Professor Huaqiang Wu

I'll go for this next.

https://youtu.be/ieMdXnjQTXc?t=102

The first thing you did not know is that all non-volatile solid state memories are memristors.

I was wondering about this, but it is good that it is finally out in the open.

The second thing you did not know is that not all memristors are non-volatile.

4:10pm. Ok, enough of this talk. It is just rambling over a bunch of equations. Let me go for that second talk and I will call it a day early.

4:15pm. Actually, I need to do an extra round of chores. Let me stop here for a bit.

4:35pm. Let me resume.

4:50pm. https://youtu.be/yChwSOXO538?t=899

I'll admit, I am not really watching this and instead am lurking /a/ on the side. The speaker's english is so poor here. Instead of giving a history review, when will he get to memristors.

5:10pm. https://youtu.be/yChwSOXO538?t=1436

Here he is finally getting to the point. I should focus here.

He mentioned a lot of startups in China working on this.

https://youtu.be/yChwSOXO538?t=2223

Here he is talking about how memristors give good results despite their variability. He said it surprised him that they got 95% on Mnist despite the issues at the single device level.

https://youtu.be/yChwSOXO538?t=2325

It seems that full chip systems are quite recent. This work is from 2020.

https://youtu.be/yChwSOXO538?t=2564

This seems like 100x the efficiency of a GPU. Good start. I want 1,000x and even more.

This is 130nm technology. That could be shrunk down below 10.

https://youtu.be/yChwSOXO538?t=2929

Here is the future perspective. The talk is almost done and it does give me some relief. If the west screws up, China can become the new west.

5:50pm. So 10+ years before I have the kinds of chips I can use to make AGI. Seems reasonable.

6pm. https://youtu.be/yChwSOXO538?t=3516

I was going to skip the discussion, but it is interesting enough for me to listen.

https://youtu.be/yChwSOXO538?t=3649

Wait, this question made me realize that those chips he demoed might not be spiking, but analogue values. That could be a factor of 100x that was left on the table.

https://youtu.be/yChwSOXO538?t=3687

Here he says that even though memristors are very good for SNNs, he wants to merge his work with existing tech and that SNNs algos are not good enough yet. His dream is to come up with a product in the next few years.

https://youtu.be/yChwSOXO538?t=3755

The chip they are working on right now is targeting ImageNet. They'll have something out in a year.

6:20pm. I guess I have some extra material for the PL review. And the half yearly review.

https://www.bing.com/videos/search?q=Simon+Thorpe&docid=608032764935409399&mid=F1369F2184A55A56CF0FF1369F2184A55A56CF0F&view=detail&FORM=VIRE "Finding repeating patterns: A key to intelligence in man and machine?" by Simon Thorpe

I'd like something better to link for the rate coding chapter than the 140 page roadmap.

6:35pm. I guess I'll go through this video as well.

7pm. https://youtu.be/602d76Yri84?t=1421

These memory games are interesting in their implication. Yeah, the scheme I came up with allow for temporal hierarchies, but it would not allow this phenomena...

...I think. Hmmm...

Well, it would be tough to keep in all in short term memory.

7:25pm. Done with lunch.

https://youtu.be/602d76Yri84?t=1719

This is interesting. They have an algorithm to find these repeating patterns. I should look into it.

It can tell the difference between noise and when something happens.

7:30pm. I can't find what it is...Ah, Google Scholar has his papers. I see that some of them are on repeating patterns. Let me finish the talk and I'll look them up.

8:05pm. I can't find the algorithm. I skimmed the patent, but it is not clearly described there. Still...

https://www.frontiersin.org/articles/10.3389/fncom.2018.00024/full Unsupervised Feature Learning With Winner-Takes-All Based STDP

All of this guy's ideas are based around Hebbian learning, so JAST is probably some variant of that. I am decently familiar with Hebbian learning. I did my best mad scientist (bad version) impersonation when I pestered Miconi back in 2018 so I have vivid memories of that which I'd rather forget.

The trouble with Hebbian learning is that it is difficult to combine with backprop in an efficient manner. I could not figure out how to lower its memory expenditure. It would require copying the weight matrix at every step. It might be possible to reverse the updates and I played with those ideas, but I gave up on that as it would be too much of a hassle to implement and the benefits of Hebbian learning aren't apparent. Backproping through the weight matrix also resulted in very large gradients when included the recurrent connections.

I forgot about this until watching the video.

8:15pm. I do not know what to think of it. Maybe the right thing would have been to not pass the gradients going through plastic layers at all. How would I use and take advantage of such a scheme within the bounds of a bigger system?

8:25pm. Forget this. Though I am familiar with Hebbian learning, this video reminds me of how difficult long term memory is.

But I am sure that the solution is not to keep a trace. Rather the solution is in the direction of factoring out the time. This idea I had is in one direction, but I do not think it will do what I want which is keeping around long term memories in the way I want. The filtering it does should be a qualitative jump in learning quality, but I am not there yet.

The kind of Hebbian learning being done, which resembles a histogram would be one way of factoring out time.

8:50pm. I've gone into thought, but I can't figure it out. I can't believe I still haven't closed. I expected that I would stop today before 6pm and yet I am still trucking. Tomorrow I'll get some programming done.

Let me put in a few final words. The problem is that if you are doing Hebbian learning, then you are automatically doing unsupervised learning. You can't really modulate that because the unsupervised learning would overwrite it. If you modulate it, then you just end up with the backprop learning rule for multiplying weight matrices.

8:55pm. Sigh. There are still many secrets left to uncover, even in the linear layer. The way the brain uses reconstruction, I can actually vividly see it now. Different areas collate their inputs, push the information to the upper layers, and then the weight matrices are run backwards to reconstruct the inputs even in modules that would otherwise be unrelated.

Even today's NNs should be able to do that.

9pm. But to do it on the scale the brain does, just how much memory would really be needed? Quite a lot.

No, I cannot compare. I am 3/5 and not 4/5 or 5/5 in ML for a reason. I do not understand the brain's masterful use of memory.

9:05pm. Just like with those moving average ideas and the training scheme ideas, I am missing profoundity when it comes to Hebbian learning.

9:10pm. I do not think it is great mathematical insight that I am missing here. The great algorithms of the future won't necessarily be complex. Rather the understanding of them will illuminate the mysteries of the present."


Saturday 2021-05-15 20:27:48 by Marko Grdinić

"9:30pm. I am still thinking about it. I just had an idea. It should have been really obvious, but I missed it.

If I wanted to use backprop for doing long term credit assignment, there is an option besides predicting the gradients or keeping impossibly long traces around. That would be to reconstruct the inputs! This would enable the input * grad rule to work.

9:45pm. This would not be possible with today's dense nets, those need the traces to be kept around, but it might be the truth that with certain kinds of architectures, the weights themselves are enough to have an implicit trace.

I admit, I was too focused on predicting the gradient. 'Predicting' the inputs never even occured to me.

Now that I am thinking about it, I can sort of sense that the sparser the net, the more accurate the implicit trace would be.

9:55pm. This scheme working would require large sparse vectors to be kept around as memories. If it is something like 2 ^ 8 possible inputs to a layer, it would not say much, but 2 ^ 1024 might be such a large amount of information that the trace could be reconstructed exactly. It could literally be only a single possible point of experience. Those keys could be concatenated RNN outputs which would make them even more unique.

10pm. Keeping large keys around would be a memory expenditure, but it would be exponentially less than keeping around traces. The traces would get stale very soon anyway, and from a credit assignment perspective it might not even be what you want.

10:10pm. I am certain - this is in fact the secret behind long term memory credit assignment in the brain. The GAN idea is good, but it is something else.

10:15pm. So that is another piece of the puzzle that falls into place. In the future, I'll have to bring these principles together to form a coherent learning system.

10:20pm. Agh, I never feel more like an NPC than it times like this. Am I the one having these ideas or are they being feed into my brain by some higher power? How otherwise could I explain not seeing something so obvious?

Annoyingly, though I haven't done much programming in the last 3 weeks my research effort seems to be completely justified in hindsight. I completely expected that the last two days of reading papers instead of programming would be a complete waste of time, but I keep having great ideas one after another. I was sure I would get stuck and achieve nothing like in 2018."


Saturday 2021-05-15 21:35:06 by Andrew Hamblin

implement indexing for LinkedList

In order to write a sane automatic test for "can insert many items", it would be really nice to have an indexer. So I removed the test and implemented that. in the process I did some simple refactors to keep the tests easy to read.

Being disciplined about doing TDD the Uncle Bob was is quite slow for me. I feel like my impl of at may not be correct. But I couldn't see how I would implement either inserts, or indexing(which are really the same logic) in little Uncle Bob TPP sized chunks.

But this is why I practice. And take the time to reflect after the session. I'm now wondering if I may not want to create an internal method that takes us to the right node to actually do our operations on it.

Relating to testing pain though. I would actually love a tool that let me 'pin' a test as expected to fail while i work on something I want to use to fix it. Because my aesthetic sense screams when I leave a test failing to get some dependency of it working. Like when I removed the test of "can insert many" in a false green state.

I'm just not used to making changes that small, or thinking of my code as something it's worth changing in that tight of a loop.

The Jest autorunning and tooling in vscode is really great. I'm actually struggling to imagine this experience being as comfortable for coding in either webstorm or emacs. Much less in the kind of cozy 'suckless' console environment I recall really liking in college.


Saturday 2021-05-15 22:38:08 by Billy Einkamerer

Created Text For URL [www.dispatchlive.co.za/news/2021-05-15-i-love-you-my-child-emotions-run-high-at-memorial-for-murder-victim-yolandi-botes/]


Saturday 2021-05-15 23:16:44 by Michael

it's been a wild ride, but I think parens are finally done. I'm sure bugs exist, but I can't find any right now after some testing so life is good man. life is good. I could potentially add errors and result = errors for malformed expressions. But I think i'm really proud of this one, and a little burnt out. I really want to move on and try NODE stuff. I have spent a massive amount of time working on this. Nearly an embarassing amount of time. If I would have left this project without parens, I probably could have called it a wrap so so much sooner. holy fuck, fuck parens lol


< 2021-05-15 >