The screencast below shows a minimal demo of a distributed matrix factorization on a blockchain. It runs on a custom blockchain implemented by Robert Bamler based on the substrate framework. See code on Github.
The above screencast shows a very simplistic movie recommendation system running decentrally on a blockchain.
Recommendation systems are of course not new. Also, training a machine learning model on many different nodes that are connected through a network is a well studied problem in the "federated learning" literature. However, federated learning usually requires a central authority to orchestrate the learning process, and all other nodes of the network have to trust the central authority for better or worse. By contrast, this demo shows truly decentralized learning without any central authority. All nodes that are involved in training the model are equal peers in a permissionless and trustless network. The can also leave the network and return to it at any time without coordination.
This permissionlessness implies a second novelty: in this demo, the model is trained in an extremely incremental way. This requires postponing a new kind of "lazy" machine learning where certain commitments to trained model parameters are postponed until a later time. It turns out that this can be done by keeping track of posterior correlations between inferred user and item feature vectors in a Bayesian interpretation of the model.
This video is a screencast of a very simplistic recommendation system. Its main purpose is to demonstrate that modern blockchain frameworks are powerful and ergonomic enough to be useful to machine learning researchers.
Apart from matrix factorization being a very simplistic model for movie recommendations, the training algorithm shown in this video also lacks important capabilities. For example, had the last rating by user Rob been submitted before the first rating by user Alice, the implementation shown here would not have been able to make a useful prediction for the query at the end. This has to do with the form of posterior approximation, which currently takes only a limited amount of posterior correlations into account and breaks symmetry too early. More research about efficient approximate Bayesian inference methods that keep track of posterior correlations will be necessary for fully useful decentralized machine learning on blockchains.
This website, the screencast video, and the implementation of the blockchain and user interface were created by Robert Bamler.