This article is intended as a resource with links to provide an overview of the cutting-edge techniques that the author believes will shape AI and our world during the 2020s. It also seeks to explain the reasons why we need different approaches to those that we have used over the past decade to create the next wave of exciting technological advances and unicorns. Furthermore, it is written in the aftermath of the events of the Conference of the Parties (COP) 26 of the UN summit on Climate Change and will consider the relevance of Machine Learning to the underlying issues as well as the threats to humanity in terms of AI or Business as Usual (BAU) with life today and governance matters.
Summary of this article:
- Availability and access to data remains a barrier to scaling Machine Learning across a number of key sectors of the economy;
- Organizational culture is also a barrier;
- The rise of the convergence between AI and the IoT known as the AIoT, and also the anticipated arrival of the Metaverse with 5G enabled glass technology will necessitate working with decentralized data but also need greater emphasis on data security and privacy with standards and ethics guidelines;
- The potential and challenges with synthetic data;
- Federated Learning with Differential Privacy;
- What is the real challenge for humanity with AI? Super Intelligent AI agents with advanced AI enabling robots into a Terminator scenario or is this a distraction from the real challenge In the short to medium term whereby mass surveillance and cyber warfare are more pressing and genuine threats?
- What is the genuine threat to humanity In the medium to long term? And what role can AI play in helping alleviate this challenge?
Venture Capitalists are hoping to find the next superstar tech unicorn, AI startup founders dreaming of creating the next unicorn, and corporates adopting AI need to consider their data growth strategy in order to be able to scale their AI-enabled services or products.
The past decade has been one of explosive growth in digital data and AI capabilities across the digital media and e-commerce space.
And it is no accident that the strongest AI capabilities reside in the Tech majors. The author argues that there will be no AI winter in the 2020s as there was in 1974 and 1987 as the internet (social media and e-commerce) are so dependent upon AI capabilities and so too with being the Metaverse, and the era of 5G enabled Edge Computing with the Internet of Things (IoT).
Furthermore, the following infographics illustrate how many people globally use social media and hence how central these channels have become to the everyday lives of people.
Likewise, the size of the e-commerce market is vast.
Although the era of standalone 5G networks may enable a window of opportunity for a new wave of consumer-facing applications in the business to consumer (B2C) in relation to e-commerce and perhaps even new digital media platforms that may challenge the current incumbents, after all the arrival of 4G provided a window for the likes of Airbnb, Uber, and leading social media platforms such as Facebook, Instagram, etc. to scale. However, while the author believes that there will be some successful new entrants within the digital media space enabled by the standalone 5G networks, the position of some of the tech majors will remain firmly entrenched with large data and state of the art tech resources to defend their territories
Hence it is more likely that the next wave of super unicorns and Tech giants will emerge in the other sectors of the economy rather than taking on the existing Tech giants such as healthcare and finance and also through the business to business (B2B) enterprise sectors that deal with manufacturing, energy and smart city applications too.
The author was a speaker at the Wonderland AI summit 2021 and happened to attend a session by Venture Capitalist Serhat Aydogdu of D4Ventures, who explained that access to and data strategy is a key criteria for D4 Ventures in terms of the screening process for AI startups.
Indeed, the importance and need for AI on the device itself rather than residing on a remote cloud server is demonstrated by Massimiliano Versace (CEO of Neurala), a speaker at the AI Wonderland summit 2021. Neurala collaborated with NASA on enabling AI on edge for the Mars Rover.
At a logical level, this makes complete sense as the Mars Rover needs to operate autonomously and make decisions in real-time rather than waiting for a server signal to come from planet Earth to planet Mars. It would take a very long time or not arrive at all! The Mars Rover is representative of the emerging space exploration sector requiring AI capabilities on the device itself for the same reason of latency and real-time, dynamic responses required by the machine.
The Rise of the Edge
Furthermore, we also face challenges when we look to scale AI across the economy into areas such as transportation with level 5 (fully) autonomous vehicles, healthcare, and manufacturing.
Let’s start with the author’s analogy between the Ancient Mariner and the Modern Data Scientist that highlights the challenge of scaling AI in certain sectors of the economy.
The Ancient Mariner was stuck on a ship that was surrounded by water. The Modern Data Scientist sits in the era of ever-growing big data.
The Rhyme of the Ancient Mariner
Day after day, day after day,
We stuck, nor breath nor motion;
As idle as a painted ship
Upon a painted ocean.
Water, water, everywhere,
And all the boards did shrink;
Water, water, everywhere,
Nor any drop to drink.
The meaning here of the Rime (Rhyme in modern English) is that although the Ancient Mariner is surrounded by plenty of water in the ocean as the water has high saliency and as the Mariner is without the tools to desalinate the water, he is unable to drink the water even though the ship itself is surrounded by water as it sails in the ocean.
A similar parallel can be made with the era of big data as one moves away from the sectors of Digital Media and e-commerce (one may add the energy sector, trading markets, drug discovery, and astronomical physics). We often simply lack the data due to the manner in which the data is captured and stored.
The Rhyme of the Modern Data Scientist outside of digital media e-commerce, plus a few exceptions such as the energy and astrophysics sectors:
Data, data, everywhere,
And expanding digital footprints did social media shriek,
Data data everywhere,
And yet not a clean stored Gigabyte that may be used.
The above has been one of the key challenges for scaling Machine Learning and Data Science across the sectors outside of digital media. For example, in healthcare, where much of the data is siloed, and regulatory rules demand data privacy, making it harder to collect and access data for Machine Learning models.
The following is a tweet by Andrew Trask, Researcher at the University of Oxford and founder of OpenMined, who is driving the PySfyt Federated Learning initiative.
Overcoming the challenge at the Organizational level
Even when we do have sufficient quality data, Gartner reported that many AI and Data Science projects fail due to strategic alignment and a lack of a clear problem definition, the goal or objective of the project.
There is often a communication gap between the understanding of the business teams and the Data Science teams in relation to the objectives of the business (how are we seeking to generate a return on investment – increase in customer conversions via marketing campaigns, or cost reductions in operations or supply chain?) and the resources and capabilities of the organization to deliver such goals.
Going forward, we need more hybrids who can bridge the gap between Data Science and business strategy. Organizations need to think more like a Tech major whereby AI technology can be a revenue generator (or cost-saver) and not just a cost center. This has to start with the C-Team in the legacy companies outside of the tech world. The C-Team led by the CEO needs to take the initiative and ownership of the AI capabilities of the firm.
AI Project Has 5 Stages
The author argues that both Problem Definition and Operational Fitness need to be considered in the context of a Firm’s business strategy and organizational capabilities.
In order to scale AI across the economy and beyond the confines of digital media and e-commerce we are going to have to further develop our AI techniques so as to enable AI to scale and transform the world around us.
Our problem definitions and operational fitness for AI going forwards in the 2020s will be in relation to increasingly working with decentralized data on the Edge of the Network and being able to respond on the fly to dynamic environments.
This article will consider how we will meet the challenges of scaling AI across the 2020s and how our journey towards and into the 2030s will coincide with the arrival of the metaverse and Internet of Everything (IoE) where physical and digital convergence will occur.
The types of AI we will need given the challenges we need to overcome
This decade will be about AI scaling beyond the digital media and e-commerce sectors and into the “real-world” economy via the IoT with AI on the Edge of the network, enabled by standalone 5G networks.
I believe that the following will dominate the 2020s:
- The rise of 5G enabled Mixed Reality glasses with AI for object detection, visual search, recommendations, and neural translation. This will lead to the development of the Metaverse and ultimately the Internet of Everything (IoE).
- The need for standards: the physical and digital convergence resulting from 5G enabled glasses will lead to exciting innovations and new opportunities that we can only dream of today, albeit the need for prevention of intrusive data harvesting without transparency and consent that is easily understood and managed will become all the more crucial – see section on Federated Learning and Differential Privacy below;
- The rise of the AI with IoT whereby the AIoT will be the enabler for intelligent homes, offices, factories, education, hospitals and cities. What do we mean by intelligence here? I am not referencing Artificial General Intelligence (AGI) whereby machines can match the capability of a human brain but rather to personalise the interaction, use predictive analytics to prevent or mitigate risks and respond to the end user in a meaningful manner.
By the late 2020s and the early 2030s, the rise of 6G may take us beyond the AIoT to the world of AI meets the Internet of everything known as AIoE. The IoE is an area of research at the University of Cambridge and other leading universities. Cisco predicts that $14.4 Trillion is at stake with the IoE connecting the unconnected across this decade.
The AIoE in the 2030s will be a complete merging between the physical and digital world and experiences. It will also be a time when human computer-brain interfaces really take off and we find ways to augment ourselves with AI. With this in mind, I personally do not believe that AGI and eventually Super Intelligence will happen in an isolated vacuum whereby AI leaps forward to match and then outperform humans. Rather this will be an era whereby humans enabled by technology also advance and more rapidly than ever before (to be considered in more detail in the fourth article to this series).
Before any of the AIoE revolutions with 6G, let alone the AIoT with stand-alone 5G networks take place We need to move to a world where AI can do the following:
- Work with smaller data sets for training – a human child can learn from a few examples. For example see Wang et al. (2020) Generalizing from a few examples: A survey of Few Shot Learning;
- The need for developing and operating more resource efficient Machine Learning. In particular Deep Learning and computational resources is the ability to work on the Edge in lower power constrained (limited battery life) environments including the need to be more energy efficient in the era of climate change mitigation;
- Ensure that we build in sensible ethics around data and also diversity in the Data Science and Machine Learning engineering teams so as to prevent the arrival of a “big brother” 1984 world with monitoring of everything we do or to create AI that creates economic or social harm in society along gender or racial discriminatory actions;
- Part of the challenge in reducing bias and consequently potential discrimination in the results of the model is to also broaden our datasets and sources to cover wider society and yet this would also require finding ways to access data from wider, often decentralised sources in sectors such as healthcare and hence the need to consider emerging techniques such as Federated Learning with Differential privacy;
- Data and AI security to prevent harmful effects in an era where cybercrime and warfare have been expanding rapidly.
Moreover, Gartner believes that 70% of organizations will refocus away from big data towards small and wide data by 2025! If firms are going to scale AI across the wider areas of the economy and across the Edge of the Network then we need to consider how we are going to enable Machine Learning and in particular data-hungry Deep Neural Network models to learn from smaller, decentralized datasets and also to become more resource-efficient.
Computational Efficiency Neural Compression
A good starting point is that in order to inference (and in the future hopefully learn on the fly) on the Edge (on device) we will need smaller more efficient models. This will be key from a sustainability perspective (power supply, carbon footprint) as well as simple resource capabilities as small low-powered devices would be unable to run a giant GPT-3 or AlphaGo model locally upon the device.
Key research highlighting the way forward was provided by Frankle and Carbin (2019) in the wared winning paper the Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks whereby they proposed subnetworks with pruning that may significantly reduce the parameter counts of trained networks resulting in a 90% decrease in storage requirements without compromising the computation performance.
“We consistently find winning tickets that are less than 10-20% of the size of several fully-connected and convolutional feed-forward architectures for MNIST and CIFAR10. Above this size, the winning tickets that we find learn faster than the original network and reach higher test accuracy.”
Karen Hao in MIT Technology Review summarised the paper in explaining how a vast amount of processing resources are being used unnecessarily to train networks that are ten times too large for the given needs. Karen Hao also illustrates how this technique may in fact enable powerful AI technology capabilities within the mobile phone of the end user.
In a further paper by Carbin et al. (2019) Stabilizing the Lottery Ticket Hypothesis explain that a further improvement may result by reinitialization of the pruned network to its state within a few steps of its training.
In essence, the aim for compression of neural network models is to result in a model that may entail simplified architecture and yet yield similar performance (for example accuracy).
A compressed model comprises a smaller model which in turn entails a reduced number of parameters and hence is more efficient with RAM during runtime thereby increasing the availability of memory for other aspects of the application. A reduction in latency relates to less time for making a prediction or inferencing and hence reduced energy consumption when the model is run. There is a typically direct relationship between latency and the size of a model whereby the greater the model size the so too the required memory resources are needed to run the model.
The results of neural compression are in line with the objectives of sustainability objectives of reduced carbon footprint, making a model more user-friendly in a low-power IoT environment and also enabling faster responses to the user. In terms of product development, this makes sense in an era where are moving to a world of near real-time analytics in the era of standalone 5G networks and the AIoT at the edge of the network and a world where consumers and policymakers are expected to seek more efficient devices.
Examples of Neural Compression include:
- Pruning: as set out by above in the work of Frankle and Carbin (2019) . It may be structured or unstructured.;
- Quantization: entails reducing the size of weights within the model. For more information see Gholami et al. (2021) A Survey of Quantization Methods for Efficient Neural Network Inference;
- Low-rank approximation: this approach applies a linear combination of reduced filters with the aim of approximating multiple redundant filters.
- Knowledge distillation: inspired by the concept that the tasks related to training and inference are different and hence the same model should not be used for the two.
See Wang and Yoon (2021) Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks
Neural Architecture Search (NAS): explained by Hannah Peterson “in the most general sense is a search over a set of decisions that define the different components of a neural network—it is a systematic, automized way of learning optimal model architectures.” The concept entails designing novel architectures with enhanced performance by removing human bias.
An excellent overview of the above techniques is provided by Hannah Peterson “An Overview of Model Compression Techniques for Deep Learning in Space”
For an additional overview of Compression of Deep Learning see Gupta and Agrawal (2021) Microsoft Compression of Deep Learning Models for Text: A Survey
TensorFlow provides model optimization documentation and resources for neural compression.
It is worth noting that Supervised Learning remains a dominant approach in the Machine Learning production environment.
Supervised Learning requires large amounts of annotated (labeled) data for the ground truth against which the Machine Learning model will learn from. Outside of Digital Media and E-commerce, we hit the issue that the labeled data sets often don’t exist in sufficient quantity.
We’ll examine some of the techniques that are emerging that might help us get around this challenge.
Synthetic Data: Plugging the Data Gap
Gerard Andrews in NVIDIA Blogs explains how synthetic data comprises data that has been annotated and generated algorithmically or via simulation. It may then be utilized as a proxy for actual real-world data from a statistical or mathematical perspective and Trembley et al. (2018) Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects argued that it may be as good or even better than real-world data for training Machine Learning models.
Synthetic data may help alleviate the challenges that restrict access to data due to privacy (for example in healthcare GDPR in Europe or HIPAA in the US), or data is simply scare and also to help alleviate the data labeling process.
Kajal Singh in Synthetic Data — key benefits, types, generation methods, and challenges explains how we may generate synthetic data by drawing from distribution or applying agent-based modeling. Machine Learning approaches to generating synthetic data include Variational Autoencoders and Generative Adversarial Networks (GANs).
Kajal Singh also explains how synthetic data approaches may entail:
- Fully synthetic Data whereby the entire dataset is synthetically generated;
- Partially Synthetic Data relates to the application of synthetic data to displace real values where a particular feature that is highly sensitive is present. The actual real-world data are only replaced in the event that the value entails a large risk for disclosure;
- Hybrid Synthetic Data entails using a combination of real-world and synthetic data. A synthetic dataset whereby Kajal Singh notes that “for each random record of real data, a close record in the synthetic data is chosen and then both are combined to form hybrid data.” The approach possesses the advantages of the two methods above, however, it entails a greater usage of processing time and memory resources.
Synthetic data is forecast to grow rapidly in usage within the AI sector. Indeed, Gartner Maverick Research predicts that by 2030 Synthetic data will significantly out-scale real-world data.
However, there remain challenges with the usage of synthetic data as noted by Cem Dilmegani whereby:
- The quality of the resulting synthetic data remains depending on the quality of the source data and this will be reflected in the Deep Learning model. Biases or other quality issues will reflect in the model performance. Hence we may end up with a large volume of synthetic data but a model that entails Garbage In, Garbage Out (GIGO);
- Generating synthetic data of appropriately high quality for use in a Machine Learning model may still be costly and difficult even if cheaper than labelling real-world data;
- The resulting synthetic dataset will represent the statistical properties inherent within the underling source data with the risk that certain behaviours of real-world data such as randomness maybe missing from the data.
Kajal Singh further expands on that synthetic data may prove difficult to generate, comprise inconsistencies when attempting to replicate real-world data. Moreover, there is a risk that key features are omitted in the synthetically generated data and hence real-world data performance will be adversely affected, meaning that some of the problems within the synthetically generated dataset may only become apparent when the model is being applied to real-world data in the production environment (by which time there is no hiding from potential mishaps).
Synthetic Data: Sim-to-real and Autonomous Robotics
The robotics sector is an example of the application of synthetic data to solve for the data challenge. An approach known as sim-to real is applied whereby synthetically generated data or virtual simulation is used to train the machines to learn in particular where sufficient data is not available.
The synthetic data approach has shown promising results for autonomous robotics including training autonomous vehicles for scenarios that may be harder to capture in sufficient volume from real-world examples.
Feng et al. (2021) Bridging the Last Mile in Sim-to-Real Robot Perception via Bayesian Active Learning explain that there is a problem with reliance on synthetic data. The problem is explained whereby on the one hand utilising synthetic data to train models is gaining in popularity in robotic vision tasks, for example object detection, due to the substantial volume of data that maybe generated without human annotation. On the other hand, in cases of sole reliance on synthetic data there is a challenge of simulation to reality gap, referred to as the Sim-to-Real gap. The Sim-to-Real gap is more challenging to solve in practical production environments and we need to resort to real human annotated data in order to resolve this problem.
A good example is provided by Google AI (2021) Toward Generalized Sim-to-Real Transfer for Robot Learning.
The previous research illustrated the capability of robots to learn end-to-end via Deep Learning and effectively interact with the unstructured world that surrounds us by understanding camera observations and then undertake corresponding tasks and actions. However, the sheer volume required for training episodes of real-world robots needs hundreds of thousands of examples that may simply not be attainable.
The team observed that a key source of the Sim-to-Real gap arose from discrepancies in relation to actual camera observations relative to images rendered via simulation, thereby impacting the real-world performance levels of the robot.
The Google Research team used Generative Adversarial Networks (GANs) with the aim of seeking to alleviate the real-world to virtual world Gap.
Source for Image above Google AI Blog
The research team argued that additional constraints applied to GANs may result in improved outcomes whilst at the same time allowing for reduced data collection and hence addressing the sim-to-real gap.
Synthetic images were translated into images that looked realistic by an RL-CycleGan that in turn applied an RL-consistency loss to preserve features that were relevant for the given task. An object-aware RetinaGAN that was not trained on specific tasks was applied to transfer across the tasks and environments and the authors claimed that it may be reused for a novel object pushing task.
Further recent research in sim-to-real includes
Feng et al. (2021) Bridging the Last Mile in Sim-to-Real Robot Perception via Bayesian Active Learning propose a Sim- to-Real pipeline that relies on deep Bayesian active learning and aims to minimize the manual annotation efforts and show that the labeling effort required to bridge the reality gap can be reduced to a small amount. Furthermore, they demonstrate the practical effectiveness of this idea in a grasping task on an assistive robot.
Source for image above: Feng et al. (2021) Bridging the Last Mile in Sim-to-Real Robot Perception via Bayesian Active Learning.
Decentralized data and Data Security
Moreover, the era of the AIoT will require the need to work with decentralized data whilst protecting data privacy including dealing with data security and ethics points. As standalone 5G networks scale and the IoT also scales, there will be an increasing amount of Edge Computing that is data being processed closer to the edge where it is created to reduce latency and the need for back-and-forth traffic on the internet.
One of the big challenges in ethics and AI in the era of Big Data is data collection at a vast scale and invasion of privacy into a user’s personal and sensitive data.
It will also be essential to gain access to greater amounts of data to scale AI in key areas such as healthcare and other sectors where data privacy is key and (or) data is siloed – recall our Rhyme of the Modern Data Scientist earlier in this article! Furthermore, we want to provide low latency (rapid response) to the user and a highly personalized user experience via the application for Machine Learning.
Differential privacy allows those analyzing data to access the information that may prove valuable for the purposes of the research or product service, without revealing the identity of the individuals about whom the data relates.
Federated Learning with Differential Privacy will play a key role in order to enable AI to scale across sectors such as healthcare, financial services, transportation, and indeed the IoT itself. An exciting approach to scale this technology has been proposed by researchers at the University of Cambridge by Lane et al. (2020) “A Friendly Federated Learning Research Framework (Flower)” and is also explored in further detail below.
We commence with an initial global model trained on the cloud server that is then delivered onto multiple devices (local servers). The AI is then
Federated Learning entails collaborative learning across multiple local servers (for example IoT or mobile devices) and hence private data is not exposed to the central server as it is today with the Tech giants. User-generated history on the given device is utilized as the source for the training data for our Machine Learning model making it more personalized and faster to respond.
Over time the Machine Learning model will constantly improve as it continues to learn from the local training data that is being generated by the user. Thereafter the local devices will transmit the results of the training of the Machine Learning model to a centralized server, however, crucially this will be the parameters of the model and not the data.
Whilst it has been reported that there is a risk of hacking and adversarial attacks into this communication that may allow for a highly sophisticated actor to reverse engineer sensitive
The process is repeated across multiple devices that contain local versions of the model and finally an aggregation of the results will take place within the centralized server albeit excluding the user’s data.
Henceforth, the Machine Learning model is updated in the central server that resides in the cloud across the aggregated training result that will have improved from the last version of the model that was deployed.
At the next stage, the Data Science team may proceed to update the Machine Learning model to the latest model and deploy it across the local devices allowing for collaborative learning across decentralized local servers without removing the data!
For a more detailed overview of the different techniques in Federated Learning see the following:
For an overview of Federated Stochastic Gradient Descent (FedSGD) and Federated Averaging (FedAvg) see the following postdoc paper from Min Du at BUC Berkely
Wang et al. (2020) Federated Learning with Matched Averaging
An example of recent interesting research in Federated Learning:
Wen et al. (2021) in Federated Dropout – A Simple Approach for Enabling Federated Learning on Resource-Constrained Devices
Wen et al. (2021) observe that a major challenge facing Federated Learning in a practical setting is the lack of resources in small devices that results in challenges with computationally intensive tasks such as updating a Deep Neural Network model. One potential solution that the paper sets out is Federated Dropout (FedDrop). This approach applies the classical dropout approach as a foundation for random model pruning.
At each iteration of the algorithm, multiple subnets are generated independently of the global model that resides within the server with dropout applied albeit by application of heterogeneous dropout rates (probabilities of the parameter-pruning) with each one adapted to the updating. This results in reduced FedDropout.
Furthermore, Wen et al. state that “ Thereby, FedDrop reduces both the communication overhead and devices’ computation loads compared with the conventional Federated Learning while outperforming the latter in the case of overfitting and also the Federated Learning scheme with uniform dropout (i.e., identical subsets).”
Key resources and libraries for Federated Learning include:
TensorFlow Federated: TensorFlow Federated (TFF) is an open-source framework for machine learning and other computations on decentralized data
Who provides a helpful overview of the types of Federated Learning.
Model centric and Cross-Device as demonstrated by Google Federated Learning
Horizontal Federated Learning
Source for image above OpenMined who explain that Horizontal Federated Learning or sample-based Federated Learning is introduced in the scenarios that data sets share the same feature space but are different in the sample.
Research on Horizontal Federated Learning from Zhao et al (2021) Efficient Client Contribution Evaluation for Horizontal Federated Learning proposed a method that they argued consistently outperforms the conventional leave-one-out method in terms of valuation authenticity as well as time complexity. The authors claim that their approach is useful because accurately discovering contribution levels in Horizontal Federated Learning may discover malicious participants that try to poison the Federated Learning framework
Seeking to unlock the value of data that is further distributed such as between banks, hospitals, and may even be distributed data that has been aggregated across various fitness apps from consumer wearables. It poses greater challenges on the security side.
An example of research into this field is from Durrant et al. (2021) The Role of Cross-Silo Federated Learning in Facilitating Data Sharing in the Agri-Food Sector
“We focus our data sharing proposition on improving production optimization through soybean yield prediction and provide potential use-cases that such methods can assist in other problem settings. Our results demonstrate that our approach not only performs better than each of the models trained on an individual data source, but also that data sharing in the agri-food sector can be enabled via alternatives to data exchange, whilst also helping to adopt emerging machine learning technologies to boost productivity.”
The authors concluded that the technique offered a solution to the agri-food sector’s hostility towards data sharing thereby providing a means to overcome the strong social barriers relating to commercial sensitivity and unwillingness to share raw data.
Another interesting research paper is by Chu et al. (2021) FedFair: Training Fair Models In Cross-Silo Federated Learning who argue that FedFair is “a well-designed federated learning framework, which can successfully train a fair model with high performance without any data privacy infringement. Our extensive experiments on three real-world data sets demonstrate the excellent fair model training performance of our method.”
Vertical Federated Learning
Vertical Federated Learning or feature-based Federated Learning is applicable to cases where two data sets share the identical sample ID space but differ in feature space.
For recent research see Zhang et al. (2021) AsySQN: Faster Vertical Federated Learning Algorithms with Better Computation Resource Utilization who argue that Stochastic Gradient Descent based Vertical Federated Learning approaches are costly in terms of communication due to the sheer number of communication rounds and s synchronous computation impairs the resource utilization in real-world applications. They propose three algorithms that are asynchronous s stochastic quasi-Newton (AsySQN) framework and claim to reduce the costs of communication whilst improving computation resource utilization relative to current state-of-the-art Vertical Federated Learning approaches.
Data-Centric Federated Learning
A newer type and emerging area of Federated Learning.
“An owner, or in future -owners, of private data can provide access for the external organization to build models on their data without sharing that data. But the concept goes further than this because ultimately a cloud of cross-siloed private data could be made available to multiple organizations enabling the building of Machine Learning models by diverse organizations to meet diverse use cases.
Federated Transfer Learning is also mentioned albeit OpenMined are not aware of actual implementations. For more on the intuition see Liu et al (2020) A secure Federated Learning Transfer Learning Framework
NVIDIA partners with King’s College University Hospital (London) illustrated the potential for Federated Learning in healthcare.
Flower Power & Federated Learning!
Flower power is a slogan that was used by the young generation in the late 1960s and early 1970s to symbolize peaceful resistance and symbolized a period of rapid (one may say radical) change in social attitudes.
An architecture for Federated Learning, called Flower, has been published by a team of international researchers which in itself is symbolic of the intent of Federated Learning to enable collaborative learning. Perhaps the Flower architecture may usher in a period of rapid (one may say radical) change in how we approach Machine Learning and enable scaling of Federated Learning.
Flower: “A unified approach to Federated Learning. Federate any workload, any ML framework, and any programming language”. The team of researchers comprised Daniel Beutel, Taner Topal, Akhil Mathur, Xinhui Qui, Titouan Parcellet, Pedro Porto Buarquede Gusma ̃, Nicholas D.Lane and entailed a collaboration across the University of Cambridge, UCL and Avignon University.
The paper may be found at the following link:
Beutel, Lane, et al. (2021) Flower: A Friendly Federated Learning Framework.
The authors explain that whilst Federated Learning is developing into a technique that shows potential for Edge Computing to learn on a collaborative basis with data privacy, there remain challenges in deploying and implementing across mobile devices due to practical challenges of the variety of languages, hardware accelerators, etc.
Moreover, the authors further explain that existing frameworks such as TF Federated, are really designed with the simulation of FL within the environment of the server rather than exploring the environment of distributed mobile settings with vast amounts of clients. In addition, whilst the existing frameworks such as TF Federated allow for simulation of Federated Learning, the current frameworks tend not to enable implementation of Federated Learning workloads within mobile devices.
The Flower approach enables developers to work with the framework irrespective of the coding language and framework for Machine Learning that was applied with limited overhead.
Flower represents an exciting potential to scale Federated Learning across mobile devices irrespective of whether they are Android or iPhone devices and is agnostic to the Machine Learning framework that is used.
The authors of the Flower framework have also pointed to the potential for FL to assist in mitigating climate change.
For more details on Flower see:
Overview including Documentation: https://flower.dev/
Also, see Lane et al. (2021) On Device Federated Learning with Flower
It may be that the power of the “Flower” approach may solve a number of the barriers facing scaling Federated Learning. This is going to be crucial as the AIoT will scale across smart cities, smart homes, education, and smart industry across this decade.
Moreover, the anticipated arrival of 5G-enabled glasses that will enable Augmented Reality and Mixed Reality interactions thereby enabling physical to digital convergence will also require greater emphasis on personal privacy and ethics. The same applied to the Metaverse concept and Virtual Reality.
Both the OpenMined (PySfyt) and Flower approaches merit further consideration and engagement from the wider Data Science community if we are to deal with the issues of scaling Machine Learning into the wider economy and into critical sectors such as healthcare, financial services (including personalized insurance), and across the IoT.
Federated Learning with Differential Privacy has enormous potential to unlock Machine Learning across sectors such as healthcare and financial services where data privacy and siloed data is key barrier today. It would in effect end the curse of the modern data scientist of not being able to effectively access and utilize the siloed, decentralized data.
It also has a key role in protecting us and our societies in the era of the AIoT from surveillance and big brother society.
In light of the governance issues in AI and Big Data with the Facebook Cambridge Analytica scandal and the Whistleblower accusations from a Facebook employee and the need to find solutions to alleviate the crisis facing our healthcare systems around the world as the scars of Covid along with aging populations in the OECD and growing populations in a number of non-OECD countries whilst both face the growing costs of dealing with complex diseases, we need to find ways to scale AI technology in a manner that is effective to learn from decentralized data whilst also preserving the privacy of our individuals so as to prevent a big brother surveillance society.
The challenge of an aging population is more complex and costly diseases result in greater healthcare costs whilst at the same time, there is a smaller proportion of younger people paying income tax hence reducing the availability of government funding, or in the case of the US a skew in the insurance sector demographics towards the higher risk segment.
Does aging lead to an increase in health care costs?
Hence, it is increasingly important to scale AI across healthcare so as to assist exhausted front-line healthcare workers and to alleviate the costs whilst improving outcomes for the patient.
The roles AI may play in the Healthcare sector
However, in order to scale Machine Learning across clinical settings in healthcare requires access to data and this means working with siloed decentralized data and dealing with patient privacy regulations.
One of the hopes for AI to advance healthcare is the development of personalized medicine whereby new therapies can be developed that maximize the beneficial therapeutic effect of the medicine whilst minimizing the harmful side effects of medical drugs. This can be described as applying “a medical model using characterization of individuals’ phenotypes and genotypes (e.g. molecular profiling, medical imaging, lifestyle data) for tailoring the right therapeutic strategy for the right person at the right time.” Quotation from ICPerMed.
Furthermore, we may recall the potential for AI to enable smart wearables to transform medicine.
Federated Learning with Differential privacy alongside governance standards for AI may play a key role in enabling us to scale AI across the IoT, healthcare and the era of 5G enabled mixed reality glasses without excessive data intrusion and breaches of data privacy.
Healthcare provides particular challenges for the scaling of Machine Learning. Since the passage of the US Patient Protection and Affordable Care Act required the adoption of electronic health records (EHRs) there has been a rapid growth in digital data around the healthcare sector in the US. EHRs have also been growing in usage in other regions around the world. For example, IDC in The Data Dilemma and Its Impact on AI in Healthcare and Life Sciences estimate an average of circa 270 Gigabytes of data will be generated for every person in the world in 2020 in relation to life science healthcare data.
Gartner further notes that the extreme levels of heterogeneity across healthcare tech stacks result in greater challenges of data fragmentation with the gap in standards causing interoperability challenges, for example across EHRs, in turn requiring highly trained specialist human resources and technology capabilities to integrate data accurately.
Furthermore, Gartner states “ (life science and healthcare) industries need to adopt Federated Learning models to convert AI into true intelligence.
For examples of research on Federated Learning in the healthcare sector see:
Sadilek et al. (2021) Privacy-first health research with Federated Learning
Liu et al. (2021) Learning From Others Without Sacrificing Privacy: Simulation Comparing Centralized and Federated Machine Learning on Mobile Health Data
Learning From Others Without Sacrificing Privacy: Simulation Comparing Centralized and Federated Machine Learning on Mobile Health Data
There are challenges with healthcare such as standards, ethics, biases in underlying data, and privacy. A cross-industry collaboration with policymakers, regulators facilitating (or perhaps mandating it) will be key for Federated Learning with Differential Privacy becoming a key tool to in turn enable Machine Learning to scale across healthcare. The work on standards may potentially be replicated in areas such as the IoT (including autonomous vehicles), Metaverse, and Financial Services.
Sometimes we need a radical solution to solve a major barrier. It may be that “Flower Power” with the Federated Learning approach of Flower will help us scale Machine Learning whilst preserving data privacy and working across decentralized data. This approach may also be beneficial for the environment and may result in a reduced carbon footprint relative to traditional Machine Learning approaches.
Sectors such as healthcare that have been ravaged by the Covid crisis and were already under strain around the world before the advent of Covid 19, financial services, the rise of the Edge across the IoT, and the soon-to-arrive Metaverse necessitate the need for Federated Learning with Differential privacy.
The next article in this series will consider advanced techniques in AI that may also enable AI models to become more efficient such as Neuro Symbolic AI and Neural Circuit Policies and the state-of-the-art research in AI (including Deep Reinforcement Learning research from Stanford University and the cutting edge from Deepmind) that may lead us into the era of Broad AI.
What is the real risk with AI and the advanced technologies in the era of 5G networks?
The author has received comments and invitations to read articles on the impending arrival of superintelligent AI that is on the verge of displacing humans and removing us from planet Earth. Images of an army of advanced super robots controlled by a giant neural network continue to generate fear and anxiety to many in the wider public domain.
This hyperbole lacks factual substance and unfortunately distracts policymakers from the real dangers of the misuse of AI technology. Let’s examine what the real dangers to humanity are.
Standalone 5G networks will arrive at scale across the US, Canada Europe, and other areas in the world in the period 2022-2025. The speed of change will be dramatic as the AIoT rises across the Edge of the Network. Policymakers, regulators, and industry participants need to come together and collaborate on the initiatives of Federated Learning with Differential Privacy as the author believes the lack of digital understanding across political circles means that many will be caught off guard by the tidal wave of technological change.
The risks of not doing so entail a potential “Big Brother (nineteen eighty-four) mass surveillance society in the era of the AIoT where data gathering across smart (our) homes, smart (our) workplaces, smart (our or your children’s) educational facilities, smart (our) hospitals, smart (our) transport networks, smart (our) cities. The Metaverse and 5G enabled smart glasses may allow for data harvesting and monitoring of everything we see too.
The era of 5G will also risk cybercrime and cyberwarfare rising to a new level altogether as device security becomes a huge risk with intrusion and hacking at a highly personal level escalating to new levels.
Acting now will help alleviate such risks including across the future autonomous cars, mass attacks on healthcare and energy facilities.
The 5G Security Risk Landscape
Federated Learning with Differential Privacy may help create data security and reduce the risks of hacking as traffic goes back and forth between clients and servers and also enable further developments in applications for Cyber Security.
For example, Zhang et al. (2021) in Federated Learning for Internet of Things: a Federated Learning Framework for on-Device Anomaly Data Detection demonstrate the potential for Federated Learning to secure device-level security. This will be a major risk exposure during the era of 5G and the AIoT.
Zhang et al. (2021) argue that their results illustrate the efficacy of Federated Learning in the detection of a substantial amount of various types of attacks. Hence the analysis demonstrates the efficiency and affordable memory cost for IoT devices that operate in a resource-constrained environment as well as efficiency for end-to-end training time.
Which pathway will our policymakers and tech leaders choose?
The dangers of the dark side and a dystopian world if we continue to have policymakers who continue to fail to understand the rapidly evolving tech sector?
The power of the Flower and a pathway whereby we seek to use AI and other advanced technologies to advance human civilization whilst also seeking to mitigate the challenges of climate change and complex diseases that increasingly beset an aging population in the OECD countries such as the US, UK, Germany, Japan, etc, or the challenges of healthcare systems that are chronically under-resourced to match the demand from growing populations in the developing world.
So which one of AGI, Singularity or human-made actions are the real risk to human civilization within our lifetimes?
The author remains of the opinion that the genuine risk of AI technology is less to do with Sci-fi movie themes about Terminator and Skynet trying to wipe out human civilization and more to do with human greed abusing AI technology. Moreover, the author is surprised by the number of people (mostly from outside of AI research) who are deeply fearful that we are on the verge of an AI-led apocalypse whereby advanced machines are about to displace human beings on planet earth.
A key point is that we are dealing with the uncertainty of if and when. Will AGI and ASI ever occur? If they do then what is the probably expected time of arrival? We don’t know exactly when AGI or Superintelligence may arrive. A survey of leading AI researchers noted a 50% probability that AGI would occur before 2099! Dylan Azulay reported on another survey of AI researchers that found that 61% believed that the Singularity would occur by 2100.
The next two articles in this series will explore the cutting-edge techniques that may steer us away from the era of Narrow AI (ANI) towards Broad AI and ultimately AGI. It may be that leaps in technology may accelerate these forecasts. However, they remain uncertain and it is highly unlikely that the singularity will occur in the next decade and in particular in the era of the von Neumann computing architecture. These issues will be explored in the next two articles in this series that proceed this one.
We could cease all AI research and development today and yet a genuine threat to human civilization from Business as Usual (BAU) within our lifetime remains. In fact, the author argues that without further advancing technological capabilities afforded by advancing AI technology, the threats to humanity will be greater if we continue with the BAU of the era of heavy industry.
The real threat to human civilization and one that the data shows will probably occur within our lifetime is the adverse impacts of Climate Change. The chair of the COP 26 summit stated the ambition to limit the Climate Change impact to 1.5C warming and apologized for the failure to truly commit to the targets with coal-fired generation to be phased down rather than out. We should note that we are no longer seeking to prevent Climate Change. Rather mitigate and limit to 1.5C. And yet 1.5C warming is forecast by the data and resulting analytics to occur around 2040!
Fires across parts of Southern Europe and the Balkan region including Turkey, Greece, and Italy 2021
National Integrated Drought Information System (NIDIS)
The National Integrated Drought Information System (NIDIS) states that “As of November 2, 2021, 40% of the U.S. and 47.8% of the lower 48 states are in drought”.
We seem to be sleepwalking and oblivious to the probabilistic reality that climate change will increasingly affect each and every one of us across this decade and the next. Equally, we seem to be turning a blind eye to the risks of mass surveillance and the Big Brother society as well as potentially catastrophic cyber-attacks whilst talk of a robot takeover of human society generates anxiety and concern. And yet it is the former that is upon us and the latter that remains a more distant threat.
The tech sector and applications of Data Science have huge potential to help alleviate and mitigate this disaster scenario as noted by Rolnick et al. (2019) Tackling Climate Change with Machine Learning. It is submitted that more efficient decentralized Machine Learning approaches such as Federated Learning will have a key role to play in enabling a more resource-efficient approach that also enables AI to scale across all sectors of the economy.
For more information see:
- University of Cambridge Can Federated Learning Save the World?
- Lane et al. (2020) Can Federated Learning Save the Planet?
Machine Learning and Deep Learning have shown remarkable advances over the past decade in areas such as Computer Vision and Natural Language Processing. However, to scale to the next level it will be less about the giant models (GPT3, probably GPT4, Switch, Wu Dao, etc) and more about the usability and successful deployment of models
Whilst there are no single bullets to solving the Climate Crisis and also transitioning to an Industry 4.0 high technology world, we do need to consider a radical change in the way that we approach Machine Learning and in particular Deep Learning in order to successfully scale across the economy whilst also reducing the carbon footprint of Machine Learning models and enabling Machine Learning to work in low resource and low power environments of the IoT across the Edge of the Network.
Scaling Federated Learning with Differential Privacy would also go a long way to alleviating the curse of the modern Data Scientist!
The author believes that the forward-thinking and enlightened members of the Data Science and technology community will step forward and find solutions. The demands of the problem will yield responses from the AI research community. After all if AI is going to change our world in the way that many AI enthusiasts believe it will, then what better way than to help alleviate the challenges of climate change and healthcare.
The expectation from many climate scientists and modelers, for example, see Milinksi et al. (2021) is that unfortunately with the current BAU we will hit 1.5C warming and may go beyond in the coming decades that irrespective of whatever has been said and promised at COP 26 from policymakers. Furthermore, scientific experts have opined that although the Glasgow Climate Pact stresses that in order to limit global warming to 1.5C from the levels that existed before the industrial era, there is a need for greenhouse gas emissions to fall by 2030 by an amount equivalent to 45% of the emissions from 2010. However, the pledges made during COP 26 summit are projected to result in an increase of emissions by 2030 amounting to an increase of 14% from the levels in 2010.
Hence as it appears that we may well be on an inevitable warming trajectory it will be down to the technology sector including the Data Science community, perhaps some of those reading this article, to deliver the foundations of the Fourth Industrial Revolution and take us out of the current era of heavy industry as scale (manufacturing, chemicals, fossil fuels plastics, etc) and into the era of cleaner, more efficient technologies with innovations in new materials too.
- Degradable Plastics: Machine Learning is being employed to help design polymers that enable easily degradable plastics;
- Recycling Plastics: Robots enabled by Machine Learning may assist with the identification of waste that can be recycled;
- Smart Grids: with AI technology applied to enable the grid to transform into a an intelligent system that optimises outcomes and will be better able to handle the steep increase in renewable energy generation that is forecast to enter national power stacks around the world.
The reality is that the Green Economy in the US already employs 10 times the amount of people relative to the fossil fuel sector.
It may simply be left to technological product innovation and progress to solve the issues of Climate Change (mitigation), healthcare, and cyber security warfare.
The author has long argued (including whilst speaking at previous COPs) that economic growth and environmental concerns don’t need to be mutually exclusive. Economic growth has been a key means for enabling poverty alleviation around the world with 1.1 Billion people being taken out of poverty with broad-based economic growth being the main tool for enabling this. Rather the author argues that the likes of the PWC report commissioned by Microsoft set out that jobs and economic growth may be aligned with reductions in Green House Gas (GHG) Emissions and enabled by AI technology across four sectors of the economy: agriculture, energy, transportation, and water.
The numbers from the PWC report illustrate the vast potential for AI technology in this battle and hence they are shown in bold font below:
- US$5 Trillion in GDP growth;
- 38 million net jobs gain;
- A 4% reduction in global emissions of GHGs in 2030, or 2.4 Gt CO2 equivalent emissions equal to the yearly emissions of Canada, Japan and Australia combined together!
Shifting to an AI and standalone 5G network-led economy will require substantial investment in infrastructure, research, and development into more advanced hardware (beyond von Neumann architectures and dealt with in Article 4 of this series), and crucially investment into people.
The other fear expressed in the mass media is the anxiety that AI will result in mass unemployment. The author has dealt with this in more detail in an article entitled “Does AI Create or Destroy jobs”
A good starting point is to consider the headline from the WEF to not fear AI as it will create jobs in the long term. In fact, the WEF found that whilst the wide-scale adoption of AI technology will lead to 85 million job losses by 2025, it would also result in the creation of 97 million new jobs, a net gain of 12 million with many of the new jobs being better paid and more interesting jobs than a number of the repetitive tasks that will be automated away.
However, this will require substantial investment in skills training for both the experienced workforce as well as a fundamental revamp in the education system to increase digital skills across the workforce.
Solving the curse of the modern Data Scientist with Federated Learning with Differential Privacy and scaling Machine Learning across the sectors of the economy may be beneficial for economic growth, job creation healthcare and our battle to mitigate the impacts of Climate Change. It may also prevent us from entering the Big Brother surveillance society when combined with standards and ethics.
The pathway to the destination may not be straightforward, and it will require material investment, however, that investment will yield material returns across the economy and for a more sustainable and less polluting existence. In an era where national economies are still bearing the scars of the Covid crisis, engaging in investment for our future will align economic recovery with technological advancement and a pathway to mitigating the worst impacts of climate change.
The author has sought to explain that before we are expected to get to a world with Super Intelligent AI agents we are probabilistically facing an environmental and resulting human catastrophe. Hence it is submitted that AI and other advanced technologies enabled by AI are the best way to avoid this scenario. Article four in this series will also consider how humans may augment themselves with AI technology and hence also enhance our own capabilities to match advancing AI technology of the future.
Aligning Technology with Economy, Environment, Social, and Governance matters
It should be a no-brainer to align sustainable economic growth, with technological development and environmental, societal and governance (ESG) aims. Aligning economic growth with environmental goals does not have to be a mutually exclusive scenario. Rather the new wave of technological advances may be applied to break with the trajectory of Business as Usual (BAU).
For an overview of how privacy-preserving Machine Learning may enable scaling of the AIoT see Knotar et al. (2021) The Internet of Federated Things (IoFT): A Vision for the Future and In-depth Survey of Data-driven Approaches for Federated Learning
In terms of resource efficiency see Tonellotto et al.(2021) Neural Network Quantization in Federated Learning at the Edge
It is important to note that across this decade many (maybe most) of our devices in smart homes may be connected to the internet and hence be IoT devices with AI capabilities on the Edge. This may even include one’s fridge!
See Javier Fernandez-Marques Running Federated Learning applications on Embedded Devices (source for the image below):
A tidal wave of technological change is about to flood the world once stand-alone 5G networks truly scale around 2024-2025. The wave of innovation and new product and service development with a refocus around the Edge of the Network with the AIoT and the emerging Metaverse will allow for a break from BAU of the era of heavy industry and resulting pollution and move into the fourth industrial revolution with a cleaner, less polluting existence whilst still creating jobs and economic growth. It will require vision and investment into both infrastructure and people in terms of skills and educational training. Good governance standards and application of Federated Learning with Differential Privacy to prevent the Metaverse and the AIoT in general to become an era of mass surveillance is key.
The author’s personal vision for our future with advanced AI technology is more optimistic! The future remains in our hands albeit our actions today will shape our world of tomorrow.
An era of decentralized data dispersed around the Edge of the Network will require Federated Learning with Differential Privacy as will scaling Machine Learning across the Healthcare sector. It will also align with greater resource and computational efficiencies and also enable a digital transformation across all sectors of the economy and enable us to emerge away from the era of heavy pollution into a cleaner era. It may also relieve the curse of the modern Data Scientist and combined with Neural Compression techniques and other advances in Machine Learning (to be explored in the next article) allow for an era of more compact and efficient models that will be needed in the era of the AIoT.
A final word, a great paper to read Rolnick et al. (2019) Tackling Climate Change with Machine Learning. The co-authors include leading AI researchers such as Andrew Ng, Yoshua Bengio, and Demis Hassabis.
Author: Imtiaz Adam MSc Computer Science with AI research, MBA, Sloan Fellow in Strategy, Director Morgan Stanley, with extensive experience in financing infrastructure projects around the world.
Note: This is a guest post written by a guest author who is a specialist in the given field. The views, opinions, and all other content mentioned in the guest article are of the guest author and they do not represent the views of Marktechpost.
Editor’s Note: Feel free to contact the editor at Asif@marktechpost.com if you have any questions or suggestions related to the above article.
The post Future Vision & Direction of AI Part II: Scaling AI Whilst Preventing a Big Brother World & Solving The Curse of the Modern Data Scientist appeared first on MarkTechPost.