Inspired by the natural flocking behavior of animals, scientists try to achieve similar advantages by using swarm robots to fulfill more complex tasks in an easier and more cost-efficient way. In order to understand the behavior of such a dynamic system and make predictions for future states, a sufficiently accurate model is necessary.
However, the dynamics of the system such as the communication structure of the semi-autonomous multi-agent system in this thesis are sometimes unknown to the human. Thus, machine learning methods such as Gaussian Process Regression are applied in order to learn the system model. The influence of the collected training data on the accuracy of the learned model is analyzed for a linear and a nonlinear case of the system. Afterwards, the model is used to predict the future average positions in order to support the human operator to control the agents to a desired destination.
In the setup regarded in this work, the human can control the velocity of a subset of robots, while all agents are supposed to move gathered by exchanging information such as their position within the neighbors. Finally, the system behavior is learned through training data collected in an experiment and analyzed in a simulation.
Contents
Abbreviations and Formula Signs
1 Introduction
1.1 Human-Swarm Interactions
1.2 Gaussian Process Regression
1.3 State of the Art
2 Swarm Dynamics
2.1 Graph Theory
2.2 Swarm Dynamics and Consensus
2.3 System Analysis
3 Learning of Swarm Dynamics using Gaussian Process Regression
3.1 Basics of Gaussian Process Regression
3.2 Gaussian Process Regression for a Linear Multi-Agent System
3.3 Gaussian Process Regression for a Nonlinear Multi-Agent System
3.4 Prediction of the Future Position and Application
3.5 Experimental Verification
4 Summary and Outlook
4.1 Summary
4.2 Outlook
List of Figures
List of Tables
Bibliography
Abstract
Inspired by the natural flocking behavior of animals, scientists try to achieve similar advantages by using swarm robots to fulfill more complex tasks in an easier and more cost-efficient way. In order to understand the behavior of such a dynamic system and make predictions for future states, a sufficiently accurate model is necessary. However, the dynamics of the system such as the communication structure of the semi-autonomous multi-agent system in this thesis are sometimes unknown to the human. Thus, machine learning methods such as Gaussian Process Regression are applied in order to learn the system model. The influence of the collected training data on the accuracy of the learned model is analyzed for a linear and a nonlinear case of the system. Afterwards, the model is used to predict the future average positions in order to support the human operator to control the agents to a desired destination. In the setup regarded in this work, the human can control the velocity of a subset of robots, while all agents are supposed to move gathered by exchanging information such as their position within the neighbors. Finally, the system behavior is learned through training data collected in an experiment and analyzed in a simulation.
Kurzfassung
Inspiriert von dem Schwarmverhalten vieler Tiere, versucht man auch in der Wis- senschaft sich diese Starken zu Nutze zu machen, um komplexere Aufgaben ein- facher und kostengunstiger durchzufuhren. Um das Verhalten eines solchen Systems zu verstehen und daraus Schlusse uber zukunftige Zustande zu ziehen, ist ein ausreichend genaues Modell notwendig. Obwohl die Dynamik des Systems ge- legentlich nicht bekannt ist, kann sie durch maschinelle Verfahren erlernt werden. Fur das betrachtete System in dieser Arbeit, welches einem teilautonomen Robo- terschwarm mit einer unbekannten Kommunikationsstruktur entspricht, wird das Regressionsverfahren des GauBprozesses angewendet. Der Einfluss der gesammel- ten Trainingsdaten auf die Genauigkeit des erlernten Models wird fur lineare und fur nichtlineare Systeme analysiert. Um den Menschen beim Steuern des Schwarms auf ein gewunschtes Ziel zu unterstutzen, wird das erlernte Model fur Vorhersa- gen der zukunftigen gemittelten Positionen verwendet. Wahrend der Mensch nur die Geschwindigkeit einzelner Schwarmmitglieder beeinflussen kann, mussen sich alle Agenten gemeinsam als Schwarm in die vorgegebene Richtung fortbewegen, indem die einzelnen Agenten Informationen wie ihre eigenen Positionen mit ih- ren festgelegten Nachbarn austauschen. AbschlieBend wird das Schwarmverhalten mittels Trainingsdaten aus einem Experiment gelernt und in einer nachfolgenden Simulation gepruft.
Abbreviations and Formula Signs
Abbreviations
Abbildung in dieser Leseprobe nicht enthalten
Formula Signs
Abbildung in dieser Leseprobe nicht enthalten
Chapter 1
Introduction
The first chapter aims to give a motivation and an insight into Human-Swarm Interactions. The setup of the system and the definition of task are briefly worded. Afterwards, the general idea of learning the system dynamics by means of a machine learning method called Gaussian Process Regression is given. Finally, the structure of this thesis is shortly explained.
1.1 Human-Swarm Interactions
Swarm formation is a naturally occurring phenomenon in the animal world. Animals like birds, fishes and insects communicate with each other in varying ways and implement swarm strategies to protect themselves against predators, amplify their success in the search of food and succeed in the battle for survival. Inspired by this arrangement, the research on swarm agents has increased dramatically during the last decade since a large number of cheap and simple agents can perform difficult tasks in a more efficient and cheaper way than a single more complex agent, giving robustness and flexibility to the group [NM13].
Since the sensing range of a swarm is wider than the reach of an individual, a network of agents called a multi-agent system (MAS) and actuate in various places at the same time. An MAS such as in Fig. 1.1 consists of several agents which are able to interact with each other and the environment in order to conduct a given task. The achievement of this task is luckily not affected by the failure of a single agent due to the redundancy of the system since there are still many other agents working [NM13]. This function can be of critical importance for the operation of an MAS in an unknown hazardous environment.
Abbildung in dieser Leseprobe nicht enthalten
Figure 1.1: Multi-Agent System of five agents, distinguished by different colors.
Although it is often expected that the swarm works autonomously, it is beneficial or even necessary to have a human operator controlling the MAS. In case of danger, critical situations, uncertainties or wrong actions of the MAS, the human can decide on a better action due to his advanced information and thus achieve a better performance [Kol+16]. The domains of application are equally diverse and they range from exploration and mapping to environmental monitoring, tracking, search and rescue. The interaction between the operator and the swarm is accomplished through an interface to control all swarm members towards a common location. The operator receives state feedback from the swarm's agents and needs to react properly by giving adequate inputs. The feedback can be provided to the human operator in many different forms, such as through audio, visual, and/or tactile information [Hat+15]. It would be extravagant and inefficient in the case of a large number of agents to control each of the agents separately. Therefore, controlling the whole swarm by only operating on a smaller, arbitrary number of agents with only one operator command is an opportunity for simplification and more efficiency, as long as interactions between agents, such as preventing collisions, are handled autonomously. The human gives a control input to a subset of agents considered to be accessible to the operator and visually obtains the average position of all accessible agents in order to decide on a new command. Moreover, it is assumed to be possible to get all agents' positions from the network. Since cooperating agents are able to achieve a greater efficiency and operational capability, it requires not only a communication system but also a control algorithm. One possibility is a distributed control strategy called consensus control which allows each agent to get information about the neighbored agents' positions. Based on these information and its own position, the agent can decide autonomously on a strategy to achieve the common goal [RB08]. This requires a connection graph to enable communication between the agents. However, it often happens that the system is delivered from the manufacturer as a blackbox system, such that inputs can be given to the system and the corresponding outputs can be obtained. The connection graph and the inner dynamics of the system are unknown, but necessary in order to simulate the system and make predictions for future states. Hence, the main task in this research is to learn the model by means of a machine learning method called Gaussian Process Regression. It is aimed to support the human to control the MAS more easily by providing the future average positions of the accessible agents. The connection between the agents is assumed to be fixed, although it would be also possible to implement a varying communication.
1.2 Gaussian Process Regression
Modeling and simulation play an important role when the cost of an experiment might be too high or its execution too dangerous. System models are often used to test hypotheses and to predict desired information. One possible way to approximate the real system is to use methods of machine learning which allow inference of a function from a set of training data. Depending on the characteristics of the output, this problem is divided into classification for discrete outputs and regression, where outputs are continuous. In general, a training set has input and output data and is collected from a real system. In order to predict the output value for new inputs, an appropriate function is necessary, which is based on the training data and the GPR algorithm. While the specialty of the GPR is the possibility to learn any dynamic without having much knowledge about the system, it is even applicable for nonlinear systems. Moreover, it can handle uncertainties caused by system noise, model uncertainty or measurement noise [TDR09]. However, the disadvantages lie in computational limitations, since this method demands a memory complexity of O(n 2) and a computational complexity of O(n 3).
1.3 State of the Art
Although the most famous literature about the Gaussian Processes (GP) in machine learning is given from Williams and Rasmussen [RW04], the basic theory was provided by the work of Wiener [Wie64] and Kolmogorov [Kol41] in the 1940's [RW04]. Hence, prediction for especially time series analysis is not a new topic[RW04]. Over the last decade, the use of GP has grown significantly. Nowadays, the Gaussian Process is often applied for object recognition which is a classification problem. In contrast, Sacks et al. described the Gaussian Process Regression for computer experiments such that computer simulations could be modeled [RW04]. Although the GP is a powerful tool for static nonlinear functions, the use of the GPs in many real world-application with nonlinear dynamical behavior is an emerging topic [Sve+16]. It is often used in order to predict future states, such as forecasting the energy consumption in Sweden four days in advance [Sve+16]. Hence, this thesis aims to introduce the GP in order to learn the system dynamics of an MAS whose communication system among the agents is unknown. Afterwards, future states of the swarm such as the future positions are predicted by means of the GPR.
While Chapter 2 provides basic information for the MAS such as graph theory, consensus control and analysis with respect to its system properties, Chapter 3 introduces the theoretical background of Gaussian Process Regression. Afterwards, the GPR is not only applied to learn the system dynamics of a linear system, but also of a nonlinear system where the agents have limitations in speed. The GPR's way of working is explained step by step, starting with a few agents and then increasing the number of agents in the swarm. The influence of different types of consensus control and communication structures is compared. In order to support the human controlling the swarm to a desired position, the future average positions of the accessible agents are predicted. Afterwards, the utility of the prediction is discussed by asking some test persons to steer the average position of the accessible agents on a given trajectory. Finally, the modeling by means of the GPR is applied with training data collected in an experiment. At the end, the research results are summarized in Chapter 4 and an outlook to possible research topics is given.
Chapter 2
Swarm Dynamics
In this chapter, general concepts of graph theory are presented as they are necessary to express communication systems between agents. Then, the swarm's dynamics and its decision making algorithm - consensus control- are introduced. Lastly, the given system is analyzed with respect to stability, observability and relative degree.
2.1 Graph Theory
Since the behavior of the swarm is based on communication and interaction between the agents, some basics of graph theory are introduced here. The communication structure of the network can be depicted as a graph which is defined as a structure representing a group of objects and their relation. In this research, the graph is set as finite, undirected, unweighted, and fixed. As it is finite, the node set V — { v 1 ,v 2 ,...,v n } contains a finite number of elements n. For the sake of simplicity, this set is also described as V — {1 , 2 ,...,n } [ME10]. Each element of V is then a node of the graph. An edge of the graph connecting two nodes i and j with i — j is described by (i, j) eE. The graph is undirected, which means that the edges have no orientations. Therefore, (j, i) eEalso holds. The finite graph G is then formally defined as G —(V , E). The nodes and edges of G are represented as V(G)andE(G), respectively [ME10]. Two nodes, i and j, are called adjacent if there exists an edge between them [ME10]. As the graph is unweighted, edges do not carry specific, varying weights. Finally, as the graph is fixed, nodes and edges are considered to be consistent and not to change. Fig. 2.1 gives an example for such a graph G —(V , E).
Abbildung in dieser Leseprobe nicht enthalten
Figure 2.1: A graph G —(V , E) with five nodes in the set V — {1 , 2 ..., 5} and edges E — {(1 , 2) , (1 , 5) , (2 , 4) , (2 , 5) , (3 , 4) , (3 , 5) , (4 , 5)}.
The neighborhood N i C V of node i is described by the set N — {j e V|(i, j) e E} which contains all nodes that are adjacent to z [ME10]. Since the graph is assumed to be undirected, i eN j also holds for j eN i. Moreover, relevant standard classes of graphs are presented in the succeeding subfigures in Fig. 2.2. Figure 2.2a reveals a graph where all nodes are connected with each other such that every node receives information about all other nodes. In this case, the graph is called complete [ME10]. Furthermore, there is an option to put all nodes of V in a line, as in Fig. 2.2b. Then, there exists a path that connects all nodes, with the edges (i, j) eEif and only if j — i +1 for i —1 ,...,n - 1 [ME10]. If the endpoints of this line are additionally connected with each other, the structure is called a circle, see Fig. 2.2c. That means (i, j) eEif and only if i - j — ±1mod n [ME10]. Finally, the star structure, as shown in Fig. 2.2d, is a graph where (i, j) eEif and only if i —1or j —1 so that every node is exclusively connected to the first node which depicts the center of this graph structure [ME10].
Abbildung in dieser Leseprobe nicht enthalten
Figure 2.2: Four examples of relevant standard classes of graphs describing different data exchange networks.
The number of nodes that are connected to node i in G is described by the degree d (i)of the given node. Then, the degree matrix A(G) of G containing the node degree of G on the diagonal is described by
Abbildung in dieser Leseprobe nicht enthalten
with n being the number of nodes in G. Comparatively, the adjacency relationship in the graph G is defined by the symmetric n x n adjacency matrix A (G) with the entries
Abbildung in dieser Leseprobe nicht enthalten
As a result, the Graph Laplacian L(G) — A(G) — A (G) describes the relations between the nodes of graph G. The number of adjacent nodes for every node is given on the diagonal of L(G) and the element l i j for i — j describes the connection of the agents. In total, the elements l i j are defined as
Abbildung in dieser Leseprobe nicht enthalten
Since the communication is undirected, the Graph Laplacian is symmetric and positive semidefinite with the eigenvalues A 1(G) < A 2(G) < ... < A n (G) and A 1(G) — 0 which means that the smallest eigenvalue is always zero [Kel09]. The algebraic multiplicity of the zero eigenvalue depends on the graph's number of connected components. Since the considered graph is always simply connected, the Graph Laplacian L contains only one zero eigenvalue while the others are strict positive. For all graphs, the rows and columns of the Laplacian sum up to zero. Therefore, LIn — 0 anH TL — 0 always hold with the column vector 1 n only containing ones. This implies that l n is the eigenvector associated with the zero eigenvalue A 1 — 0. Hence, the Graph Laplacian gives all information about the connection structure between the agents.
2.2 Swarm Dynamics and Consensus
Shared information represents an important condition for cooperation within a dynamic system. In this thesis, a group of n interconnected agents are considered. An agent is a computer system which can act autonomously to a certain degree. By interacting with its environment and the other agents in the network, it is possible to fulfill a desired task. The collection of several agents is called a multi-agent system (MAS). The communication structure is described by an undirected, unweighted and fixed graph G — (V , E) with the node set V — {1 ,...,n } and the edge set E C V x V. The swarm is located on a 2D-plane. Since the simplest and usual way to model a swarm is an integrator, each agent i ’s dynamic is modeled by
Abbildung in dieser Leseprobe nicht enthalten
with the position q i G R 1x2 and the velocity input u i G R 1x2. The MAS is controlled by an operator who interacts through a device such as a computer or tablet as interface. The goal is to support the operator to lead the agents to a desired reference position [HCF15]. If all agents perform independent activities, the human operator is then asked to give an input to all agents each which results in a complexity of O(n) for n agents, which has a linear relation to the number of agents [Kol+16]. However, this effort can be decreased by only giving one command to a subset V h of the swarm with an arbitrary size of m agents which are considered to be accessible. This method can be only applied if the agents are able to handle interactions among each other autonomously, such as avoiding collision. Accessibility means that the operator can send a command signal u h to only these agents and receives feedback information from this subset of the swarm [HCF15]. Based on the average position of all accessible agents
Abbildung in dieser Leseprobe nicht enthalten
as visual feedback, the human brain generates the next appropriate velocity input uh e R 1x 2 for the MAS and conveys it to all accessible agents in the set V h. In this case, the necessary number of actions required by the operator is O(m) and is independent of the size of the MAS, allowing to control a swarm of any size by only one human operator input. Then, the operator only needs to focus on the goal of the swarm and treats the swarm as single entity [Kol+16]. On top of that, the cognitive burden of the operator is not affected by adding or removing multiple agents [Kol+16]. Although the operator only interacts with the accessible agents, it is assumed that all states can be measured at any time, such that the system output of the MAS is
Abbildung in dieser Leseprobe nicht enthalten
where I n e R n x n is the identity matrix. This method of controlling a subset of the MAS is only possible, if all agents are able to exchange information among each other. All members are supposed to collaborate and agree on a variable of interest called information state,which is supposed to be accurate and consistent. If the agreement on a value is successful, they have reached consensus. While it is possible to exchange information through centralized schemes by sharing the information with only one central agent or sharing information via a fully connected network, the focus of this research will be on distributing dynamics where the agents control the dynamics on their own by only assuming neighbor-to-neighbor interactions. Thus, the information state update is done by only considering the information state of their neighbors. Not only does this approach reduce communication and sensing requirements but also improves scalability, flexibility, reliability and robustness. Note that in a fixed graph G, consensus is achieved if and only if the graph G has one or more spanning trees. These trees are subgraphs including all nodes of G thus ensuring that all nodes are - directly or indirectly through their edges shared with other nodes - connected to each other.
In order to drive the information state such as position q of all agents toward a common value through state synchronization, the update law
Abbildung in dieser Leseprobe nicht enthalten
is designed, where q i (t) and q j (t) describe the positions of agent i and agent j at time t.If a group of agents is supposed to gather together at a yet unknown location via consensus, each agent creates an information state q i that represents the i -th agent's local position. The initialized information state q i (0) of each agent is set to the chosen position where the agent can meet the others. By means of the consensus method (2.11), all agents communicate with their neighbors and negotiate a place for gathering. All agents are then driven to this arranged meeting place through on-board controllers. In every change of the environment, the information state of each agent is reset and the negotiation process is repeated [RB08]. For the sake of simplicity, the possibility of a collision between the agents is neglected such that all agents can reach the same position at the same time. The goal is to reach a desired reference position q c with the swarm, such that
Abbildung in dieser Leseprobe nicht enthalten
is achieved. Note that the reference position q c e R 1x 2 is only known by the operator and the system designer cannot directly access this information. Thus, it is assumed that the next action u h is determined according to the distance q c — y h between the actual average position y h and the reference value q c which is computed in the brain of the human operator. The block diagram is illustrated in Fig. 2.3.
Abbildung in dieser Leseprobe nicht enthalten
Figure 2.3: Block diagram of the overall system. The human operator decides on a velocity input u h for the swarm based on the error between the average position of the accessible agents y h and the reference position q c which is desired to be the destination of the swarm. While the positions q of all agents are the state and system output y of the system, the average value of all accessible agents' positions y h is led back to the human operator.
According to the input signal of the human operator, each agent i GVresponds in a velocity input u i (2.4) which is defined as
Abbildung in dieser Leseprobe nicht enthalten
where u c,i describes the contraction speed based on the coupling of the i -th agent with its neighbors. 6 i gives information about the accessibility of agent i, defined as
2.2.1 P-Consensus
In case of only exchanging position information among the agents, the consensus is called P- Consensus. The velocity of each agent is proportional to the distance to its neighbors which can be expressed by the adjacency matrix in (2.2). Then, it holds that the contraction speed of agent i is computed as
Abbildung in dieser Leseprobe nicht enthalten
where a ij is the (i, j)-th entry of the adjacency matrix A (G) G R nxn and q i, q j are the information states of agent i and its neighbors, respectively. If a ij — 0, there is no information exchange between agents i and j. Equation (2.11) for every agent i can be also rewritten in matrix form as
Abbildung in dieser Leseprobe nicht enthalten
which represents the swarm dynamics, where q —[ q 1 ,...,q n ] T contains the position of every agent as information state and L G R nxn is the Laplacian matrix associated with G.
Summarizing the equations (2.4), (2.9), (2.12) and (2.6), the state-space model of the MAS is described as
Abbildung in dieser Leseprobe nicht enthalten
where q e R n xf describes the positions of all agents and equals to the system output, L e R n x n is the Graph Laplacian representing the system matrix, U h e R 1xf the human input and D e R n the input matrix whose z -th element is equal to 5 i, as defined in (2.10).
To illustrate the way of working of P-Consensus, Fig. 2.4 shows an example for a fixed communication structure in which three agents are connected in line structure.
Abbildung in dieser Leseprobe nicht enthalten
Figure 2.4: An MAS with three agents in line structure. The graph G —(V , E) has three nodes in the node set V — {1 , 2 , 3} and the edge set E — {(1 , 2) , (2 , 3)}.
Assume that only the first agent is accessible. Then, the input matrix is
Abbildung in dieser Leseprobe nicht enthalten
Applying the equations (2.4), (2.9) and (2.11) yield
Abbildung in dieser Leseprobe nicht enthalten
These equations can be also summarized in matrix form
Abbildung in dieser Leseprobe nicht enthalten
As a consequence of (2.11), the information state q i (t) of agent z is driven towards the information states ofits neighbors. Although (2.11) ensures the agreement ofthe information states of a group, it does not permit specifying a desired information state [HCF15].
2.2.2 PI-Consensus
If the agents are not only informing their neighboring agents about their position, but additionally exchange an auxiliary variable as an the integrated value of the contraction speed u c,i, the consensus is then called PI-Consensus and is given as
Abbildung in dieser Leseprobe nicht enthalten
This enables a better assembly and avoids bias between the agents in a steady state, due to the integrator part. However, there are oscillations in the movement of the MAS which makes it harder to control.
The system from above with the equations (2.4), (2.6), (2.9) and (2.18) can then be rewritten in matrix form as
Abbildung in dieser Leseprobe nicht enthalten
with $ — [ $ T ...$ T ] T e R n x2, q — [q T ...q T ] T e R n x2, L e R n x n and D e R n. The states q and $ can be summarized to the state x —[ q, $ ] T which is the new information state to exchange. The new system matrix A PI e R 2 n x2 n, the input matrix B PI e R 2 n and the main output matrix C PI e R 2 n x2 n of the system are then given as block matrices
Abbildung in dieser Leseprobe nicht enthalten
Since it is assumed that the Graph Laplacian L and thus the system matrix are unknown, a method to estimate the system dynamics is introduced in Chapter 3.
2.3 System Analysis
Before starting with the learning, some basic properties of the system are discussed. This includes stability, observability and the relative degree of the system.
2.3.1 Stability
At first, the stability of the system is analyzed. A dynamic linear system with n agents is stable if the real parts of the system matrix's eigenvalues are less than or equal to zero. In case of the model with P-Consensus (2.13), the system matrix is A P = — L, with the Graph Laplacian L defined in (2.3) describing the connection between the agents. Since L is a symmetric, positive semi-definite matrix with positive eigenvalues and one eigenvalue equal to zero, the system matrix A P is negative semi-definite with negative eigenvalues and one zero eigenvalue. For this reason, the linear system with P-Consensus (2.13) is stable.
In order to compute the eigenvalues of
Abbildung in dieser Leseprobe nicht enthalten
for Pl-Consensus, the system matrix may be rewritten as a Kronecker product A pi = U ® L, with
Abbildung in dieser Leseprobe nicht enthalten
The eigenvalues of A pI are then the products of all eigenvalues of U and L [Sch13]. Since the Graph Laplacian L is symmetric and positive semidefinite, it has only real eigenvalues bigger or equal to zero. The multiplication with the two eigenvalues A 1 / 2 = —0 . 5 ± 0 . 866 i of U lead to 2 n complex conjugated eigenvalues of A pI, where the real parts are all smaller or equal to zero. This explains the oscillating behavior of the system with PI-Consensus. The smaller the real parts of the eigenvalues, the more damped is the oscillation of the agents. Moreover, the complex part of each pair of eigenvalues describes the corresponding natural frequency. There are especially two eigenvalues of A pI equal to zero. Because of the double zero eigenvalue, a more precise definition for stability is required. According to the stability property for linear systems in [Cla06], the system is stable if additionally to the upper condition the algebraic m \ and geometric multiplicities d \ are equal for all eigenvalues with a zero real part. Since the system matrix A pI has two eigenvalues in the origin for all connection graphs of any size, the dynamic system is stable if and only if the geometric multiplicity is two as well. This can be easily computed with the formula d \ = 2n — rank(A PI — AI 2 n) where I 2 n G R 2 n x2 n is the identity matrix [Bur+07]. Since the case for a zero eigenvalue A = 0 has to be analyzed, d \ = 2 n — rank(A PI) holds. Because of A PI = U L, the rank of A PI is the product of the ranks of U and L [Sch13]. The rank of U is two, due to the linear independence of its rows and columns. Since the Graph Laplacian is a symmetric matrix of size n x n, there is an orthonormal basis of its eigenvectors such that L is diagonalizable. An orthogonal matrix S with full rank n exists such that
Abbildung in dieser Leseprobe nicht enthalten
holds. Since the diagonal matrix D L only contains eigenvalues of L on the diagonal, its rank is rank(D L)— n - 1 due to the zero eigenvalue of L. By means of Sylvester's rank inequality and general properties of the rank,
rank(A) + rank(B) - n < rank(AB) < min(rank(A) , rank(B)) (2.26)
Abbildung in dieser Leseprobe nicht enthalten
can be used in order to determine the rank of L. For the substitution E := SDL, it is
Abbildung in dieser Leseprobe nicht enthalten
So, the Graph Laplacian always has the rank n − 1. The geometric multiplicity can be then computed as
Abbildung in dieser Leseprobe nicht enthalten
which equals to algebraic multiplicity m \ — 2. The system is stable.
2.3.2 Observability
Observability means that the states q of the system can be reconstructed given the history of the input u h and the output y. This is especially important if the states of the system are not measurable. However, because of the assumption that the state q is measurable at any given time t and is also the system output of the MAS, the considered output matrix for observability will be the identity matrix of size n x n for P-Consensus and size 2 n x 2 n for PI- Consensus. Although it is already known that the system has to be observable, this property can be also verified by the condition for observability where the observability matrix
Abbildung in dieser Leseprobe nicht enthalten
must have full rank which is always given due to the identity matrix as output matrix (2.21).
2.3.3 Relative degree
The relative degree r is defined as the difference between the highest output degree and the input in differential equations and is a measure for the delay between the system input and output. Hence, the bigger the relative degree of the system, the more inert is the reaction on a change in the input. Only in case of r — 0, the input can influence the output directly [Lun08].
Differentiating the system output
Abbildung in dieser Leseprobe nicht enthalten
In order to get the system input in the equation, only one differentiation of the system output is required. Therefore, r — 1 always holds.
Chapter 3
Learning of Swarm Dynamics using Gaussian Process Regression
In this section, some fundamental knowledge about regression including a popular method called Gaussian Process Regression (GPR) will be acquired. Afterwards, the method is applied for the MAS of the previous chapter in order to learn its system dynamics. Later on, The learned model is used to support the human operator through predictions of the future states. Finally, the learning process is applied for training data of an experiment.
3.1 Basics of Gaussian Process Regression
Supervised learning is a useful method to learn the behavior of a system by using a known dataset of inputs and the corresponding outputs in order to make predictions. It is divided into classification for discrete outputs and regression in case of continuous outputs and is an important tool for machine learning and statistics [RW04]. There are several approaches to solve the problem. One possibility is to allow only linear functions of the input. However, this has the disadvantage that the predictions are poor due to a not well modeled target function. On the other hand, it is also possible to give a prior probability to all possible functions while more likely functions are given higher probabilities [RW04]. The prior represents assumptions regarding what kind of functions are expected before seeing the data. The main problem of the latter approach is the infinite set of possible functions and, in contrast, the finite time available for the computation of this set. This challenge can be solved by using the Gaussian Process (GP) which can be conveniently used to specify flexible nonlinear regressions. The training set is used in order to solve a convex optimization problem by specifying the 'best fit' model for the data and use this estimated model to make 'best guess' predictions for future test input points.
Consider the training set D — {(x i,y i)| i —1 ,...,N } with N observations, where x i e R d is an input vector and y e R denotes the corresponding scalar output. The column vector inputs for all N observations are accumulated in the N x d design matrix X, and the outputs are collected in the vector y such that the data set can be written as D —(X, y). Given this training set, predictions1 for new, not included inputs x * can be made. For this purpose, it is necessary to find an appropriate function f that is able to make estimations for any input value. All formulas in the following are referred to [RW04], [Gpr] and [Exa].
3.1.1 Weight-space View
Getting started with simpler basic fundamentals, a linear regression model of the form
Abbildung in dieser Leseprobe nicht enthalten
is introduced, where x denotes the input vector, 3 - N (0 , S p) the vector of weights with a zero mean and the covariance matrix £ p e R p x p on the weights, y is the observed target value and f a stochastic function which gives the associated output value for every input x.The target y differs from f (x) by an additive noise, which is assumed to follow an independent, identically distributed (i.i.d.) Gaussian distribution with zero mean and variance a f, denoted by e N(0 ,a f) e R [RW04]. Independent, identically distributed means here that the random variables are independent from each other and can achieve a value with the same probability. According to [RW04], the following probability density of the observations given the parameters, also known as the likelihood, is increased by the noise assumption together with the model and can be factored over observations in the data set
Abbildung in dieser Leseprobe nicht enthalten
For Bayesian linear regression, a prior belief over the parameters has to be determined before looking at the observations. By means of the Bayes' rule, the posterior distribution over weights can be estimated based on the training data
Abbildung in dieser Leseprobe nicht enthalten
The normalizing constant p (y | x) called marginal likelihood is independent of the weights and computed by the integration [RW04]
Abbildung in dieser Leseprobe nicht enthalten
The parameter posterior is then used to compute the predictive distribution p (y *| x * , D) over the testing output y * for a new, unknown test point x *. In the Bayesian linear regression, the linear combination of the inputs delivers the output. Although this method is quite simple to implement and interpret, it allows only a limited flexibility if the connection between input and output can not be reasonably approximated by a linear function. Thus, the model will give poor predictions.
[...]
1 Although prediction is usually used to estimate future values, various literatures such as [RW04] use this term in order to estimate the corresponding value for any input. Hence, prediction in this thesis means estimation for an input by means of the GPR.
- Quote paper
- Gamze Alkan (Author), 2017, Human-Swarm Interactions. Learning of Swarm Dynamics using Gaussian Processes, Munich, GRIN Verlag, https://www.grin.com/document/511613
-
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X.