Particle Swarm Optimisation

Jayanti prasad Ph.D
8 min readAug 11, 2020

--

Introduction

Particle Swarm Optimisation or PSO is one of the most interesting research projects I have worked on. It opened a new and interesting dimension to my professional work. I plan to post here a comprehensive account of the work I have done and other works I am familiar with, including important references. Here mostly I will give a popular account of PSO, that I hope, will be understandable by everyone without needing any technical background.

There are many reasons for my special interest in PSO such as it is about computational agents which posses some sort of ‘intelligence’ and that has been one of my fundamental interests. Apart from that optimisation is a universal problem and nature and society must find solutions to the optimisation problems they come across all the times. Another factor which makes PSO (or any other agent based optimisation method) more interesting than other methods is that in PSO we employ some sort of ‘intelligent agents’ to solve the problems and I think that is the best approach we can have.

Optimisation is something that is related to very much what we do in our day to day life. It is nothing but maximising or minimising a certain outcome by changing the parameters which affect that. For example, if we are selling something then our aim is to maximize the profit that may depend on many factors — the number of items we sell, the cost at which we buy items, tax we pay and the expenses we have to bear for holding the items in our inventory. In most cases the optimising quantity, called the cost function, is a non-linear function of the variables it depends on. In some cases the optimisation function must satisfy certain constrains also. Optimisation sits at the heart of modern machine learning also — in particular based on artificial neural networks.

There are two main challenges which need to be addressed in a typical optimisation problem. The first one is that in many practical situations the dimensionality of the space over which we optimize (maximize/minimize) the cost function may be very large — may the thousands or even million ! The second challenge is that the cost function may not be globally convex or even well behaved (continuous, differential etc.) or in some cases may have local minima also. The brute force approach by computing the value of the cost function at all the grid points by constructing a grid in the parameter space becomes computationally challenging. For example if we have a 10 dimensional search space and draw 10 grid points along each dimension then we will have 1010 points to search for the global maximum.

There are a large number of interesting optimisation problems such as the traveling salesman problem in which a salesman has to find the optimal order of visiting a set of cities without visiting any city twice or missing any city on the basis of the locations of the cities he has. Nature does optimisation all the time. For example, we come across more number of circular and spherical shapes in nature than other types because they have the minimum area and volume for a given size of boundary and surface respectively. Everything we see around us is always the result of some optimisation. In natural systems certain configurations are preferred than others because they have the lower ‘potential energy. Almost all the physics is based on a principle called the ‘variational principle’ which is directly related to optimisation. The bottom line is that the goal of optimisations is to find the configuration or a set of variables which will lead the minimum value of the cost function.

Now let me come to the main topic of discussion here and that is of Particle Swarm Optimisation or PSO. As has been mentioned above that nature does optimisation all the time which is an important component of many computing problems. Therefore it is not a surprise that many of the optimisation scheme, including PSO, are inspired from nature. There is field of computer science called ‘nature inspired computing’ or Natural Computing (as the wikipedia title is for this field) under which many such techniques such as ant colony optimisation, neural networks, genetic algorithm, simulated annealing are studied. You can find an outstanding review of some of efforts here .

We all have seen flocks of bird flying with very interesting and coordinated patterns that change very fast. It looks puzzling how the birds coordinate with each other without having any central command. Systems which show coordinated behavior without any central authority are common in nature and society. For example, in a society anyone who buys or sells plays important role in the market dynamics without actually understanding economics or following any central command. Basically every one is taking some decision locally and that is giving rise to very large scale patterns.

The Algorithm

Let us consider a set of agents which are in the search of finding the place at which concentration of certain quantity, let us say gold, is maximum. Given that the search space is very large it will not be practical that every place is visited and then compared with other places. PSO proposes that:

  • Every agent knows the place at which it has found the best value so far — Xp_best or the personal best and also know the position which is best out of all the X p_best let us call that Xg_best or the global best. Note that if the search space is n-dimensional then X will be an n-dimensional vector.
  • Every PSO agent (particle) has the following attributes: position: X i , velocity : Vi and the personal best : Xp_best .
  • At every time step (iteration) the position and velocities of the PSO particles are updated in the following way:
  • Here w, c 1 and c 2 are the design parameters (hyper-parameters) and ξ1 and ξ 2 are two uniform random numbers in the range [0,1]. The parameter ‘w’ quantifies how willingness a particle is to change its position (sometime called ‘inertia’ also) and ‘c 1 ‘ quantifies the weight given to self learning and ‘c2 ‘ quantifies the learning from the experience of others. There are prescription to set the values of these parameters. Apart from these, the number of particles or agents we use is also a hyper-parameter. There are some considerations about the choice of w, c 1 and c 2 and Kennedy (2003) gives the values of these parameters 0.729, 2.05 and 2.05 respectively. We will use these for most of our experiments.
  • The positions and velocities of the particles are initialised randomly in the beginning and iterated till some convergence criteria is satisfied.

This is the basic algorithm but there are many other variations also available which differ from each other in many different ways. PSO is very intuitive and is close to the real world in which the knowledge which we use to carry out certain tasks either comes from our own experience or that of others.

The algorithm given above is not enough and we have to take into account the following considerations:

  1. Initial conditions — how to set up initial positions and velocities of the particles
  2. Velocity constrains — it has been found that the velocities of particles become very large quickly so we must put some upper limit on the velocity of particles.
  3. Boundary conditions — only velocity constraints are not good and we need to impose some boundary conditions also to keep the particles within the search range.
  4. Topology — we may allow particles to share information with all other particles or follow some particular topology for this purpose.
  5. Convergence criteria — we must have some convergence criteria since stochastic search is always converges asymptotically
  6. Hyper-parameters — number of particles and various weights must be decided carefully.

History

The PSO algorithm was given by Kennedy and Eberhart in 1995 and what motivated them to put forward the algorithm was the hypothesis of Edward O Wilson (one of the finest scientists of 20th century whose influential book ‘Socio-biology’ has been inspiring generations of thinkers):

In theory at least, individual members of the school can profit from the discoveries and previous experience of all other members of the school during the search of the food. This advance can become decisive, outweighing the disadvantage of competition of food items, whenever the resource is unpredictably distributed in patches.

The original paper of Kennedy and Eberhart provides enough context of the algorithm such as its relation with artificial life and artificial neural network. The implementation of the algorithm is discussed for a set of non-linear continuous function and the merits of the algorithm are compared to that of other similar algorithms such as genetic algorithm and evolutionary computing.

Optimisation of a set of standard benchmark functions, which I will discuss below in details, is the main use case considered. I think the main reason behind the popularity of PSO is its elegance, simplicity and the scope for variation. In fact there have been so many variations of the algorithms that it becomes difficult to compare them. The two main reasons behind so much diversity between different implementation is that there is no single best algorithm for all the optimisation problems and there is no convincing theoretical proof which can show why the algorithm works (this is the case with many other stochastic optimisation algorithms also).

There have been proposed so many modifications to PSO that it is hard to review even a small fraction of these. One variation which has been around since early 2000s is the use of constriction factor, which actually does not alter anything fundamentally. The motivation behind using the constriction factor is to avoid the use of v max in the standard PSO which is needed to stop PSO particles flying outside the search region.

There are two projects I have on the GitHub related to this page. The first is about the general PSO and the second one about a particular application of PSO which I have worked on.

References

  1. Kennedy, J.; Eberhart, R. (1995), Proceedings of IEEE International Conference on Neural Networks, IV. pp. 1942–1948., Particle Swarm Optimization,
  2. Maurice Clerc and James Kennedy (2002), IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 6, NO. 1, FEBRUARY 2002, The Particle Swarm — Explosion, Stability, and Convergence in a Multidimensional Complex Space
  3. Eberhart and Shi (2000), Comparing Inertia Weights and Constriction Factors in Particle Swarm Optimization
  4. James Kennedy (2007), Proceedings of the 2007 IEEE Swarm Intelligence Symposium (SIS 2007), Some Issues and Practices for Particle Swarms
  5. Jayanti Prasad and Tarun Souradeep (2012), Phys. Rev. D 85, 123008 [arXiv:1108.5600 [astro-ph.CO]], Cosmological parameter estimation using particle swarm optimization
  6. Peter Andras (2012), PLOS November 2012 | Volume 7 | Issue 11 | e48710, A Bayesian Interpretation of the Particle Swarm Optimization and Its Kernel Extension
  7. Iztok Fister Jr., Xin-She Yang, Iztok Fister, Janez Brest, Dušan Fister (2013), A Brief Review of Nature-Inspired Algorithms for Optimization

--

--

Jayanti prasad Ph.D
Jayanti prasad Ph.D

Written by Jayanti prasad Ph.D

Physicist, Data Scientist and Blogger.

No responses yet