SlideShare ist ein Scribd-Unternehmen logo
1 von 108
Downloaden Sie, um offline zu lesen
School of Chemical and Mineral Engineering
CEMI479
Comparison between the Plitt model
and an artificial neural network in
predicting hydrocyclone separation
performance
Neil Zietsman
23379936
Supervisor: Mr. A.F. van der Merwe
North-West University
Potchefstroom Campus
Date of submission:
26 October 2015
School of Chemical and Minerals Engineering
Declaration| i
Declaration
L.N. Zietsman 23379936, hereby declare that:
 the text and references of this study reflect the sources I have consulted and
 sections with no source references are my own ideas, arguments and/or conclusions.
This declaration is for the report entitled CEMI479: Comparison between the Plitt model
and an artificial neural network in predicting hydrocyclone separation performance
submitted for the partial fulfilment of the requirements for the B.Eng. Chemical Engineering
degree at the North-West University, Potchefstroom Campus.
Signed at Potchefstroom on the day of ______ October 2015.
_______________________
L.N. Zietsman 23379936
School of Chemical and Minerals Engineering
Acknowledgements| ii
Acknowledgements
I would like to thank the following people for their help during the year with my project:
 My God for giving me strength during the year to complete this project
 Mr. A.F. van der Merwe, my study leader for his help and guidance.
 Workshop personnel for their help with regard to the technical problems that occurred
during the course of the year.
 Mrs. Sanet Botes for her help with ordering the needed items in the project
 Miss. Sarita van Loggenberg, my colleague, who helped me perform the hydrocyclone
experiments.
 Mr. Nico Lemmer for his help on the Malvern Mastersizer 2000
School of Chemical and Minerals Engineering
Abstract| iii
Abstract
The hydrocyclone is an invaluable process unit which is popular for its use in the mineral
processing industry. As all classifiers, the hydrocyclone is not capable of perfect separation.
The ability of the hydrocyclone to separate particles into the correct streams could be
represented by a curve, known as a partition curve.
Two important variables could be obtained from the partition curve – the cut size, d50c, and the
sharpness of separation. These two variables could be used to fully describe the separation
efficiency of the hydrocyclone. Optimal control of the hydrocyclone could be achieved if
accurate values of the d50c and sharpness of separation could be obtained.
Unfortunately, this is easier said than done. On-line instrumentation for direct analysis of these
variables are unheard of. Additionally, the complex flow inside the hydrocyclone makes it
impossible to determine these variables indirectly through first principle calculations. The
solution is inference sensors, which make use of easily measured variables, like the flowrate
and solids percentage to determine the d50c and sharpness of separation.
Two methods of inference sensoring was covered in this study, namely an empirical method
(Plitt model) and an artificial neural network.
The modified Plitt model was specifically used in this case where its fudging factors were
changed to fit experimental data. The Plitt model was only capable of predicting the d50c to a
certain extent, but failed to predict the sharpness of separation.
The artificial neural network was trained with the backpropagation algorithm. The more input
variables the artificial neural network had, the better its predicting capability became. The
addition of regularization and momentum terms further increased the prediction power of the
neural network.
Keywords: hydrocyclone; d50c; sharpness of separation; artificial neural network; Plitt model,
fine cut point, variable size spigot
School of Chemical and Minerals Engineering
Attached documents| iv
Attached documents
Folder name File name Description
Experimental
error
Experimental Error
ExcelÂź spreadsheet containing the data
and calculations that were done to
determine the experimental error
Plitt model Plitt model
ExcelÂź spreadsheet containing the
calculations performed on the experimental
data with the Plitt model
Artificial neural
networks
Neil'sANN.rev3-d50c -
Du;
Neil'sANN.rev3-d50c -
Du+phi+Q;
Neil'sANN.rev3-d50c -
Du+phi+Q+P+S;
Neil'sANN.rev3-m - Du;
Neil'sANN.rev3-m -
Du+phi+Q;
Neil'sANN.rev3-m -
Du+phi+Q+P+S
The macro enabled ExcelÂź spreadsheets
contain the program with which the artificial
neural networks were trained and validated
Meetings Various files
This folder contains all the minutes and
agendas of each meeting in Microsoft
WordÂź format
Data
processing
Data processing
Contains the ExcelÂź spreadsheet with
which the data processing was done
MSDS MSDS – Silica flour
This is a PDF document containing the
MSDS of silica flour
Gantt chart Gantt chart
This folder contains a Gantt chart that is
both in PDF format and MS Project format
School of Chemical and Minerals Engineering
Table of contents| v
Table of contents
Declaration............................................................................................................................. i
Acknowledgements................................................................................................................ii
Abstract.................................................................................................................................iii
Attached documents .............................................................................................................iv
Table of contents .................................................................................................................. v
List of figures .......................................................................................................................vii
List of tables..........................................................................................................................ix
List of acronyms.................................................................................................................... x
List of symbols ...................................................................................................................... x
Chapter 1 - Introduction ........................................................................................................ 1
1.1 Background............................................................................................................. 1
1.2 Problem statement.................................................................................................. 1
1.3 Aim and objectives.................................................................................................. 2
1.3.1 Aim .................................................................................................................. 2
1.3.2 Objective.......................................................................................................... 2
1.3.3 Methodology .................................................................................................... 2
Chapter 2 - Literature study................................................................................................... 3
2.1 The hydrocyclone ................................................................................................... 3
2.2 Hydrocyclone control .............................................................................................. 6
2.2.1 Sensors used in hydrocyclone performance determination .............................. 6
2.3 Soft sensors............................................................................................................ 7
2.3.1 Empirical models ............................................................................................. 8
2.3.2 Artificial neural networks................................................................................ 10
Chapter 3 - Experimental procedure ................................................................................... 20
3.1 Overview............................................................................................................... 20
School of Chemical and Minerals Engineering
Table of contents| vi
3.2 Raw materials....................................................................................................... 20
3.3 Equipment ............................................................................................................ 20
3.4 Experimental setup ............................................................................................... 20
3.5 Experimental procedure........................................................................................ 23
3.5.1 Preparation.................................................................................................... 23
3.5.2 Sampling........................................................................................................ 25
3.5.3 Analysing....................................................................................................... 26
3.5.4 Experimental error ......................................................................................... 27
Chapter 4 - Model development.......................................................................................... 30
4.1 Overview............................................................................................................... 30
4.2 The Plitt model...................................................................................................... 30
4.2.1 Split flow ........................................................................................................ 30
4.2.2 Cut size – d50c ................................................................................................ 31
4.2.3 Sharpness of separation ................................................................................ 31
4.3 The artificial neural network .................................................................................. 31
4.3.1 Artificial neural network architecture .............................................................. 31
Chapter 5 - Results and discussion..................................................................................... 34
5.1 Deviations in the feed PSD ................................................................................... 34
5.2 Plitt model............................................................................................................. 36
5.2.1 Cut size – d50c ................................................................................................ 36
5.2.2 Sharpness of separation ................................................................................ 38
5.3 Artificial neural networks ....................................................................................... 39
5.3.1 Cut size – d50c ................................................................................................ 40
5.3.2 Sharpness of separation ................................................................................ 46
Chapter 6 - Conclusion and recommendations.................................................................... 53
6.1 Conclusion............................................................................................................ 53
6.2 Recommendations................................................................................................ 53
6.3 Further study......................................................................................................... 54
School of Chemical and Minerals Engineering
List of figures| vii
Bibliography........................................................................................................................ 55
Appendix A Data processing ............................................................................................ I
Appendix B Data processing source code ......................................................................IV
Appendix C Processed data ...................................................................................... XVIII
Appendix D Experimental error data............................................................................ XXI
Appendix E ANN source code .................................................................................... XXII
Appendix F ECSA exit level outcomes ..................................................................... XXXII
Appendix G Hazard identification and risk assessment.............................................XXXV
List of figures
Figure 2.1: Hypothetical flow inside the hydrocyclone viewed from the top of the hydrocyclone.
Adapted from Plitt (1976) ...................................................................................................... 4
Figure 2.2: Corrected and non-corrected partition curve adapted from Schneider (2001)...... 5
Figure 2.3: Diagram of the computational nodes and weights of an artificial neural network
adapted from Jain (1996).................................................................................................... 10
Figure 2.4: Supervised learning with reference to Hagan et al. (2002) ................................ 13
Figure 2.5: Polynomial of first order produces a bad fit for the data. Reproduced from Bishop
(2008:11) ............................................................................................................................ 14
Figure 2.6: Polynomial of high order producing something that looks like a good fit for all the
data points, but the predictive power of the polynomial is sacrificed. Reproduced from Bishop
(2008:12). ........................................................................................................................... 15
Figure 2.7: A lower order polynomial that has the capability to generalize well ................... 16
Figure 3.1: Diagram of the hydrocyclone setup ................................................................... 21
Figure 3.2: The experimental hydrocyclone setup............................................................... 22
Figure 3.3: The Marcy scale................................................................................................ 24
Figure 3.4: Malvern Mastersizer 2000................................................................................. 26
Figure 3.5: Experimental error of the d50c with a 95% confidence interval.......................... 29
Figure 3.6: Experimental error of the sharpness of separation with a 95% confidence interval
........................................................................................................................................... 29
School of Chemical and Minerals Engineering
List of figures| viii
Figure 4.1: Experimental split flow values plotted with the predicted Plitt model split flow values
........................................................................................................................................... 31
Figure 4.2: Learning capability of one of the 6 developed artificial neural networks............. 33
Figure 5.1: Particle size distribution of 25 different feed samples ........................................ 34
Figure 5.2: Example partition curve before justifications...................................................... 35
Figure 5.3: Experimental vs. adjusted values of Rf .............................................................. 36
Figure 5.4: Experimental cut point plotted with the cut point predicted by the Plitt model .... 37
Figure 5.5: Plitt model predicted cut size vs. experimental cut size plotted over the y=x curve
........................................................................................................................................... 37
Figure 5.6: Experimental sharpness of separation plotted with the sharpness of separation
predicted by the Plitt model................................................................................................. 38
Figure 5.7: Plitt model predicted m vs. experimental m plotted over the y=x curve.............. 39
Figure 5.8: Results of neural network 2 trained with a training speed of 0.2 and a maximum
amount of epochs of 8000................................................................................................... 40
Figure 5.9: Results of neural network 2 trained with a training speed of 0.2 and a maximum
amount of epochs of 8000................................................................................................... 41
Figure 5.10: Results of neural network 3 trained with a training speed of 0.2 and a maximum
amount of epochs of 8000................................................................................................... 41
Figure 5.11: Calculated d50c plotted with the experimental d50c values of neural network 3
trained with a maximum of 60000 epochs and a training speed of 0.02 .............................. 42
Figure 5.12: Predicted d50c vs. experimental d50c plotted over the y=x curve for the neural
network trained with a maximum of 60000 epochs and a training speed of 0.02 ................. 43
Figure 5.13: Calculated vs. experimental values of neural network 3 trained with a maximum
of 60000 epochs and a training speed of 0.02 with the addition of the momentum term...... 44
Figure 5.14: Predicted d50c vs. experimental d50c plotted over the y=x curve for a neural network
trained with a maximum of 60000 epochs and a training speed of 0.02 .............................. 44
Figure 5.15: Calculated vs. experimental values of neural network 3 trained with a maximum
of 60000 epochs and a training speed of 0.02 with the addition of the regularization term .. 45
Figure 5.16: Predicted d50c vs. experimental d50c plotted on the y=x curve for the neural
network trained with a maximum of 60000 epochs and a training speed of 0.02 with the
addition of the regularization term ....................................................................................... 46
School of Chemical and Minerals Engineering
List of tables| ix
Figure 5.17: Results of neural network 4 trained with a training speed of 0.5 and a maximum
amount of epochs of 20000................................................................................................. 47
Figure 5.18: Results of neural network 5 trained with a training speed of 0.5 and a maximum
amount of epochs of 25000................................................................................................. 47
Figure 5.19: Results of neural network 6 trained with a training speed of 0.2 and a maximum
amount of epochs of 8000................................................................................................... 48
Figure 5.20: Experimental and predicted values of neural network 6 trained with a maximum
of 60000 epochs and a training speed of 0.02..................................................................... 49
Figure 5.21: Predicted vs. experimental m plotted over the y=x graph for neural network 6
trained with a maximum of 60000 epochs and a training speed of 0.02 .............................. 49
Figure 5.22: Predicted and experimental values of neural network 6 trained with a maximum
of 60000 epochs and a training speed of 0.02 with the addition of the momentum term...... 50
Figure 5.23: Predicted m vs. experimental values m plotted over the y=x line for neural network
6 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of
the momentum term............................................................................................................ 50
Figure 5.24: Predicted vs. experimental values of neural network 6 trained with a maximum of
60000 epochs and a training speed of 0.02 with the addition of the regularization term ...... 51
Figure 5.25: Predicted m vs. experimental m plotted over the y=x curve for neural network 6
trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the
regularization term .............................................................................................................. 51
List of tables
Table 2.1: Sensors used in the on-line monitoring of hydrocyclone performance .................. 7
Table 3.1: Processed data used for the experimental error determination........................... 27
Table 3.2: Values for substitution into the student's t equation ............................................ 28
Table 4.1: Different artificial neural networks that were programmed .................................. 32
School of Chemical and Minerals Engineering
List of acronyms| x
List of acronyms
Acronym Description
ANN Artificial neural network
HIRA Hazard identification and risk assessment
MS Microsoft
MSDS Material safety data sheet
PPE Personal protective equipment
PSD Particle size distribution
List of symbols
Symbol Description
Al2O3 Aluminium oxide
K2O Potassium oxide
Fe2O3 Iron(III) oxide
CaO Calcium oxide
Na2O Sodium oxide
𝑑 Size of a particle in 𝜇𝑚.
𝑑50𝑐 Hydrocyclone corrected cut point. This is the particle size that has an
equal chance of either leaving through the underflow or the overflow. Its
unit is in 𝜇𝑚.
đ·đ‘ Hydrocyclone diameter in 𝑐𝑚
đ·đ‘– Inlet diameter in 𝑐𝑚
đ· 𝑜 Vortex finder diameter in 𝑐𝑚
School of Chemical and Minerals Engineering
List of symbols| xi
đ· 𝑱 Underflow/apex/spigot diameter in 𝑐𝑚
𝛿𝑗
𝑂 Error of the output node 𝑗.
𝛿𝑗
đ» Error of the hidden node 𝑗.
𝐾 Error of a neuron’s output
đžÌƒ Conditioned error of the neuron’s output
𝜂 𝑣 Viscosity of the carrier fluid in 𝑐𝑝
đč𝑖 Fudging factor of the modified Plitt model where 𝑖 = 1,2,3 

ℎ Free vortex height in 𝑐𝑚
𝑘 Constant that takes into account the effect of the solids density on the
corrected cut size.
𝑚 Sharpness of separation. This is the slope of the partition curve that
indicates how well the classification is taking place inside the
hydrocyclone. The higher the value of m, the closer the hydrocyclone will
be to an ideal classifier.
𝑀 Momentum factor defined by the user
𝑚 𝑠 Mass of silica sand that has to be added to the storage tank in 𝑘𝑔
𝑛 The amount of weights attached to node 𝑗
𝑛𝑱 Number of data points available in the set
Ω Penalty term
𝑃 Pressure over the hydrocyclone in 𝑘𝑃𝑎
𝜑 Percentage solids in the feed
𝑄 Volumetric feed flow rate in
𝑙𝑖𝑡𝑒𝑟𝑠
𝑚𝑖𝑛𝑱𝑡𝑒
𝑅 Regularization factor defined by the user
School of Chemical and Minerals Engineering
List of symbols| xii
𝑅𝑓 Recovery of the carrier liquid to the underflow
𝜌 𝑝 Density of the hydrocyclone feed slurry in
𝑔
𝑐𝑚3
𝜌𝑠 Density of the solid phase in
𝑔
𝑐𝑚3
𝑆 Split flow – The volumetric flow of the underflow divided by the volumetric
𝑆𝑡 Standard deviation in the data
𝜎𝑗
đ» Output value of the transfer function of node 𝑗 in the hidden layer
𝜎𝑗
𝑂 Output of the neuron in the output layer 𝑗.
𝑡 𝑛−1(
đ›Œ
2
) Critical t value that could be obtained from the back cover of Devore and
Farnum (2005)
𝑇(95%) Critical t value for a 95% confidence.
𝜐 Parameter for controlling the importance of the bias term
đ‘‰đ‘€ Volume of water in the storage tank in 𝑚3
đ‘€đ‘– Value of weight 𝑖, where 𝑖 = 1,2,3 

đ‘€đ‘–đ‘— Value of the weight that goes from node 𝑖 to node 𝑗.
đ‘€đ‘–đ‘—
đ» Value of the hidden layer weight that goes from node 𝑖 to 𝑗.
âˆ†đ‘€đ‘–đ‘—
đ»
Value with which weight đ‘€đ‘–đ‘—
đ»
has to be updated
đ‘€đ‘–đ‘—
đ»
(𝑡 − 1) The value of the previous weight đ‘€đ‘–đ‘—
đ»
đ‘€đ‘–đ‘—
𝑂 Value of the output weight that goes from node 𝑖 to node 𝑗.
âˆ†đ‘€đ‘–đ‘—
𝑂
Value with which the output weight, đ‘€đ‘–đ‘—
𝑂
has to be updated
đ‘€đ‘–đ‘—
𝑂
(𝑡 − 1) Value of the previous weight, đ‘€đ‘–đ‘—
𝑂
𝑋 Average of a data set
School of Chemical and Minerals Engineering
List of symbols| xiii
đ‘„đ‘– Input value from weight 𝑖.
đ‘„đ‘—
đ» Output of the hidden node 𝑗.
đ‘„đ‘—
𝑂 Output of the output node 𝑗.
𝜉𝑗 Signal sent to node 𝑗, where 𝑗 = 1,2,3 

𝑩 Partition number. This is the value displayed on the partition curve’s y-
axis for a certain particle size, 𝑑.
𝑩â€Č Corrected partition number
𝑩𝑗 Desired output of output node 𝑗
𝑧 Amount of weights connected to the cell
School of Chemical and Minerals Engineering
Introduction| 1
Chapter 1 - Introduction
1.1 Background
Hydrocyclones are very handy process units when it comes to classifying particles according
to size or density in the mineral processing industry. Unfortunately, it is very difficult for the
operator to monitor the performance of the hydrocyclone while on-line (Coelho & Medronho,
2000). Empirical models had to be developed in order to predict how the hydrocyclone will
perform under certain conditions by acting as inference sensors (Kraipech et al., 2005).
One popular empirical model used to predict the performance of a hydrocyclone is the Plitt
model (Flinthoff et al., 1987). L.R. Plitt developed this model to be robust by gathering a large
number of experimental data. The data was gathered by operating a wide range of
hydrocyclone geometries at different operating conditions (Plitt, 1976). This model is in general
not very accurate in predicting the performance, i.e. the separation efficiency of hydrocyclones
(Silva et al., 2009).
Artificial Neural Networks can be used to predict the performance of complex systems like the
hydrocyclone (Kutz, 2003). What makes this method so special is its ability to learn through
parallel processing (McMillan, 1999). Given a certain amount of experimental data, the ANN
can identify underlying patterns in the data which gives it the ability to predict the outcome,
given certain input parameters (Jain, 1996).
South Africa has a very large mining industry (Anglo American Platinum, 2013; Anglo Gold
Ashanti, 2013). Improving the performance of the hydrocyclone could possibly lead to the
growth in the South African economy, as mineral processing becomes more efficient. The use
of hydrocyclones are not limited to the mining industry. These process units are also globally
used in the petrochemical, environmental and food processing industries (Sripriya et al.,
2007). The improved use of the hydrocyclone could thus have a large impact globally in
various industries.
1.2 Problem statement
Inadequate control of the hydrocyclones on a mineral processing plant may lead to
inefficiencies in the downstream process units, ultimately leading to a loss in profit for the
company. Monitoring the on-line performance of a hydrocyclone is not a simple task. Inference
sensors1
that make use of empirical models or artificial neural networks are possible solutions
1 The terms inference sensors and soft sensors are used interchangeably in this study
School of Chemical and Minerals Engineering
Introduction| 2
to this problem. A study is needed to determine which of these methods will be more
appropriate for predicting hydrocyclone performance.
1.3 Aim and objectives
1.3.1 Aim
Improve hydrocyclone efficiency by producing a soft sensor that has the ability to accurately
predict hydrocyclone performance.
1.3.2 Objective
Compare the predictive power of an empirical model, namely the Plitt model, with the
predictive power of an artificial neural network trained with the backpropagation algorithm by
making use of experimental hydrocyclone data.
1.3.3 Methodology
 Do a literature study on the operation of the hydrocyclone, empirical models for
predicting hydrocyclone performance and artificial neural networks;
 Do a HIRA study before sampling on the hydrocyclone commences;
 Devise a procedure for obtaining representative samples from the hydrocyclone;
 Obtain more than 100 samples from the hydrocyclone;
 Gather data on the samples’ PSD by analysing the samples with the Malvern
Mastersizer 2000;
 Process the data for it to be in a suitable form for inserting into an empirical model and
an artificial neural network;
 Develop the artificial neural network from the literature study that was previously
conducted;
 Find optimal architectures and parameters for the artificial neural network through trial-
and-error;
 Substitute the processed data into the empirical model and compare its output (d50c
and sharpness of separation) with that of the experimental data;
 Substitute the processed data into the trained artificial neural networks and compare
the output to the experimental data;
 Compare the two soft sensors, the empirical model and the artificial neural network,
with each other and come to a conclusion over which is better for predicting
hydrocyclone performance
School of Chemical and Minerals Engineering
Literature study| 3
Chapter 2 - Literature study
2.1 The hydrocyclone
Hydrocyclones are commonly used in the mineral industry for the classification of particles
after grinding (Flinthoff et al., 1987). It is usually installed in a closed circuit grinding unit where
it is used to separate the under size particles from the course particles (Kelly & Spottiswood,
1982:201). The course particles are returned to the grinder for further comminution while the
under size particles leave the circuit (Wills, 2006:224-225). Advantages of hydrocyclones
include simple design, low operational costs and the capability of handling large volumes of
pulp (Sripriya et al., 2007). Complex mechanical devices like spirals and rake classifiers have
been replaced by cyclones2
, due to their simple structure that contains no moving parts
(Napier-Munn et al., 2005:309).
Its applications are however not limited to the mineral industry as it is also used in the chemical
industry, power generation industry, textile industry and more. By customising its structure,
the hydrocyclone can be used for specific applications like (Svarovsky, 1984:1):
 Liquid clarification
 Slurry thickening
 Cleansing solid particles
 Elimination of gasses from liquids
Classification of particles takes place due to the difference in settling velocities of the particles
being classified. The settling velocities can be a function of either particle size and/or particle
density, depending on whether a homogeneous or heterogeneous ore is classified (Kelly &
Spottiswood, 1982:199). A homogeneous ore contains particles of similar densities. Particles
of homogeneous ores will be classified according to their size (Flinthoff et al., 1987).
The feed enters the cylindrical section of the hydrocyclone tangentially where it forms a vortex
inside the cyclone’s cone shaped body. The fluid follows a helical path until it reaches the
spigot, also known as the apex, where a portion of the downward flow leaves through the
spigot as the underflow. The remaining downward flow follows an upward spiral, located on
the inside of the outer vortex, and leaves via the vortex finder (Svarovsky, 1984:30-31). The
reason for the formation of the upward spiral is not fully understood (Svarovsky, 1984:41).
Particles of similar density or size gather together due to the competition between the drag
forces and centrifugal forces acting on these particles (Napier-Munn et al., 2005:309-310). If
the density of the carrier liquid is lower than that of the solids being separated, the centripetal
2The terms, “hydrocyclone” and “cyclone” are used interchangeably in this study
School of Chemical and Minerals Engineering
Literature study| 4
force on the solid particles will be larger than the centripetal force of the liquid. On the other
hand, the centripetal force acting on the particle will increase as the particle size increases for
homogeneous ores (Hibbeler, 2010:131). The centripetal force acting on the particles
dominates the drag force also acting on the particles in a radial direction. Larger particles thus
reach the boundary layer, formed between the liquid and the wall of the cyclone, with more
ease than the smaller particles. The particles in the boundary layer leave the cyclone via the
apex under ideal conditions. The finer particles that could not reach the boundary layer by the
time the apex is reached, is transported to the inner spiral where it leaves throught the vortex
finder (Svarovsky, 1984:41).
Random turbulence, hindered settling and the interaction between the carrier liquid and the
solid particles makes describing the flow inside the hydrocyclone very difficult. Determining
the separation performance of the hydrocyclone is thus not an easy task (Sripriya et al., 2007).
The performance of the hydrocyclone is defined as the ability to separate particles into the
desired size ranges (Kelly & Spottiswood, 1982:204). According to Svarovsky (1984) the
separation performance of the hydrocyclone could be determined if the corrected cut size,
𝑑50𝑐, and the sharpness of separation, m, could be calculated. This is done with the use of a
corrected partition curve. The grade efficiency curve, also called the partition curve or the
Tromp curve, is a plot of the particles in a certain size range, on the x-axis, vs. the fraction of
Figure 2.1: Hypothetical flow inside the hydrocyclone viewed from the top of the hydrocyclone.
Adapted from Plitt (1976)
School of Chemical and Minerals Engineering
Literature study| 5
these particles in the feed leaving the hydrocyclone through the underflow (Frachon & Cilliers,
1999) as can be seen on Figure 2.2. The grade efficiency curve cannot be approximated from
first principles and has to be determined by using experimental data (Svarovsky, 1984:17).
Figure 2.2: Corrected and non-corrected partition curve adapted from Schneider (2001)
The 𝑑50𝑐, also known as the cut size, is the particle size that has an equal chance to exit the
hydrocyclone through the vortex finder or through the underflow. The corrected cut size is
used instead of the real cut size, as this gives a better indication of the separation forces that
are present in the hydrocyclone. More information on the corrected partition curve follows
later. The sharpness of separation, m, indicates how well the classification is taking place in
the cyclone. The higher the value of m, the closer the hydrocyclone is to an ideal classifier
(Napier-Munn et al., 2005:311).
In practice, some of the particles, irrespective of their size, in the hydrocyclone bypasses the
classification. By controlling the operating conditions of the cyclone, these deviations from
ideal separation could be lowered, but never eliminated (Napier-Munn et al., 2005:310-311).
Two paths that could be followed for bypassing classification are mentioned below.
Small particles tend to stay suspended in the liquid which leaves the hydrocyclone through
the underflow. According to Frachon and Cilliers (1999), Plitt (1976) and Svarovsky (1984:20)
the fraction small particles bypassing to the underflow is directly proportional to the liquid
recovery to the underflow, Rf. A corrected partition curve is constructed to remove the effect
of the bypass to the underflow as can be seen on Figure 2.2. Another phenomenon that
School of Chemical and Minerals Engineering
Literature study| 6
causes the undersize particles to leave though the underflow, is when the undersize particles
are trapped in the boundary layer by the larger particles. The corrected partition curve
constructed with the use of equation 2.1 might thus not be capable of taking into account all
of the undersize particles leaving via the underflow.
Another way in which classification could be bypassed is if particles near votex finder leaves
via the overflow (Svarovsky, 1984:40). No corrections are made on the partition curve to take
this effect into account, but this effect will however be held in mind when the results are
interpreted.
𝑩â€Č
=
𝑩 − 𝑅𝑓
1 − 𝑅𝑓
2.1
2.2 Hydrocyclone control
If a hydrocyclone is not operated to produce the desired overflow and underflow, it could lead
to poor performance in downstream processes (Eren & Gupta, 1988). Fines in the underflow
lead to overgrinding, while coarse material in the overflow can cause downstream separation
problems (Aldrich et al., 2014). Slight changes in the operating conditions of the hydrocyclone
could markedly affect the performance of the hydrocyclone (Neesse et al., 2004). The operator
of a hydrocyclone might not always be aware of the cyclone’s underperformance and is in
addition frequently incapable of returning the cyclone to its optimal operation. There is thus a
need for methods to efficiently determine the performance of the cyclone while in operation
(Napier-Munn et al., 2005:309). Optimising the hydrocyclone is not an easy task, as the
variables are often interlinked with each other. Models that are reasonably accurate are
capable of finding the optimum operating conditions for the hydrocyclone even if the variables
like the split flow and pressure are for example dependent on each other (Napier-Munn et al.,
2005:320).
2.2.1 Sensors used in hydrocyclone performance determination
Variables like the pressure drop over the cyclone, the flow rates in and out of the cyclone and
the feed are commonly monitored while the cyclone is on-line. With all this information, the
operator might still not be able to control the performance of the cyclone effectively (Napier-
Munn et al., 2005; Aldrich et al., 2014). Numerous studies have been conducted to find a
suitable method to control the performance of the hydrocyclone, many of which has not been
widely used in the industry (Aldrich et al., 2014). Table 2.1 contains a list of current sensors
that have been developed for determining hydrocyclone performance.
School of Chemical and Minerals Engineering
Literature study| 7
2.3 Soft sensors
One other way the performance can be monitored on-line is by developing a soft sensor, like
an artificial neural network (Napier-Munn et al., 2005). Soft sensors use operational data from
the plant to predict variables that are usually difficult and/or costly to measure on-line (Kadlec
Table 2.1: Sensors used in the on-line monitoring of hydrocyclone performance
Sensor Description
Acoustic Sensors An acoustic sensor was mounted externally on the
hydrocyclone and after a suitable model was found, could
accurately predict various parameters like the solids
concentration and the flow rate. Variables like the d50c and
sharpness of separation are however not determined with the
use of this method (Hou et al., 1998).
Videographic
Measurement
A video camera was used to monitor the discharge angle of
the hydrocyclone. The discharge angle of the cyclone is said
to be linked to the performance of the cyclone (Concha et al.,
1996; Neesse et al., 2004). Although this method has some
challenges, it is a cost effective way to determine the discharge
angle with good accuracy (Janse van Vuuren et al., 2011).
Photographic
measurement
Aldrich et al. (2014) used images and other experimental data
from the underflow of a experimental hydrocyclone setup to
develop a model that had the ability to identify the mean
particle size in the underflow. Instead of using the discharge
angle of the underflow like Janse van Vuuren et al. (2011), the
textural information that the images provided of the underflow
was utilised.
Measurement using a
laser beam
A laser beam is pointed at the underflow of the cyclone where
the reflection of the laserbeam is measured with a camera to
determine if the cyclone is in the spray or roping state (Neesse
et al., 2004).
School of Chemical and Minerals Engineering
Literature study| 8
et al., 2009). Two possible soft sensors for the control of the hydrocyclone, empirical models
and artificial neural networks, will be discussed in this study.
2.3.1 Empirical models
Empirical models had to be developed in order to predict how the hydrocyclone will perform
under certain conditions (Kraipech et al., 2005). Although Flinthoff et al. (1987) states that
these models have been widely accepted, Chen et al. (2000) however states in his study that
these models are not reliable. Coelho and Medronho (2000), reasons that these models will
only work well if the cyclone is operated in the range that was used to obtain the data to fit the
models.
One popular empirical model that is used to predict the performance of a hydrocyclone, is the
Plitt model (Flinthoff et al., 1987). L.R. Plitt developed this model to be robust by gathering a
large number of experimental data. The Plitt model was designed to also take into account the
theories around the complex flow of the hydrocyclone (Plitt, 1976). These theories include the
residence time theory and the equilibrium orbit theory (Chen et al., 2000). The theories alone
are incapable of describing the hydrocyclone performance (Napier-Munn et al., 2005:312).
The data was gathered by operating a wide range of hydrocyclone geometries at different
operating conditions (Plitt, 1976). This model is in general not very accurate in predicting the
performance of hydrocyclones (Silva et al., 2009).
The Plitt model consists of four empirical equations. These equations are used to calculate
the corrected cut size, the flow split between the underflow and overflow, the sharpness of
separation and the pressure drop over the hydrocyclone (Plitt, 1976). Although the Plitt model
is designed to work without calibration, Flinthoff et al. (1987) recommends inserting empirical
constants, F1 – F4, that will take into account the unique conditions under which the cyclone
operates. Only one experimental data point is needed to tune these empirical constants. By
default, the values of these constants are all equal to 1.
𝑑50𝑐 = đč1
39.7đ·đ‘
0.46
đ·đ‘–
0.6
đ·0
1.21
𝜂 𝑣
0.5
exp(0.063𝜑)
đ· 𝑱
0.71
ℎ0.38 𝑄0.45 (
𝜌𝑠 − 1
1.6 )
𝑘
2.2
𝑚 = đč21.94 exp (−
1.58𝑆
1 + 𝑆
) (
đ·đ‘
2
ℎ
𝑄
)
0.15 2.3
School of Chemical and Minerals Engineering
Literature study| 9
𝑃 = đč3
1.88𝑄1.78
exp(0.0055𝜑)
đ·đ‘
0.37
đ·đ‘–
0.94
ℎ0.28(đ· 𝑱
2
+ đ· 𝑜
2)0.87
2.4
𝑆 =
đč4 (3.29𝜌 𝑝
0.24
(
đ· 𝑱
đ· 𝑜
)
3.31
ℎ0.54(đ· 𝑱
2
+ đ· 𝑜
2)0.36
𝑒0.0054𝜑
)
đ·đ‘
1.11
𝑃0.24
2.5
Where:
đ·đ‘= Cyclone diameter in 𝑐𝑚
đ·đ‘–= Inlet diameter in 𝑐𝑚
đ· 𝑜= Vortex finder diameter in 𝑐𝑚
đ· 𝑱= Underflow/apex diameter in 𝑐𝑚
ℎ= Free vortex height in 𝑐𝑚
𝜌 𝑝= Density of the cyclone feed slurry in
𝑔
𝑐𝑚3
𝜌𝑠= Density of the solid phase in
𝑔
𝑐𝑚3
𝜂 𝑣= Viscosity of the carrier fluid in 𝑐𝑝
𝜑 = Percentage solids in the feed
𝑄 = Feed flow rate in
𝑙𝑖𝑡𝑒𝑟𝑠
𝑚𝑖𝑛𝑱𝑡𝑒
𝑑50𝑐 = Corrected cut size in 𝑚𝑖𝑐𝑟𝑜𝑛𝑠
𝑚 = Sharpness of separation which is dimensionless
𝑃 = Gauge pressure in 𝑘𝑃𝑎
𝑆 = Split flow. This is the volume of the underflow divided by the volume of the overflow and it
is a dimensionless quantity
According to Plitt (1976), the PSD of the feed slurry has a negligible effect on the outcome of
the d50c of the underflow.
After determining the d50c and m, with the Plitt model, these values can then be inserted into
the Rosin-Rammler equation, equation 2.6, to obtain the corrected partition curve.
School of Chemical and Minerals Engineering
Literature study| 10
𝑩â€Č
= 1 − exp⁡(−0.693 (
𝑑
𝑑50𝑐
)
𝑚
)
2.6
Where 𝑑 is the particle size in 𝑚𝑖𝑐𝑟𝑜𝑛𝑠 and 𝑩â€Č is the corrected volume of a certain particle size
that was recovered in the underflow.
2.3.2 Artificial neural networks
Various studies have been done on the use of artificial neural networks for the prediction of
hydrocyclone performance and have proven to be successful (Eren et al., 1997a; Eren et al.,
1997b; Karimi et al., 2010)
The human brain has powerful learning, generalization and parallel computing abilities. It is
desired to give computers the same abilities by copying the principle operation of brain cells
and developing artificial neural networks (ANN) (Jain, 1996). ANNs are not limited to soft
sensors. Awodele and Jegede (2009) reasons that ANN promises a wide range of new
applications in the areas such as education and medicine in the future. This is the reason why
research in this field has been booming in the past few decades (Gallant, 1994:1).
Figure 2.3: Diagram of the computational nodes and weights of an artificial neural network
adapted from Jain (1996)
School of Chemical and Minerals Engineering
Literature study| 11
An artificial neural network consists of computational units called nodes3
. These nodes are
located in sets called layers. Connections, called weights, connects the nodes of one layer to
the following layer. Information transported through the weights can only travel in one
direction. Figure 2.3 illustrates the computational nodes in a three-layer neural network4
. The
arrows represent the weights and their direction.
Values, either positive or negative, are assigned to each of the weights. The magnitude of the
value assigned to the weight determines how large the effect of the data transported through
that weight will be on the neural network. The larger the magnitude of the weights, the larger
the effect. The input data travels through the weights to which they are connected. The data
traveling through that weight is multiplied by the value of that weight. When the data reaches
hidden layer 1 through the weights, an input value to the node is calculated. More information
on these calculations later. The input value is substituted into a function called an activation
function which calculates an output called an activation. The activation travels through the
weights to the next nodes and the same operation is performed. This is done until the ANN
produces its final output (Gallant, 1994:1).
Parameters that influence the output of the network include:
 The number of layers
 The number of nodes in the hidden layers
 The activation function used in the nodes
 The values of the weights
 Number of input variables
2.3.2.1 Network topography
The network topology involves the arrangement of nodes and connections in the network.
These arrangements can be classified into 2 main categories: Feed-forward networks or
feedback networks.
In feed-forward networks, information can only be carried in one direction, from the input to
the output. This type of network is mainly used for pattern recognition purposes. Figure 2.3
illustrates a feed-forward neural network.
In a feedback or recurrent networks, the information can either travel in the forward direction
to the output or return in the input direction, i.e. make a loop (Awodele & Jegede, 2009).
For the purposes of this study, a feed forward structure will be used.
3 The words “nodes” and “neurons” are used interchangeably in this study.
4 The terms “neural networks” and “artificial neural networks” are used interchangeably in this study.
School of Chemical and Minerals Engineering
Literature study| 12
2.3.2.2 Other artificial neural network parameters
2.3.2.2.1 Initial weights
Initial weight values between -0.1 and 0.1 are randomly chosen. Assigning non-random
weights could lead to weights that perform the same action and does not lead to sufficient
convergence. The weights need to be unique when initialising training to increase the chances
of identifying the pattern in the data (Gallant, 1994:213). Another, more complex approach
proposed by Gallant (1994:220), is to initialise the weights connected to a certain cell to a
random value between -2/z and 2/z, where z is the amount of weights connected to the cell.
2.3.2.2.2 Training speed
A large value for the training speed, 𝜇, gives a faster convergence. This convergence can
however only be maintained up to a certain point where the network will become unstable and
diverge. This is called overtraining. It is advised to choose a training speed that has a positive
value no larger than 0.1. Although this results in slow training, the neural network has a better
chance to find the local minimum (Gallant, 1994:220).
2.3.2.2.3 Momentum
Momentum is used to increase the training speed. The momentum term consists of the
change in weight at the previous iteration, multiplied by the momentum parameter. An
additional benefit of adding momentum is the removal of noise that might occur during weight
updating. The weight thus converges smoothly (Gallant, 1994:221).
2.3.2.2.4 Number of hidden neurons
It is very common for the backpropagation algorithm used in the industry to only contain one
hidden layer, the main reason being that networks with more hidden layers learn very slowly.
Neural networks with one hidden layer are known to be universal approximators. The only way
to determine whether a network with multiple or a single hidden layer should be used, is by
trail-and-error (Gallant, 1994:221).
2.3.2.3 Machine learning
In order for an ANN to produce better results, an algorithm has to be written that gives the
ANN the ability to adjust its self. This is called machine learning (Nag, 2010). The two main
types of machine learning are supervised and unsupervised learning.
In supervised learning, the ANN is given input data to produce an output. The output the ANN
produces for the given input data is evaluated with the desired output. If the output from the
ANN does not match the desired output, the necessary adjustments are made with the use of
School of Chemical and Minerals Engineering
Literature study| 13
the learning algorithm (Gallant, 1994:6). The diagram in Figure 2.4 attempt to better describe
what is meant with supervised learning.
In contrast, unsupervised is not provided with the desired output. Instead, unsupervised
learning is used to adjust the ANN so that it can group data that show similar patterns (Gallant,
1994:7). Applications of unsupervised learning include finding the probability distribution of
data and identify groups of data that show similar properties and occur close together, i.e.
cluster identification (Bishop, 2008:10). In this study, supervised learning will be used, as the
experimental data from the hydrocyclone provide the ANN with input and the desired output.
2.3.2.4 Learning algorithms
There are a number of learning algorithms in existence that are used to adjust the neural
network in order to achieve the desired output. Some algorithms include the perceptron
learning algorithm, radial basis function algorithm and the Boltzmann learning algorithm (Jain,
1996). The question arises: “Which algorithms would be fit for a certain application?”
According to Jain (1996) the backpropagation algorithm, among others, are fit for use in control
systems. Gallant (1994:225), on the other hand reasons that trial-and-error has to be used to
find the appropriate algorithm. From his experience, he found that one should first try to use a
single-cell model before using a complex algorithm like the backpropagation algorithm.
A large problem that occurs in all systems is the presence of noise which commonly occurs in
real world applications. Noise is the introduction of erroneous data into the data set. It could
Figure 2.4: Supervised learning adapted from Hagan et al. (2002)
School of Chemical and Minerals Engineering
Literature study| 14
either be that the data is false or absent (Gallant, 1994:9). Artificial neural networks, on the
other hand are capable of handling noise (Gallant, 1994:10).
2.3.2.5 Problems with artificial neural networks
2.3.2.5.1 Failure to generalise
The purpose of training an ANN is not so much as to reproduce the exact values of the training
data, but rather to develop a network that is capable is producing a general answer that would
be expected in the training data range (Zhang et al., 2003; Bishop, 2008:332).
To explain the difference between good and bad generalization of ANNs, Bishop (2008:9-12)
uses the analogy with the complexity of an ANN and the order of a polynomial (polynomial of
high or low order). Given a certain data set generated by adding random values to the output
of a known function, say 𝑩 = sin⁥( đ‘„). Two polynomials are used to fit the data. The one
polynomial is of a high order and the other of a low order. The results of the first order
polynomial that were fit to the data could be seen on Figure 2.5. This corresponds to a neural
network with only one hidden node that produces a bad fit to the data. A possible solution is
to increase the number of free parameters. In the case of a neural network, the number of
hidden nodes will be increased. As can be seen in Figure 2.6, the higher order polynomial
produces a good fit for the all the data points. It is however a bad representation of the sine
wave, as there are plenty of oscillations (Bishop, 2008:9-12).
Figure 2.5: Polynomial of first order produces a bad fit for the data. Reproduced from Bishop
(2008:11)
School of Chemical and Minerals Engineering
Literature study| 15
To address this problem of finding a suitable complexity for the ANN, two concepts, the
variance and bias (not to be confused with bias weights) are used. The bias is a measure of
the amount with which the overall average of the ANN output differs with that of the given data.
Figure 2.5 has a high bias value while Figure 2.6 has a low bias value. The variance is used
as a measure of how well the ANN output will fit to another data set with that does not include
the ANN training data. A low variance value can be expected in Figure 2.5, while a high
variance value can be expected in Figure 2.6. The variance and bias goes hand in hand – an
increase in the variance leads to a decrease in the bias and vice-versa. The goal is to decrease
the value of both the variance and the bias (Bishop, 2008:334-335).
2.3.2.5.2 Regularization
Over-fitting is the result of weights with high values. In order to suppress the weights from
obtaining large values, regularization is applied. In regularization, the error of the output is
conditioned in order to produce a smoother output. This is done by adding a penalty term, Ω,
to the error, 𝐾. The conditioned error, đžÌƒ, can be calculated with the help of equation 2.7.
đžÌƒ = 𝐾 + đœÎ© 2.7
Bishop (2008:338) provides two ways in which the penalty term can be calculated. One of the
two methods is the Tikhonov regularizers which will not be discussed in this study. Another is
Figure 2.6: Polynomial of high order producing something that looks like a good fit for all the
data points, but the predictive power of the polynomial is sacrificed. Reproduced from
Bishop (2008:12).
School of Chemical and Minerals Engineering
Literature study| 16
the weight decay method. In this method, the penalty term is equal to the sum of squares of
all the weights and biases. The equation could be observed in equation 2.8.
Ω =
1
2
∑ đ‘€đ‘–
2
𝑖
2.8
The weight decay regularizer suppresses the weights from obtaining large values which will
cause over-fitting (Bishop, 2008:338-339).
2.3.2.5.3 Structural stabilization
Trial-and-error could be used to find a more suitable structure that has little complexity, but
produces good results. One way of doing this, is by varying the number of hidden nodes or by
adding bias weights to the network (Gallant, 1994:221; Bishop, 2008:332).
2.3.2.5.4 More data points
The number of training data points and the possible curves that can fit through these training
data points are inversely proportional to each other (Zhang et al., 2003). If one desires to train
a complex network for reasons such as more accurate results, one simply has to add more
training data to the network.
A neural network that has the ability to generalize should give output as displayed by the lower
order polynomial in Figure 2.7.
Figure 2.7: A lower order polynomial that has the capability to generalize well
School of Chemical and Minerals Engineering
Literature study| 17
2.3.2.6 The backpropagation algorithm
As mentioned above, the backpropagation algorithm is one of the popular neural network
training algorithms that is suitable for use in process control environments. It was decided that
this training algorithm will be used for the neural network in this study. It should be noted that
this algorithm mentioned here is developed for a neural network that has a single hidden layer.
The sources used in the development of this artificial neural network include Jain (1996) and
Basheer and Hajmeer (2000). The steps are as follow:
1. Choose the amount of input, hidden and output nodes. This will then also tell you how
many weights there will be in the neural network architecture.
2. Assign random values to the weights.
3. Propagate the signal forward by multiplying the inputs to the neural network with the
with the weights that connect the inputs to the hidden neurons, then sum the results of
the weights that goes to each of the hidden nodes to produce the signal that is sent to
the specified node as can be seen in equation 2.9.
𝜉𝑗 = ∑ đ‘„đ‘– đ‘€đ‘–đ‘—
𝑛
𝑖=0
2.9
Where:
𝜉𝑗=Signal sent to node 𝑗.
𝑛=The amount of weights attached to the node 𝑗.
đ‘„đ‘–=Input value from weight 𝑖
đ‘€đ‘–đ‘—=Weight attached to the input node, 𝑖, and the hidden node 𝑗.
4. Substitute the input signal into the activation function. The sigmoid activation function
was chosen and can be seen in equation 2.10.
𝜎𝑗
đ»
=
1
1 + 𝑒−𝜉 𝑗
2.10
Where 𝜎𝑗
đ»
is the output value of the transfer function of node 𝑗 in the hidden layer.
5. The output of node 𝑗, 𝜎𝑗, is then fed forward to the next layer of nodes, the output nodes
where equation 2.11 is applied.
School of Chemical and Minerals Engineering
Literature study| 18
𝜉𝑗 = ∑ 𝜎𝑖
đ»
đ‘€đ‘–đ‘—
𝑛
𝑖=0
2.11
6. The signal to the output nodes, 𝜉𝑗, is again substituted into the sigmoid function in
equation 2.12 to produce the output of the output layer nodes.
𝜎𝑗
𝑂
=
1
1 + 𝑒−𝜉 𝑗
2.12
Where 𝜎𝑗
𝑂
is the output of the output layer nodes. Note that 𝜎𝑗
𝑂
is equal to đ‘„đ‘—
𝑂
that will be
mentioned soon.
7. The error of the output neurons could then be calculated by comparing the output of
the output neurons with the desired output of the training data with the use of equation
2.13 (Gupta & Lam, 1998).
𝛿𝑗
𝑂
= (đ‘„đ‘—
𝑂
− 𝑩𝑗)đ‘„đ‘—
𝑂
(1 − đ‘„đ‘—
𝑂
) 2.13
Where:
𝛿𝑗
𝑂
=Error of the output node 𝑗
đ‘„đ‘—
𝑂
=Output of the output node 𝑗. Again note that đ‘„đ‘—
𝑂
is equal to 𝜎𝑗
𝑂
.
𝑩𝑗=Desired output of the output node 𝑗
8. The values with which the weights between the output layer nodes and the hidden
layer nodes are changed could now be calculated with equation 2.14 (Gupta & Lam,
1998).
âˆ†đ‘€đ‘–đ‘—
𝑂
= 𝜂𝛿𝑗
𝑂
đ‘„đ‘—
𝑂
− đ‘€đ‘€đ‘–đ‘—
𝑂
(𝑡 − 1) − 𝜂𝑅 × (
đ‘€đ‘–đ‘—
𝑂
(𝑡 − 1)2
((1 + đ‘€đ‘–đ‘—
𝑂(𝑡 − 1))
2
)
2)
2.14
Where:
âˆ†đ‘€đ‘–đ‘—
𝑂
=The value with which weight đ‘€đ‘–đ‘—
0
has to be updated
𝜂=Training speed defined by the user
𝛿𝑗
𝑂
=Error of the output node 𝑗
đ‘„đ‘—
𝑂
=Output of the output node 𝑗
𝑀=Momentum factor defined by the user
School of Chemical and Minerals Engineering
Literature study| 19
𝑅=Regularization factor defined by the user
đ‘€đ‘–đ‘—
0
(𝑡 − 1)=The value of the previous weight đ‘€đ‘–đ‘—
𝑂
9. The new weight values can then be calculated with equation 2.15.
đ‘€đ‘–đ‘—
𝑂
=⁥ đ‘€đ‘–đ‘—
0
(𝑡 − 1) − âˆ†đ‘€đ‘–đ‘—
𝑂 2.15
Where đ‘€đ‘–đ‘—
𝑂
is the new value of the output weight that extends from node 𝑖 in the hidden
layer to node 𝑗 in the output layer.
10. The next step is to calculate the error of the hidden nodes with the help of equation
2.16 (Gupta & Lam, 1998).
𝛿𝑗
đ»
= đ‘„đ‘—
đ»
× (1 − đ‘„đ‘—
đ»
) × đ‘€đ‘–đ‘—
𝑂
(𝑡 − 1) × 𝛿𝑗
𝑂 2.16
Where
𝛿𝑗
đ»
=Error of the hidden node 𝑗
đ‘„đ‘—
đ»
=Output of the hidden node 𝑗
11. Now that the error of the hidden layer of nodes are known, the increment with which
the weights that extend from the input layer to the hidden layer has to change could
now be calculated with equation 2.17 (Gupta & Lam, 1998).
âˆ†đ‘€đ‘–đ‘—
đ»
= 𝜂𝛿𝑗
đ»
đ‘„đ‘—
đ»
− đ‘€đ‘€đ‘–đ‘—
đ»
(𝑡 − 1) − 𝜂𝑅 × (
đ‘€đ‘–đ‘—
đ»
(𝑡 − 1)2
((1 + đ‘€đ‘–đ‘—
đ»(𝑡 − 1))
2
)
2)
2.17
Where:
âˆ†đ‘€đ‘–đ‘—
đ»
=The value with which weight đ‘€đ‘–đ‘—
đ»
has to be updated
𝛿𝑗
đ»
=Error of the hidden node 𝑗
đ‘„đ‘—
đ»
=Output of the hidden node 𝑗
đ‘€đ‘–đ‘—
đ»
(𝑡 − 1)=The value of the previous weight đ‘€đ‘–đ‘—
đ»
12. The new weights can then be calculated with the help of equation 2.18.
đ‘€đ‘–đ‘—
đ»
=⁥ đ‘€đ‘–đ‘—
đ»
(𝑡 − 1) − âˆ†đ‘€đ‘–đ‘—
đ» 2.18
The above steps could be repeated with the data from a new sample. An epoch is completed
if the artificial neural network has gone through the entire set of training data. A new epoch is
started by again going through the training data set.
School of Chemical and Minerals Engineering
Experimental procedure| 20
Chapter 3 - Experimental procedure
3.1 Overview
Various slurries were prepared to be fed to the hydrocyclone. Different operating conditions
were imposed on the hydrocyclone. All necessary operating conditions, samples and other
data, were recorded on each run. A PSD analysis of the samples were carried out on the
samples. The gathered information could then be used to determine the d50c and the
sharpness of separation.
3.2 Raw materials
The solid particles that had to be separated was micron sized silica quartz particles, MQ15,
supplied by Micronized SA Limited. According to Tew (2012) the particles contain 98.50%
silica, with small amounts of Al2O3, K2O, Fe2O3, CaO and Na2O. The particles have a density
of 2650
𝑘𝑔
𝑚3 and a d50c and m values of 20 𝑚𝑖𝑐𝑟𝑜𝑛𝑠 and 1.9 respectively.
The carrier fluid used in this case was municipal water from the Tlokwe municipality.
3.3 Equipment
 3X 5 litre buckets
 2X 20 litre buckets
 1X water gun
 1X Marcy scale
 1X Doppler flow meter
 50X poly tops
 1X large syringe
 1X spoon
3.4 Experimental setup
A diagram showing the experimental setup could be observed on Figure 3.1. The geometry of
the hydrocyclone that was used in this study is displayed on Table 3.1. The alphabetical
numbering in Figure 3.1 are explained below:
 A: Slurry storage tank
 B: Circulation pump
 C: Main feed bypass valve
 D: Feed fine tune bypass valve
 E: Feed shutdown valve
School of Chemical and Minerals Engineering
Experimental procedure| 21
 F: Pressure gauge
 G: Doppler flow meter
 H: Hydrocyclone
 I: Hydrocyclone overflow
 J: Hydrocyclone underflow
 K: Sample taken from the hydrocyclone overflow
 L: Sample taken from the hydrocyclone underflow
 M: Mixer
Figure 3.1: Diagram of the hydrocyclone setup
School of Chemical and Minerals Engineering
Experimental procedure| 22
The mixer, mentioned above, consists of square tubing that transports the fluid from the
fine tuning bypass valve to the bottom of the storage tank. Holes were made at the end of
the square tubing in order for the slurry to be sprayed towards the sides of the storage
tank so as to promote better mixing.
Two sampling containers are located above the storage tank and below the overflow and
underflow outlets. As soon as the container of the underflow is pushed in under the
underflow, a mechanism pushes the overflow pipe in the overflow container, meaning that
the overflow and the underflow are sampled simultaneously. The experimental
hydrocyclone setup can be seen on Figure 3.2. The two containers that store the underflow
and the overflow are also indicated on this figure.
Figure 3.2: The experimental hydrocyclone setup
Where:
 A: Hydrocyclone
 B: Hydrocyclone overflow
 C: Hydrocyclone underflow
School of Chemical and Minerals Engineering
Experimental procedure| 23
 D: Sampling container for the underflow
 E: Sampling container for the overflow
 F: Slurry storage tank
Table 3.1: Hydrocyclone geometry
Part Size
đ·đ‘ 10 𝑐𝑚
đ·đ‘– 3.03 𝑐𝑚
đ· 𝑜 3.4 𝑐𝑚
ℎ 53 𝑐𝑚
3.5 Experimental procedure
3.5.1 Preparation
3.5.1.1 Doppler flow meter calibration
The Doppler flow meter is installed on a suitable place where minimum noise will occur due to
turbulence in the piping. The storage tank was initially loaded with water only. After the pump
was turned on, one person read the value from the Doppler flow meter display, while the other
person fills the underflow and overflow containers with the water coming from the
hydrocyclone. The underflow and the overflow of the hydrocyclone is equal to the feed to the
hydrocyclone. The person filling the underflow and overflow buckets also has to keep track of
the time in which the containers are filled. From the volume of the water collected and the time
in which the water was collected, one can then calculate the real feed flowrate to the
hydrocyclone. The Doppler flow meter was calibrated accordingly. This procedure was
repeated until the error between the Doppler flow meter and that measured with the container
and watch method was small enough.
School of Chemical and Minerals Engineering
Experimental procedure| 24
3.5.1.2 Marcy scale calibration
The Marcy scale is a handy tool that could be used to determine the density of a slurry mixture.
A picture of the Marcy scale could be observed on Figure 3.3. Before its use, it has to be
calibrated with the above mentioned municipal water. The density value is set to 1000
𝑘𝑔
𝑚3 when
calibrated.
3.5.1.3 Slurry preparation
One of the variables that also has to be monitored is the volumetric percentage of solids in the
feed. The slurry tank (storage tank) was firstly filled with 200 𝑙𝑖𝑡𝑒𝑟𝑠 of municipal water. The
weight of silica sand that has to be added to the tank to obtain a certain volumetric solids
percentage is calculated with equation 3.1.
𝑚 𝑠 =
𝜑 × đ‘‰đ‘€
1
𝜌𝑠
− 𝜑 ×
1
𝜌𝑠
3.1
Where:
𝑚 𝑠=Mass of silica sand that has to be added to the storage tank
𝜑=Desired volume percentage of solids in the slurry
Figure 3.3: The Marcy scale
School of Chemical and Minerals Engineering
Experimental procedure| 25
𝜌𝑠=Density of silica sand = 2650
𝑘𝑔
𝑚3
đ‘‰đ‘€=Volume of water in the tank = 200 𝑙𝑖𝑡𝑒𝑟 = 0.2 𝑚3
3.5.2 Sampling
Step-by-step instructions for obtaining samples from the rig5
are given in this section. These
steps could only be followed after the preparation mentioned in chapter 3.5.1 have been
completed:
1. Make sure that the valve between the pump opening and the storage tank exit is fully
opened;
2. Make sure that there are no objects in the storage tank that could cause pump failure.
3. Close the feed shutdown valve;
4. Close both of the valves from the overflow and underflow containers;
5. Fully open both the feed bypass valves;
6. Turn on the pump;
7. The slurry from the bypass valves will lead to plenty of turbulence in the storage tank.
It is however recommended that the storage tank also be mixed manually so as to
ensure that most of the silica particles are suspended in the slurry;
8. Fully open the feed shutdown valve;
9. Slowly close the feed bypass valves while keeping an eye on the pressure gauge. Stop
closing the bypass valves as soon as the required pressure is reached;
10. One person has to take note of the flow rate, while the other person has to push in the
underflow sampling container. This has to be done at the same time. The person
recording the flow rate from the Doppler flow meter also has to start and stop a
stopwatch when the containers are firstly inserted and pulled out again;
11. As soon as the underflow and overflow containers have been pulled out, the pump
may be stopped;
12. Separate buckets have to be inserted under the hoses that are connected to the outlet
valves of the underflow container and the overflow container;
13. Slowly open the outlet valves of the overflow and underflow containers and collect the
underflow and overflow samples in the separate buckets. The content of the underflow
and overflow containers have to be stirred well while the outlet valves are opened so
as to avoid silica sand from settling and remaining in the underflow or overflow
containers;
5 The terms “rig” and “hydrocyclone experimental setup” are used interchangeably in this study.
School of Chemical and Minerals Engineering
Experimental procedure| 26
14. The buckets have to be weighed separately on scale. The scale should previously
have been reset with the mass of the buckets that are used. Similar buckets thus have
to be used. The mass of the content inside the buckets are recorded;
15. A smaller sample of overflow and the underflow are taken by mixing the slurry in the
buckets and filling a poly top with the content. The poly tops should be labelled
thoroughly.
16. The remaining content in the buckets are again stirred before the Marcy scale bucket
is filled with the slurry. The Marcy scale bucket is put on the Marcy scale to determine
the density of the slurry. The densities of both the slurries have to be determined this
way.
The buckets containing the remaining slurry are emptied into the slurry storage tank of the rig.
The above steps are then repeated for the next sample.
3.5.3 Analysing
The particle size distribution of the underflow samples are all determined with the use of the
Malvern Mastersizer 2000. The particles are circulated through the Mastersizer where they
eventually pass through a laser beam. The particles passing through the laser beam scatter
some of the radiation from the laser beam. The intensity of the backscattering of the laser light
from the particles are measured with special backscatter detectors. The angle at which the
light is scattered is inversely proportional to the size of the silica particles (Malvern
instruments, 2005). Figure 3.4 is a picture of the Malvern Mastersizer 2000.
Figure 3.4: Malvern Mastersizer 2000
School of Chemical and Minerals Engineering
Experimental procedure| 27
3.5.4 Experimental error
For the experimental error determination, 4 random operating conditions were chosen from all
the experiments that were conducted. Six runs were completed on each of these operating
conditions. A total of 26 experiments were thus completed in order to determine the
experimental error. The conditions at which each of the sets were done, as well as the results
could be observed in Appendix D. All the calculations that were done in the determination of
the experimental error could be found in the electronically attached spreadsheet named
“Experimental Error”.
It is assumed that the data follows a normal distribution. Due to the small amount of available
data for each of the sets, the experimental error had to be determined using the student’s t
test (Devore & Farnum, 2005:313-318).
The experimental error could be determined with equation 3.2.
𝑡 𝑛−1(
đ›Œ
2
) ×
𝑆𝑡
√ 𝑛𝑱
3.2
Where:
𝑡 𝑛−1(
đ›Œ
2
)=Critical t value that could be obtained from the back cover of Devore and Farnum
(2005)
𝑆𝑡=Standard deviation of the data
𝑛𝑱=Number of data points available in the set
A 95% confidence interval was used to obtain the experimental error. The processed
experimental data that were used for the determination of the experimental error could be
observed in Table 3.2.
Table 3.2: Processed data used for the experimental error determination
Data Set 1 Set 2 Set 3 Set 4
d50c m d50c m d50c m d50c m
1 21.71 1.35 28.41 2.03 17.72 1.13 29.58 1.44
2 16.59 1.37 24.51 1.92 21.91 1.21 30.83 1.66
3 18.63 1.49 29.57 1.89 22.02 1.21 30.99 1.69
School of Chemical and Minerals Engineering
Experimental procedure| 28
4 - - - - 22.26 1.30 31.44 1.71
5 - - - - 23.41 1.34 31.66 2.00
Valid 18.44 1.29 26.03 1.97 21.82 1.16 29.74 1.60
Two of the data points in both sets 1 and 2 have been discarded due to their large deviation
with the rest of the data in the set. One data point in each set have been used as a validation
data point. The values that were substituted into equation 3.2 to calculate the experimental
error of each set could be observed on Table 3.3. The results of the d50c experimental error
for each data set and the results of the sharpness of separation error for each data set could
be observed on Figure 3.5 and Figure 3.6 respectively.
Table 3.3: Values for substitution into the student's t equation
Data Set 1 Set 2 Set 3 Set 4
d50c m d50c m d50c m d50c m
n 3 3 3 3 5 5 5 5
S 2.58 0.076 2.65 0.075 2.17 0.073 0.811 0.20
X6
18.98 1.40 27.50 1.95 21.46 1.24 30.90 1.70
T(95%) 4.303 4.303 4.303 4.303 2.776 2.776 2.776 2.776
6 X is the average of the data in the specific set
School of Chemical and Minerals Engineering
Experimental procedure| 29
Figure 3.5: Experimental error of the d50c with a 95% confidence interval
Figure 3.6: Experimental error of the sharpness of separation with a 95% confidence interval
Large errors are observed for the d50c values in each of the sets in Figure 3.5. This could be
ascribed to the varying feed PSDs that will be dealt with later in this paper. The experimental
errors of the sharpness of separation as seen on Figure 3.6 are however acceptable.
5
10
15
20
25
30
35
40
45
0 1 2 3 4 5
d_50c
Set Number
D50c Average
Validation Data
0.5
1
1.5
2
2.5
3
0 1 2 3 4 5
m
Set Number
m Average
Validation Data
School of Chemical and Minerals Engineering
Model development| 30
Chapter 4 - Model development
4.1 Overview
121 samples where processed to be put through the artificial neural network and the Plitt
model. Unfortunately, the raw data needed to be processed before it was fit to use in the
artificial neural network and the Plitt model. For more information on how the data was
processed, please refer to Appendix A.
4.2 The Plitt model
As mentioned before, the modified Plitt model with the fudging factors will be used in an
attempt to predict the d50c and the sharpness of separation of the hydrocyclone operated under
certain conditions. Of the 121 data points, 69 samples were used to fit the fudging factors with
the help of the ExcelÂź add-in, Solver. The input parameters from the experimental data that
were not used in the tuning of the fudging factors were then substituted into the Plitt model.
The d50c and sharpness of separation results from Plitt model were then compared to the
corresponding experimental results. The Plitt model calculations could be found in the
electronically attached spreadsheet named “Plitt model”
4.2.1 Split flow
As mentioned before, this paper will only focus on predicting the d50c and the sharpness of
separation, m. The processed input data from Appendix C was inserted into the d50c and
sharpness of separation equations of the Plitt model.
For the split flow variable, 𝑆, of the d50c equation either the experimentally calculated 𝑆 or the
split flow calculated with one of the Plitt model equations given in equation 4.1 could be used.
4.1
The value of đč4 was determined by minimizing the error between 69 of the experimental and
calculated split flow values with the help of the ExcelÂź add-in, Solver. The resulting value of
đč4 was found to be 0.13. The remaining experimental values were then compared with
corresponding Plitt model values under the same operating conditions. The results of this
investigation are presented in Figure 4.1. Very small deviations from the experimental split
flow values are observed, meaning the split flow values from the Plitt model is suitable for
further use.
School of Chemical and Minerals Engineering
Model development| 31
4.2.2 Cut size – d50c
The d50c value was calculated with equation 2.2. Just as with the split flow, 69 experimental
data points were used to adjust the value of đč1. There are however another variable, 𝑘, that
could be adjusted in this equation. It was observed that solver could either vary đč1 or 𝑘 to
obtain a minimum error. A value of 0.5 was arbitrarily chosen for 𝑘, while đč1 was varied. The
resulting value for đč1 is 64.9.
4.2.3 Sharpness of separation
The fudging factor of the sharpness of separation was determined the same way as the above
mentioned fudging factors.
4.3 The artificial neural network
The backpropagation algorithm will be used to train the neural network. A few modifications
were made to the ANN. This includes the addition of a regularization term and the addition of
a momentum term. Both these terms could be observed in equation 2.14 and 2.17. All artificial
neural networks that were constructed had various input variables and only one output
variable. The output variable was either the d50c or the sharpness of separation.
4.3.1 Artificial neural network architecture
Six different artificial neural networks have been written. The amount of neurons in each of
these networks could be varied between 1 and 20, while the input and output neurons cannot
be changed. Table 4.1 displays a list of all the ANNs that have been programmed. All these
programs could be found under the attached folder named “Artificial neural networks”.
Figure 4.1: Experimental split flow values plotted with the predicted Plitt model split flow values
0
0.5
1
1.5
2
2.5
0 10 20 30 40 50 60
Splitflow
Sample number
S experimental
S predicted
School of Chemical and Minerals Engineering
Model development| 32
Table 4.1: Different artificial neural networks that were programmed
Neural network number Input variables Output variables
1 Du d50c
2 Du, 𝜑 and Q d50c
3 Du, 𝜑, Q, P, and S d50c
4 Du m
5 Du, 𝜑 and Q m
6 Du, 𝜑, Q, P, and S m
Separate neural networks for the d50c and sharpness of separation were constructed, as neural
networks that had both these variables as output, lacked the ability to learn.
It was decided that the first ANN of both the d50c output and sharpness of separation output
should only have the spigot diameter as input variable as it is known that this variable has the
largest effect on the hydrocyclone performance. In this study, the spigot diameter was
changed by switching off the pump and manually inserting a new spigot with a different
diameter. In industry, this would however be impractical. In a study conducted by Eren and
Gupta (1988), the spigot size could be adjusted pneumatically while the cyclone was on-line.
This study will thus be applicable to hydrocyclones which spigot size could be changed while
the cyclone is on-line.
The second set of neural networks contained the same inputs that are needed in the Plitt
model – the volumetric percentage solids in the feed 𝜑 and the feed volumetric flowrate 𝑄.
This neural network and the Plitt model are thus on equal grounds and could be compared
with one another.
For the third and last set of neural networks, the split flow and pressure drop over the cyclone
were added as inputs to test whether the predictive power of the neural network will improve.
Each neural network that were constructed had the ability to test 20 different architectures with
one click of a button. The networks could thus be run on multiple computers at the same time.
More neural networks could thus be tested in a shorter amount of time in comparison with
MATLAB¼’s Neural Network Toolboxℱ.
School of Chemical and Minerals Engineering
Model development| 33
Each of the neural networks were trained with roughly 75% of the experimental data. The
remaining 25% of the data was used as validation data. The validation data was used for all
the results that are displayed in chapter 5. None of the training data were thus used for
validation purposes.
To display the learning capability of the developed neural networks, a neural network that had
the spigot diameter as input parameter and the sharpness of separation as output parameter
was trained with 80 epochs and a training speed of 0.02. The results are displayed on Figure
4.2. The reader is referred to Appendix E for the source code of one of the artificial neural
networks.
Figure 4.2: Learning capability of one of the 6 developed artificial neural networks
School of Chemical and Minerals Engineering
Results and discussion| 34
Chapter 5 - Results and discussion
After processing the data, it was found that the PSD of the feed varied considerably. An
alternative for calculating the feed PSD is dealt with in this section.
As mentioned before, the neural network was written in order to make it convenient for the
user to test multiple neural network architectures at once. This functionality was used to filter
out the more suitable neural network architectures for predicting the cut size and the
sharpness of separation. These filtered out neural networks were then further optimised. The
results as well as a discussion of these results are given in this section.
5.1 Deviations in the feed PSD
From the start of the sampling and analyses, it was assumed that the feed PSD remained
constant for all the slurry batches, as the same silica sand product from the same
manufacturer was used each time. It thus only seemed necessary to sample and determine
the PSD of the feed once and sample the underflow of each run, instead of sampling both the
underflow and the overflow of each run. This meant the total amount of PSD analyses could
be cut in half. The resulting partition curves that were produced had partition values that
exceeded 1 or was lower than 0. This means that the material balance did not solve. After
Figure 5.1: Particle size distribution of 25 different feed samples
-0.01
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0 20 40 60 80 100 120 140 160
Volume%solids
Particle size [microns]
School of Chemical and Minerals Engineering
Results and discussion| 35
taking samples of 25 different slurry mixtures7
, it was found that the PSDs differed significantly
from each other as can be seen on Figure 5.1.
This meant that the partition curve could no longer be calculated from one feed PDS sample.
A solution to this problem was to calculate 25 different partition curves from the feed PSDs
that could be seen on Figure 5.1 for each of the underflow samples that were analysed.
One out of the 25 partition curves had to be chosen. The chosen partition curve had to fulfil
two criteria. Firstly, there may not be a value on the partition curve that exceeds 1. This would
mean that more of a certain size of particles exits the cyclone than have entered the cyclone.
According to the literature study, the correction made to the partition curve in order to obtain
the corrected partition curve is equal to the recovery of water to the underflow. The partition
curve thus also has to intersect the y-axis at a value that is close to the value of Rf. This is the
second constraint the partition curve has to meet.
Unfortunately, the Mastersizer was incapable of accurately measuring the particle sizes that
were smaller than 8.4 𝜇𝑚. The curve on Figure 5.2 shows the large fluctuations that occur at
particle sizes smaller than 8.4 𝜇𝑚. This phenomenon occurred in all the partition curves.
According to the results from the Mastersizer, the particles under 8.4 𝜇𝑚 amounted to 0.1%
of the total particles. The values of these particles will thus be neglected. The partition curve
value of the 8.4 𝜇𝑚 will thus be taken as the recovery of liquid to the underflow.
7 The word batch and slurry mixture are used interchangeably
Figure 5.2: Example partition curve before justifications
School of Chemical and Minerals Engineering
Results and discussion| 36
A similar phenomenon was observed for particles larger than 95 𝜇𝑚. These particles
amounted to less than 0.05% of the total particles. It would thus also be a safe assumption to
ignore these particle sizes in further calculations.
After the partition curve that suited the description above was chosen, small changes were
made to the value of Rf so that it would be equal to the experimental Rf value. These small
changes could be observed on Figure 5.3.
5.2 Plitt model
5.2.1 Cut size – d50c
The d50c results of the modified Plitt model are displayed on Figure 5.4. The blue line connects
the experimental data points, while the orange line connects points that were predicted by the
Plitt model. The results are displayed in another form on Figure 5.5: where the predicted vs.
actual values are plotted over the 𝑩 = đ‘„ curve. To determine how well the data fits the 𝑩 = đ‘„
curve, a value called the coefficient of determination is calculated. This resulted in a 𝑅2
value
of 0.664.
Figure 5.3: Experimental vs. adjusted values of Rf
School of Chemical and Minerals Engineering
Results and discussion| 37
Figure 5.5: Plitt model predicted cut size vs. experimental cut size plotted over the y=x
curve
Figure 5.4: Experimental cut point plotted with the cut point predicted by the Plitt model
0
5
10
15
20
25
30
0 10 20 30 40 50 60
d50c[microns]
Sample number
d50c experimental
d50c predicted
School of Chemical and Minerals Engineering
Results and discussion| 38
From Figure 5.4, it is clear that the Plitt model was capable of predicting the cut point to a
certain extent. At the larger d50c values, the Plitt model tends to overpredict the d50c, while the
opposite is true for the smaller d50c values. The absolute error for the 44 validation data
points are 71.5 𝜇𝑚. The average error per predicted d50c value is thus 1.6 𝜇𝑚 which is
acceptable.
5.2.2 Sharpness of separation
The sharpness of separation results from the Plitt model could be observed on Figure 5.6.
Again, the experimental values for the sharpness of separation are connected by the blue line,
while the predicted values are connected by the orange line.
Figure 5.6: Experimental sharpness of separation plotted with the sharpness of separation
predicted by the Plitt model
0
1
2
3
4
5
6
0 10 20 30 40 50 60
Sharpnessofseparation
Sample number
m experimental
m predicted
School of Chemical and Minerals Engineering
Results and discussion| 39
Figure 5.7: Plitt model predicted m vs. experimental m plotted over the y=x curve
From Figure 5.6 and Figure 5.7, it is clear that the Plitt model is incapable of predicting the
sharpness of separation. Deviations with an absolute value of 2 could easily be observed on
these figures.
5.3 Artificial neural networks
Various experiments were conducted in order to determine which neural network architecture
and parameters will be more suited for predicting the d50c and the sharpness of separation.
The same architectures and parameters were tested on both the d50c and the sharpness of
separation.
In the first series of tests, the number of epochs and the training speed was held constant
while the number of neurons in the hidden layer was varied between 3 and 20. Below 3 hidden
neurons, the neural network lacked the complexity to adequately predict the d50c and the
School of Chemical and Minerals Engineering
Results and discussion| 40
sharpness of separation. All six neural networks mentioned in Table 4.1 were tested with these
architectures and parameters.
The architecture and parameters that were revealed to be the best out of those tested,
underwent further testing by increasing the amount of epochs by orders of magnitude and
decreasing the training speed so as to increase the chances of finding the global minimum.
The momentum and regularization terms were tested with the same architecture and
parameters as those used by neural network mentioned in the previous paragraph.
5.3.1 Cut size – d50c
5.3.1.1 Neural network screening
The results of training the neural network with only the spigot diameter, Du, as input are given
in Figure 5.8Error! Reference source not found.. Overtraining8
occurred at all of the tests
accept for the neural network that had 3 hidden neurons. It should be noted that the neural
networks stop training as soon as overtraining started.
The results in Figure 5.8 show that a simple neural network with no more than 6 hidden
neurons had the best prediction capabilities. Adding more hidden layers tends to
overcomplicate the network, leading to poorer results.
Figure 5.8: Results of neural network 2 trained with a training speed of 0.2 and a
maximum amount of epochs of 8000
8 Overtraining takes place when the artificial neural network stops to converge to an answer and starts
to diverge
50
55
60
65
70
75
80
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Combinedabsoluteerrorof44results
[microns]
Number of hidden neurons
School of Chemical and Minerals Engineering
Results and discussion| 41
By adding more input parameters to the neural network, even better results are achieved. The
results could be seen on Figure 5.9. All networks in this test were trained until overtraining
commenced. A maximum error just above 1.35 micron per validation data point was achieved
in this neural network.
Figure 5.9: Results of neural network 2 trained with a training speed of 0.2 and a
maximum amount of epochs of 8000
Two more input parameters, the split flow and the pressure drop over the cyclone were
inserted. The addition of these two parameters produced better results than the previous tests.
No trend could be observed in the absolute error as the amount of neurons were increased.
The results are given on Figure 5.10.
Figure 5.10: Results of neural network 3 trained with a training speed of 0.2 and a
maximum amount of epochs of 8000
52
53
54
55
56
57
58
59
60
61
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Combinedabsoluteerrorof44results
[microns]
Number of hidden neurons
44.4
44.6
44.8
45
45.2
45.4
45.6
45.8
46
46.2
46.4
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Combinedabsoluteerrorof44results
[microns]
Number of hidden neurons
School of Chemical and Minerals Engineering
Results and discussion| 42
5.3.1.2 Enhancing the neural network
From the results in Figure 5.8, Figure 5.9 and Figure 5.10, it is clear that the predictive power
of the neural network increases with an increase in the number of inputs. It was thus decided
to further develop neural network 3 for predicting the d50c of the hydrocyclone.
The neural network was given 12 hidden neurons and firstly trained with a maximum of 60000
epochs and a training speed of 0.02. It was expected that the absolute error observed in Figure
5.10 would decrease, instead, the error increased with almost 1.6 microns to 47.19 𝜇𝑚. An
explanation for this phenomenon could be that this neural network just happened to step over
the local minimum that was found by the neural network in Figure 5.10. Another representation
of the results is given in Figure 5.12.
Although there are some neural network output values that differ with 2 𝜇𝑚, from the
experimental d50c values, Figure 5.11 shows that the neural network has adequate prediction
power.
Figure 5.11: Calculated d50c plotted with the experimental d50c values of neural network 3
trained with a maximum of 60000 epochs and a training speed of 0.02
13
15
17
19
21
23
25
27
0 10 20 30 40 50
d_50c
Validation sample number
Calculated d_50c
Experimental d_50c
School of Chemical and Minerals Engineering
Results and discussion| 43
Figure 5.12: Predicted d50c vs. experimental d50c plotted over the y=x curve for the neural
network trained with a maximum of 60000 epochs and a training speed of 0.02
For the next enhancement, the momentum term will be used. The momentum constant was
given a value of 1 × 10−6
. The other parameters and architecture of the neural network
remains unchanged. A significant reduction of more than 3 𝜇𝑚 was observed in the combined
error when compared to the previous test. The value of the combined error in this case is
44.15 𝜇𝑚. It can also be seen on Figure 5.14 that the 𝑅2
value decreased by 0.05 to 0.795.
When looking at the calculated and experimental graph on Figure 5.13, certain improvements
could be spotted. As an example, the last validation data point lies on the predicted d50c value.
This was not the case in Figure 5.11.
School of Chemical and Minerals Engineering
Results and discussion| 44
Figure 5.13: Calculated vs. experimental values of neural network 3 trained with a maximum of
60000 epochs and a training speed of 0.02 with the addition of the momentum term
Figure 5.14: Predicted d50c vs. experimental d50c plotted over the y=x curve for a neural
network trained with a maximum of 60000 epochs and a training speed of 0.02
13
15
17
19
21
23
25
27
0 10 20 30 40 50
d_50c
Validation sample number
Calculated d_50c
Experimental d_50c
School of Chemical and Minerals Engineering
Results and discussion| 45
For the final neural network enhancement, the momentum term will be deactivated while the
regularization term will the activated. The regularization constant was set to a value of 1 ×
10−4
. The regularization term only produced slight improvements when compared to the
initial neural network enhancement. The resulting combined absolute error was 46.45 𝜇𝑚.
The validation results are displayed on Figure 5.15. The predicted vs. experimental d50c
could be observed on Figure 5.16. Only slight differences are observed in the graphs of
Figure 5.13 and Figure 5.15.
Figure 5.15: Calculated vs. experimental values of neural network 3 trained with a maximum
of 60000 epochs and a training speed of 0.02 with the addition of the regularization term
13
15
17
19
21
23
25
27
0 10 20 30 40 50
d_50c
Validation sample number
Calculated d_50c
Experimental d_50c
School of Chemical and Minerals Engineering
Results and discussion| 46
Figure 5.16: Predicted d50c vs. experimental d50c plotted on the y=x curve for the neural
network trained with a maximum of 60000 epochs and a training speed of 0.02 with the
addition of the regularization term
5.3.2 Sharpness of separation
5.3.2.1 Screening of neural networks
Screening of the neural networks with the sharpness of separation as output was done the
same way as the screening of the d50c neural networks. The results of neural networks 4, 5
and 6 are given in Figure 5.17, Figure 5.18 and Figure 5.19 respectively. Most of the networks
were trained until overtraining commenced.
School of Chemical and Minerals Engineering
Results and discussion| 47
Figure 5.17: Results of neural network 4 trained with a training speed of 0.5 and a
maximum amount of epochs of 20000
Figure 5.18: Results of neural network 5 trained with a training speed of 0.5 and a
maximum amount of epochs of 25000
9
9.2
9.4
9.6
9.8
10
10.2
10.4
10.6
10.8
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Combinedabsoluteerrorof44results
[microns]
Number of hidden neurons
8
8.5
9
9.5
10
10.5
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Combinedabsoluteerrorof44results
[microns]
Number of hidden neurons
School of Chemical and Minerals Engineering
Results and discussion| 48
Figure 5.19: Results of neural network 6 trained with a training speed of 0.2 and a
maximum amount of epochs of 8000
The same phenomena that happened in neural networks 1, 2 and 3 were observed in neural
networks 4, 5 and 6. When the amount of inputs to the neural network was less than or equal
to 3, the predictive capability of the neural networks reached their peak when the amount of
hidden neurons were capped at 8. There again was no trend in the prediction power of the
neural network as the amount of hidden neurons were increased for the neural network that
had 5 inputs. An increase in the amount of inputs to the neural network also lead to an
improved predicting capability. It was thus decided that neural network 6 should be further
developed.
5.3.2.2 Enhancing the neural network
It was decided that neural network 6 should be given 13 hidden nodes, as good results were
obtained with this amount of hidden nodes as can be seen on Figure 5.19. The network was
trained with a maximum of 60000 epochs and a training speed of 0.02. After the first test, the
momentum term was added with a momentum constant of 1 × 105
and in the second test, the
momentum term was deactivated and the regularization term was inserted with a
regularization constant of 0.001. The results could be observed on Figure 5.20, Figure 5.22
and Figure 5.24. Predicted vs. calculated plots could be seen on Figure 5.21, Figure 5.23 and
Figure 5.25.
9.06
9.07
9.08
9.09
9.1
9.11
9.12
9.13
9.14
9.15
9.16
9.17
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Combinedabsoluteerrorof44results
[microns]
Number of hidden neurons
School of Chemical and Minerals Engineering
Results and discussion| 49
Figure 5.20: Experimental and predicted values of neural network 6 trained with a
maximum of 60000 epochs and a training speed of 0.02
Figure 5.21: Predicted vs. experimental m plotted over the y=x graph for neural network 6
trained with a maximum of 60000 epochs and a training speed of 0.02
2
2.5
3
3.5
4
4.5
0 5 10 15 20 25 30 35
m
Validation sample number
Experimental m
Predicted m
School of Chemical and Minerals Engineering
Results and discussion| 50
Figure 5.22: Predicted and experimental values of neural network 6 trained with a
maximum of 60000 epochs and a training speed of 0.02 with the addition of the
momentum term
Figure 5.23: Predicted m vs. experimental values m plotted over the y=x line for neural
network 6 trained with a maximum of 60000 epochs and a training speed of 0.02 with the
addition of the momentum term
2
2.5
3
3.5
4
4.5
0 5 10 15 20 25 30 35
m
Validation sample number
Experimental m
Predicted m
School of Chemical and Minerals Engineering
Results and discussion| 51
Figure 5.24: Predicted vs. experimental values of neural network 6 trained with a
maximum of 60000 epochs and a training speed of 0.02 with the addition of the
regularization term
Figure 5.25: Predicted m vs. experimental m plotted over the y=x curve for neural network
6 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition
of the regularization term
2
2.5
3
3.5
4
4.5
0 5 10 15 20 25 30 35
m
Validation sample number
Experimental m
Predicted m
School of Chemical and Minerals Engineering
Results and discussion| 52
Accept for the large outliers observed near validation sample number 20 and at validation
sample 13, the sharpness of separation was predicted with reasonable accuracy. When
comparing the graphs, Figure 5.20, Figure 5.22 and Figure 5.24, one observes that there are
no significant differences. The combined absolute error of the neural network that had no
regularization nor momentum term had a combined validation error of 9.11 for the sharpness
of separation, meaning that the sharpness of separation was out with an average value of
0.21 per validation sample. The momentum term further decreased the combined error to
9.04, while the addition of the regularization term significantly decreased the combined error
to 8.87, meaning that the average error per validation sample prediction was decreased to
0.2.
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance
Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance

Weitere Àhnliche Inhalte

Was ist angesagt?

Deeksha membrane distillation
Deeksha membrane distillationDeeksha membrane distillation
Deeksha membrane distillationtranslateds
 
Fixed Bed Reactor Scale-up Checklist
Fixed Bed Reactor Scale-up ChecklistFixed Bed Reactor Scale-up Checklist
Fixed Bed Reactor Scale-up ChecklistGerard B. Hawkins
 
Hydrogen electrolyser capacity investment in the Australian context: optimiza...
Hydrogen electrolyser capacity investment in the Australian context: optimiza...Hydrogen electrolyser capacity investment in the Australian context: optimiza...
Hydrogen electrolyser capacity investment in the Australian context: optimiza...IEA-ETSAP
 
MASS TRANSFER OPERATION-DISTILLATION
MASS TRANSFER OPERATION-DISTILLATIONMASS TRANSFER OPERATION-DISTILLATION
MASS TRANSFER OPERATION-DISTILLATIONHoneyAgrawal16
 
AIR DISPERSION MODELLING
AIR DISPERSION MODELLINGAIR DISPERSION MODELLING
AIR DISPERSION MODELLINGRajat Nag
 
Webinar Series: Carbon Sequestration Leadership Forum Part 1. CCUS in the Uni...
Webinar Series: Carbon Sequestration Leadership Forum Part 1. CCUS in the Uni...Webinar Series: Carbon Sequestration Leadership Forum Part 1. CCUS in the Uni...
Webinar Series: Carbon Sequestration Leadership Forum Part 1. CCUS in the Uni...Global CCS Institute
 
CCUS in the USA: Activity, Prospects, and Academic Research - plenary present...
CCUS in the USA: Activity, Prospects, and Academic Research - plenary present...CCUS in the USA: Activity, Prospects, and Academic Research - plenary present...
CCUS in the USA: Activity, Prospects, and Academic Research - plenary present...UK Carbon Capture and Storage Research Centre
 
OECD Workshop “Climate transition scenarios: integrating models into risk ass...
OECD Workshop “Climate transition scenarios: integrating models into risk ass...OECD Workshop “Climate transition scenarios: integrating models into risk ass...
OECD Workshop “Climate transition scenarios: integrating models into risk ass...OECD Environment
 
A Better Life with a Healthy Planet: Pathways to Net Zero Emissions
A Better Life with a Healthy Planet: Pathways to Net Zero EmissionsA Better Life with a Healthy Planet: Pathways to Net Zero Emissions
A Better Life with a Healthy Planet: Pathways to Net Zero EmissionsGlobal CCS Institute
 
Membrane distillation
Membrane distillationMembrane distillation
Membrane distillationYahia Reda
 
Membrane Distillation, a thermally driven separation water treatment technolo...
Membrane Distillation, a thermally driven separation water treatment technolo...Membrane Distillation, a thermally driven separation water treatment technolo...
Membrane Distillation, a thermally driven separation water treatment technolo...Leonardo ENERGY
 

Was ist angesagt? (12)

Deeksha membrane distillation
Deeksha membrane distillationDeeksha membrane distillation
Deeksha membrane distillation
 
Fixed Bed Reactor Scale-up Checklist
Fixed Bed Reactor Scale-up ChecklistFixed Bed Reactor Scale-up Checklist
Fixed Bed Reactor Scale-up Checklist
 
Hydrogen electrolyser capacity investment in the Australian context: optimiza...
Hydrogen electrolyser capacity investment in the Australian context: optimiza...Hydrogen electrolyser capacity investment in the Australian context: optimiza...
Hydrogen electrolyser capacity investment in the Australian context: optimiza...
 
MASS TRANSFER OPERATION-DISTILLATION
MASS TRANSFER OPERATION-DISTILLATIONMASS TRANSFER OPERATION-DISTILLATION
MASS TRANSFER OPERATION-DISTILLATION
 
L 25 and 26 final
L 25 and 26  finalL 25 and 26  final
L 25 and 26 final
 
AIR DISPERSION MODELLING
AIR DISPERSION MODELLINGAIR DISPERSION MODELLING
AIR DISPERSION MODELLING
 
Webinar Series: Carbon Sequestration Leadership Forum Part 1. CCUS in the Uni...
Webinar Series: Carbon Sequestration Leadership Forum Part 1. CCUS in the Uni...Webinar Series: Carbon Sequestration Leadership Forum Part 1. CCUS in the Uni...
Webinar Series: Carbon Sequestration Leadership Forum Part 1. CCUS in the Uni...
 
CCUS in the USA: Activity, Prospects, and Academic Research - plenary present...
CCUS in the USA: Activity, Prospects, and Academic Research - plenary present...CCUS in the USA: Activity, Prospects, and Academic Research - plenary present...
CCUS in the USA: Activity, Prospects, and Academic Research - plenary present...
 
OECD Workshop “Climate transition scenarios: integrating models into risk ass...
OECD Workshop “Climate transition scenarios: integrating models into risk ass...OECD Workshop “Climate transition scenarios: integrating models into risk ass...
OECD Workshop “Climate transition scenarios: integrating models into risk ass...
 
A Better Life with a Healthy Planet: Pathways to Net Zero Emissions
A Better Life with a Healthy Planet: Pathways to Net Zero EmissionsA Better Life with a Healthy Planet: Pathways to Net Zero Emissions
A Better Life with a Healthy Planet: Pathways to Net Zero Emissions
 
Membrane distillation
Membrane distillationMembrane distillation
Membrane distillation
 
Membrane Distillation, a thermally driven separation water treatment technolo...
Membrane Distillation, a thermally driven separation water treatment technolo...Membrane Distillation, a thermally driven separation water treatment technolo...
Membrane Distillation, a thermally driven separation water treatment technolo...
 

Andere mochten auch

Development and comparison of neural network based soft sensors for online es...
Development and comparison of neural network based soft sensors for online es...Development and comparison of neural network based soft sensors for online es...
Development and comparison of neural network based soft sensors for online es...ISA Interchange
 
An introduction to basic hydrocyclone
An introduction to basic hydrocycloneAn introduction to basic hydrocyclone
An introduction to basic hydrocycloneshakib afzal
 
Hydrocyclon
HydrocyclonHydrocyclon
HydrocyclonLith Ali
 
Cyclone design
Cyclone designCyclone design
Cyclone designSuraj Gaikwad
 
Cyclone separator maintenance
Cyclone separator maintenanceCyclone separator maintenance
Cyclone separator maintenanceSohail Ahmad Malik
 
Cyclone separator
Cyclone separatorCyclone separator
Cyclone separatorWaqas Ahmed
 
Design Theory Cyclones
Design Theory  CyclonesDesign Theory  Cyclones
Design Theory CyclonesShambhudayal
 

Andere mochten auch (7)

Development and comparison of neural network based soft sensors for online es...
Development and comparison of neural network based soft sensors for online es...Development and comparison of neural network based soft sensors for online es...
Development and comparison of neural network based soft sensors for online es...
 
An introduction to basic hydrocyclone
An introduction to basic hydrocycloneAn introduction to basic hydrocyclone
An introduction to basic hydrocyclone
 
Hydrocyclon
HydrocyclonHydrocyclon
Hydrocyclon
 
Cyclone design
Cyclone designCyclone design
Cyclone design
 
Cyclone separator maintenance
Cyclone separator maintenanceCyclone separator maintenance
Cyclone separator maintenance
 
Cyclone separator
Cyclone separatorCyclone separator
Cyclone separator
 
Design Theory Cyclones
Design Theory  CyclonesDesign Theory  Cyclones
Design Theory Cyclones
 

Ähnlich wie Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance

e-Research & the art of linking Astrophysics to Deforestation
e-Research & the art of linking Astrophysics to Deforestatione-Research & the art of linking Astrophysics to Deforestation
e-Research & the art of linking Astrophysics to DeforestationDavid Wallom
 
Mohamed Abuella_Presentation_2023.pdf
Mohamed Abuella_Presentation_2023.pdfMohamed Abuella_Presentation_2023.pdf
Mohamed Abuella_Presentation_2023.pdfMohamed Abuella
 
Hyun wong thesis 2019 06_22_rev40_final_grammerly
Hyun wong thesis 2019 06_22_rev40_final_grammerlyHyun wong thesis 2019 06_22_rev40_final_grammerly
Hyun wong thesis 2019 06_22_rev40_final_grammerlyHyun Wong Choi
 
Mohamed Abuella_Presentation_2023.pptx
Mohamed Abuella_Presentation_2023.pptxMohamed Abuella_Presentation_2023.pptx
Mohamed Abuella_Presentation_2023.pptxMohamed Abuella
 
Geldenhuys model 20418
Geldenhuys model 20418Geldenhuys model 20418
Geldenhuys model 20418hajarchokri1
 
Hyun wong thesis 2019 06_22_rev40_final_printed
Hyun wong thesis 2019 06_22_rev40_final_printedHyun wong thesis 2019 06_22_rev40_final_printed
Hyun wong thesis 2019 06_22_rev40_final_printedHyun Wong Choi
 
ise sensor
ise sensorise sensor
ise sensorHopNguyen51
 
Distillation Column Process Fault Detection in the Chemical Industries
Distillation Column Process Fault Detection in the Chemical IndustriesDistillation Column Process Fault Detection in the Chemical Industries
Distillation Column Process Fault Detection in the Chemical IndustriesISA Interchange
 
Fault Detection in the Distillation Column Process
Fault Detection in the Distillation Column ProcessFault Detection in the Distillation Column Process
Fault Detection in the Distillation Column ProcessISA Interchange
 
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...IAEME Publication
 
Hyun wong thesis 2019 06_22_rev40_final
Hyun wong thesis 2019 06_22_rev40_finalHyun wong thesis 2019 06_22_rev40_final
Hyun wong thesis 2019 06_22_rev40_finalHyun Wong Choi
 
ReComp: challenges in selective recomputation of (expensive) data analytics t...
ReComp: challenges in selective recomputation of (expensive) data analytics t...ReComp: challenges in selective recomputation of (expensive) data analytics t...
ReComp: challenges in selective recomputation of (expensive) data analytics t...Paolo Missier
 
Energy Engineering
Energy Engineering Energy Engineering
Energy Engineering Anish Anish
 
Thesis_Yu_Ting_Huang_2013
Thesis_Yu_Ting_Huang_2013Thesis_Yu_Ting_Huang_2013
Thesis_Yu_Ting_Huang_2013Yu-Ting Huang
 
Modelling and simulation in pulp and paper industry
Modelling and simulation in pulp and paper industryModelling and simulation in pulp and paper industry
Modelling and simulation in pulp and paper industryHuy Nguyen
 
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...Paolo Missier
 
IRJET- Modelling BOD and COD using Artificial Neural Network with Factor Anal...
IRJET- Modelling BOD and COD using Artificial Neural Network with Factor Anal...IRJET- Modelling BOD and COD using Artificial Neural Network with Factor Anal...
IRJET- Modelling BOD and COD using Artificial Neural Network with Factor Anal...IRJET Journal
 
Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...IRJET Journal
 

Ähnlich wie Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance (20)

001
001001
001
 
e-Research & the art of linking Astrophysics to Deforestation
e-Research & the art of linking Astrophysics to Deforestatione-Research & the art of linking Astrophysics to Deforestation
e-Research & the art of linking Astrophysics to Deforestation
 
Mohamed Abuella_Presentation_2023.pdf
Mohamed Abuella_Presentation_2023.pdfMohamed Abuella_Presentation_2023.pdf
Mohamed Abuella_Presentation_2023.pdf
 
Hyun wong thesis 2019 06_22_rev40_final_grammerly
Hyun wong thesis 2019 06_22_rev40_final_grammerlyHyun wong thesis 2019 06_22_rev40_final_grammerly
Hyun wong thesis 2019 06_22_rev40_final_grammerly
 
003
003003
003
 
Mohamed Abuella_Presentation_2023.pptx
Mohamed Abuella_Presentation_2023.pptxMohamed Abuella_Presentation_2023.pptx
Mohamed Abuella_Presentation_2023.pptx
 
Geldenhuys model 20418
Geldenhuys model 20418Geldenhuys model 20418
Geldenhuys model 20418
 
Hyun wong thesis 2019 06_22_rev40_final_printed
Hyun wong thesis 2019 06_22_rev40_final_printedHyun wong thesis 2019 06_22_rev40_final_printed
Hyun wong thesis 2019 06_22_rev40_final_printed
 
ise sensor
ise sensorise sensor
ise sensor
 
Distillation Column Process Fault Detection in the Chemical Industries
Distillation Column Process Fault Detection in the Chemical IndustriesDistillation Column Process Fault Detection in the Chemical Industries
Distillation Column Process Fault Detection in the Chemical Industries
 
Fault Detection in the Distillation Column Process
Fault Detection in the Distillation Column ProcessFault Detection in the Distillation Column Process
Fault Detection in the Distillation Column Process
 
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
 
Hyun wong thesis 2019 06_22_rev40_final
Hyun wong thesis 2019 06_22_rev40_finalHyun wong thesis 2019 06_22_rev40_final
Hyun wong thesis 2019 06_22_rev40_final
 
ReComp: challenges in selective recomputation of (expensive) data analytics t...
ReComp: challenges in selective recomputation of (expensive) data analytics t...ReComp: challenges in selective recomputation of (expensive) data analytics t...
ReComp: challenges in selective recomputation of (expensive) data analytics t...
 
Energy Engineering
Energy Engineering Energy Engineering
Energy Engineering
 
Thesis_Yu_Ting_Huang_2013
Thesis_Yu_Ting_Huang_2013Thesis_Yu_Ting_Huang_2013
Thesis_Yu_Ting_Huang_2013
 
Modelling and simulation in pulp and paper industry
Modelling and simulation in pulp and paper industryModelling and simulation in pulp and paper industry
Modelling and simulation in pulp and paper industry
 
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...
 
IRJET- Modelling BOD and COD using Artificial Neural Network with Factor Anal...
IRJET- Modelling BOD and COD using Artificial Neural Network with Factor Anal...IRJET- Modelling BOD and COD using Artificial Neural Network with Factor Anal...
IRJET- Modelling BOD and COD using Artificial Neural Network with Factor Anal...
 
Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...
 

Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone performance

  • 1. School of Chemical and Mineral Engineering CEMI479 Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone separation performance Neil Zietsman 23379936 Supervisor: Mr. A.F. van der Merwe North-West University Potchefstroom Campus Date of submission: 26 October 2015
  • 2. School of Chemical and Minerals Engineering Declaration| i Declaration L.N. Zietsman 23379936, hereby declare that:  the text and references of this study reflect the sources I have consulted and  sections with no source references are my own ideas, arguments and/or conclusions. This declaration is for the report entitled CEMI479: Comparison between the Plitt model and an artificial neural network in predicting hydrocyclone separation performance submitted for the partial fulfilment of the requirements for the B.Eng. Chemical Engineering degree at the North-West University, Potchefstroom Campus. Signed at Potchefstroom on the day of ______ October 2015. _______________________ L.N. Zietsman 23379936
  • 3. School of Chemical and Minerals Engineering Acknowledgements| ii Acknowledgements I would like to thank the following people for their help during the year with my project:  My God for giving me strength during the year to complete this project  Mr. A.F. van der Merwe, my study leader for his help and guidance.  Workshop personnel for their help with regard to the technical problems that occurred during the course of the year.  Mrs. Sanet Botes for her help with ordering the needed items in the project  Miss. Sarita van Loggenberg, my colleague, who helped me perform the hydrocyclone experiments.  Mr. Nico Lemmer for his help on the Malvern Mastersizer 2000
  • 4. School of Chemical and Minerals Engineering Abstract| iii Abstract The hydrocyclone is an invaluable process unit which is popular for its use in the mineral processing industry. As all classifiers, the hydrocyclone is not capable of perfect separation. The ability of the hydrocyclone to separate particles into the correct streams could be represented by a curve, known as a partition curve. Two important variables could be obtained from the partition curve – the cut size, d50c, and the sharpness of separation. These two variables could be used to fully describe the separation efficiency of the hydrocyclone. Optimal control of the hydrocyclone could be achieved if accurate values of the d50c and sharpness of separation could be obtained. Unfortunately, this is easier said than done. On-line instrumentation for direct analysis of these variables are unheard of. Additionally, the complex flow inside the hydrocyclone makes it impossible to determine these variables indirectly through first principle calculations. The solution is inference sensors, which make use of easily measured variables, like the flowrate and solids percentage to determine the d50c and sharpness of separation. Two methods of inference sensoring was covered in this study, namely an empirical method (Plitt model) and an artificial neural network. The modified Plitt model was specifically used in this case where its fudging factors were changed to fit experimental data. The Plitt model was only capable of predicting the d50c to a certain extent, but failed to predict the sharpness of separation. The artificial neural network was trained with the backpropagation algorithm. The more input variables the artificial neural network had, the better its predicting capability became. The addition of regularization and momentum terms further increased the prediction power of the neural network. Keywords: hydrocyclone; d50c; sharpness of separation; artificial neural network; Plitt model, fine cut point, variable size spigot
  • 5. School of Chemical and Minerals Engineering Attached documents| iv Attached documents Folder name File name Description Experimental error Experimental Error ExcelÂź spreadsheet containing the data and calculations that were done to determine the experimental error Plitt model Plitt model ExcelÂź spreadsheet containing the calculations performed on the experimental data with the Plitt model Artificial neural networks Neil'sANN.rev3-d50c - Du; Neil'sANN.rev3-d50c - Du+phi+Q; Neil'sANN.rev3-d50c - Du+phi+Q+P+S; Neil'sANN.rev3-m - Du; Neil'sANN.rev3-m - Du+phi+Q; Neil'sANN.rev3-m - Du+phi+Q+P+S The macro enabled ExcelÂź spreadsheets contain the program with which the artificial neural networks were trained and validated Meetings Various files This folder contains all the minutes and agendas of each meeting in Microsoft WordÂź format Data processing Data processing Contains the ExcelÂź spreadsheet with which the data processing was done MSDS MSDS – Silica flour This is a PDF document containing the MSDS of silica flour Gantt chart Gantt chart This folder contains a Gantt chart that is both in PDF format and MS Project format
  • 6. School of Chemical and Minerals Engineering Table of contents| v Table of contents Declaration............................................................................................................................. i Acknowledgements................................................................................................................ii Abstract.................................................................................................................................iii Attached documents .............................................................................................................iv Table of contents .................................................................................................................. v List of figures .......................................................................................................................vii List of tables..........................................................................................................................ix List of acronyms.................................................................................................................... x List of symbols ...................................................................................................................... x Chapter 1 - Introduction ........................................................................................................ 1 1.1 Background............................................................................................................. 1 1.2 Problem statement.................................................................................................. 1 1.3 Aim and objectives.................................................................................................. 2 1.3.1 Aim .................................................................................................................. 2 1.3.2 Objective.......................................................................................................... 2 1.3.3 Methodology .................................................................................................... 2 Chapter 2 - Literature study................................................................................................... 3 2.1 The hydrocyclone ................................................................................................... 3 2.2 Hydrocyclone control .............................................................................................. 6 2.2.1 Sensors used in hydrocyclone performance determination .............................. 6 2.3 Soft sensors............................................................................................................ 7 2.3.1 Empirical models ............................................................................................. 8 2.3.2 Artificial neural networks................................................................................ 10 Chapter 3 - Experimental procedure ................................................................................... 20 3.1 Overview............................................................................................................... 20
  • 7. School of Chemical and Minerals Engineering Table of contents| vi 3.2 Raw materials....................................................................................................... 20 3.3 Equipment ............................................................................................................ 20 3.4 Experimental setup ............................................................................................... 20 3.5 Experimental procedure........................................................................................ 23 3.5.1 Preparation.................................................................................................... 23 3.5.2 Sampling........................................................................................................ 25 3.5.3 Analysing....................................................................................................... 26 3.5.4 Experimental error ......................................................................................... 27 Chapter 4 - Model development.......................................................................................... 30 4.1 Overview............................................................................................................... 30 4.2 The Plitt model...................................................................................................... 30 4.2.1 Split flow ........................................................................................................ 30 4.2.2 Cut size – d50c ................................................................................................ 31 4.2.3 Sharpness of separation ................................................................................ 31 4.3 The artificial neural network .................................................................................. 31 4.3.1 Artificial neural network architecture .............................................................. 31 Chapter 5 - Results and discussion..................................................................................... 34 5.1 Deviations in the feed PSD ................................................................................... 34 5.2 Plitt model............................................................................................................. 36 5.2.1 Cut size – d50c ................................................................................................ 36 5.2.2 Sharpness of separation ................................................................................ 38 5.3 Artificial neural networks ....................................................................................... 39 5.3.1 Cut size – d50c ................................................................................................ 40 5.3.2 Sharpness of separation ................................................................................ 46 Chapter 6 - Conclusion and recommendations.................................................................... 53 6.1 Conclusion............................................................................................................ 53 6.2 Recommendations................................................................................................ 53 6.3 Further study......................................................................................................... 54
  • 8. School of Chemical and Minerals Engineering List of figures| vii Bibliography........................................................................................................................ 55 Appendix A Data processing ............................................................................................ I Appendix B Data processing source code ......................................................................IV Appendix C Processed data ...................................................................................... XVIII Appendix D Experimental error data............................................................................ XXI Appendix E ANN source code .................................................................................... XXII Appendix F ECSA exit level outcomes ..................................................................... XXXII Appendix G Hazard identification and risk assessment.............................................XXXV List of figures Figure 2.1: Hypothetical flow inside the hydrocyclone viewed from the top of the hydrocyclone. Adapted from Plitt (1976) ...................................................................................................... 4 Figure 2.2: Corrected and non-corrected partition curve adapted from Schneider (2001)...... 5 Figure 2.3: Diagram of the computational nodes and weights of an artificial neural network adapted from Jain (1996).................................................................................................... 10 Figure 2.4: Supervised learning with reference to Hagan et al. (2002) ................................ 13 Figure 2.5: Polynomial of first order produces a bad fit for the data. Reproduced from Bishop (2008:11) ............................................................................................................................ 14 Figure 2.6: Polynomial of high order producing something that looks like a good fit for all the data points, but the predictive power of the polynomial is sacrificed. Reproduced from Bishop (2008:12). ........................................................................................................................... 15 Figure 2.7: A lower order polynomial that has the capability to generalize well ................... 16 Figure 3.1: Diagram of the hydrocyclone setup ................................................................... 21 Figure 3.2: The experimental hydrocyclone setup............................................................... 22 Figure 3.3: The Marcy scale................................................................................................ 24 Figure 3.4: Malvern Mastersizer 2000................................................................................. 26 Figure 3.5: Experimental error of the d50c with a 95% confidence interval.......................... 29 Figure 3.6: Experimental error of the sharpness of separation with a 95% confidence interval ........................................................................................................................................... 29
  • 9. School of Chemical and Minerals Engineering List of figures| viii Figure 4.1: Experimental split flow values plotted with the predicted Plitt model split flow values ........................................................................................................................................... 31 Figure 4.2: Learning capability of one of the 6 developed artificial neural networks............. 33 Figure 5.1: Particle size distribution of 25 different feed samples ........................................ 34 Figure 5.2: Example partition curve before justifications...................................................... 35 Figure 5.3: Experimental vs. adjusted values of Rf .............................................................. 36 Figure 5.4: Experimental cut point plotted with the cut point predicted by the Plitt model .... 37 Figure 5.5: Plitt model predicted cut size vs. experimental cut size plotted over the y=x curve ........................................................................................................................................... 37 Figure 5.6: Experimental sharpness of separation plotted with the sharpness of separation predicted by the Plitt model................................................................................................. 38 Figure 5.7: Plitt model predicted m vs. experimental m plotted over the y=x curve.............. 39 Figure 5.8: Results of neural network 2 trained with a training speed of 0.2 and a maximum amount of epochs of 8000................................................................................................... 40 Figure 5.9: Results of neural network 2 trained with a training speed of 0.2 and a maximum amount of epochs of 8000................................................................................................... 41 Figure 5.10: Results of neural network 3 trained with a training speed of 0.2 and a maximum amount of epochs of 8000................................................................................................... 41 Figure 5.11: Calculated d50c plotted with the experimental d50c values of neural network 3 trained with a maximum of 60000 epochs and a training speed of 0.02 .............................. 42 Figure 5.12: Predicted d50c vs. experimental d50c plotted over the y=x curve for the neural network trained with a maximum of 60000 epochs and a training speed of 0.02 ................. 43 Figure 5.13: Calculated vs. experimental values of neural network 3 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the momentum term...... 44 Figure 5.14: Predicted d50c vs. experimental d50c plotted over the y=x curve for a neural network trained with a maximum of 60000 epochs and a training speed of 0.02 .............................. 44 Figure 5.15: Calculated vs. experimental values of neural network 3 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the regularization term .. 45 Figure 5.16: Predicted d50c vs. experimental d50c plotted on the y=x curve for the neural network trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the regularization term ....................................................................................... 46
  • 10. School of Chemical and Minerals Engineering List of tables| ix Figure 5.17: Results of neural network 4 trained with a training speed of 0.5 and a maximum amount of epochs of 20000................................................................................................. 47 Figure 5.18: Results of neural network 5 trained with a training speed of 0.5 and a maximum amount of epochs of 25000................................................................................................. 47 Figure 5.19: Results of neural network 6 trained with a training speed of 0.2 and a maximum amount of epochs of 8000................................................................................................... 48 Figure 5.20: Experimental and predicted values of neural network 6 trained with a maximum of 60000 epochs and a training speed of 0.02..................................................................... 49 Figure 5.21: Predicted vs. experimental m plotted over the y=x graph for neural network 6 trained with a maximum of 60000 epochs and a training speed of 0.02 .............................. 49 Figure 5.22: Predicted and experimental values of neural network 6 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the momentum term...... 50 Figure 5.23: Predicted m vs. experimental values m plotted over the y=x line for neural network 6 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the momentum term............................................................................................................ 50 Figure 5.24: Predicted vs. experimental values of neural network 6 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the regularization term ...... 51 Figure 5.25: Predicted m vs. experimental m plotted over the y=x curve for neural network 6 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the regularization term .............................................................................................................. 51 List of tables Table 2.1: Sensors used in the on-line monitoring of hydrocyclone performance .................. 7 Table 3.1: Processed data used for the experimental error determination........................... 27 Table 3.2: Values for substitution into the student's t equation ............................................ 28 Table 4.1: Different artificial neural networks that were programmed .................................. 32
  • 11. School of Chemical and Minerals Engineering List of acronyms| x List of acronyms Acronym Description ANN Artificial neural network HIRA Hazard identification and risk assessment MS Microsoft MSDS Material safety data sheet PPE Personal protective equipment PSD Particle size distribution List of symbols Symbol Description Al2O3 Aluminium oxide K2O Potassium oxide Fe2O3 Iron(III) oxide CaO Calcium oxide Na2O Sodium oxide 𝑑 Size of a particle in 𝜇𝑚. 𝑑50𝑐 Hydrocyclone corrected cut point. This is the particle size that has an equal chance of either leaving through the underflow or the overflow. Its unit is in 𝜇𝑚. đ·đ‘ Hydrocyclone diameter in 𝑐𝑚 đ·đ‘– Inlet diameter in 𝑐𝑚 đ· 𝑜 Vortex finder diameter in 𝑐𝑚
  • 12. School of Chemical and Minerals Engineering List of symbols| xi đ· 𝑱 Underflow/apex/spigot diameter in 𝑐𝑚 𝛿𝑗 𝑂 Error of the output node 𝑗. 𝛿𝑗 đ» Error of the hidden node 𝑗. 𝐾 Error of a neuron’s output đžÌƒ Conditioned error of the neuron’s output 𝜂 𝑣 Viscosity of the carrier fluid in 𝑐𝑝 đč𝑖 Fudging factor of the modified Plitt model where 𝑖 = 1,2,3 
 ℎ Free vortex height in 𝑐𝑚 𝑘 Constant that takes into account the effect of the solids density on the corrected cut size. 𝑚 Sharpness of separation. This is the slope of the partition curve that indicates how well the classification is taking place inside the hydrocyclone. The higher the value of m, the closer the hydrocyclone will be to an ideal classifier. 𝑀 Momentum factor defined by the user 𝑚 𝑠 Mass of silica sand that has to be added to the storage tank in 𝑘𝑔 𝑛 The amount of weights attached to node 𝑗 𝑛𝑱 Number of data points available in the set Ω Penalty term 𝑃 Pressure over the hydrocyclone in 𝑘𝑃𝑎 𝜑 Percentage solids in the feed 𝑄 Volumetric feed flow rate in 𝑙𝑖𝑡𝑒𝑟𝑠 𝑚𝑖𝑛𝑱𝑡𝑒 𝑅 Regularization factor defined by the user
  • 13. School of Chemical and Minerals Engineering List of symbols| xii 𝑅𝑓 Recovery of the carrier liquid to the underflow 𝜌 𝑝 Density of the hydrocyclone feed slurry in 𝑔 𝑐𝑚3 𝜌𝑠 Density of the solid phase in 𝑔 𝑐𝑚3 𝑆 Split flow – The volumetric flow of the underflow divided by the volumetric 𝑆𝑡 Standard deviation in the data 𝜎𝑗 đ» Output value of the transfer function of node 𝑗 in the hidden layer 𝜎𝑗 𝑂 Output of the neuron in the output layer 𝑗. 𝑡 𝑛−1( đ›Œ 2 ) Critical t value that could be obtained from the back cover of Devore and Farnum (2005) 𝑇(95%) Critical t value for a 95% confidence. 𝜐 Parameter for controlling the importance of the bias term đ‘‰đ‘€ Volume of water in the storage tank in 𝑚3 đ‘€đ‘– Value of weight 𝑖, where 𝑖 = 1,2,3 
 đ‘€đ‘–đ‘— Value of the weight that goes from node 𝑖 to node 𝑗. đ‘€đ‘–đ‘— đ» Value of the hidden layer weight that goes from node 𝑖 to 𝑗. âˆ†đ‘€đ‘–đ‘— đ» Value with which weight đ‘€đ‘–đ‘— đ» has to be updated đ‘€đ‘–đ‘— đ» (𝑡 − 1) The value of the previous weight đ‘€đ‘–đ‘— đ» đ‘€đ‘–đ‘— 𝑂 Value of the output weight that goes from node 𝑖 to node 𝑗. âˆ†đ‘€đ‘–đ‘— 𝑂 Value with which the output weight, đ‘€đ‘–đ‘— 𝑂 has to be updated đ‘€đ‘–đ‘— 𝑂 (𝑡 − 1) Value of the previous weight, đ‘€đ‘–đ‘— 𝑂 𝑋 Average of a data set
  • 14. School of Chemical and Minerals Engineering List of symbols| xiii đ‘„đ‘– Input value from weight 𝑖. đ‘„đ‘— đ» Output of the hidden node 𝑗. đ‘„đ‘— 𝑂 Output of the output node 𝑗. 𝜉𝑗 Signal sent to node 𝑗, where 𝑗 = 1,2,3 
 𝑩 Partition number. This is the value displayed on the partition curve’s y- axis for a certain particle size, 𝑑. 𝑩â€Č Corrected partition number 𝑩𝑗 Desired output of output node 𝑗 𝑧 Amount of weights connected to the cell
  • 15. School of Chemical and Minerals Engineering Introduction| 1 Chapter 1 - Introduction 1.1 Background Hydrocyclones are very handy process units when it comes to classifying particles according to size or density in the mineral processing industry. Unfortunately, it is very difficult for the operator to monitor the performance of the hydrocyclone while on-line (Coelho & Medronho, 2000). Empirical models had to be developed in order to predict how the hydrocyclone will perform under certain conditions by acting as inference sensors (Kraipech et al., 2005). One popular empirical model used to predict the performance of a hydrocyclone is the Plitt model (Flinthoff et al., 1987). L.R. Plitt developed this model to be robust by gathering a large number of experimental data. The data was gathered by operating a wide range of hydrocyclone geometries at different operating conditions (Plitt, 1976). This model is in general not very accurate in predicting the performance, i.e. the separation efficiency of hydrocyclones (Silva et al., 2009). Artificial Neural Networks can be used to predict the performance of complex systems like the hydrocyclone (Kutz, 2003). What makes this method so special is its ability to learn through parallel processing (McMillan, 1999). Given a certain amount of experimental data, the ANN can identify underlying patterns in the data which gives it the ability to predict the outcome, given certain input parameters (Jain, 1996). South Africa has a very large mining industry (Anglo American Platinum, 2013; Anglo Gold Ashanti, 2013). Improving the performance of the hydrocyclone could possibly lead to the growth in the South African economy, as mineral processing becomes more efficient. The use of hydrocyclones are not limited to the mining industry. These process units are also globally used in the petrochemical, environmental and food processing industries (Sripriya et al., 2007). The improved use of the hydrocyclone could thus have a large impact globally in various industries. 1.2 Problem statement Inadequate control of the hydrocyclones on a mineral processing plant may lead to inefficiencies in the downstream process units, ultimately leading to a loss in profit for the company. Monitoring the on-line performance of a hydrocyclone is not a simple task. Inference sensors1 that make use of empirical models or artificial neural networks are possible solutions 1 The terms inference sensors and soft sensors are used interchangeably in this study
  • 16. School of Chemical and Minerals Engineering Introduction| 2 to this problem. A study is needed to determine which of these methods will be more appropriate for predicting hydrocyclone performance. 1.3 Aim and objectives 1.3.1 Aim Improve hydrocyclone efficiency by producing a soft sensor that has the ability to accurately predict hydrocyclone performance. 1.3.2 Objective Compare the predictive power of an empirical model, namely the Plitt model, with the predictive power of an artificial neural network trained with the backpropagation algorithm by making use of experimental hydrocyclone data. 1.3.3 Methodology  Do a literature study on the operation of the hydrocyclone, empirical models for predicting hydrocyclone performance and artificial neural networks;  Do a HIRA study before sampling on the hydrocyclone commences;  Devise a procedure for obtaining representative samples from the hydrocyclone;  Obtain more than 100 samples from the hydrocyclone;  Gather data on the samples’ PSD by analysing the samples with the Malvern Mastersizer 2000;  Process the data for it to be in a suitable form for inserting into an empirical model and an artificial neural network;  Develop the artificial neural network from the literature study that was previously conducted;  Find optimal architectures and parameters for the artificial neural network through trial- and-error;  Substitute the processed data into the empirical model and compare its output (d50c and sharpness of separation) with that of the experimental data;  Substitute the processed data into the trained artificial neural networks and compare the output to the experimental data;  Compare the two soft sensors, the empirical model and the artificial neural network, with each other and come to a conclusion over which is better for predicting hydrocyclone performance
  • 17. School of Chemical and Minerals Engineering Literature study| 3 Chapter 2 - Literature study 2.1 The hydrocyclone Hydrocyclones are commonly used in the mineral industry for the classification of particles after grinding (Flinthoff et al., 1987). It is usually installed in a closed circuit grinding unit where it is used to separate the under size particles from the course particles (Kelly & Spottiswood, 1982:201). The course particles are returned to the grinder for further comminution while the under size particles leave the circuit (Wills, 2006:224-225). Advantages of hydrocyclones include simple design, low operational costs and the capability of handling large volumes of pulp (Sripriya et al., 2007). Complex mechanical devices like spirals and rake classifiers have been replaced by cyclones2 , due to their simple structure that contains no moving parts (Napier-Munn et al., 2005:309). Its applications are however not limited to the mineral industry as it is also used in the chemical industry, power generation industry, textile industry and more. By customising its structure, the hydrocyclone can be used for specific applications like (Svarovsky, 1984:1):  Liquid clarification  Slurry thickening  Cleansing solid particles  Elimination of gasses from liquids Classification of particles takes place due to the difference in settling velocities of the particles being classified. The settling velocities can be a function of either particle size and/or particle density, depending on whether a homogeneous or heterogeneous ore is classified (Kelly & Spottiswood, 1982:199). A homogeneous ore contains particles of similar densities. Particles of homogeneous ores will be classified according to their size (Flinthoff et al., 1987). The feed enters the cylindrical section of the hydrocyclone tangentially where it forms a vortex inside the cyclone’s cone shaped body. The fluid follows a helical path until it reaches the spigot, also known as the apex, where a portion of the downward flow leaves through the spigot as the underflow. The remaining downward flow follows an upward spiral, located on the inside of the outer vortex, and leaves via the vortex finder (Svarovsky, 1984:30-31). The reason for the formation of the upward spiral is not fully understood (Svarovsky, 1984:41). Particles of similar density or size gather together due to the competition between the drag forces and centrifugal forces acting on these particles (Napier-Munn et al., 2005:309-310). If the density of the carrier liquid is lower than that of the solids being separated, the centripetal 2The terms, “hydrocyclone” and “cyclone” are used interchangeably in this study
  • 18. School of Chemical and Minerals Engineering Literature study| 4 force on the solid particles will be larger than the centripetal force of the liquid. On the other hand, the centripetal force acting on the particle will increase as the particle size increases for homogeneous ores (Hibbeler, 2010:131). The centripetal force acting on the particles dominates the drag force also acting on the particles in a radial direction. Larger particles thus reach the boundary layer, formed between the liquid and the wall of the cyclone, with more ease than the smaller particles. The particles in the boundary layer leave the cyclone via the apex under ideal conditions. The finer particles that could not reach the boundary layer by the time the apex is reached, is transported to the inner spiral where it leaves throught the vortex finder (Svarovsky, 1984:41). Random turbulence, hindered settling and the interaction between the carrier liquid and the solid particles makes describing the flow inside the hydrocyclone very difficult. Determining the separation performance of the hydrocyclone is thus not an easy task (Sripriya et al., 2007). The performance of the hydrocyclone is defined as the ability to separate particles into the desired size ranges (Kelly & Spottiswood, 1982:204). According to Svarovsky (1984) the separation performance of the hydrocyclone could be determined if the corrected cut size, 𝑑50𝑐, and the sharpness of separation, m, could be calculated. This is done with the use of a corrected partition curve. The grade efficiency curve, also called the partition curve or the Tromp curve, is a plot of the particles in a certain size range, on the x-axis, vs. the fraction of Figure 2.1: Hypothetical flow inside the hydrocyclone viewed from the top of the hydrocyclone. Adapted from Plitt (1976)
  • 19. School of Chemical and Minerals Engineering Literature study| 5 these particles in the feed leaving the hydrocyclone through the underflow (Frachon & Cilliers, 1999) as can be seen on Figure 2.2. The grade efficiency curve cannot be approximated from first principles and has to be determined by using experimental data (Svarovsky, 1984:17). Figure 2.2: Corrected and non-corrected partition curve adapted from Schneider (2001) The 𝑑50𝑐, also known as the cut size, is the particle size that has an equal chance to exit the hydrocyclone through the vortex finder or through the underflow. The corrected cut size is used instead of the real cut size, as this gives a better indication of the separation forces that are present in the hydrocyclone. More information on the corrected partition curve follows later. The sharpness of separation, m, indicates how well the classification is taking place in the cyclone. The higher the value of m, the closer the hydrocyclone is to an ideal classifier (Napier-Munn et al., 2005:311). In practice, some of the particles, irrespective of their size, in the hydrocyclone bypasses the classification. By controlling the operating conditions of the cyclone, these deviations from ideal separation could be lowered, but never eliminated (Napier-Munn et al., 2005:310-311). Two paths that could be followed for bypassing classification are mentioned below. Small particles tend to stay suspended in the liquid which leaves the hydrocyclone through the underflow. According to Frachon and Cilliers (1999), Plitt (1976) and Svarovsky (1984:20) the fraction small particles bypassing to the underflow is directly proportional to the liquid recovery to the underflow, Rf. A corrected partition curve is constructed to remove the effect of the bypass to the underflow as can be seen on Figure 2.2. Another phenomenon that
  • 20. School of Chemical and Minerals Engineering Literature study| 6 causes the undersize particles to leave though the underflow, is when the undersize particles are trapped in the boundary layer by the larger particles. The corrected partition curve constructed with the use of equation 2.1 might thus not be capable of taking into account all of the undersize particles leaving via the underflow. Another way in which classification could be bypassed is if particles near votex finder leaves via the overflow (Svarovsky, 1984:40). No corrections are made on the partition curve to take this effect into account, but this effect will however be held in mind when the results are interpreted. 𝑩â€Č = 𝑩 − 𝑅𝑓 1 − 𝑅𝑓 2.1 2.2 Hydrocyclone control If a hydrocyclone is not operated to produce the desired overflow and underflow, it could lead to poor performance in downstream processes (Eren & Gupta, 1988). Fines in the underflow lead to overgrinding, while coarse material in the overflow can cause downstream separation problems (Aldrich et al., 2014). Slight changes in the operating conditions of the hydrocyclone could markedly affect the performance of the hydrocyclone (Neesse et al., 2004). The operator of a hydrocyclone might not always be aware of the cyclone’s underperformance and is in addition frequently incapable of returning the cyclone to its optimal operation. There is thus a need for methods to efficiently determine the performance of the cyclone while in operation (Napier-Munn et al., 2005:309). Optimising the hydrocyclone is not an easy task, as the variables are often interlinked with each other. Models that are reasonably accurate are capable of finding the optimum operating conditions for the hydrocyclone even if the variables like the split flow and pressure are for example dependent on each other (Napier-Munn et al., 2005:320). 2.2.1 Sensors used in hydrocyclone performance determination Variables like the pressure drop over the cyclone, the flow rates in and out of the cyclone and the feed are commonly monitored while the cyclone is on-line. With all this information, the operator might still not be able to control the performance of the cyclone effectively (Napier- Munn et al., 2005; Aldrich et al., 2014). Numerous studies have been conducted to find a suitable method to control the performance of the hydrocyclone, many of which has not been widely used in the industry (Aldrich et al., 2014). Table 2.1 contains a list of current sensors that have been developed for determining hydrocyclone performance.
  • 21. School of Chemical and Minerals Engineering Literature study| 7 2.3 Soft sensors One other way the performance can be monitored on-line is by developing a soft sensor, like an artificial neural network (Napier-Munn et al., 2005). Soft sensors use operational data from the plant to predict variables that are usually difficult and/or costly to measure on-line (Kadlec Table 2.1: Sensors used in the on-line monitoring of hydrocyclone performance Sensor Description Acoustic Sensors An acoustic sensor was mounted externally on the hydrocyclone and after a suitable model was found, could accurately predict various parameters like the solids concentration and the flow rate. Variables like the d50c and sharpness of separation are however not determined with the use of this method (Hou et al., 1998). Videographic Measurement A video camera was used to monitor the discharge angle of the hydrocyclone. The discharge angle of the cyclone is said to be linked to the performance of the cyclone (Concha et al., 1996; Neesse et al., 2004). Although this method has some challenges, it is a cost effective way to determine the discharge angle with good accuracy (Janse van Vuuren et al., 2011). Photographic measurement Aldrich et al. (2014) used images and other experimental data from the underflow of a experimental hydrocyclone setup to develop a model that had the ability to identify the mean particle size in the underflow. Instead of using the discharge angle of the underflow like Janse van Vuuren et al. (2011), the textural information that the images provided of the underflow was utilised. Measurement using a laser beam A laser beam is pointed at the underflow of the cyclone where the reflection of the laserbeam is measured with a camera to determine if the cyclone is in the spray or roping state (Neesse et al., 2004).
  • 22. School of Chemical and Minerals Engineering Literature study| 8 et al., 2009). Two possible soft sensors for the control of the hydrocyclone, empirical models and artificial neural networks, will be discussed in this study. 2.3.1 Empirical models Empirical models had to be developed in order to predict how the hydrocyclone will perform under certain conditions (Kraipech et al., 2005). Although Flinthoff et al. (1987) states that these models have been widely accepted, Chen et al. (2000) however states in his study that these models are not reliable. Coelho and Medronho (2000), reasons that these models will only work well if the cyclone is operated in the range that was used to obtain the data to fit the models. One popular empirical model that is used to predict the performance of a hydrocyclone, is the Plitt model (Flinthoff et al., 1987). L.R. Plitt developed this model to be robust by gathering a large number of experimental data. The Plitt model was designed to also take into account the theories around the complex flow of the hydrocyclone (Plitt, 1976). These theories include the residence time theory and the equilibrium orbit theory (Chen et al., 2000). The theories alone are incapable of describing the hydrocyclone performance (Napier-Munn et al., 2005:312). The data was gathered by operating a wide range of hydrocyclone geometries at different operating conditions (Plitt, 1976). This model is in general not very accurate in predicting the performance of hydrocyclones (Silva et al., 2009). The Plitt model consists of four empirical equations. These equations are used to calculate the corrected cut size, the flow split between the underflow and overflow, the sharpness of separation and the pressure drop over the hydrocyclone (Plitt, 1976). Although the Plitt model is designed to work without calibration, Flinthoff et al. (1987) recommends inserting empirical constants, F1 – F4, that will take into account the unique conditions under which the cyclone operates. Only one experimental data point is needed to tune these empirical constants. By default, the values of these constants are all equal to 1. 𝑑50𝑐 = đč1 39.7đ·đ‘ 0.46 đ·đ‘– 0.6 đ·0 1.21 𝜂 𝑣 0.5 exp(0.063𝜑) đ· 𝑱 0.71 ℎ0.38 𝑄0.45 ( 𝜌𝑠 − 1 1.6 ) 𝑘 2.2 𝑚 = đč21.94 exp (− 1.58𝑆 1 + 𝑆 ) ( đ·đ‘ 2 ℎ 𝑄 ) 0.15 2.3
  • 23. School of Chemical and Minerals Engineering Literature study| 9 𝑃 = đč3 1.88𝑄1.78 exp(0.0055𝜑) đ·đ‘ 0.37 đ·đ‘– 0.94 ℎ0.28(đ· 𝑱 2 + đ· 𝑜 2)0.87 2.4 𝑆 = đč4 (3.29𝜌 𝑝 0.24 ( đ· 𝑱 đ· 𝑜 ) 3.31 ℎ0.54(đ· 𝑱 2 + đ· 𝑜 2)0.36 𝑒0.0054𝜑 ) đ·đ‘ 1.11 𝑃0.24 2.5 Where: đ·đ‘= Cyclone diameter in 𝑐𝑚 đ·đ‘–= Inlet diameter in 𝑐𝑚 đ· 𝑜= Vortex finder diameter in 𝑐𝑚 đ· 𝑱= Underflow/apex diameter in 𝑐𝑚 ℎ= Free vortex height in 𝑐𝑚 𝜌 𝑝= Density of the cyclone feed slurry in 𝑔 𝑐𝑚3 𝜌𝑠= Density of the solid phase in 𝑔 𝑐𝑚3 𝜂 𝑣= Viscosity of the carrier fluid in 𝑐𝑝 𝜑 = Percentage solids in the feed 𝑄 = Feed flow rate in 𝑙𝑖𝑡𝑒𝑟𝑠 𝑚𝑖𝑛𝑱𝑡𝑒 𝑑50𝑐 = Corrected cut size in 𝑚𝑖𝑐𝑟𝑜𝑛𝑠 𝑚 = Sharpness of separation which is dimensionless 𝑃 = Gauge pressure in 𝑘𝑃𝑎 𝑆 = Split flow. This is the volume of the underflow divided by the volume of the overflow and it is a dimensionless quantity According to Plitt (1976), the PSD of the feed slurry has a negligible effect on the outcome of the d50c of the underflow. After determining the d50c and m, with the Plitt model, these values can then be inserted into the Rosin-Rammler equation, equation 2.6, to obtain the corrected partition curve.
  • 24. School of Chemical and Minerals Engineering Literature study| 10 𝑩â€Č = 1 − exp⁥(−0.693 ( 𝑑 𝑑50𝑐 ) 𝑚 ) 2.6 Where 𝑑 is the particle size in 𝑚𝑖𝑐𝑟𝑜𝑛𝑠 and 𝑩â€Č is the corrected volume of a certain particle size that was recovered in the underflow. 2.3.2 Artificial neural networks Various studies have been done on the use of artificial neural networks for the prediction of hydrocyclone performance and have proven to be successful (Eren et al., 1997a; Eren et al., 1997b; Karimi et al., 2010) The human brain has powerful learning, generalization and parallel computing abilities. It is desired to give computers the same abilities by copying the principle operation of brain cells and developing artificial neural networks (ANN) (Jain, 1996). ANNs are not limited to soft sensors. Awodele and Jegede (2009) reasons that ANN promises a wide range of new applications in the areas such as education and medicine in the future. This is the reason why research in this field has been booming in the past few decades (Gallant, 1994:1). Figure 2.3: Diagram of the computational nodes and weights of an artificial neural network adapted from Jain (1996)
  • 25. School of Chemical and Minerals Engineering Literature study| 11 An artificial neural network consists of computational units called nodes3 . These nodes are located in sets called layers. Connections, called weights, connects the nodes of one layer to the following layer. Information transported through the weights can only travel in one direction. Figure 2.3 illustrates the computational nodes in a three-layer neural network4 . The arrows represent the weights and their direction. Values, either positive or negative, are assigned to each of the weights. The magnitude of the value assigned to the weight determines how large the effect of the data transported through that weight will be on the neural network. The larger the magnitude of the weights, the larger the effect. The input data travels through the weights to which they are connected. The data traveling through that weight is multiplied by the value of that weight. When the data reaches hidden layer 1 through the weights, an input value to the node is calculated. More information on these calculations later. The input value is substituted into a function called an activation function which calculates an output called an activation. The activation travels through the weights to the next nodes and the same operation is performed. This is done until the ANN produces its final output (Gallant, 1994:1). Parameters that influence the output of the network include:  The number of layers  The number of nodes in the hidden layers  The activation function used in the nodes  The values of the weights  Number of input variables 2.3.2.1 Network topography The network topology involves the arrangement of nodes and connections in the network. These arrangements can be classified into 2 main categories: Feed-forward networks or feedback networks. In feed-forward networks, information can only be carried in one direction, from the input to the output. This type of network is mainly used for pattern recognition purposes. Figure 2.3 illustrates a feed-forward neural network. In a feedback or recurrent networks, the information can either travel in the forward direction to the output or return in the input direction, i.e. make a loop (Awodele & Jegede, 2009). For the purposes of this study, a feed forward structure will be used. 3 The words “nodes” and “neurons” are used interchangeably in this study. 4 The terms “neural networks” and “artificial neural networks” are used interchangeably in this study.
  • 26. School of Chemical and Minerals Engineering Literature study| 12 2.3.2.2 Other artificial neural network parameters 2.3.2.2.1 Initial weights Initial weight values between -0.1 and 0.1 are randomly chosen. Assigning non-random weights could lead to weights that perform the same action and does not lead to sufficient convergence. The weights need to be unique when initialising training to increase the chances of identifying the pattern in the data (Gallant, 1994:213). Another, more complex approach proposed by Gallant (1994:220), is to initialise the weights connected to a certain cell to a random value between -2/z and 2/z, where z is the amount of weights connected to the cell. 2.3.2.2.2 Training speed A large value for the training speed, 𝜇, gives a faster convergence. This convergence can however only be maintained up to a certain point where the network will become unstable and diverge. This is called overtraining. It is advised to choose a training speed that has a positive value no larger than 0.1. Although this results in slow training, the neural network has a better chance to find the local minimum (Gallant, 1994:220). 2.3.2.2.3 Momentum Momentum is used to increase the training speed. The momentum term consists of the change in weight at the previous iteration, multiplied by the momentum parameter. An additional benefit of adding momentum is the removal of noise that might occur during weight updating. The weight thus converges smoothly (Gallant, 1994:221). 2.3.2.2.4 Number of hidden neurons It is very common for the backpropagation algorithm used in the industry to only contain one hidden layer, the main reason being that networks with more hidden layers learn very slowly. Neural networks with one hidden layer are known to be universal approximators. The only way to determine whether a network with multiple or a single hidden layer should be used, is by trail-and-error (Gallant, 1994:221). 2.3.2.3 Machine learning In order for an ANN to produce better results, an algorithm has to be written that gives the ANN the ability to adjust its self. This is called machine learning (Nag, 2010). The two main types of machine learning are supervised and unsupervised learning. In supervised learning, the ANN is given input data to produce an output. The output the ANN produces for the given input data is evaluated with the desired output. If the output from the ANN does not match the desired output, the necessary adjustments are made with the use of
  • 27. School of Chemical and Minerals Engineering Literature study| 13 the learning algorithm (Gallant, 1994:6). The diagram in Figure 2.4 attempt to better describe what is meant with supervised learning. In contrast, unsupervised is not provided with the desired output. Instead, unsupervised learning is used to adjust the ANN so that it can group data that show similar patterns (Gallant, 1994:7). Applications of unsupervised learning include finding the probability distribution of data and identify groups of data that show similar properties and occur close together, i.e. cluster identification (Bishop, 2008:10). In this study, supervised learning will be used, as the experimental data from the hydrocyclone provide the ANN with input and the desired output. 2.3.2.4 Learning algorithms There are a number of learning algorithms in existence that are used to adjust the neural network in order to achieve the desired output. Some algorithms include the perceptron learning algorithm, radial basis function algorithm and the Boltzmann learning algorithm (Jain, 1996). The question arises: “Which algorithms would be fit for a certain application?” According to Jain (1996) the backpropagation algorithm, among others, are fit for use in control systems. Gallant (1994:225), on the other hand reasons that trial-and-error has to be used to find the appropriate algorithm. From his experience, he found that one should first try to use a single-cell model before using a complex algorithm like the backpropagation algorithm. A large problem that occurs in all systems is the presence of noise which commonly occurs in real world applications. Noise is the introduction of erroneous data into the data set. It could Figure 2.4: Supervised learning adapted from Hagan et al. (2002)
  • 28. School of Chemical and Minerals Engineering Literature study| 14 either be that the data is false or absent (Gallant, 1994:9). Artificial neural networks, on the other hand are capable of handling noise (Gallant, 1994:10). 2.3.2.5 Problems with artificial neural networks 2.3.2.5.1 Failure to generalise The purpose of training an ANN is not so much as to reproduce the exact values of the training data, but rather to develop a network that is capable is producing a general answer that would be expected in the training data range (Zhang et al., 2003; Bishop, 2008:332). To explain the difference between good and bad generalization of ANNs, Bishop (2008:9-12) uses the analogy with the complexity of an ANN and the order of a polynomial (polynomial of high or low order). Given a certain data set generated by adding random values to the output of a known function, say 𝑩 = sin⁥( đ‘„). Two polynomials are used to fit the data. The one polynomial is of a high order and the other of a low order. The results of the first order polynomial that were fit to the data could be seen on Figure 2.5. This corresponds to a neural network with only one hidden node that produces a bad fit to the data. A possible solution is to increase the number of free parameters. In the case of a neural network, the number of hidden nodes will be increased. As can be seen in Figure 2.6, the higher order polynomial produces a good fit for the all the data points. It is however a bad representation of the sine wave, as there are plenty of oscillations (Bishop, 2008:9-12). Figure 2.5: Polynomial of first order produces a bad fit for the data. Reproduced from Bishop (2008:11)
  • 29. School of Chemical and Minerals Engineering Literature study| 15 To address this problem of finding a suitable complexity for the ANN, two concepts, the variance and bias (not to be confused with bias weights) are used. The bias is a measure of the amount with which the overall average of the ANN output differs with that of the given data. Figure 2.5 has a high bias value while Figure 2.6 has a low bias value. The variance is used as a measure of how well the ANN output will fit to another data set with that does not include the ANN training data. A low variance value can be expected in Figure 2.5, while a high variance value can be expected in Figure 2.6. The variance and bias goes hand in hand – an increase in the variance leads to a decrease in the bias and vice-versa. The goal is to decrease the value of both the variance and the bias (Bishop, 2008:334-335). 2.3.2.5.2 Regularization Over-fitting is the result of weights with high values. In order to suppress the weights from obtaining large values, regularization is applied. In regularization, the error of the output is conditioned in order to produce a smoother output. This is done by adding a penalty term, Ω, to the error, 𝐾. The conditioned error, đžÌƒ, can be calculated with the help of equation 2.7. đžÌƒ = 𝐾 + đœÎ© 2.7 Bishop (2008:338) provides two ways in which the penalty term can be calculated. One of the two methods is the Tikhonov regularizers which will not be discussed in this study. Another is Figure 2.6: Polynomial of high order producing something that looks like a good fit for all the data points, but the predictive power of the polynomial is sacrificed. Reproduced from Bishop (2008:12).
  • 30. School of Chemical and Minerals Engineering Literature study| 16 the weight decay method. In this method, the penalty term is equal to the sum of squares of all the weights and biases. The equation could be observed in equation 2.8. Ω = 1 2 ∑ đ‘€đ‘– 2 𝑖 2.8 The weight decay regularizer suppresses the weights from obtaining large values which will cause over-fitting (Bishop, 2008:338-339). 2.3.2.5.3 Structural stabilization Trial-and-error could be used to find a more suitable structure that has little complexity, but produces good results. One way of doing this, is by varying the number of hidden nodes or by adding bias weights to the network (Gallant, 1994:221; Bishop, 2008:332). 2.3.2.5.4 More data points The number of training data points and the possible curves that can fit through these training data points are inversely proportional to each other (Zhang et al., 2003). If one desires to train a complex network for reasons such as more accurate results, one simply has to add more training data to the network. A neural network that has the ability to generalize should give output as displayed by the lower order polynomial in Figure 2.7. Figure 2.7: A lower order polynomial that has the capability to generalize well
  • 31. School of Chemical and Minerals Engineering Literature study| 17 2.3.2.6 The backpropagation algorithm As mentioned above, the backpropagation algorithm is one of the popular neural network training algorithms that is suitable for use in process control environments. It was decided that this training algorithm will be used for the neural network in this study. It should be noted that this algorithm mentioned here is developed for a neural network that has a single hidden layer. The sources used in the development of this artificial neural network include Jain (1996) and Basheer and Hajmeer (2000). The steps are as follow: 1. Choose the amount of input, hidden and output nodes. This will then also tell you how many weights there will be in the neural network architecture. 2. Assign random values to the weights. 3. Propagate the signal forward by multiplying the inputs to the neural network with the with the weights that connect the inputs to the hidden neurons, then sum the results of the weights that goes to each of the hidden nodes to produce the signal that is sent to the specified node as can be seen in equation 2.9. 𝜉𝑗 = ∑ đ‘„đ‘– đ‘€đ‘–đ‘— 𝑛 𝑖=0 2.9 Where: 𝜉𝑗=Signal sent to node 𝑗. 𝑛=The amount of weights attached to the node 𝑗. đ‘„đ‘–=Input value from weight 𝑖 đ‘€đ‘–đ‘—=Weight attached to the input node, 𝑖, and the hidden node 𝑗. 4. Substitute the input signal into the activation function. The sigmoid activation function was chosen and can be seen in equation 2.10. 𝜎𝑗 đ» = 1 1 + 𝑒−𝜉 𝑗 2.10 Where 𝜎𝑗 đ» is the output value of the transfer function of node 𝑗 in the hidden layer. 5. The output of node 𝑗, 𝜎𝑗, is then fed forward to the next layer of nodes, the output nodes where equation 2.11 is applied.
  • 32. School of Chemical and Minerals Engineering Literature study| 18 𝜉𝑗 = ∑ 𝜎𝑖 đ» đ‘€đ‘–đ‘— 𝑛 𝑖=0 2.11 6. The signal to the output nodes, 𝜉𝑗, is again substituted into the sigmoid function in equation 2.12 to produce the output of the output layer nodes. 𝜎𝑗 𝑂 = 1 1 + 𝑒−𝜉 𝑗 2.12 Where 𝜎𝑗 𝑂 is the output of the output layer nodes. Note that 𝜎𝑗 𝑂 is equal to đ‘„đ‘— 𝑂 that will be mentioned soon. 7. The error of the output neurons could then be calculated by comparing the output of the output neurons with the desired output of the training data with the use of equation 2.13 (Gupta & Lam, 1998). 𝛿𝑗 𝑂 = (đ‘„đ‘— 𝑂 − 𝑩𝑗)đ‘„đ‘— 𝑂 (1 − đ‘„đ‘— 𝑂 ) 2.13 Where: 𝛿𝑗 𝑂 =Error of the output node 𝑗 đ‘„đ‘— 𝑂 =Output of the output node 𝑗. Again note that đ‘„đ‘— 𝑂 is equal to 𝜎𝑗 𝑂 . 𝑩𝑗=Desired output of the output node 𝑗 8. The values with which the weights between the output layer nodes and the hidden layer nodes are changed could now be calculated with equation 2.14 (Gupta & Lam, 1998). âˆ†đ‘€đ‘–đ‘— 𝑂 = 𝜂𝛿𝑗 𝑂 đ‘„đ‘— 𝑂 − đ‘€đ‘€đ‘–đ‘— 𝑂 (𝑡 − 1) − 𝜂𝑅 × ( đ‘€đ‘–đ‘— 𝑂 (𝑡 − 1)2 ((1 + đ‘€đ‘–đ‘— 𝑂(𝑡 − 1)) 2 ) 2) 2.14 Where: âˆ†đ‘€đ‘–đ‘— 𝑂 =The value with which weight đ‘€đ‘–đ‘— 0 has to be updated 𝜂=Training speed defined by the user 𝛿𝑗 𝑂 =Error of the output node 𝑗 đ‘„đ‘— 𝑂 =Output of the output node 𝑗 𝑀=Momentum factor defined by the user
  • 33. School of Chemical and Minerals Engineering Literature study| 19 𝑅=Regularization factor defined by the user đ‘€đ‘–đ‘— 0 (𝑡 − 1)=The value of the previous weight đ‘€đ‘–đ‘— 𝑂 9. The new weight values can then be calculated with equation 2.15. đ‘€đ‘–đ‘— 𝑂 =⁥ đ‘€đ‘–đ‘— 0 (𝑡 − 1) − âˆ†đ‘€đ‘–đ‘— 𝑂 2.15 Where đ‘€đ‘–đ‘— 𝑂 is the new value of the output weight that extends from node 𝑖 in the hidden layer to node 𝑗 in the output layer. 10. The next step is to calculate the error of the hidden nodes with the help of equation 2.16 (Gupta & Lam, 1998). 𝛿𝑗 đ» = đ‘„đ‘— đ» × (1 − đ‘„đ‘— đ» ) × đ‘€đ‘–đ‘— 𝑂 (𝑡 − 1) × 𝛿𝑗 𝑂 2.16 Where 𝛿𝑗 đ» =Error of the hidden node 𝑗 đ‘„đ‘— đ» =Output of the hidden node 𝑗 11. Now that the error of the hidden layer of nodes are known, the increment with which the weights that extend from the input layer to the hidden layer has to change could now be calculated with equation 2.17 (Gupta & Lam, 1998). âˆ†đ‘€đ‘–đ‘— đ» = 𝜂𝛿𝑗 đ» đ‘„đ‘— đ» − đ‘€đ‘€đ‘–đ‘— đ» (𝑡 − 1) − 𝜂𝑅 × ( đ‘€đ‘–đ‘— đ» (𝑡 − 1)2 ((1 + đ‘€đ‘–đ‘— đ»(𝑡 − 1)) 2 ) 2) 2.17 Where: âˆ†đ‘€đ‘–đ‘— đ» =The value with which weight đ‘€đ‘–đ‘— đ» has to be updated 𝛿𝑗 đ» =Error of the hidden node 𝑗 đ‘„đ‘— đ» =Output of the hidden node 𝑗 đ‘€đ‘–đ‘— đ» (𝑡 − 1)=The value of the previous weight đ‘€đ‘–đ‘— đ» 12. The new weights can then be calculated with the help of equation 2.18. đ‘€đ‘–đ‘— đ» =⁥ đ‘€đ‘–đ‘— đ» (𝑡 − 1) − âˆ†đ‘€đ‘–đ‘— đ» 2.18 The above steps could be repeated with the data from a new sample. An epoch is completed if the artificial neural network has gone through the entire set of training data. A new epoch is started by again going through the training data set.
  • 34. School of Chemical and Minerals Engineering Experimental procedure| 20 Chapter 3 - Experimental procedure 3.1 Overview Various slurries were prepared to be fed to the hydrocyclone. Different operating conditions were imposed on the hydrocyclone. All necessary operating conditions, samples and other data, were recorded on each run. A PSD analysis of the samples were carried out on the samples. The gathered information could then be used to determine the d50c and the sharpness of separation. 3.2 Raw materials The solid particles that had to be separated was micron sized silica quartz particles, MQ15, supplied by Micronized SA Limited. According to Tew (2012) the particles contain 98.50% silica, with small amounts of Al2O3, K2O, Fe2O3, CaO and Na2O. The particles have a density of 2650 𝑘𝑔 𝑚3 and a d50c and m values of 20 𝑚𝑖𝑐𝑟𝑜𝑛𝑠 and 1.9 respectively. The carrier fluid used in this case was municipal water from the Tlokwe municipality. 3.3 Equipment  3X 5 litre buckets  2X 20 litre buckets  1X water gun  1X Marcy scale  1X Doppler flow meter  50X poly tops  1X large syringe  1X spoon 3.4 Experimental setup A diagram showing the experimental setup could be observed on Figure 3.1. The geometry of the hydrocyclone that was used in this study is displayed on Table 3.1. The alphabetical numbering in Figure 3.1 are explained below:  A: Slurry storage tank  B: Circulation pump  C: Main feed bypass valve  D: Feed fine tune bypass valve  E: Feed shutdown valve
  • 35. School of Chemical and Minerals Engineering Experimental procedure| 21  F: Pressure gauge  G: Doppler flow meter  H: Hydrocyclone  I: Hydrocyclone overflow  J: Hydrocyclone underflow  K: Sample taken from the hydrocyclone overflow  L: Sample taken from the hydrocyclone underflow  M: Mixer Figure 3.1: Diagram of the hydrocyclone setup
  • 36. School of Chemical and Minerals Engineering Experimental procedure| 22 The mixer, mentioned above, consists of square tubing that transports the fluid from the fine tuning bypass valve to the bottom of the storage tank. Holes were made at the end of the square tubing in order for the slurry to be sprayed towards the sides of the storage tank so as to promote better mixing. Two sampling containers are located above the storage tank and below the overflow and underflow outlets. As soon as the container of the underflow is pushed in under the underflow, a mechanism pushes the overflow pipe in the overflow container, meaning that the overflow and the underflow are sampled simultaneously. The experimental hydrocyclone setup can be seen on Figure 3.2. The two containers that store the underflow and the overflow are also indicated on this figure. Figure 3.2: The experimental hydrocyclone setup Where:  A: Hydrocyclone  B: Hydrocyclone overflow  C: Hydrocyclone underflow
  • 37. School of Chemical and Minerals Engineering Experimental procedure| 23  D: Sampling container for the underflow  E: Sampling container for the overflow  F: Slurry storage tank Table 3.1: Hydrocyclone geometry Part Size đ·đ‘ 10 𝑐𝑚 đ·đ‘– 3.03 𝑐𝑚 đ· 𝑜 3.4 𝑐𝑚 ℎ 53 𝑐𝑚 3.5 Experimental procedure 3.5.1 Preparation 3.5.1.1 Doppler flow meter calibration The Doppler flow meter is installed on a suitable place where minimum noise will occur due to turbulence in the piping. The storage tank was initially loaded with water only. After the pump was turned on, one person read the value from the Doppler flow meter display, while the other person fills the underflow and overflow containers with the water coming from the hydrocyclone. The underflow and the overflow of the hydrocyclone is equal to the feed to the hydrocyclone. The person filling the underflow and overflow buckets also has to keep track of the time in which the containers are filled. From the volume of the water collected and the time in which the water was collected, one can then calculate the real feed flowrate to the hydrocyclone. The Doppler flow meter was calibrated accordingly. This procedure was repeated until the error between the Doppler flow meter and that measured with the container and watch method was small enough.
  • 38. School of Chemical and Minerals Engineering Experimental procedure| 24 3.5.1.2 Marcy scale calibration The Marcy scale is a handy tool that could be used to determine the density of a slurry mixture. A picture of the Marcy scale could be observed on Figure 3.3. Before its use, it has to be calibrated with the above mentioned municipal water. The density value is set to 1000 𝑘𝑔 𝑚3 when calibrated. 3.5.1.3 Slurry preparation One of the variables that also has to be monitored is the volumetric percentage of solids in the feed. The slurry tank (storage tank) was firstly filled with 200 𝑙𝑖𝑡𝑒𝑟𝑠 of municipal water. The weight of silica sand that has to be added to the tank to obtain a certain volumetric solids percentage is calculated with equation 3.1. 𝑚 𝑠 = 𝜑 × đ‘‰đ‘€ 1 𝜌𝑠 − 𝜑 × 1 𝜌𝑠 3.1 Where: 𝑚 𝑠=Mass of silica sand that has to be added to the storage tank 𝜑=Desired volume percentage of solids in the slurry Figure 3.3: The Marcy scale
  • 39. School of Chemical and Minerals Engineering Experimental procedure| 25 𝜌𝑠=Density of silica sand = 2650 𝑘𝑔 𝑚3 đ‘‰đ‘€=Volume of water in the tank = 200 𝑙𝑖𝑡𝑒𝑟 = 0.2 𝑚3 3.5.2 Sampling Step-by-step instructions for obtaining samples from the rig5 are given in this section. These steps could only be followed after the preparation mentioned in chapter 3.5.1 have been completed: 1. Make sure that the valve between the pump opening and the storage tank exit is fully opened; 2. Make sure that there are no objects in the storage tank that could cause pump failure. 3. Close the feed shutdown valve; 4. Close both of the valves from the overflow and underflow containers; 5. Fully open both the feed bypass valves; 6. Turn on the pump; 7. The slurry from the bypass valves will lead to plenty of turbulence in the storage tank. It is however recommended that the storage tank also be mixed manually so as to ensure that most of the silica particles are suspended in the slurry; 8. Fully open the feed shutdown valve; 9. Slowly close the feed bypass valves while keeping an eye on the pressure gauge. Stop closing the bypass valves as soon as the required pressure is reached; 10. One person has to take note of the flow rate, while the other person has to push in the underflow sampling container. This has to be done at the same time. The person recording the flow rate from the Doppler flow meter also has to start and stop a stopwatch when the containers are firstly inserted and pulled out again; 11. As soon as the underflow and overflow containers have been pulled out, the pump may be stopped; 12. Separate buckets have to be inserted under the hoses that are connected to the outlet valves of the underflow container and the overflow container; 13. Slowly open the outlet valves of the overflow and underflow containers and collect the underflow and overflow samples in the separate buckets. The content of the underflow and overflow containers have to be stirred well while the outlet valves are opened so as to avoid silica sand from settling and remaining in the underflow or overflow containers; 5 The terms “rig” and “hydrocyclone experimental setup” are used interchangeably in this study.
  • 40. School of Chemical and Minerals Engineering Experimental procedure| 26 14. The buckets have to be weighed separately on scale. The scale should previously have been reset with the mass of the buckets that are used. Similar buckets thus have to be used. The mass of the content inside the buckets are recorded; 15. A smaller sample of overflow and the underflow are taken by mixing the slurry in the buckets and filling a poly top with the content. The poly tops should be labelled thoroughly. 16. The remaining content in the buckets are again stirred before the Marcy scale bucket is filled with the slurry. The Marcy scale bucket is put on the Marcy scale to determine the density of the slurry. The densities of both the slurries have to be determined this way. The buckets containing the remaining slurry are emptied into the slurry storage tank of the rig. The above steps are then repeated for the next sample. 3.5.3 Analysing The particle size distribution of the underflow samples are all determined with the use of the Malvern Mastersizer 2000. The particles are circulated through the Mastersizer where they eventually pass through a laser beam. The particles passing through the laser beam scatter some of the radiation from the laser beam. The intensity of the backscattering of the laser light from the particles are measured with special backscatter detectors. The angle at which the light is scattered is inversely proportional to the size of the silica particles (Malvern instruments, 2005). Figure 3.4 is a picture of the Malvern Mastersizer 2000. Figure 3.4: Malvern Mastersizer 2000
  • 41. School of Chemical and Minerals Engineering Experimental procedure| 27 3.5.4 Experimental error For the experimental error determination, 4 random operating conditions were chosen from all the experiments that were conducted. Six runs were completed on each of these operating conditions. A total of 26 experiments were thus completed in order to determine the experimental error. The conditions at which each of the sets were done, as well as the results could be observed in Appendix D. All the calculations that were done in the determination of the experimental error could be found in the electronically attached spreadsheet named “Experimental Error”. It is assumed that the data follows a normal distribution. Due to the small amount of available data for each of the sets, the experimental error had to be determined using the student’s t test (Devore & Farnum, 2005:313-318). The experimental error could be determined with equation 3.2. 𝑡 𝑛−1( đ›Œ 2 ) × 𝑆𝑡 √ 𝑛𝑱 3.2 Where: 𝑡 𝑛−1( đ›Œ 2 )=Critical t value that could be obtained from the back cover of Devore and Farnum (2005) 𝑆𝑡=Standard deviation of the data 𝑛𝑱=Number of data points available in the set A 95% confidence interval was used to obtain the experimental error. The processed experimental data that were used for the determination of the experimental error could be observed in Table 3.2. Table 3.2: Processed data used for the experimental error determination Data Set 1 Set 2 Set 3 Set 4 d50c m d50c m d50c m d50c m 1 21.71 1.35 28.41 2.03 17.72 1.13 29.58 1.44 2 16.59 1.37 24.51 1.92 21.91 1.21 30.83 1.66 3 18.63 1.49 29.57 1.89 22.02 1.21 30.99 1.69
  • 42. School of Chemical and Minerals Engineering Experimental procedure| 28 4 - - - - 22.26 1.30 31.44 1.71 5 - - - - 23.41 1.34 31.66 2.00 Valid 18.44 1.29 26.03 1.97 21.82 1.16 29.74 1.60 Two of the data points in both sets 1 and 2 have been discarded due to their large deviation with the rest of the data in the set. One data point in each set have been used as a validation data point. The values that were substituted into equation 3.2 to calculate the experimental error of each set could be observed on Table 3.3. The results of the d50c experimental error for each data set and the results of the sharpness of separation error for each data set could be observed on Figure 3.5 and Figure 3.6 respectively. Table 3.3: Values for substitution into the student's t equation Data Set 1 Set 2 Set 3 Set 4 d50c m d50c m d50c m d50c m n 3 3 3 3 5 5 5 5 S 2.58 0.076 2.65 0.075 2.17 0.073 0.811 0.20 X6 18.98 1.40 27.50 1.95 21.46 1.24 30.90 1.70 T(95%) 4.303 4.303 4.303 4.303 2.776 2.776 2.776 2.776 6 X is the average of the data in the specific set
  • 43. School of Chemical and Minerals Engineering Experimental procedure| 29 Figure 3.5: Experimental error of the d50c with a 95% confidence interval Figure 3.6: Experimental error of the sharpness of separation with a 95% confidence interval Large errors are observed for the d50c values in each of the sets in Figure 3.5. This could be ascribed to the varying feed PSDs that will be dealt with later in this paper. The experimental errors of the sharpness of separation as seen on Figure 3.6 are however acceptable. 5 10 15 20 25 30 35 40 45 0 1 2 3 4 5 d_50c Set Number D50c Average Validation Data 0.5 1 1.5 2 2.5 3 0 1 2 3 4 5 m Set Number m Average Validation Data
  • 44. School of Chemical and Minerals Engineering Model development| 30 Chapter 4 - Model development 4.1 Overview 121 samples where processed to be put through the artificial neural network and the Plitt model. Unfortunately, the raw data needed to be processed before it was fit to use in the artificial neural network and the Plitt model. For more information on how the data was processed, please refer to Appendix A. 4.2 The Plitt model As mentioned before, the modified Plitt model with the fudging factors will be used in an attempt to predict the d50c and the sharpness of separation of the hydrocyclone operated under certain conditions. Of the 121 data points, 69 samples were used to fit the fudging factors with the help of the ExcelÂź add-in, Solver. The input parameters from the experimental data that were not used in the tuning of the fudging factors were then substituted into the Plitt model. The d50c and sharpness of separation results from Plitt model were then compared to the corresponding experimental results. The Plitt model calculations could be found in the electronically attached spreadsheet named “Plitt model” 4.2.1 Split flow As mentioned before, this paper will only focus on predicting the d50c and the sharpness of separation, m. The processed input data from Appendix C was inserted into the d50c and sharpness of separation equations of the Plitt model. For the split flow variable, 𝑆, of the d50c equation either the experimentally calculated 𝑆 or the split flow calculated with one of the Plitt model equations given in equation 4.1 could be used. 4.1 The value of đč4 was determined by minimizing the error between 69 of the experimental and calculated split flow values with the help of the ExcelÂź add-in, Solver. The resulting value of đč4 was found to be 0.13. The remaining experimental values were then compared with corresponding Plitt model values under the same operating conditions. The results of this investigation are presented in Figure 4.1. Very small deviations from the experimental split flow values are observed, meaning the split flow values from the Plitt model is suitable for further use.
  • 45. School of Chemical and Minerals Engineering Model development| 31 4.2.2 Cut size – d50c The d50c value was calculated with equation 2.2. Just as with the split flow, 69 experimental data points were used to adjust the value of đč1. There are however another variable, 𝑘, that could be adjusted in this equation. It was observed that solver could either vary đč1 or 𝑘 to obtain a minimum error. A value of 0.5 was arbitrarily chosen for 𝑘, while đč1 was varied. The resulting value for đč1 is 64.9. 4.2.3 Sharpness of separation The fudging factor of the sharpness of separation was determined the same way as the above mentioned fudging factors. 4.3 The artificial neural network The backpropagation algorithm will be used to train the neural network. A few modifications were made to the ANN. This includes the addition of a regularization term and the addition of a momentum term. Both these terms could be observed in equation 2.14 and 2.17. All artificial neural networks that were constructed had various input variables and only one output variable. The output variable was either the d50c or the sharpness of separation. 4.3.1 Artificial neural network architecture Six different artificial neural networks have been written. The amount of neurons in each of these networks could be varied between 1 and 20, while the input and output neurons cannot be changed. Table 4.1 displays a list of all the ANNs that have been programmed. All these programs could be found under the attached folder named “Artificial neural networks”. Figure 4.1: Experimental split flow values plotted with the predicted Plitt model split flow values 0 0.5 1 1.5 2 2.5 0 10 20 30 40 50 60 Splitflow Sample number S experimental S predicted
  • 46. School of Chemical and Minerals Engineering Model development| 32 Table 4.1: Different artificial neural networks that were programmed Neural network number Input variables Output variables 1 Du d50c 2 Du, 𝜑 and Q d50c 3 Du, 𝜑, Q, P, and S d50c 4 Du m 5 Du, 𝜑 and Q m 6 Du, 𝜑, Q, P, and S m Separate neural networks for the d50c and sharpness of separation were constructed, as neural networks that had both these variables as output, lacked the ability to learn. It was decided that the first ANN of both the d50c output and sharpness of separation output should only have the spigot diameter as input variable as it is known that this variable has the largest effect on the hydrocyclone performance. In this study, the spigot diameter was changed by switching off the pump and manually inserting a new spigot with a different diameter. In industry, this would however be impractical. In a study conducted by Eren and Gupta (1988), the spigot size could be adjusted pneumatically while the cyclone was on-line. This study will thus be applicable to hydrocyclones which spigot size could be changed while the cyclone is on-line. The second set of neural networks contained the same inputs that are needed in the Plitt model – the volumetric percentage solids in the feed 𝜑 and the feed volumetric flowrate 𝑄. This neural network and the Plitt model are thus on equal grounds and could be compared with one another. For the third and last set of neural networks, the split flow and pressure drop over the cyclone were added as inputs to test whether the predictive power of the neural network will improve. Each neural network that were constructed had the ability to test 20 different architectures with one click of a button. The networks could thus be run on multiple computers at the same time. More neural networks could thus be tested in a shorter amount of time in comparison with MATLAB¼’s Neural Network Toolboxℱ.
  • 47. School of Chemical and Minerals Engineering Model development| 33 Each of the neural networks were trained with roughly 75% of the experimental data. The remaining 25% of the data was used as validation data. The validation data was used for all the results that are displayed in chapter 5. None of the training data were thus used for validation purposes. To display the learning capability of the developed neural networks, a neural network that had the spigot diameter as input parameter and the sharpness of separation as output parameter was trained with 80 epochs and a training speed of 0.02. The results are displayed on Figure 4.2. The reader is referred to Appendix E for the source code of one of the artificial neural networks. Figure 4.2: Learning capability of one of the 6 developed artificial neural networks
  • 48. School of Chemical and Minerals Engineering Results and discussion| 34 Chapter 5 - Results and discussion After processing the data, it was found that the PSD of the feed varied considerably. An alternative for calculating the feed PSD is dealt with in this section. As mentioned before, the neural network was written in order to make it convenient for the user to test multiple neural network architectures at once. This functionality was used to filter out the more suitable neural network architectures for predicting the cut size and the sharpness of separation. These filtered out neural networks were then further optimised. The results as well as a discussion of these results are given in this section. 5.1 Deviations in the feed PSD From the start of the sampling and analyses, it was assumed that the feed PSD remained constant for all the slurry batches, as the same silica sand product from the same manufacturer was used each time. It thus only seemed necessary to sample and determine the PSD of the feed once and sample the underflow of each run, instead of sampling both the underflow and the overflow of each run. This meant the total amount of PSD analyses could be cut in half. The resulting partition curves that were produced had partition values that exceeded 1 or was lower than 0. This means that the material balance did not solve. After Figure 5.1: Particle size distribution of 25 different feed samples -0.01 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0 20 40 60 80 100 120 140 160 Volume%solids Particle size [microns]
  • 49. School of Chemical and Minerals Engineering Results and discussion| 35 taking samples of 25 different slurry mixtures7 , it was found that the PSDs differed significantly from each other as can be seen on Figure 5.1. This meant that the partition curve could no longer be calculated from one feed PDS sample. A solution to this problem was to calculate 25 different partition curves from the feed PSDs that could be seen on Figure 5.1 for each of the underflow samples that were analysed. One out of the 25 partition curves had to be chosen. The chosen partition curve had to fulfil two criteria. Firstly, there may not be a value on the partition curve that exceeds 1. This would mean that more of a certain size of particles exits the cyclone than have entered the cyclone. According to the literature study, the correction made to the partition curve in order to obtain the corrected partition curve is equal to the recovery of water to the underflow. The partition curve thus also has to intersect the y-axis at a value that is close to the value of Rf. This is the second constraint the partition curve has to meet. Unfortunately, the Mastersizer was incapable of accurately measuring the particle sizes that were smaller than 8.4 𝜇𝑚. The curve on Figure 5.2 shows the large fluctuations that occur at particle sizes smaller than 8.4 𝜇𝑚. This phenomenon occurred in all the partition curves. According to the results from the Mastersizer, the particles under 8.4 𝜇𝑚 amounted to 0.1% of the total particles. The values of these particles will thus be neglected. The partition curve value of the 8.4 𝜇𝑚 will thus be taken as the recovery of liquid to the underflow. 7 The word batch and slurry mixture are used interchangeably Figure 5.2: Example partition curve before justifications
  • 50. School of Chemical and Minerals Engineering Results and discussion| 36 A similar phenomenon was observed for particles larger than 95 𝜇𝑚. These particles amounted to less than 0.05% of the total particles. It would thus also be a safe assumption to ignore these particle sizes in further calculations. After the partition curve that suited the description above was chosen, small changes were made to the value of Rf so that it would be equal to the experimental Rf value. These small changes could be observed on Figure 5.3. 5.2 Plitt model 5.2.1 Cut size – d50c The d50c results of the modified Plitt model are displayed on Figure 5.4. The blue line connects the experimental data points, while the orange line connects points that were predicted by the Plitt model. The results are displayed in another form on Figure 5.5: where the predicted vs. actual values are plotted over the 𝑩 = đ‘„ curve. To determine how well the data fits the 𝑩 = đ‘„ curve, a value called the coefficient of determination is calculated. This resulted in a 𝑅2 value of 0.664. Figure 5.3: Experimental vs. adjusted values of Rf
  • 51. School of Chemical and Minerals Engineering Results and discussion| 37 Figure 5.5: Plitt model predicted cut size vs. experimental cut size plotted over the y=x curve Figure 5.4: Experimental cut point plotted with the cut point predicted by the Plitt model 0 5 10 15 20 25 30 0 10 20 30 40 50 60 d50c[microns] Sample number d50c experimental d50c predicted
  • 52. School of Chemical and Minerals Engineering Results and discussion| 38 From Figure 5.4, it is clear that the Plitt model was capable of predicting the cut point to a certain extent. At the larger d50c values, the Plitt model tends to overpredict the d50c, while the opposite is true for the smaller d50c values. The absolute error for the 44 validation data points are 71.5 𝜇𝑚. The average error per predicted d50c value is thus 1.6 𝜇𝑚 which is acceptable. 5.2.2 Sharpness of separation The sharpness of separation results from the Plitt model could be observed on Figure 5.6. Again, the experimental values for the sharpness of separation are connected by the blue line, while the predicted values are connected by the orange line. Figure 5.6: Experimental sharpness of separation plotted with the sharpness of separation predicted by the Plitt model 0 1 2 3 4 5 6 0 10 20 30 40 50 60 Sharpnessofseparation Sample number m experimental m predicted
  • 53. School of Chemical and Minerals Engineering Results and discussion| 39 Figure 5.7: Plitt model predicted m vs. experimental m plotted over the y=x curve From Figure 5.6 and Figure 5.7, it is clear that the Plitt model is incapable of predicting the sharpness of separation. Deviations with an absolute value of 2 could easily be observed on these figures. 5.3 Artificial neural networks Various experiments were conducted in order to determine which neural network architecture and parameters will be more suited for predicting the d50c and the sharpness of separation. The same architectures and parameters were tested on both the d50c and the sharpness of separation. In the first series of tests, the number of epochs and the training speed was held constant while the number of neurons in the hidden layer was varied between 3 and 20. Below 3 hidden neurons, the neural network lacked the complexity to adequately predict the d50c and the
  • 54. School of Chemical and Minerals Engineering Results and discussion| 40 sharpness of separation. All six neural networks mentioned in Table 4.1 were tested with these architectures and parameters. The architecture and parameters that were revealed to be the best out of those tested, underwent further testing by increasing the amount of epochs by orders of magnitude and decreasing the training speed so as to increase the chances of finding the global minimum. The momentum and regularization terms were tested with the same architecture and parameters as those used by neural network mentioned in the previous paragraph. 5.3.1 Cut size – d50c 5.3.1.1 Neural network screening The results of training the neural network with only the spigot diameter, Du, as input are given in Figure 5.8Error! Reference source not found.. Overtraining8 occurred at all of the tests accept for the neural network that had 3 hidden neurons. It should be noted that the neural networks stop training as soon as overtraining started. The results in Figure 5.8 show that a simple neural network with no more than 6 hidden neurons had the best prediction capabilities. Adding more hidden layers tends to overcomplicate the network, leading to poorer results. Figure 5.8: Results of neural network 2 trained with a training speed of 0.2 and a maximum amount of epochs of 8000 8 Overtraining takes place when the artificial neural network stops to converge to an answer and starts to diverge 50 55 60 65 70 75 80 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Combinedabsoluteerrorof44results [microns] Number of hidden neurons
  • 55. School of Chemical and Minerals Engineering Results and discussion| 41 By adding more input parameters to the neural network, even better results are achieved. The results could be seen on Figure 5.9. All networks in this test were trained until overtraining commenced. A maximum error just above 1.35 micron per validation data point was achieved in this neural network. Figure 5.9: Results of neural network 2 trained with a training speed of 0.2 and a maximum amount of epochs of 8000 Two more input parameters, the split flow and the pressure drop over the cyclone were inserted. The addition of these two parameters produced better results than the previous tests. No trend could be observed in the absolute error as the amount of neurons were increased. The results are given on Figure 5.10. Figure 5.10: Results of neural network 3 trained with a training speed of 0.2 and a maximum amount of epochs of 8000 52 53 54 55 56 57 58 59 60 61 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Combinedabsoluteerrorof44results [microns] Number of hidden neurons 44.4 44.6 44.8 45 45.2 45.4 45.6 45.8 46 46.2 46.4 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Combinedabsoluteerrorof44results [microns] Number of hidden neurons
  • 56. School of Chemical and Minerals Engineering Results and discussion| 42 5.3.1.2 Enhancing the neural network From the results in Figure 5.8, Figure 5.9 and Figure 5.10, it is clear that the predictive power of the neural network increases with an increase in the number of inputs. It was thus decided to further develop neural network 3 for predicting the d50c of the hydrocyclone. The neural network was given 12 hidden neurons and firstly trained with a maximum of 60000 epochs and a training speed of 0.02. It was expected that the absolute error observed in Figure 5.10 would decrease, instead, the error increased with almost 1.6 microns to 47.19 𝜇𝑚. An explanation for this phenomenon could be that this neural network just happened to step over the local minimum that was found by the neural network in Figure 5.10. Another representation of the results is given in Figure 5.12. Although there are some neural network output values that differ with 2 𝜇𝑚, from the experimental d50c values, Figure 5.11 shows that the neural network has adequate prediction power. Figure 5.11: Calculated d50c plotted with the experimental d50c values of neural network 3 trained with a maximum of 60000 epochs and a training speed of 0.02 13 15 17 19 21 23 25 27 0 10 20 30 40 50 d_50c Validation sample number Calculated d_50c Experimental d_50c
  • 57. School of Chemical and Minerals Engineering Results and discussion| 43 Figure 5.12: Predicted d50c vs. experimental d50c plotted over the y=x curve for the neural network trained with a maximum of 60000 epochs and a training speed of 0.02 For the next enhancement, the momentum term will be used. The momentum constant was given a value of 1 × 10−6 . The other parameters and architecture of the neural network remains unchanged. A significant reduction of more than 3 𝜇𝑚 was observed in the combined error when compared to the previous test. The value of the combined error in this case is 44.15 𝜇𝑚. It can also be seen on Figure 5.14 that the 𝑅2 value decreased by 0.05 to 0.795. When looking at the calculated and experimental graph on Figure 5.13, certain improvements could be spotted. As an example, the last validation data point lies on the predicted d50c value. This was not the case in Figure 5.11.
  • 58. School of Chemical and Minerals Engineering Results and discussion| 44 Figure 5.13: Calculated vs. experimental values of neural network 3 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the momentum term Figure 5.14: Predicted d50c vs. experimental d50c plotted over the y=x curve for a neural network trained with a maximum of 60000 epochs and a training speed of 0.02 13 15 17 19 21 23 25 27 0 10 20 30 40 50 d_50c Validation sample number Calculated d_50c Experimental d_50c
  • 59. School of Chemical and Minerals Engineering Results and discussion| 45 For the final neural network enhancement, the momentum term will be deactivated while the regularization term will the activated. The regularization constant was set to a value of 1 × 10−4 . The regularization term only produced slight improvements when compared to the initial neural network enhancement. The resulting combined absolute error was 46.45 𝜇𝑚. The validation results are displayed on Figure 5.15. The predicted vs. experimental d50c could be observed on Figure 5.16. Only slight differences are observed in the graphs of Figure 5.13 and Figure 5.15. Figure 5.15: Calculated vs. experimental values of neural network 3 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the regularization term 13 15 17 19 21 23 25 27 0 10 20 30 40 50 d_50c Validation sample number Calculated d_50c Experimental d_50c
  • 60. School of Chemical and Minerals Engineering Results and discussion| 46 Figure 5.16: Predicted d50c vs. experimental d50c plotted on the y=x curve for the neural network trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the regularization term 5.3.2 Sharpness of separation 5.3.2.1 Screening of neural networks Screening of the neural networks with the sharpness of separation as output was done the same way as the screening of the d50c neural networks. The results of neural networks 4, 5 and 6 are given in Figure 5.17, Figure 5.18 and Figure 5.19 respectively. Most of the networks were trained until overtraining commenced.
  • 61. School of Chemical and Minerals Engineering Results and discussion| 47 Figure 5.17: Results of neural network 4 trained with a training speed of 0.5 and a maximum amount of epochs of 20000 Figure 5.18: Results of neural network 5 trained with a training speed of 0.5 and a maximum amount of epochs of 25000 9 9.2 9.4 9.6 9.8 10 10.2 10.4 10.6 10.8 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Combinedabsoluteerrorof44results [microns] Number of hidden neurons 8 8.5 9 9.5 10 10.5 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Combinedabsoluteerrorof44results [microns] Number of hidden neurons
  • 62. School of Chemical and Minerals Engineering Results and discussion| 48 Figure 5.19: Results of neural network 6 trained with a training speed of 0.2 and a maximum amount of epochs of 8000 The same phenomena that happened in neural networks 1, 2 and 3 were observed in neural networks 4, 5 and 6. When the amount of inputs to the neural network was less than or equal to 3, the predictive capability of the neural networks reached their peak when the amount of hidden neurons were capped at 8. There again was no trend in the prediction power of the neural network as the amount of hidden neurons were increased for the neural network that had 5 inputs. An increase in the amount of inputs to the neural network also lead to an improved predicting capability. It was thus decided that neural network 6 should be further developed. 5.3.2.2 Enhancing the neural network It was decided that neural network 6 should be given 13 hidden nodes, as good results were obtained with this amount of hidden nodes as can be seen on Figure 5.19. The network was trained with a maximum of 60000 epochs and a training speed of 0.02. After the first test, the momentum term was added with a momentum constant of 1 × 105 and in the second test, the momentum term was deactivated and the regularization term was inserted with a regularization constant of 0.001. The results could be observed on Figure 5.20, Figure 5.22 and Figure 5.24. Predicted vs. calculated plots could be seen on Figure 5.21, Figure 5.23 and Figure 5.25. 9.06 9.07 9.08 9.09 9.1 9.11 9.12 9.13 9.14 9.15 9.16 9.17 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Combinedabsoluteerrorof44results [microns] Number of hidden neurons
  • 63. School of Chemical and Minerals Engineering Results and discussion| 49 Figure 5.20: Experimental and predicted values of neural network 6 trained with a maximum of 60000 epochs and a training speed of 0.02 Figure 5.21: Predicted vs. experimental m plotted over the y=x graph for neural network 6 trained with a maximum of 60000 epochs and a training speed of 0.02 2 2.5 3 3.5 4 4.5 0 5 10 15 20 25 30 35 m Validation sample number Experimental m Predicted m
  • 64. School of Chemical and Minerals Engineering Results and discussion| 50 Figure 5.22: Predicted and experimental values of neural network 6 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the momentum term Figure 5.23: Predicted m vs. experimental values m plotted over the y=x line for neural network 6 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the momentum term 2 2.5 3 3.5 4 4.5 0 5 10 15 20 25 30 35 m Validation sample number Experimental m Predicted m
  • 65. School of Chemical and Minerals Engineering Results and discussion| 51 Figure 5.24: Predicted vs. experimental values of neural network 6 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the regularization term Figure 5.25: Predicted m vs. experimental m plotted over the y=x curve for neural network 6 trained with a maximum of 60000 epochs and a training speed of 0.02 with the addition of the regularization term 2 2.5 3 3.5 4 4.5 0 5 10 15 20 25 30 35 m Validation sample number Experimental m Predicted m
  • 66. School of Chemical and Minerals Engineering Results and discussion| 52 Accept for the large outliers observed near validation sample number 20 and at validation sample 13, the sharpness of separation was predicted with reasonable accuracy. When comparing the graphs, Figure 5.20, Figure 5.22 and Figure 5.24, one observes that there are no significant differences. The combined absolute error of the neural network that had no regularization nor momentum term had a combined validation error of 9.11 for the sharpness of separation, meaning that the sharpness of separation was out with an average value of 0.21 per validation sample. The momentum term further decreased the combined error to 9.04, while the addition of the regularization term significantly decreased the combined error to 8.87, meaning that the average error per validation sample prediction was decreased to 0.2.