In order to determine the flow pulsation in terms of objective mathematical variables, one can start from
qin?qout =
V
b
¶ p
¶t
+
¶V
¶t
(1)
Assuming the material body of the pump is rigid, one can set
¶V
¶t = 0, so one can rewrite 1, to the following
Dq =
V
b
¶ p
¶t
(2)
Using chain rule, one can write ¶ p
¶t = ¶ p
¶q
¶q
¶t , and flow is noted per [?] function of angle of rotation of the pump driving motor rotor due to the change of the suction or outlet volume due to pump geometry, so one can rewrite 2, to the following
DD(q)
w
2 p =
V
b
¶ p
¶q
¶q
¶t
(3)
since angular speed w is ¶q
¶t , one can rewrite the equation as
follows
DD(q) = 2 p
V
b
¶ p
¶q (4)
one notice from 4, that the flow pulsation given the assumptions mentioned in this section becomes independent of the driving angular speed of the pump, which means that the prospecting reinforcement learning algorithm shouldn’t focus on controlling the angular speed of the driving motor to minimize the flow pulsation.
Continuing on the approach of isolating the variables affecting the flow pulsation, one can use the fundamental pressure equation as a function of the area facing the displaced fluid and torque arm radius Per [?], the driving flow radius shown in 7 is function also of the of angle of rotation of the pump driving motor rotor due to change of geometry inside the pump and per [?] the motor driving torque is also function of angle of rotation in a brushless dc motor due to fluctuation of the magnetic field inside the motor,
so 7 can be rewritten as follows
p =
T(q)
R(q) A
(8)
Substituting 8 into 4 per rotation cycle, one finds applying the differentiation in 8, one rewrite it to:
At the final shape of 10 lies the importance of determining the variables that the reinforcement learning algorithm should focus on, one notice that minimizing the flow pulsation DD(q) can only be accomplished by both making fluid displacement and driving torque constant per rotation angle in a cycle and since the dc motor torque fluctuations is out of scope of this paper, then the focus will be on making fluid displacement per rotation angle constant and this would be by minimizing flow amplitude at higher frequencies and move it at close of possible to 0 Hz “constant flow”.
2 Reinforcement Learning Algorithm Model
Based FLow Pulsation
Since the target of the flow pulsation minimization as described in 1 is minimizing flow amplitude at higher frequencies and move it at close of possible to 0 Hz “constant flow”, which means minimize the geometrical effects on the flow displacement, as a result the model used for the reinforcement learning p(st+1jat ; st ) would be based on a piston pump, due to the simplicity in describing the kinematic flow pulsation in such type of pumps.
Per [], the kinematic flow pulsation can be formulated as follows
if the t in 11 is considered the time required to complete one cycle and both sides are divided by n, then 11 can be considered
equivalent to D(q) in 10. As determined 1, the angular speed doesn’t affect the flow pulsation “directly”, so it would be a parameter that is being adjusted based on the required flow but independent of the reinforcement learning, that is being used to minimize the flow pulsation. Thereinforcement learning algorithm to be used is Q-learning algorithm that uses neural networks to determine a policy pq (at jst ) “a sequence of actions” to optimize the cost function of the controller given the current state st and resulting reward rt of such state given action at [].