代写 C algorithm Scheme game html math statistic network security theory IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 37, NO. 2, FEBRUARY 2019 439

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 37, NO. 2, FEBRUARY 2019 439
A Prediction-Based Charging Policy and
Interference Mitigation Approach in the
Wireless Powered Internet of Things
Lixin Li , Member, IEEE, Yang Xu, Zihe Zhang, Jiaying Yin, Wei Chen, Senior Member, IEEE, and Zhu Han, Fellow, IEEE
Abstract— The Internet of Things (IoT) technology has recently drawn more attention due to its ability to achieve the intercon- nections of massive physic devices. However, how to provide a reliable power supply to energy-constrained devices and improve the energy efficiency in the wireless powered IoT (WP-IoT) is a twofold challenge. In this paper, we develop a novel wireless power transmission (WPT) system, where an unmanned aerial vehicle (UAV) equipped with radio frequency energy transmitter charges the IoT devices. A machine learning framework of echo state networks together with an improved k-means clustering algorithm is used to predict the energy consumption and cluster all the sensor nodes at the next period, thus automatically determining the charging strategy. The energy obtained from the UAV by WPT supports the IoT devices to communicate with each other. In order to improve the energy efficiency of the WP-IoT system, the interference mitigation problem is modeled as a mean field game, where an optimal power control policy is presented to adapt and analyze the large number of sensor nodes randomly deployed in WP-IoT. The numerical results verify that our proposed dynamic charging policy effectively reduces the data packet loss rate, and that the optimal power control policy greatly mitigates the interference, and improve the energy efficiency of the whole network.
Index Terms—Wireless power transmission (WPT), charging policy, energy prediction, Internet of Things (IoT), mean-field game (MFG).
Manuscript received March 19, 2018; revised July 6, 2018; accepted September 15, 2018. Date of publication September 27, 2018; date of current version January 15, 2019. This work was supported in part by the National Science Foundation of China under Grant 61671269, in part by the National 10000-Talent Program of China, in part by the Aerospace Science and Technology Innovation Fund of the China Aerospace Science and Technology Corporation, in part by the Shanghai Aerospace Science and Technology Innovation Fund under Grant SAST2016034 and Grant SAST2017049, in part by the China Fundamental Research Fund for the Central Universities under Grant 3102017ZY029, in part by the Seed Foundation of Innovation and Creation for Graduate Students in Northwestern Polytechnical University under Grant ZZ2018019 and Grant ZZ2018130, in part by US MURI, and in part by NSF under Grant CNS-1717454, Grant CNS-1731424, Grant CNS- 1702850, Grant CNS-1646607, and Grant ECCS-1547201. (Corresponding author: Lixin Li.)
L. Li, Y. Xu, Z. Zhang, and J. Yin are with the School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710129, China (e-mail: lilixin@nwpu.edu.cn).
W. Chen is with the Department of Electronic Engineering and the Tsinghua National Laboratory for Information Science and Technology, Tsinghua Uni- versity, Beijing 100084, China.
Z. Han is with the University of Houston, Houston, TX 77004 USA, and also with the Department of Computer Science and Engineering, Kyung Hee University, Seoul 130-701, South Korea (e-mail: zhan2@uh.edu).
Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/JSAC.2018.2872429
I. INTRODUCTION
AS SMART services like metering, city or building lights management, environment monitoring, and vehicle traf- fic control, become pervasive in urban areas, the Internet of Things (IoT) technology emerges with the expectation to implement the inter-networking of a huge number of devices [1]. Wireless sensor network (WSN) as a promising technology of IoT has been widely applied in military [2], intelligent transportation [3], smart home [4], [5], environ- mental monitoring [6], and healthcare [7], etc. A WSN con- sists of autonomous sensor nodes flexibly distributed in a given area, so as to sense and collect the information of interest. However, in practice, the reliable power sup- ply of energy-constrained IoT devices becomes extremely challenging.
Energy harvesting is regarded as a feasible solution to meet the energy demands of communication networks, and a large number of studies have been conducted in [8]–[12]. At present, two main technologies are foreseen to harvest energy. One is opportunistic to obtain energy from the renew- able energy, such as solar energy, biological energy, wind energy, and radio frequency (RF) energy in the environ- ment (the nearby radio and television stations). The other is wireless power transmission (WPT), where the device is charged by a dedicated energy source. The former does not intend for energy transfer, whereas the latter can provide energy when predictable supply is expected, usually from license-free frequency bands of the radio spectrum. In practice, the constraints, resulting from time-varying and the uncertainty of the renewable energy, lead to no reliable-guaranteed energy supply for the ultra-dense deployed sensor nodes. However, the dedicated energy source is not subject to weather and seasonal constraints, providing a permanent and cost-effective energy supply for low-power sensor nodes. As the WPT process with dedicated origin is fully controllable, it is suitable to support applications with QoS constraints [8]. Traditional WPT systems, where the distributed sensor nodes are pow- ered by a dedicated energy source fixed in a specific area, suffer from shortcomings such as the severe path loss, poor network-edge coverage, all causing no effective-guaranteed energy for some sensor nodes in edge regions. In [11], the method of energy beamforming was proposed to improve the efficiency of WPT by training the available channel status
0733-8716 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

440 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 37, NO. 2, FEBRUARY 2019
information. Xu and Zhang [12] employed a multiple-input multiple-output technology to enhance the efficiency of WPT of radio frequency signals. Chu et al. [13] investigated a wireless powered sensor network with a multi-antenna power station providing the power to sensor nodes, and the harvested energy supported the sensor nodes to transmit monitoring information to the fusion center, with two different scenarios considered to maximize the system sum throughput of the sensor network.
Different from the previous researches, the energy station can also be mobile, to travel and transfer power to rechargeable wireless sensor nodes. Especially, the hardware based on WPT has been widely used in many projects. Sangare et al. [9] pro- posed to use a mobile robot vehicle with radio frequency (RF) energy transmitters to charge the WSN and verified the results through the P21XXCSR evaluation board. Comparing with the mobile charging vehicles on the two-dimensional space, the mobile charging of the unmanned aerial vehicle (UAV) in the three-dimensional space cannot only flexibly plan its path, but have less shading and increase the possibility of line-of-sight links, greatly improving the energy transmission efficiency. In fact, the UAV has a great potential to provide air- to-ground mobile charging services in various situations [14]. Additionally, in [15], UAV-enabled WPT was considered with trajectory design and energy region characterization. In the UAV-enabled WPT system, however, the energy received by the sensor nodes is uncertain. The energy consumption of sensor nodes at next period is vital to implement seamless and accurate wireless charging for the UAV. Thus, the pre- diction of energy consumption of each sensor node plays an important role on maximizing network efficiency. A machine learning framework approach of echo state networks (ESNs) was studied to predict each user’s content request distribution and mobility pattern with low complexity for cloud access networks in [16]. When compared to the traditional prediction algorithms, the prediction with ESN can obtain more accurate results with low complexity. Considering the regularity of nodes’ contexts, it is worth to study the prediction of energy consumption by using ESN algorithm.
Another important aspect in wireless powered IoT (WP-IoT) network is that there exists severe interference greatly reduc- ing the energy transmission efficiency [17]. Power control approach was used to perform interference management in [18]–[21]. Xie et al. [18] formulated the load-aware energy efficient user association and power optimization problem as a mixed-integer programming problem to cognitively limit the interference between the BSs and the users in ultra-dense networks. In addition, a distributed power control approach was employed to coordinate the intra-tier interfer- ence among different spectrum-sharing small cells in [21]. Recently, game theory has been widely used to characterize the dynamic power control, which is prone to analyze the actions of the transmitters and design the corresponding dis- tributed algorithms by simulating the competition and interfer- ence coordination between transmitters. Moreover, distributed resource allocation problems can be modeled as different types of games [22]–[26]. In association with [26], the power allocation problem in heterogeneous small cell network
was modeled as a non-cooperation game by introducing a time-varying cross-tier/co-tier interference pricing. The classic games need to take the interactions between each player and the others into account to establish the equations. However, when it comes to a large number of players, a huge number of equations need to be solved, which is difficult to analyze with the classic games.
Mean field game (MFG) is a promising alternative for modeling and analyzing the large-scale wireless powered sensor nodes, where the MFG has considered the interactions between the behavior of the individual player and the average behavior of the collective players [27]–[30]. This collective behavior is simulated by a mean field, which represents the statistical distribution of the system state considered. For example, the work in [27] developed a theoretical framework for the MFG by taking the interference with a mean-field approximation (MFA). In [28], the power control problem of a small cell network was first formulated as a stochastic game under the assumption that the users were deployed in a co-frequency channel. Subsequently, the power control stochastic game was extended to the MFG in ultra-dense networks. In [29], the distributed power control strategy was obtained through the MFG for the small base stations of an ultra-dense deployment. In this case, the interactions among players are described as the interaction between the generic player and the mean field, which can be modeled by the Hamilton-Jacobi-Bellman (HJB) equation in the mean field game. The dynamics of the mean field can be simulated with the Fokker-Planck-Kolmogorov (FPK) equation based on the action effects of the generic player. These coupled FPK and HJB equations are also called the forward and backward equations, respectively. It is preferable to perform the mean field equilibrium by solving these two equations, but actually no general technique is suitable for the outcome of the equations.
To solve the above-mentioned problems, we study a WP-IoT, where a centralized way is carried out by the UAV for mobile charging and a distributed power control method is presented for massive wireless powered sensor nodes commu- nication. The contributions of this paper can be summarized as follows:
• We formulate a theoretic mobile wireless charging frame- work for the WP-IoT, where we assume that the UAV is expected to serve as a RF energy transmitter and that a large number of sensor nodes are randomly deployed in a limited area. In this framework, we jointly consider the mobile charging of the UAV and the communication performance of the WP-IoT.
• We propose a new ESN-based scheme to predict the energy consumption of each sensor node at the next period. In the presence of predictions, an improved k-means clustering algorithm is employed to cluster a large number of distributed sensor nodes into multi- ple clusters in a specific rule to minimize the packet loss rate. Meanwhile, the mobile charging strategy prolongs the lifetime of energy-constrained wireless networks.

LI et al.: PREDICTION-BASED CHARGING POLICY AND INTERFERENCE MITIGATION APPROACH 441
Fig. 1.
The system model.
B. Problem Formulation
1) The Charging Schedule of the UAV: The large-scale sen- sor nodes in this scenario are randomly distributed. Limited by the maximum transmitting power of the UAV and the charging distances between the UAV and sensor nodes, the UAV can charge many sensor nodes at each moment, but not all the sensor nodes. Considering the different roles of sensor nodes playing in different regions, we assume that the sensors’ energy consumption varies from one to another. Therefore, we divide the sensor nodes into different clusters with the energy consumption of sensor nodes. Select k nodes with the higher energy consumption for charging preferentially, enhanc- ing the charging efficiency of the energy-strapped nodes. When charging, each cluster is regarded as a basic charging unit and all the nodes are charged in the cluster simultaneously. The charging path depends on the shortest path principle to traverse k clusters. And drone hovering positions and the charging duration are also critical. In this paper, we intend to use a ESN to predict the energy consumption of each sensor node at next period, thus determining the hovering positions and the charging duration of the k clusters. The prediction and clustering formulations are detailed in Section III.
2) The Power Control of Communication Network: In the WP-IoT, the communication scheduling among sensor nodes follows the general sensor network protocol, without regard to the specific MAC layer design. So after clustering, these sensors communicate in a simple peer-to-peer manner. The sensor nodes choose different channels according to corre- sponding protocols. For the sensor nodes who share the same channel, interference mitigation is very important due to severe interference. Interference dynamics caused by time-varying environments should be aware and characterized when an interference-aware power control policy is designed to miti- gate interference. In this paper, since the sensor nodes have harvested enough energy, we then investigate their optimal power control policy during the operation period. We denote a generic sensor node as i ∈ Ns, and Ns = {1s,…,Ns} is the sensor nodes set, where all the nodes in N s share the s-th channel. Although there are more than one node receiving the signal from i, we just concentrate on the quality of service of node i’s destination node. For other situations, for instance, personal area network coordinators broadcast beacon frame, the interference dynamics of received nodes are not severe. The scheduling of signals exchange is operated in the central controllers considering the mission requirement following the protocol, which is not within the scope of this paper. For all the nodes sharing the same channel, they should follow the optimal power control policy to maintain enough signal to interference noise ratio (SINR) at the receiver and save energy. We denote the generic transmit-receive node pair as index i, The SINR of the received node i is
• In the proposed WP-IoT, we model the interference mit- igation problem in communication as a MFG, taking the sensor nodes’ energy and the mutual interference among the sensors into consideration. To capture the effects of the interference among the sensor nodes, we employ a MFA approach. The related HJB and FPK equations are derived in succession. Moreover, an optimal power control policy of WP-IoT with finite difference algorithm based on the upwind scheme is proposed to solve the mean-field equilibrium [31].
• The numerical results show that our proposed schemes significantly improve the efficiency of WPT. Meanwhile, the possibility of applying the scheme in practice is greatly promoted and the algorithm has the advantage that it can be executed off-line.
The rest of this paper is organized as follows. In Section II, we describe the system model and problem formulation. The mobile charging strategy of UAV and the corresponding theoretical analysis are given in Section III. Then, an optimal power control policy is obtained in Section IV. Numerical results are given in Section V, and finally we draw the conclusion in Section VI.
II. SYSTEM MODEL AND PROBLEM FORMULATION
A. System Model
As shown in Fig. 1, we consider a novel WPT-based IoT communication network, where a large number of limited-energy sensor nodes are randomly distributed in a two-dimensional area and a UAV acts as an energy trans- mitter to charge the sensor nodes. Our goal is to improve the efficiency of WPT. In this model, the sensor nodes can receive and store the energy transmitted by the UAV, yield data packets and regularly send them to the fusion center in a multi-hop way. Moreover, it cannot be ignored that there exists severe interference among the sensor nodes when interacting with others. A reasonable power control method can reduce the number of retransmissions of sensor nodes. Therefore, we mainly focus on the mobile charging for energy-constraint sensor nodes and the power control in massive WP-IoT.
γi = pi(t)gi,i(t)
Ns 􏰆
pj(t)gj,i(t) + σ2(t) j̸=i
, (1)
where pi (t) denotes the transmit power of node i, gj,i (t) is the channel gain from transmit node j to receive node i, σ2(t)

442
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 37, NO. 2, FEBRUARY 2019
Fig. 2.
The relationship between charging policy and power control policy.
scenarios with fixed parameters, in time-varying scenarios, such as sensor nodes, the prediction based on this method will make the update of energy consumption lag behind the changes in the sensor, which ultimately leads to a lower charging efficiency. Therefore, in this paper, we adopt a novel prediction method, namely ESN. Considering the regularity of energy consumption, ESN can establish the relationship between different sensors and energy consumption so as to achieve the purpose of prediction.
ESN is a special type of recurrent neural networks with dynamic reservoir added. Due to the time-varying character- istics of dynamic system, ESN is much more suitable for handling the problem of the dynamic system modeling, such as prediction. In general, the ESN system model consists of three layers: input layer, hidden layer and output layer. Three layers are connected by input weight matrix and output weight matrix, denoted as Win and Wout, respectively. In addition, the nodes of hidden layer are connected by the hidden layer matrix, denoted as W. In the training process, only the output weight matrix Wout needs to be changed which makes the training process of ESN more simple and efficient.
Suppose that the number of nodes in the input layer, hidden layer and output layer of ESN is K, M, N, respectively. The states of ESN at time t can be expressed as:
x(t) = [x1(t), x2(t), · · · , xK (t)], (2) u(t) = [u1(t), u2(t), · · · , uM (t)], (3) y(t) = [y1(t), y2(t), · · · , yN (t)]. (4)
The update of the hidden layer state at time t + 1 can be expressed as:
u(t + 1) = f(Winx(t + 1) + Wu(t) + Wbacky(t)), (5)
where Wback is the weight matrix which represents the output layer of the previous moment to the hidden layer of the next moment, x(t + 1) and u(t + 1) are the input and hidden state at this moment, respectively. f is the activation function of internal neurons (e.g. tanh). The output layer state of ESN at t+1is:
y(t + 1) = fout(Wout[x(t + 1); u(t + 1)]), (6)
where fout is the activation function of output layer neurons, and [;] denotes the concatenation of two vectors. The goal of our training process is to minimize the difference between y(target) and y(t + 1) by adjusting Wout, and therefore, we only need to train Wout. The calculation of Wout can be implemented as [32]:
Wout = YUT(UUT + lI)−1, (7)
where U = {u1(i),u2(i),··· ,uN(i)},(i = m,m+1,··· ,P) and Y = {y(m),y(m + 1),··· ,y(P)} represent a matrix of hidden layer state and output values at different moments, respectively. UT is the transpose of the vector U. l is a regularization coefficient. I is the identity matrix. U−1 is the inversion of the square matrix U. Considering the leaky-integrator neurons of reservoir, as the number of iter- ations increases, the echo properties of the ESN may be
is background noise. Here, Ii(t) =
Ns 2
pj(t)gj,i(t) + σ (t)
j̸=i
denotes the total interference power plus noise perceived by
receiver node i brought by all other transmit nodes in Ns. We introduce a SINR threshold γth to represent the minimum SINR requirement of nodes. Hence, γi ≥ γth should be hold for all the nodes to satisfy. Meanwhile, all the nodes will also avoid exorbitant energy consumption. As the essential problem ‘curse of dimensionality’ can be caused in this stochastic dynamic game, we propose a MFG framework to investigate this power control problem. The complete formulation is detailed in Section IV.
As is shown in Fig. 2, the context information of sensor nodes can be attached to normal data frames and periodi- cally aggregate to the data fusion center, thus determining the charging scheduling of the UAV during the next period. In addition, the harvested energy can maintain the sensor nodes to communicate normally. Taking the inter-interference and intra-interference of other transmitters to the generic sensor node receiver into consideration, an optimal power control policy based on MFG for the sensor nodes is crucial to the total system performance.
III. SENSOR NODES PREDICTION AND CLUSTERING
In this section, energy consumption predictions of the wireless powered sensor nodes are proposed in Section III-A. Then, these nodes are clustered based on the prediction results in Section III-B. Finally, the charging path planning is determined by the corresponding clustering in Section III-C. The detailed implementations are as follows.
A. Energy Consumption Prediction
Due to the limited energy storage capacity of sensor nodes, it is impossible for the sensor nodes to maintain the long-time operation. Therefore, in a mobile charging path planning scheme, it is preferable to select the larger energy consumption sensors for priority charging. The sensor nodes with higher energy consumption are selected to be preferentially charged during the next charging period owing to the fact that these selected sensors urgently need to be charged to maintain nor- mal work. In addition, formulating charging strategies based on predictions can improve charging efficiency. The traditional prediction method is to use the collected historical data for statistical analysis. Although it has a good effect on some
􏰆

LI et al.: PREDICTION-BASED CHARGING POLICY AND INTERFERENCE MITIGATION APPROACH 443
diminished or even disappear, so in this paper we adopt an improved ESN algorithm to calculate the hidden layer state u:
u(t+1)=(1−a)u(t)+af(Winx(t+1)+Wu(t)), (8) where a ∈ (0, 1] denotes the leaking rate.
As proposed in [33], the author used the users’ context such as time, location, and device type, etc., to predict the users’ content request distribution, which obtained a better result when compared with the traditional prediction method. There- fore, in this paper, we utilize the sensor’s context information for energy consumption prediction, such as the region where the sensor is deployed, the sensor type, the sensor working time, etc. The proposed new ESN-based prediction scheme is illustrated as follows:
1) Set Initial Parameters: The hidden layer is the core of the entire ESN network. The final performance of the ESN will be determined by various parameters of the reservoir. Therefore, except for setting the number of neurons in each layer, such as the input layer neurons K, output layer neurons N, we need to pay more attention to the initialization of each parameter in the reservoir. The main parameters affecting the ESN performance include the spectral radius (SR), the sparse degree (SD), the size of hidden layer neurons M, and the input extension IS.
2) Establish ESN Model: We construct the ESN model based on the initial parameters. Firstly, generate weight matrix, including input weight matrix Win ∈ RM ×K , hidden layer weight matrix W ∈ RM×M and output weight matrix Wout ∈ RN×(M+K). Win and W are generated randomly via a uniform distribution. When these two matrices are initialized, they will never change in the subsequent trainings. Wout is also initialized randomly by the uniform distribution, but it will be constantly updated in the subsequent trainings.
3) Collect Variable Information: We define Xtj = {x1(t),x2(t),··· ,xK(t)} to represent the context informa- tion of sensor node j at time t, including region, month, hour, weather, and device type. The hidden state Utj = {u1(t), u2(t), · · · , uM (t)} and the output variable Ytj = {p1,p2,··· ,pN} are calculated by (6) and (8) based on the input information of Xtj , where Ytj represents the sen- sors’ energy consumption of sensor j at time t. We need to collect Uj = {u1(i),u2(i),···,uM(i)} and Yj = {p1(i),p2(i),··· ,pN(i)}, (i = m,m + 1,··· ,P) to form a matrix, which will be used to train Wout subsequently, where Uj and Yj denote the hidden state and the output variable of user j at different moments, respectively.
4) Train the Network: Based on the connection variable Yj and Uj to train Wout, according to (6), we know that the goal of our training process is to minimize:
Algorithm 1 Sensor Nodes Clustering
1: 2:
3:
4: 5: 6: 7: 8: 9:
10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23:
Input: Sensor set D = {x1,x2,··· ,xm}.
Set the number of clusters k and the charging distance L
based on the prediction results.
Select the initial mean vector {μ1, μ2, · · · , μk} based on
the prediction results. Repeat
letSi = (1≤i≤k), forj=1,2,···,mdo
for i = 1,2,··· ,k do
calculate dji = ||xj − μi||2, (1 ≤ i ≤ k),
based λj = arg mini∈{1,2,··· ,k} dji clustering xj . if λj≤Lthen
S =S 􏰇{x}. λj λj j
end if end for
end for
for i = 1,2,··· ,k do􏰆
calculate μ′ = 1
′ i |Si|
if μi ̸= μi then μ i = μ ′i .
end if
Output: μi. end for
x∈Si
x.
􏰂P min{
􏰂N i=1
(Wout[x(t); u(t)]))}, j
Moreover, with the ESN model predicting the sensors’ energy consumption, we get the situation of each sensor’s energy consumption at the next period.
B. Sensor Nodes Clustering
In the proposed model, the UAV can charge the nodes in clusters so that the UAV can find the best hover charging locations to complete the charging, thus improving charging efficiency. The clustering algorithm in this paper is different from the traditional k-means clustering algorithm. Specifi- cally, in the clustering algorithm, the mean vector {S = μ1 , μ2 , · · · , μk } is selected based on the prediction results of the nodes’ power consumption. The selecting criterion is to rank the energy consumption of all the sensor nodes based on the prediction results, and then select the k nodes that consume the higher energy as the mean vector to cluster the remaining nodes, where the selection of k needs to ensure that the remaining node’s power can continue to work until the next charging period. Then the distances between each rest node and the mean vector Si are calculated, denoted as dji, where j represents the j-th rest of sensor left in the region andi∈(1,k).
Cluster the j-th sensor into Si based on the smallest dji.
It is notable that the distance dji must meet the demand of
dji ≤ L, where L denotes the maximum charging distance of
UAV based on the clustering center. In the simulation, L is set
as 5 meter. In order to reduce the number of clustering, when
(y(k) −
where y(k) is the real energy consumption of sensor j at
f
out
(9)
k=m
time t. The formula in (7) can be further expressed as:
Wout = (U−1 × Y)T . j
(10)
all the nodes have been completely clustered, we will compute
Until the current mean vector is updated. Output: S = {S1,S2,··· ,Sk}.
anewmeanvectorμ′ insetSi basedonμi′ = 1 􏰆x.
With the training process completed, we can utilize the ESN network for sensors’ energy consumption prediction.
i |Si|
x∈Si
If μ′i ̸= μi, μ′i will be treated as a new mean vector, and the

444 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 37, NO. 2, FEBRUARY 2019
mean vector will be expected to serve as a hovering center for UAV to charge the sensor nodes in the cluster. Finally, the clustering results S = {S1, S2, · · · , Sk} are output. The detailed algorithm is shown in Algorithm 1.
The clustering algorithm in this paper is an improved k-means algorithm. If the k-means clustering algorithm is used to complete the sensor node clustering, the algorithm can know that there will be no overlapping areas in the clustering result, that is, each node is only clustered once and belongs to only one cluster. However, our purpose of using clustering is to improve the charging efficiency of UAV and reduce the packet loss rate of the overall system. The k-means clustering method does not consider the effective charging distance of the UAV, nor does it take into account the energy consumption of the sensor nodes. Therefore, this clustering result is not suitable.
For the clustering results of our propose algorithm, the fol- lowing situations will occur: (1) there is a case where the sensor node does not belong to any of the above clusters, because in this clustering algorithm, a condition must be satisfied, namely: dji < L; (2) there will be cases where some sensor nodes will be overlapped by different clusters and be charged repeatedly. If a node is not covered by a cluster, it has the sufficient energy to maintain the normal operation in the next period. It is inevitable that a few nodes will be repeatedly covered. However, the main goal of clustering is to reduce the loss rate of the long-term data package of the whole system, namely to reduce the number of sensor nodes that cannot continue to work because of power down. Repeated coverage for a few nodes does not affect the clustering purpose and subsequent processing. C. Charging Strategy of the UAV The interest data collected by the sensor will be sent to the data fusion center for subsequent processing. As mentioned previously, in order to minimize the packet loss rate of the overall wireless powered sensor nodes, we need to ensure that the sensor is as unpowered as possible, assuming the power consumption of sensors is different to perform different tasks. In addition, the energy consumption of the same sensor in different time periods or different climate conditions maybe vary. In this case, a fixed charging strategy may cause some important sensor nodes to run out of power and fail to charge in time, eventually leading to higher packet loss rate. In con- trast, some sensor nodes appear to be full of power but have no data to transmit and waste. In order to handle this problem, we propose a dynamic charging strategy adjustment based on the predictions of sensor energy consumption. By ranking the prediction results, the sensors with the higher energy consumption are selected for priority charging, which will increase the charging efficiency and reduce the packet loss rate. The packet loss rate can be expressed as PLR = Np , Nt where Np and Nt are the total number of power down nodes and the total number of sensor nodes, respectively. The key to planning the UAV charging strategy is to dynamically adjust the charging path. The timely replenish- ment of energy, before the wireless powered sensor nodes are depleted, is crucial to the path planning problem. In this paper, we assume that UAV periodically start from the energy station to charge a series of sensors and return after completing the charging task. In order to adjust the charging path, we make some basic settings for some parameters. First of all, the charg- ing period T of the UAV must ensure that the power stored by any sensor node can just maintain the normal operation of the sensor during this period. Secondly, the number of energy stations can be single or multiple. To maximize the network utility, we take one UAV to complete the charging task by dynamically adjusting the charging strategy. In order to minimize the overall packet loss rate of the WP-IoT, it is necessary to maximize the number of wireless powered sensor nodes that satisfy the condition of emin ≥ E, where emin is the minimum residual energy of the sensor node and E is the minimum energy to ensure the normal operation of the sensor node. Therefore, we propose a periodic charging strategy based on regional energy urgency, that is to say, we prefer to charge the nodes that consume a lot of energy on the basis of the prediction results. Briefly speaking, k nodes in the prediction results of Subsection A are selected as the clustering centers for clustering, and then one of the charging paths with the shortest charging distance is determined according to these clustering results. IV. DISTRIBUTED POWER CONTROL OF MASSIVE WIRELESS POWERED SENSOR NODES In this section, we first give an introduction to the optimal power control problem of wireless powered sensor nodes, which is formulated as a stochastic differential game. Then a MFG framework is formulated and solved by a finite difference method. A. Interference Mitigation Stochastic Differential Game The interference mitigation problem of massive wireless powered sensor nodes can be firstly modeled as a stochastic differential game, which can be represented as a 5-tuple, 􏰈􏰉 G = Ns, {pi}i∈Ns , {ei}i∈Ns , {Qi}i∈Ns , {ci}i∈Ns , (11) where Ns is the number of sensors sharing the s-th channel, and these sensors make rational power control policy in this game. {pi}i∈Ns denotes the actions, which is the transmit power during the considered period [0, T ]. Each sensor node should determine timely actions pi(t) to minimize the cost function. {ei }i∈Ns , which are the states of sensor nodes, are considered as the current remaining energy. As all the nodes have the ability to harvest energy from RF signals, we introduce a disturbance to the state dynamics. As nodes take actions at any time to minimize the cost function, overall actions during the period are the control policy. {Qi}i∈Ns aims at minimizing the average of the cost function over the time interval [0, T ]. {ci} describes the objective performance i∈Ns of all the nodes. In this scenario, nodes tend to have better quality of communication and make the energy losses as low as possible. We choose node i as a generic player. We suppose that all the nodes have the same constrained battery setting, which means the state space of nodes is [0,emax]. The initial states of different nodes vary with the time-varying WPT LI et al.: PREDICTION-BASED CHARGING POLICY AND INTERFERENCE MITIGATION APPROACH 445 scheduling period. The set of actions for node i include all the possible transmit powers, pi(t) ∈ [0, pmax], where pmax is the maximum allowable transmit power of any node. The transmit power of node i at time t is denoted by pi(t). Considering the harvesting energy among [0, T ], the state dynamic of players in this game can be expressed as a stochastic differential equation dei(t) = −pi(t)dt + σtdWi(t), (12) where the transmit power pi(t) is a controlled drift at time t. Wi(t) is an independent Brownian motion (Wiener process) with a diffusion coefficient σt. Its differentiation should follow the rules of Ito calculus, thus dWt = εtdt where εt is a Gaussian random variable with mean zero and variance 1. The choice of Wi(t) represents the adjustment (uncertainty in power loss and energy harvesting) added to the transmitting which indicates that the transmitting process is independent among the nodes at different times, due to different battery characteristics and individual nodes’ minor operational con- sumption during the transmitting time. The control policy pi(0 → T) can be seen as a mapping from one state to one action. An optimal power control policy of each node is able to minimize its own average cost during the given control time period, which can be expressed as decisions must form an optimal policy with regard to the state resulting from the first decision. Hence, the optimal power control policy can then be defined in the light of a value function, and can be solved with a HJB equation. For the proposed differential game, the Nash equilibrium is our final goal. In the Nash equilibrium, none of the nodes can get a lower cost by deviating unilaterally from the current power control policy. To obtain the Nash equilibrium for this differential game, a large number of joint HJB equations associated with each player based on (13) should be solved [36], [37]. In other words, obtaining the equilibrium for this differential game with N nodes needs to solve N simultaneous partial differential equations (PDEs). Intuitively, obtaining the Nash equilibrium by solving a plenty of PDEs is difficult (even impossible). Therefore, for modeling and analyzing the power control problem in a dense nodes network, we can describe the system behaviors based on a MFG framework. B. Mean Field Game Framework The essential problem of the differential game above is the “curse of dimensionality”. The power control MFG in the WP-IoT is a special form of a differential game when the number of nodes links approaches infinity. The core idea of an MFG is the assumption of similarity, i.e., all the players are identical and follow the same strategy [9]. They can only be differentiated by their state. If the number of players is sufficiently large, it can be assumed that the impact of a generic player on others is nearly negligible. In this proposed MFG, the mean field m(t,e) represents the statistical distribution of all the nodes’ state, which is defined as 􏰊􏰋 T p∗s(0→T)=arg min E ps(0→T) 0 􏰌 ci(t)dt+ci(T) , (13) where ci(T ) is the terminal cost. The communication performance of the transmit-receive nodes pair s is characterized by the SINR. It should be noted that a larger SINR implies a higher transmit power the nodes will be chosen, which can be limited by the constrained energy. Also, a higher transmit power means the increase of the interference to others. The constrained power needs to be assigned to the dynamical system for energy shift. For simplicity, we assume the amount of energy is in proportion to the quadric form of the power. The designed cost function is composed of above two elements, which can be expressed as 1 􏰂 1{ei (t)=e} , (15) m(t, e) = lim N→∞ N ∀i∈N where to balance the units of the ci(t) = ω1( ω1 , ω2 , are pigii(t) Ii(t) + N0 introduced − γth)2 + ω2pi(t), (14) where 1{} denotes an indicator function which returns 1 if the given condition is true. Otherwise, it returns 0. For a given time instant, the mean field is the probability distribution of the states over the set of players. When the number of players N is very large, we can consider that m(t,e) is a smooth continuous distribution function. In this MFG, we consider an optimal power control problem for a generic node. The state of the node is still the current energy ei(t). For each node, it should control its power considering the interference introduced by all other nodes. Benefiting from similarity, all the nodes have the same set of equations and constraints, so the optimal control problem for the N nodes reduces to find the optimal policy for only one generic node. In this MFG, the infinite mass of interference is ought to be a function of the mean field when studying a typical user. Accordingly, the cost function should be modified as a function of mean field. The first term of the cost function is the quadric form about the action, where the first term on its denominator Ii(t) is related to the interference. To express the average interference I ̄ (t) i with mean field, expect for the mean field, we should obtain the information of average channel gain from other transmit nodes to the receiver node i, g ̄j,i. We adopt a MFA approach to cost elements. A classical cost function can be designed as cˆ(t,e)=−ωˆ1γi+ωˆ2pi(t),whichcanbefoundin[34].The first term is only related to the SINR, which means the nodes are motivated to increase the power to get higher SINR even though the QoS constraint has been satisfied already. The proposed cost function (14) takes the QoS constraint into account. To minimize the average cost during the period, further increasing of power will be discouraged after the QoS constraint is satisfied. Hence, compared with cˆ (t, e), the first term of the proposed cost function will attempt to improve the energy efficiency. On the other hand, the second term is in proportion to the interference to the fusion center. So each node should solve its optimal control problem to get the optimal power control policy minimizing the average cost. According to the Bellman’s principle of optimality [35], an optimal control policy should have the property that what- ever the initial state and initial decision are, the remaining i 446 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 37, NO. 2, FEBRUARY 2019 obtain approximate g ̄j,i. We have the approximate expression above starts at time t with T ≥ t ≥ 0, we obtain the Bellman Ii(t) = Ns j̸=i pj(t)gj,i(t) ≈ (N − 1)p ̄j(t)g ̄j,i(t), (16) −ω1( pigi,i(τ) −γth) +ω2pi2(τ) dτ. I ̄ ( τ ) + N function u(t, e) as 􏰂􏰋T􏰀2􏰁 g ̄ j , i ( t ) = p ri ( t ) − p ̄ i ( t ) g i , i ( t ) . ( N − 1 ) p ̄ j ( t ) Thus, the cost function in this MFG can be defined as and the action, the optimal power control problem given in (12) is similar for all the players in the system. The various initial states are the difference among all the investigated nodes. The HJB equations can be obtained as (20), which is shown at the bottom of this page. The evolution of the mean field can be described by a FPK equation. As it evolves forward in time, it is also called the forward equation. The forward equation of this MFG can be derived as i0 where I ̄ (t) = g ̄ (t) 􏰍 emax p(t, e)m(t, e)de. i j,i 0 Since the cost functions are only related to the mean field p ̄i (t)gi,i (t) v(t,e)=E C. MFG Equilibrium and Finite Difference Method where p ̄j(t) is the known test transmit power, which can be informed before the control period. We assume that the players involved in the game are using the same test transmit power. The term g ̄j,i(t) defines the mean interference channel gain of massive infinitesimal nodes effects, which can be estimated by the following idea. If we use p ̄j(t) as the transmit power for nodes pair i’s transmitter, then the power pri (t) received at the corresponding receiver is (22) pri (t) = p ̄i(t)gi,i(t) + Ii(t), where gi,i (t) is the effective channel gain, and is the effective received power, and Ii(t) is the received interference power from all the others. Thus, we can derive the only unknown variable g ̄j,i(t) as Our goal is to obtain the identical strategy for all the nodes. As derived above, the forward-backward equations will result in the solutions of this MFG. A finite difference technique is used in this section to solve the equations [38]. In numerical analysis, finite difference schemes are methods to numerically solve partial differential equations. The idea is to approximate the derivatives appearing in the equations by so-called finite differences. The HJB equation develops backward in time and conducts the calculation of the optimal power strategy, whose solutions give the minimum cost. Correspondingly, the FPK equation, which evolves forward in time, gives the motion of the mean field based on the current control strategy. To solve the two coupled equations iteratively, the solution of the MFG can be obtained eventually. To get the numerical solution of the MFG, the solution space is firstly discretized, where the time T are discretized into a mass of intervals as [0, TmaxΔt]. The step size of the time space is set as Tmax. Intuitively, the state of each player which satisfies the constraint e ∈ [0, emax] can be discretized as [0,emaxΔe] whose step size is Δe. Accordingly, all the functions respect to time and state become Tmax × (hmax + 1) matrices. And for notations simple, for instance, the simplified representations of the cost function at time tΔt and state eΔe is denoted as C (t, e). Based on the upwind difference scheme. The complicated derivative expressions with respect to time and state space in the continuous scenario can be reformulated as c(t + 1, e) − c(t, e)) = , (23) ∂e Δe ∂2v(t,e) = v(t,e+1)−2v(t,e)+v(t,e−1). (25) ∂2e (Δe)2 Firstly, to evolve the mean filed, we discrete the FPK equa- tion. Accordingly, the mean field evolution equation forward in time can be derived as (26), which is shown at the bottom of this page. (17) ti0 ci(t) = ω1( pigi,i(t) − γth)2 + ω2pi(t), I ̄ ( t ) + N ( 1 8 ) (19) ∂c(t, e) ∂t ∂m(t,e) σ2 ∂2m(t,e) t 2 ∂2e Δt ∂c(t,e) = c(t,e)−c(t,e−1), (24) ∂ ∂t ∂e (m(t, e)p(t, e)) + = 0. + (21) To get the solution of the MFG, it is necessary to solve the two coupled PDEs derived in (20) and (21). As an FPK type equation evolves forward in time that governs the evolution of the density function of the agents. And an HJB type equation evolves backward in time that governs the computation of the optimal path for each agent. Assuming that the optimal control ∂v(t,e) ∂tp(t,e) 􏰎 c(p(t,e),m(t,e))−p(t,e)· ∂v(t,e)􏰏 σt2 ∂2v(t,e) + 2 = 0. (20) + min ∂e 2∂e M(t+1,e) = M(t,e)+ Δt [M(t,e−1)P(t,e−1)−M(t,e)P(t,e)]+ σt2Δt [2M(t,e)−M(t,e+1)−M(t,e−1)]. Δe 2(Δe)2 (26) LI et al.: PREDICTION-BASED CHARGING POLICY AND INTERFERENCE MITIGATION APPROACH 447 TABLE I PARAMETER SETTING Algorithm 2 Power Control Algorithm of Distributed Wireless Powered Sensor Nodes Require: Set up Tmax × (emax + 1) matrices V, M, P. M(0,:): Initial mean field distribution. P(t,e): Arbitrary initial power value for the generic sensor node. V(tmax,:) = 0, i = 1. while i < MAX do foralli=1:1:tmax do for all j ∈ {0,...,emax} do Solve the FPK equation to obtain M with (26). end for end for for all i = tmax : 1 : 2 do for all j ∈ {0,...,emax} do Calculate V(i − 1, j) by using (28). end for end for Calculate the new power value P∗ (tmax, (emax + 1)) using (27). RegressivelyupdateP=aP+bP∗witha+b=1. end while The HJB equation is used to update the optimal value function. By applying the first order necessary condition on the Hamiltonian, the optimal power control can be derived as 2I ̄(t)γ −ω+∂v(t,e) p∗(t,e)= i th 2 ∂e . (27) Fig. 3. hidden nodes. 2 ω I ̄ ( t ) 1i convergence solution [29]. The whole algorithm for getting the solution of the MFG is given in Algorithm 2. To operate this algorithm, each node should know the initial mean field dis- tributions, which can be informed from the center controller. And to express the cost function, the g ̄j,i(t) is obtained based on the MFA before the beginning of the game. The algorithm stops when the number of iterations is larger than a threshold value. V. NUMERICAL RESULTS AND DISCUSSION In this section, we illustrate the performance of our pro- posed scheme via numerical simulation results. In the sim- ulation, we consider a network including 400 sensor nodes, which consists of a region that is 50 meters length and width, respectively. Before the experiment, we do some simplification for the model to reduce the complexity of the simulation. We use one UAV to accomplish the sensors’ charging, and the charging cycle T ensures that the sensor will continue to work during this period. In addition, we ignore the UAV’s time delay during flight and assume that UAV has enough energy. Some parameters are shown in Table I. Fig. 3 shows the mean square error between the predicted values and the real values under different hidden nodes. We predict the energy consumption of sensor nodes by the ESN model. Since the ESN model only trains the output Mean square error between prediction and real values under different Replacing p∗ back to the HJB equation, the optimal value function evolution equation can be derived after some alge- braic steps as (28), which is shown at the bottom of this page. To accomplish the integrated evolution of the mean field and the optimal strategy iteratively, some boundary conditions are necessary. For the computation when the time or the power is out of range, for instance, V(t, e + 1) should not exist when e = emax. So in this situation, we replace V(t, emax + 1) by V(t, emax). As a node will act only against the mean field m, itera- tively solving the equations (26), (27) and (28) can give a Δe (V(t + 1, e) − V(t, e))+ΔeC (P∗(t, e), M(t, e)) σ2Δt + t [2V(t,e)−V(t,e+1)−V(t,e−1)]. V(t−1,e)=V(t,e)+ Δt P∗(t, e) 2(Δe)2 (28) 448 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 37, NO. 2, FEBRUARY 2019 Fig. 4. Sensor node clustering based on energy prediction. weight matrix, the selection of parameters will directly affect the prediction results of the entire ESN model, especially the hidden layer parameters. As shown in Fig. 3, the horizontal axis is the number of iteration, and the vertical axis is the mean square error between the predicted values and the real values. In Fig. 3, we mainly test the influence of parameters on the prediction performance of the network when taking different values, for example, the learning rate, the number of hidden nodes in the ESN network, etc. Therefore, using the mean squared error can better indicate the deviation of the predicted values from the actual value. We can see that as the number of hidden layer nodes M increases, the mean square error is getting smaller and smaller, indicating that the ESN model’s prediction results are closer and closer to the real value, and the final M value of the model is also obtained experimentally. Fig. 3 also shows that with other parameters fixed, the lower the update rate of hidden layer is, the more accurate the prediction results are. This is mainly because the lower learning rate in the neural network training process and the smaller step size of the function adjustment at each iteration will lead to higher prediction accuracy. In addition, from Fig. 3 we can also see that, as the number of the iterations increases, the prediction accuracy will be improved. Fig. 4 demonstrates a 50*50 square meters sensor network with 400 sensor nodes randomly distributed, and we know the locations of these nodes in advance. Due to the difference in energy consumption among sensor nodes, some nodes obviously consume more energy. We simulate the cases of clustering and no clustering, respectively. As shown in Fig. 4, the red nodes are selected with the higher energy consumption. The solid circles indicate the charging coverage areas with the proposed clustering algorithm, while the dotted line circles represent the charging coverage areas with no clustering. The values attached to the circles denote the amount of energy consumption, where the larger the value is, the more energy the node consumes. Compared with the case of no clustering, the clustering algorithm of this paper can obtain higher charg- ing efficiency under the same UAV charging coverage. Based on the prediction and clustering results, the charging path can be planned with the shortest path principle, which is shown in Fig. 5. The charging path varies with the different charging periods. Fig. 5. principle. The charging path based on the clustering with the shortest path Fig. 6. sensor node sizes. Total wireless powered sensor nodes packet loss rate with different Fig. 6 shows the performance of the total wireless powered sensor nodes data package loss rate change for different strategies varying with the number of sensor nodes. In Fig. 6, no matter how many the wireless powered sensor nodes are, the total packet loss rate of the sensor node in the case of energy consumption prediction is significantly lower than that of no prediction. Moreover, it can be seen that the larger the number of sensor nodes is, the smaller the total packet loss rate in the case of prediction becomes. The main reason is that the prediction can more accurately select nodes with high energy consumption. Compared with random charging strategy, the charging strategy based on prediction is more targeted, especially when the number of nodes is large. Fig. 6 shows that regardless of which strategy to charge, as the number of sensor nodes increases, the total package loss rate of sensor nodes will continue to increase. That is because, the sensor nodes are randomly distributed in the region. Due to the difference of distances, UAV cannot charge all the nodes in time, so the nodes run out of the power and lose the packets. In the following, we diagrammatize the nodes behaviors and the distributions of the mean field based on our proposed power control MFG framework. We consider a WSN, with the radius of peer-to-peer links uniformly distributed between 10 m to 40 m. And the SINR threshold is 8dB. We set emax = 0.1J, T = 0.5s and pmax = 0.01W. And we assume Ns = 50, which means that these nodes share the same channel and LI et al.: PREDICTION-BASED CHARGING POLICY AND INTERFERENCE MITIGATION APPROACH 449 Fig. 7. Mean field distribution with a uniform initial energy distribution. Fig. 8. Cross-sections of the mean field distributions. have mutual interference. Without loss of generality, we set the initial energy distribution m(0, :) uniform. It is worth noting that some nodes may lose all their energy in a particular case. The mean field at the equilibrium is shown in Fig. 7. We can observe that the number of higher energy levels nodes decreases with time. And the nodes with lower energy gradually increase with time. This implies that all the nodes tend to increase power. The probability of nodes having zero energy increases at the beginning of the time frame and settles to a constant finally. Not all the nodes empty their battery to improve the SINR, because the quadratic term of the cost function discourages the nodes to increase their transmit power after satisfying the threshold value. As is shown in Fig. 8, some cross-sections of the mean field give a clearer view to the variation of the probability distribution of nodes having a certain energy with time. In this simulation, the initial probability are identical since we assume the mean field is a uniform distribution at the beginning. The goal of each node is to minimize the cost function. Therefore, the probability of nodes with maximum energy level dives to zero after the beginning. And the number of nodes having zero energy increases with time. The probability distribution of nodes with 0.04J has a slight increase at last. The mean field at 0.06J keeps stable for a while, and decreases after that. There is another phenomenon that the rates of the variations of these cross-sections decrease with time. The reason is that along with time, more nodes transmit higher power, which cause larger interference. Hence, to improve the SINR, each node need larger power, which cost more energy. In Fig. 9, we demonstrate the power policies for the nodes. Each node can adjust the dynamical system to choose the corresponding power at each time instant. It is notable that the Fig. 9. Optimal power control policy with uniform initial energy distribution. Variation of a generic node’s average SINR with time. Variation of average SINR under two power control policies. Fig. 10. Fig. 11. length of the control time interval can be set flexibly. All nodes increase their power once beginning, and continue to increase with time. That is because, as the power has the increasing tendency, which results in the increased interference, and each node cannot make higher impacts on the value function with power increasing. At last, we investigate the average spectrum effectiveness of the nodes during the control period. We refer to the performance index average SINR to describe the improvement in quality of communications from our proposed algorithm. Fig. 10 depicts the average SINR under the proposed MFG control policy. In the simulations, we assume that all the sensors choose different power policies only according to their states and the mean field. It can be seen that the average SINR is always larger than the SINR threshold value. Meanwhile it verifies that the designed cost function causes a better energy efficiency. It decreases the power after satisfying the SINR threshold. To verify the performance of our proposed MFG algorithm, we derive a uniform power control scheme 450 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 37, NO. 2, FEBRUARY 2019 as a benchmark. We assume a proper power for all the nodes, in case it costs much more power. The simulation results are shown in Fig. 11. The two full curves represent the average SINR of all the nodes forward in time. When the density of the network becomes larger, the average SINR under uniform power scheme decreases rapidly. The results illustrate that in comparison to the uniform power scheme, the control policy obtained from the proposed algorithm has a better performance. VI. CONCLUSION In this paper, we have investigated a UAV-aid WP-IoT network, with taking both the charging policy of the UAV and the interference mitigation of the WP-IoT into consideration. From the numerical simulation results, we can draw the following conclusions. Firstly, the total packet loss rate of sensor nodes with ESN-based energy consumption prediction is effectively lower than that with no prediction. Based on the prediction and clustering results, the UAV charging path can be determined with the shortest path principle. Secondly, we have modeled the interference control problem as a MFG. It verifies that the designed cost function has a better energy efficiency. Within the MFG framework, a proposed finite difference algorithm based on the upwind scheme obtains the optimal power control policy, which improves the average SINR of each sensor node and makes the wireless charging interval longer. Especially, the algorithm can be executed off-line with the solution of MFG in practice to implement the proposed scheme. Finally, the average spectrum efficiency and the network utility are significantly improved with the proposed algorithms comparing to the benchmark schemes. REFERENCES [1] S. M. R. Islam, D. Kwak, M. H. Kabir, M. Hossain, and K.-S. Kwak, “The Internet of Things for health care: A comprehensive survey,” IEEE Access, vol. 3, no. 2, pp. 678–708, Jun. 2015. [2] T. Azzabi, H. Farhat, and N. Sahli, “A survey on wireless sensor networks security issues and military specificities,” in Proc. Int. Conf. Adv. Syst. Electr. Technol., Hammamet, Tunisia, Jan. 2017, pp. 66–72. [3] Z. Chu, W. Sun, and J. Wang, “Research on MAC layer communi- cation performance model of wireless sensor networks for intelligent transportation,” in Proc. Int. Conf. Autom. Comput., Colchester, U.K., Sep. 2016, pp. 366–371. [4] S. A. Imam, A. Choudhary, A. M. Zaidi, M. K. Singh, and V. K. Sachan, “Cooperative effort based wireless sensor network clustering algorithm for smart home application,” in Proc. IEEE Int. Conf. Integr. Circuits Microsyst., Nanjing, China, Nov. 2017, pp. 304–308. [5] S. Aroua, I. El Korbi, Y. Ghamri-Doudane, and L. A. Saidane, “A distrib- uted cooperative spectrum resource allocation in smart home cognitive wireless sensor networks,” in Proc. IEEE Symp. Comput. Commun., Heraklion, Greece, Jul. 2017, pp. 754–759. [6] J. Cabra, D. Castro, J. Colorado, D. Mendez, and L. Trujillo, “An IoT approach for wireless sensor networks applied to e-health environmental monitoring,” in Proc. IEEE Int. Conf. Internet Things, Exeter, U.K., Feb. 2017, pp. 578–583. [7] A. Mathur, T. Newe, M. Rao, W. Elgenaidi, and D. Toal, “Cluster head election and rotation for medical-based wireless sensor networks,” in Proc. Int. Conf. Control, Decis., Inf. Technol., Barcelona, Spain, Nov. 2017, pp. 149–154. [8] L. Xie, Y. Shi, Y. T. Hou, and W. Lou, “Wireless power transfer and applications to sensor networks,” IEEE Wireless Commun. Mag., vol. 20, no. 4, pp. 140–145, Aug. 2013. [9] F. Sangare, Y. Xiao, D. Niyato, and Z. Han, “Mobile charging in wireless-powered sensor networks: Optimal scheduling and experi- mental implementation,” IEEE Trans. Veh. Technol., vol. 66, no. 8, pp. 7400–7410, Aug. 2017. [10] F. Sangare and Z. Han, “Joint optimization of cognitive RF energy harvesting and channel access using Markovian multi-armed bandit problem,” in Proc. IEEE Int. Conf. Commun. Workshops, Paris, France, May 2017, pp. 487–492. [11] Y. Zeng and R. Zhang, “Optimized training design for wireless energy transfer,” IEEE Trans. Commun., vol. 63, no. 2, pp. 536–550, Feb. 2015. [12] J. Xu and R. Zhang, “A general design framework for MIMO wireless energy transfer with limited feedback,” IEEE Trans. Signal Process., vol. 64, no. 10, pp. 2475–2488, May 2016. [13] Z. Chu, F. Zhou, Z. Zhu, R. Q. Hu, and P. Xiao, “Wireless powered sensor networks for Internet of Things: Maximum throughput and optimal power allocation,” IEEE Internet Things J., vol. 5, no. 1, pp. 310–321, Feb. 2018. [14] F. Sangare, A. Arab, M. Pan, L. Qian, S. K. Khator, and Z. Han, “RF energy harvesting for WSNs via dynamic control of unmanned vehicle charging,” in Proc. Wireless Commun. Netw. Conf., New Orleans, LA, USA, Mar. 2015, pp. 1291–1296. [15] J. Xu, Y. Zeng, and R. Zhang, “UAV-enabled wireless power transfer: Trajectory design and energy region characterization,” in Proc. IEEE GLOBECOM Workshops (GC Wkshps), Singapore, Dec. 2017, pp. 1–7. [16] M. Chen, W. Saad, C. Yin, and M. Debbah, “Echo state networks for proactive caching in cloud-based radio access networks with mobile users,” IEEE Trans. Wireless Commun., vol. 16, no. 6, pp. 3520–3535, Jun. 2017. [17] M. Kamel, W. Hamouda, and A. Youssef, “Ultra-dense networks: A survey,” IEEE Commun. Surveys Tuts., vol. 18, no. 4, pp. 2522–2545, 4th Quart., 2017. [18] H. Zhang, S. Huang, C. Jiang, K. Long, V. C. M. Leung, and H. V. Poor, “Energy efficient user association and power allocation in millimeter-wave-based ultra dense networks with energy harvesting base stations,” IEEE J. Sel. Areas Commun., vol. 35, no. 9, pp. 1936–1947, Sep. 2017. [19] X. Li, L. Qian, and D. Kataria, “Downlink power control in co-channel macrocell femtocell overlay,” in Proc. Inf. Sci. Syst., Baltimore, MD, USA, Mar. 2009, pp. 383–388. [20] D. Bauso, B. M. Dia, B. Djehiche, H. Tembine, and R. Tempone, “Mean- field games for marriage,” PLoS ONE, vol. 9, no. 5, p. e94933, Jun. 2014. [21] C. Yang, Y. Zhang, J. Li, and Z. Han, “Power control mean field game with dominator in ultra-dense small cell networks,” in Proc. IEEE Global Commun. Conf., Singapore, Dec. 2017, pp. 1–6. [22] K.-H. N. Bui and J. J. Jung, “Cooperative game theoretic approach for distributed resource allocation in heterogeneous network,” in Proc. Int. Conf. Intell. Environ., Seoul, South Korea, Aug. 2017, pp. 168–171. [23] Z. Zhou, M. Dong, K. Ota, B. Gu, and T. Sato, “Stackelberg-game based distributed energy-aware resource allocation in device-to-device communications,” in Proc. IEEE Int. Conf. Commun. Syst., Macau, China, Nov. 2014, pp. 11–15. [24] P. Semasinghe, E. Hossain, and K. Zhu, “An evolutionary game for distributed resource allocation in self-organizing small cells,” IEEE Trans. Mobile Comput., vol. 14, no. 2, pp. 274–287, Feb. 2015. [25] S. D’Oro, L. Galluccio, S. Palazzo, and G. Schembra, “A game theo- retic approach for distributed resource allocation and orchestration of softwarized networks,” IEEE J. Sel. Areas Commun., vol. 35, no. 3, pp. 721–735, Mar. 2017. [26] H. Zhang, J. Du, J. Cheng, K. Long, and V. C. M. Leung, “Incomplete CSI based resource optimization in SWIPT enabled heterogeneous networks: A non-cooperative game theoretic approach,” IEEE Trans. Wireless Commun., vol. 17, no. 3, pp. 1882–1892, Mar. 2018. [27] C. Yang, J. Li, P. Semasinghe, E. Hossain, S. M. Perlaza, and Z. Han, “Distributed interference and energy-aware power control for ultra-dense D2D networks: A mean field game,” IEEE Trans. Wireless Commun., vol. 16, no. 2, pp. 1205–1217, Feb. 2017. [28] P. Semasinghe and E. Hossain, “Downlink power control in self- organizing dense small cells underlaying macrocells: A mean field game,” IEEE Trans. Mobile Comput., vol. 15, no. 2, pp. 350–363, Feb. 2016. [29] T. K. Thuc, E. Hossain, and H. Tabassum, “Downlink power con- trol in two-tier cellular networks with energy-harvesting small cells as stochastic games,” IEEE Trans. Commun., vol. 63, no. 12, pp. 5267–5282, Dec. 2015. [30] Z. Han, D. Niyato, W. Saad, T. Bas ̧ar, and A. Hjørungnes, Game Theory in Wireless and Communication Networks: Theory, Models, and Applications. Cambridge, U.K.: Cambridge Univ. Press, 2012. [31] R. Courant, E. Isaacson, and M. Rees, “On the solution of nonlinear hyperbolic differential equations by finite differences,” Commun. Pure Appl. Math., vol. 5, no. 3, pp. 243–255, Aug. 2010. LI et al.: PREDICTION-BASED CHARGING POLICY AND INTERFERENCE MITIGATION APPROACH 451 [32] A. Anderson and H. Haas, “Using echo state networks to characterise wireless channels,” in Proc. IEEE 77th Veh. Technol. Conf. (VTC Spring), Dresden, Germany, Jun. 2013, pp. 1–5. [33] M. Bauduin, A. Smerieri, S. Massar, and F. Horlin, “Equalization of the non-linear satellite communication channel with an echo state network,” in Proc. IEEE 81st Veh. Technol. Conf. (VTC Spring), Glasgow, U.K., May 2015, pp. 1–5. [34] A. Y. Al-Zahrani, F. R. Yu, and M. Huang, “A joint cross-layer and colayer interference management scheme in hyperdense heterogeneous networks using mean-field game theory,” IEEE Trans. Veh. Technol., vol. 65, no. 3, pp. 1522–1535, Mar. 2016. [35] Z. Zhu, S. Lambotharan, W. H. Chin, and Z. Fan, “A mean field game theoretic approach to electric vehicles charging,” IEEE Access, vol. 4, no. 5, pp. 3501–3510, Jun. 2016. [36] M. Aziz and P. E. Caines, “A mean field game computational method- ology for decentralized cellular network optimization,” IEEE Trans. Control Syst. Technol., vol. 25, no. 2, pp. 563–576, Mar. 2017. [37] K. Hamidouche, W. Saad, M. Debbah, and H. V. Poor, “Mean- field games for distributed caching in ultra-dense small cell net- works,” in Proc. Amer. Control Conf., Boston, MA, USA, Jul. 2016, pp. 4699–4704. [38] M. de Mari, E. C. Strinati, M. Debbah, and T. Q. S. Quek, “Joint stochastic geometry and mean field game optimization for energy- efficient proactive scheduling in ultra dense networks,” IEEE Trans. Cogn. Commun. Netw., vol. 3, no. 4, pp. 766–781, Dec. 2017. Lixin Li (M’12) received the B.Sc. and M.Sc. degrees in communication engineering and the Ph.D. degree in control theory and its applications from Northwestern Polytechnical University (NPU), Xi’an, China, in 2001, 2004, and 2008, respectively. He was a Post-Doctoral Fellow with NPU from 2008 to 2010. In 2017, he was a Visiting Scholar at the University of Houston, Houston, TX, USA. He is currently an Associate Professor with the School of Electronics and Information, NPU. He has authored or co-authored over 80 technical papers in journals and international conferences, and he holds 10 patents. His current research interests include wireless communications, game theory, and machine learning. He has reviewed papers for many international journals. He received the 2016 NPU Outstanding Young Teacher Award, which is the highest research and education honors for young faculties in NPU. Yang Xu is currently pursuing the master’s degree under the supervision of Prof. L. Li with the School of Electronics and Information, Northwestern Poly- technical University, China. Her research interests include wireless caching and mean field game in wireless communication networks. Zihe Zhang is currently pursuing the master’s degree under the supervision of Prof. L. Li with the School of Electronics and Information, Northwest- ern Polytechnical University, China. His research interests include mean field game, UAVs communi- cations, and simultaneous wireless information and power transfer. Jiaying Yin is currently pursuing the master’s degree under the supervision of Prof. L. Li with the School of Electronics and Information, Northwestern Polytechnical University, China. His current research interests are mainly wireless caching and machine learning. Wei Chen (S’05–M’07–SM’13) received the B.S. and Ph.D. degrees (Hons.) from Tsinghua University in 2002 and 2007, respectively. From 2005 to 2007, he was a Visiting Ph.D. Student at The Hong Kong University of Science and Technology. Since 2007, he has been on the Faculty at Tsinghua University, where he is currently a tenured Full Professor, the Director of the Degree Office, Tsinghua Uni- versity, and a University Council Member. From 2014 to 2016, he has served as a Deputy Head of the Department of Electronic Engineering. He visited Princeton University, Télécom ParisTech, and the University of Southampton in 2016, 2014, and 2010, respectively. His research interests are in the areas of communication theory, stochastic optimization, and statistical learning. Dr. Chen is a Cheung Kong Young Scholar and a member of National Program for Special Support for Eminent Professionals, also known as 10000- Talent Program. He has also been supported by the National 973 Youth Project, the NSFC Excellent Young Investigator Project, the New Century Talent Program of the Ministry of Education, and the Beijing Nova Program. He was a receipt of the National May 1st Labor Medal and the China Youth May 4th Medal. He received the IEEE Marconi Prize Paper Award and the IEEE Comsoc Asia Pacific Board Best Young Researcher Award in 2009 and 2011, respectively. He has served as a TPC Co-Chair for IEEE VTC-Spring 2011and a Symposium Co-Chair for IEEE ICC and GLOBECOM. He serves as an editor for the IEEE TRANSACTIONS ON COMMUNICATIONS. Zhu Han (S’01–M’04–SM’09–F’14) received the B.S. degree in electronic engineering from Tsinghua University in 1997 and the M.S. and Ph.D. degrees in electrical and computer engineering from the University of Maryland at College Park, College Park, MD, USA, in 1999 and 2003, respectively. From 2000 to 2002, he was an R&D Engineer of JDSU, Germantown, MD, USA. From 2003 to 2006, he was a Research Associate with the University of Maryland. From 2006 to 2008, he was an Assistant Professor at Boise State University, Boise, ID, USA. He is currently a Professor with the Electrical and Computer Engineering Department and the Computer Science Department, University of Houston, Houston, TX, USA. His research interests include wireless resource allocation and management, wireless communications and networking, game theory, big data analysis, security, and smart grid. He received the NSF Career Award in 2010, the Fred W. Ellersick Prize of the IEEE Communication Society in 2011, the EURASIP Best Paper Award for the Journal on Advances in Signal Processing in 2015, the IEEE Leonard G. Abraham Prize in the field of communications systems (Best Paper Award in IEEE JSAC) in 2016, and several best paper awards in IEEE conferences. He is currently an IEEE Communications Society Distinguished Lecturer.

Related Posts