Exploiting graph neural networks to perform finite-difference time-domain based optical simulations

I. INTRODUCTION

Section:

ChooseTop of pageABSTRACTI. INTRODUCTION <<II. METHODSIII. RESULTSIV. CONCLUSION AND OUTLOO...REFERENCES

Nanophotonics explores the interaction of light with materials structured on a length scale comparable to or smaller than the wavelength.1,21. A. F. Koenderink, A. Alù, and A. Polman, “Nanophotonics: Shrinking light-based technology,” Science 348(6234), 516–521 (2015). https://doi.org/10.1126/science.12612432. F. Monticone and A. Alù, “Metamaterial, plasmonic and nanophotonic devices,” Rep. Prog. Phys. 80(3), 036401 (2017). https://doi.org/10.1088/1361-6633/aa518f Novel phenomena encountered at the nanoscale and the disruptive means to tailor the flow of light constitute an ever-increasing motivation to explore such systems.33. C. Lienau, M. A. Noginov, and M. Lončar, “Light–matter interactions at the nanoscale,” J. Opt. 16(11), 110201 (2014). https://doi.org/10.1088/2040-8978/16/11/110201 For these explorations, it is of major importance to reliably simulate the light–matter interaction.44. A. V. Lavrinenko, J. Lægsgaard, N. Gregersen, F. Schmidt, and T. Søndergaard, Numerical Methods in Photonics (CRC Press, 2018). On the one hand, it drives the development of novel materials, structures, and devices. On the other hand, it provides insights into the light propagation in actual systems inaccessible by experimental means. The numerical experiment, as it is often called, is indispensable to understanding nanophotonic devices.5,65. B. Gallinet, J. Butet, and O. J. F. Martin, “Numerical methods for nanophotonics: Standard problems and future challenges,” Laser Photonics Rev. 9(6), 577–603 (2015). https://doi.org/10.1002/lpor.2015001226. P. Lalanne, W. Yan, K. Vynck, C. Sauvan, and J.-P. Hugonin, “Light interaction with photonic and plasmonic resonances,” Laser Photonics Rev. 12(5), 1700113 (2018). https://doi.org/10.1002/lpor.201700113One of the most common and well-known numerical simulation tools is the finite-difference time-domain method (FDTD).77. A. Taflove, S. C. Hagness, and M. Piket-May, “Computational electromagnetics: The finite-difference time-domain method,” in The Electrical Engineering Handbook (Elsevier, Inc., 2005), pp. 629–670. It originates from a discrete version of Maxwell’s equations and evolves electromagnetic fields in time-domain with high accuracy. The FDTD method is particularly useful for understanding how light interacts with a given structure. It allows us to define an incident field and a given structure and to observe the evolution of the field in that specific setting afterward.8,98. V. Jovanov, U. Palanchoke, P. Magnus, H. Stiebig, J. Hüpkes, P. Sichanugrist, M. Konagai, S. Wiesendanger, C. Rockstuhl, and D. Knipp, “Light trapping in periodically textured amorphous silicon thin film solar cells using realistic interface morphologies,” Opt. Express 21(S4), A595–A606 (2013). https://doi.org/10.1364/oe.21.00a5959. A. Vaccari, L. Cristoforetti, A. C. Lesina, L. Ramunno, A. Chiappini, F. Prudenzano, A. Bozzoli, and L. Calliari, “Parallel finite-difference time-domain modeling of an opal photonic crystal,” Opt. Eng. 53(7), 071809 (2014). https://doi.org/10.1117/1.OE.53.7.071809 With that, it provides a way to observe the evolution of light exactly as it would happen in the real world. However, as with all computational techniques, the speed could always be accelerated and the domain size that can be handled could always be larger. While the parallelization of FDTD algorithms is fairly standard, it quickly requires computational infrastructures for its execution that are not accessible to everybody. Thus, even moderately sized problems may come along with a high computational cost and time.Lately, in the slip-stream of developments in the field of computer sciences, in general, and machine learning, in particular, there have been tremendous efforts to approach this problem with the use of deep learning-based data-driven methods.10–1310. J. Park, S. Kim, D. W. Nam, H. Chung, C. Y. Park, and M. S. Jang, “Free-form optimization of nanophotonic devices: from classical methods to deep learning,” Nanophotonics 11(9), 1809–1845 (n.d.).11. W. Ma, Z. Liu, Z. A. Kudyshev, A. Boltasseva, W. Cai, and Y. Liu, “Deep learning for the design of photonic structures,” Nat. Photonics 15(2), 77–90 (2021). https://doi.org/10.1038/s41566-020-0685-y12. Z. A. Kudyshev, A. V. Kildishev, V. M. Shalaev, and A. Boltasseva, “Machine learning–assisted global optimization of photonic devices,” Nanophotonics 10(1), 371–383 (2021). https://doi.org/10.1515/nanoph-2020-037613. T. Repän, V. Ramakrishna, and C. Rockstuhl, “Artificial neural networks used to retrieve effective properties of metamaterials,” Opt. Express 29(22), 36072–36085 (2021). https://doi.org/10.1364/OE.427778 Many of these attempts have been very successful and promoted not only accelerated simulation but also inverse design processes.14–1914. Y. Tang, K. Kojima, T. Koike‐Akino, Y. Wang, P. Wu, Y. Xie, M. H. Tahersima, D. K. Jha, K. Parsons, and M. Qi, “Generative deep learning model for inverse design of integrated nanophotonic devices,” Laser Photonics Rev. 14(12), 2000287 (2020). https://doi.org/10.1002/lpor.20200028715. P. R. Wiecha, A. Arbouet, C. Girard, and O. L. Muskens, “Deep learning in nano-photonics: Inverse design and beyond,” Photonics Res. 9(5), B182–B200 (2021). https://doi.org/10.1364/prj.41596016. S. So, T. Badloe, J. Noh, J. Bravo-Abad, and J. Rho, “Deep learning enabled inverse design in nanophotonics,” Nanophotonics 9(5), 1041–1057 (2020). https://doi.org/10.1515/nanoph-2019-047417. L. Kuhn, T. Repän, and C. Rockstuhl, “Inverse design of core-shell particles with discrete material classes using neural networks,” Sci. Rep. 12(1), 19019 (2022). https://doi.org/10.1038/s41598-022-21802-318. I. Sajedian, T. Badloe, and J. Rho, “Optimisation of colour generation from dielectric nanostructures using reinforcement learning,” Opt. Express 27(4), 5874–5883 (2019). https://doi.org/10.1364/oe.27.00587419. C. Majorel, C. Girard, A. Arbouet, O. L. Muskens, and P. R. Wiecha, “Deep learning enabled strategies for modeling of complex aperiodic plasmonic metasurfaces of arbitrary size,” ACS Photonics 9(2), 575–585 (2022). https://doi.org/10.1021/acsphotonics.1c01556 In addition, a few approaches were already reported that looked into the question of how to translate an FDTD simulation to a machine learning approach.20,2120. H. M. Yao and L. J. Jiang, “Machine learning based neural network solving methods for the FDTD method,” in 2018 IEEE International Symposium on Antennas and Propagation and USNC/URSI National Radio Science Meeting (IEEE, 2018), pp. 2321–2322.21. D. Zhu, Q. Zhao, Y. Bo, W. Chen, and L. Yang, “Application of deep learning in FDTD method,” in 2021 13th International Symposium on Antennas, Propagation and EM Theory (ISAPE) (IEEE, 2021), Vol. 1, pp. 1–3. Generally, this reads extremely rewarding. Once trained, the neural networks are expected to solve the same computational problems in a much shorter time. While the general feasibility was demonstrated in these pioneering prior contributions, these initial approaches also came along with some drawbacks caused by deep-learning architectures and a lack of generalizability due to limited data. This is a commonly encountered problem, and these surrogate physics solver approaches were restricted with respect to the materials and shapes of the scatterers, as well as the domain size and scale of the simulation. These previously considered neural networks mostly include convolutional layers, which learn features on a certain scale and are not able to extrapolate on an arbitrary resolution. Additionally, the input shape of the model has to be fixed. Thus, variable domain sizes could not be considered.To circumvent these aforementioned problems, we consider here the use of graph neural networks (GNNs) for the same purpose. In general, GNNs are suitable to process unstructured, non-uniform data and have been already useful to classify molecules,22,2322. O. Wieder, S. Kohlbacher, M. Kuenemann, A. Garon, P. Ducrot, T. Seidel, and T. Langer, “A compact review of molecular property prediction with graph neural networks,” Drug Discovery Today: Technol. 37, 1–12 (2020). https://doi.org/10.1016/j.ddtec.2020.11.00923. H. Ma, Y. Bian, R. Yu, W. Huang, T. Xu, W. Xie, G. Ye, and J. Huang, “Multi-view graph neural networks for molecular property prediction,” arXiv:2005.13607 (2020). to model fluid and particle dynamics,24,2524. P. Tobias, M. Fortunato, A. Sanchez-Gonzalez, and P. W. Battaglia, “Learning mesh-based simulation with graph networks,” arXiv:2010.03409 (2020).25. F. De Avila Belbute-Peres, T. Economon, and Z. Kolter, “Combining differentiable PDE solvers and graph neural networks for fluid flow prediction,” in Proceedings of the 37th International Conference on Machine Learning, Proceedings of Machine Learning Research, edited by H. Daumé III and A. Singh (PMLR, 2020), Vol. 119, pp. 2402–2411. for computer vision tasks,2626. K. Han, Y. Wang, J. Guo, Y. Tang, and E. Wu, “Vision GNN: An image is worth graph of nodes,” arXiv:2206.00272 (2022). in particle and high-energy physics,27,2827. X. Ju, S. Farrell, P. Calafiura, D. Murnane, Prabhat, L. Gray, T. Klijnsma, K. Pedro, G. Cerati, J. Kowalkowski, G. Perdue, P. Spentzouris, N. Tran, J.-R. Vlimant, A. Zlokapa, J. Pata, M. Spiropulu, S. An, A. Adam, J. Hewes, A. Tsaris, K. Terao, and T. Usher, “Graph neural networks for particle reconstruction in high energy physics detectors,” arXiv:2003.11603 (2020).28. J. Shlomi, P. Battaglia, and J.-R. Vlimant, “Graph neural networks in particle physics,” Mach. Learn.: Sci. Technol. 2(2), 021001 (2020). https://doi.org/10.1088/2632-2153/abbf9a or to work as graph neural operators.29–3129. Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anandkumar, “Neural operator: Graph kernel network for partial differential equations,” arXiv:2003.03485 (2020).30. Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, A. Stuart, K. Bhattacharya, and A. Anandkumar, “Multipole graph neural operator for parametric partial differential equations,” Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6–12, 2020.31. N. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. Stuart, and A. Anandkumar, “Neural operator: Learning maps between function spaces,” arXiv:2108.08481 (2021). GNNs work independently of the graph size or structure and can easily store information about scaling and resolution.

We exploit these features to train GNNs to propagate electromagnetic field distributions in time for arbitrary domain sizes and material distributions. We stress that in this approach, we do not necessarily seek for steady-state solutions of the field but, instead, seek to advance the field propagation by a certain amount of time. The main purpose of our GNNs is to learn the FDTD scheme while exploiting all benefits offered by GNNs, e.g., working on unstructured grids, extrapolation to larger domain sizes, and extrapolation to material distributions not seen in the training process.

This paper is structured as follows: first, we discuss our methods, including FDTD, graphs, and graph neural networks. Subsequently, we outline the physical setup and training data generation. Here, we follow two approaches and focus either on the prediction of the electric field only or on performing full-field predictions. Finally, we conclude with our results and sketch improvements, extensions, and applications.

II. METHODS

Section:

ChooseTop of pageABSTRACTI. INTRODUCTIONII. METHODS <<III. RESULTSIV. CONCLUSION AND OUTLOO...REFERENCES

A. The finite-difference time-domain method

The finite-difference time-domain method (FDTD) is an iterative numerical update procedure to compute the evolution of an electromagnetic field in space and time.3232. J. B. Schneider, “Understanding the finite-difference time-domain method,” School of Electrical Engineering and Computer Science, Washington State University (2010), 28. For reasons of consistency, we will briefly sketch the approach in 2D assuming a transverse-magnetic (TM) polarization, starting from the source-free Maxwell equations. Please note that, since different conventions exist, TM is understood in this contribution as the situation where, in an x–y plane, the considered electric field has a non-zero component perpendicular to this plane, while the magnetic field has non-zero components in this plane. An understanding of how the FDTD works is necessary to appreciate the formulation in form of a graph afterward, which is the natural starting point for a graph neural network to solve these equations.For simplicity, we write down Maxwell’s equations for a non-dispersive material characterized in the spatial domain by permittivity ɛ(r), permeability μ(r), electric conductivity σ(r), and magnetic conductivity σm(r) as the point of departure. These Maxwell equations read as−σm(r)H(r,t)−μ(r)∂H(r,t)∂t=∇×E(r,t)=êx∂Ez(r,t)∂y−êy∂Ez(r,t)∂x,(1)σ(r)E(r,t)−ε(r)∂E(r,t)∂t=∇×H(r,t)=êz∂Hy(r,t)∂x−êz∂Hx(r,t)∂y,(2)

In these equations, E(r, t) is the electric field, H(r, t) is the magnetic field, σm(r) is the magnetic conductivity, σ(r) is the electric conductivity, and ɛ(r) and μ(r) are the permeability and permittivity, respectively. If dispersive materials are required to be considered, auxiliary differential equations would be needed to supplement these Maxwell equations. These auxiliary equations would describe the temporal evolution of a polarization or magnetization depending on the electric or magnetic field, respectively.

In our considered setting, i.e., the propagation of the fields in the x–y-plane and with the restriction to TM polarization, the differential equations that describe the spatial and temporal evolution of the three involved field components are−σmHx(r,t)−μ∂Hxr,t)∂t=∂Ez(r,t)∂y,(5)σmHy(r,t)+μ∂Hy(r,t)∂t=∂Ez(r,t)∂x,(6)σEz(r,t)+ε∂Ez(r,t)∂t=Hy(r,t)∂x−∂Hx(r,t)∂y.(7)To render the equations manageable on a computer, the space and time are discretized in the FDTD on a finite grid spaced by Δx = Δy and Δt. For the spatial discretization, we define a grid indexed by m and n for the x- and y-direction, respectively; thus, r = (x, y) = (mΔx, nΔy). The temporal steps follow Δt=1c−Hym−12,n,q+12−ΔtεΔy−Hxm,n−12,q+12.(8)Nearly identical expressions can be found for Hx and Hy, respectively. A detailed derivation can be found in the literature.7,337. A. Taflove, S. C. Hagness, and M. Piket-May, “Computational electromagnetics: The finite-difference time-domain method,” in The Electrical Engineering Handbook (Elsevier, Inc., 2005), pp. 629–670.33. A. Taflove, A. Oskooi, and S. G. Johnson, Advances in FDTD Computational Electrodynamics: Photonics and Nanotechnology (Artech house, 2013). The idea of the FDTD is now to evolve the fields in a leap-frog-scheme. To be specific, the spatial differences from the magnetic field allow to advance the electric field in time. Afterward, those spatial differences from the electric field allow advancing the magnetic field in time. To solve the PDE, we have to impose some boundary conditions and potentially some perfectly matched layers, i.e., an impedance matched material that absorbs incoming radiation without any reflection. Such a material is necessary to imitate the infinite open space on a finite grid. To execute an actual simulation, we also need to define sources of electromagnetic radiation. We provide more details on these aspects in the section titled RESULTS since they are related to the training data generation.

B. Graph representation

As sketched in Sec. , the whole FDTD formalism is based on a local and iterative updating scheme. That is why the use of graph neural networks appears very suitable to treat this problem. GNNs rely on the information exchange of neighboring nodes, which is ideal to capture the locality of the updating procedure. Moreover, FDTD works independently of the domain size, which can be translated to a GNN working on arbitrary input graphs.

The purpose of this section is a brief introduction to graph representations and graph neural networks; thus, it becomes clear to the reader how and why GNNs are potentially useful for FDTD computations.

A graph G(V,E) comprises a set of nodes or vertices V=v1,v2,… and edges E⊆V×V. We consider here only undirected graphs. Therefore, every edge represents an undirected link between two graph nodes eij=(vi,vj)∈E. Both nodes and edges hold a set of attributes, the node features hv∈RD and edge features he∈RC, respectively. In general, the number of features for each node or edge can vary in the so-called heterogeneous graphs. However, in our case, we have a fixed number of features, making our graphs homogeneous. Additionally, we assign spatial positions pi = (pi,x, pi,y) of the nodes that coincide with the simulation coordinates.

Graphs are very suitable to represent unstructured data, such as networks, molecules, or meshes.22–2422. O. Wieder, S. Kohlbacher, M. Kuenemann, A. Garon, P. Ducrot, T. Seidel, and T. Langer, “A compact review of molecular property prediction with graph neural networks,” Drug Discovery Today: Technol. 37, 1–12 (2020). https://doi.org/10.1016/j.ddtec.2020.11.00923. H. Ma, Y. Bian, R. Yu, W. Huang, T. Xu, W. Xie, G. Ye, and J. Huang, “Multi-view graph neural networks for molecular property prediction,” arXiv:2005.13607 (2020).24. P. Tobias, M. Fortunato, A. Sanchez-Gonzalez, and P. W. Battaglia, “Learning mesh-based simulation with graph networks,” arXiv:2010.03409 (2020). For the purpose of field propagation, we represent the electromagnetic field and material distribution as graphs. To keep things simple in this initial work, we concentrate here on dielectric materials only, for which it is sufficient to describe them using a spatially resolved dielectric function only. We can think about two possible graphs.One approach is a direct representation of the Yee-grid; see Fig. 1. Here, the node features hv∈R2 encode the electric field Ez(r, t0) and the permittivity ɛ(r). In extension, the edge features he∈R4 comprise the magnetic field components Hx,y(r, t0) and the relative Cartesian vector rij of linked nodes,rij=(pj,x−pi,x,pj,y−pi,y).(9)Another approach is a meshing of the domain [see Fig. 1(b)] with equivalent edge and node features. While, technically, there is no distinction between a regular grid and a mesh at the level of the graph, it would make a tremendous difference at the level of the FDTD. In the FDTD, frequently, a regular grid is used, which causes a stair-case approximation of the surface of rounded objects. This approximation can lead to computational artifacts that can be avoided with such a mesh.

In both cases, the choice of resolution is important. It does not have to be as fine as it is chosen for the FDTD simulation, but a too-low resolution leads to poor results. This is not necessarily caused by bad predictions but rather by a poor interpolation of the predictions to achieve the fields at all points in space inside the domain. Overall, the resolution is a trade-off between exceedingly large graphs, which slows down computation, and too less nodes to achieve accurate results at all points in space. In this work, we use a resolution that is found by manually tuning and choosing a sweet spot.

C. Graph neural networks

After having discussed the classical FDTD algorithm used to advance electromagnetic fields in space and time and having discussed how to represent the discretized Maxwellian problem in terms of a graph, we provide the following short introduction to graph neural networks. In short, the purpose of these networks will be to learn how to propagate a given input field in space and time.

A graph neural network (GNN) is an artificial neural network (ANN) that works on graphs. In contrast to conventional ANNs, such as linear or convolutional networks, a GNN does not rely on a specific shape and size of the input data. It works on any graph, independent of its number of nodes, edges, or connectivity and yields permutation invariance. A GNN can be interpreted as a graph operator F that acts on a graph G and returns a new graph G′ with updated features and potentially new nodes and edges. GNNs can perform tasks on different levels, specifically node-level, edge-level, and graph-level. We want to predict the electric field at each point in space after a fixed time step. This coincides with a node-level regression task since these encode the electric field in space. The graph that our network takes corresponds to the field at a certain moment in time, and afterward, it returns the field after a time step.

Our network comprises several components, depicted in Fig. 2. First, we have two encoders that lift the graph’s node and edge features in a high-dimensional embedding space. Both encoders are fully connected feedforward networks of three linear layers with ReLU activation. The number of features increases with every MLP layer to 32, 64, and 128. Node and edge features must have the same dimension to enable the following generalized graph convolution (GCN) layers.3434. G. Li, C. Xiong, A. Thabet, and B. Ghanem, “All you need to train deeper GCNs,” arXiv:2006.07739 (2020). To perform the graph convolution, each node vi gathers edge and node features of every neighboring node vj:j∈N(i) and connecting edge eij, the so-called message passing. Subsequently, the messages get aggregated (message aggregation) and added to the original node features. Finally, a multi-layer perception (MLP) processes the features. With that, we achieve updated node features vi′. The whole operation readswhere ɛ is a small positive parameter to ensure non-zero division in the aggregation scheme, for which we use max aggregation. Additionally, we use a layer normalization before non-linear activation and we add residual connections after every GCN layer. The message passing allows information exchange of connected neighbors, which we use to adapt the updating scheme performed in the Yee grid. However, since we use several GCN layers, we achieve information exchange not only of directly connected but also of further away nodes. Finally, we have a single decoder, another three-layered MLP, that projects the resulting features of each node back to a single value, the updated electric field. Please note that the whole operation on the input graph G is independent of the amount of nodes and edges, only the number of input features is fixed. In general, this is one of the main strengths of GNNs, which we exploit to treat different domain sizes with the same model.

We trained GNNs with different hyperparameters to find a suitable configuration. However, we observed only minor differences in the final performance; thus, a qualitative reproduction of our results can also be achieved with different parameters. In our experiments, we found that a minimum number of trainable parameters of ∼300000 is needed.

III. RESULTS

Section:

ChooseTop of pageABSTRACTI. INTRODUCTIONII. METHODSIII. RESULTS <<IV. CONCLUSION AND OUTLOO...REFERENCES

We train different GNNs to evolve a given electromagnetic field distribution Ez(r, t0), Hx,y(r, t0) for a distinct time step Δt. We follow two different approaches for this purpose. First, we train different GNNs to perform a single time step in electric field prediction, given a comparatively low spatial resolution. We test and evaluate this for both grid and meshing. Second, we want to be able to predict the full field. Therefore, we keep a high temporal and spatial resolution and deduce the magnetic fields from predicted electric fields following the FDTD propagation procedure. The training is performed in parallel on four NVIDIA A100 Tensor Core graphics processing units (GPUs) with 40 GB RAM each. The training time varies, depending on model specifications, between 40 minutes and up to 4 hours.

We use the open-source Python FDTD-implementation meep3535. A. F. Oskooi, D. Roundy, M. Ibanescu, P. Bermel, J. D. Joannopoulos, and S. G. Johnson, “Meep: A flexible free-software package for electromagnetic simulations by the FDTD method,” Comput. Phys. Commun. 181(3), 687–702 (2010). https://doi.org/10.1016/j.cpc.2009.11.008 for data generation and testing. We consider a plane wave source at λ = 1 µm and arbitrarily shaped non-magnetic, non-conducting, non-dispersive, dielectric 2D scatterers of ɛ = 2.1025 in vacuum. The domain size varies between 1 and 3 µm side length with perfectly matched layers as boundary conditions. We chose a spatial resolution of 60 pixels/wavelength to ensure physically correct results. The source is a time-harmonic plane wave propagating along the +x-direction into the considered domain. This comparably basic setting is most suitable for initial investigations. However, the method is not restricted to a specific source or material properties. An expansion to dispersive materials and different sources is possible.

A. Electric field prediction

To evolve the electric field only for a fixed time step, we can use a low spatial resolution and comparatively large time step Δt=λ4c. This Δt coincides with 15 high resolution time steps in meep. It is possible to choose such a high time step since the underlying training data were created with a high temporal resolution. Thus, it is physically accurate. The model then learns from correct data to adapt larger time steps. We construct graphs as structured grids, similar to the Yee grid, and meshes. In this case, edge features include both Hx and Hy values.

The training set comprises 125 different shapes, sampled at 20 different time steps of discretization Δt=λ4c. The shapes are generated from random noise after applying a Gaussian filter. Starting from the initial field distribution, either a structured grid or a meshed graph is constructed; see Fig. 1. For meshing, we use the Python library meshpy3636. I. Steinbrecher and A. Popp, “MeshPy: A general purpose 3D beam finite element input generator,” https://compsim.gitlab.io/codes/meshpy, 2021. to create a triangular mesh. We control the number of mesh points by restricting the volume size. The domain edges and the scatterer boundary are added manually. Subsequently, the GNN performs the temporal evolution for Δt, leading to a transformed graph G′. The loss function compares the ground truth result of FDTD with the node features of G′ and computes the mean squared error (MSE),MSE(G,Ĝ)=1Nv∑iNv(hv,i−ĥv,i)2(11)=1Nv∑iNv(Ez,i−Êz,i)2.(12)Here, the FDTD ground truth values are interpolated in the case of meshed graphs to get the field values at every node position. The whole dataset is split into 2000 and 500 samples for training and validation, respectively. We use the optimizer ADAM3737. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv:1412.6980.ISO 690 (2014). and a learning rate scheduler, which reduces the learning rate by a factor of 0.5 if no further improvement of the validation loss is observed within 10 epochs. We train for a total of 200 epochs and observe the convergence of validation and training loss.

To achieve the resulting field at every point in space, not only at the node positions, we linearly interpolate the nodes to a grid.

We test the resulting models on 20 different shapes and time steps and observe excellent agreements of the predicted fields with GNN and FDTD simulations. In the case of square grid graphs, we reach a mean squared error of 7.8 × 10−4, whereas meshing results in 4.4 × 10−4. Figure 3 shows two examples for each grid (a) and meshing (b). Here, the left figure shows the initial state. The following figure, i.e., the second to left, shows the predicted field after a time step Δt=λ4c with the GNN. The third figure from the left shows the predicted field values as computed with FDTD for reference purposes. Both predictions from the GNNs considering grids and meshes are in excellent agreement with the ground truth FDTD simulations. To quantify this agreement, we evaluate the normed relative error of the GNN predicted field Ez(r) and FDTD computed field Ez,0(r) at t0 + Δt,δ(r,Ez,Ez,0)=|Ez(r)−Ez,0(r)||Ez,0(r)|+|Ez,0|̄,(13)where |Ez,0|̄ is the average absolute value of the correct field with respect to r. The error is shown in the right of Figs. 3(a) and 3(b). The prediction is slightly worse for smaller propagation times due to transitional temporal effects arising from the source initiation. These effects are particularly hard to capture with meshing with coarser elements in the spatial domain. This results in interpolation errors when converting the graph representation to fields. However, the prediction with grids of resolution 30 pixels/wavelength covers the effects quite well.Now, we test the trained GNNs on domain sizes outside the trained range of 1–3 µm. Figure 4 shows an example with 5 µm side length in meshing. The mean value for this example coincides with ≈1.4% deviation, which is very low. In addition, the shape of the scatterer is rather specific compared to the ones used in training. This shows that our model learned to adapt the propagation procedure of a given electromagnetic field. It can extrapolate very well to larger domain sizes and even complex shapes not seen in training. We achieve similar results with grid graphs.

B. Full field prediction

Instead of only predicting Ez(r, t0 + Δt), we want to train models that are able to compute the full electromagnetic field. For that, we add two feature channels to the node decoder to enable direct prediction of Hx,y(r, t

View original article

APL PHOTONICS

Share Bookmark

0 0 0 0 0 0 0

More from this channel

Exploiting graph neural networks to perform finite-difference time-domain based optical simulations

Comments (0)