Metamaterials are artificially engineered materials with unique and extraordinary properties [1]. Due to their excellent physical properties, metamaterials have triggered revolutionary advances in the fields of optics, microwaves, and terahertz. In the field of optical absorbers, metamaterials have realized precise modulation of light waves through mechanisms such as local resonance, surface equipartitioning exciton effect, and impedance matching [2,3], thus giving rise to a variety of high-performance absorbers [[4], [5], [6]], such as solar absorbers [7,8], dual-band absorbers [9,10], and Ultra-broadband absorbers [11]. MSAs can efficiently absorb light at effective operating wavelengths across a broad spectrum, converting optical energy into forms such as heat and electricity. Moreover, MSAs hold great potential in applications like desalination [12], wastewater purification [13], and photovoltaic energy generation [14], offering promising avenues for clean and sustainable energy solutions. For the design of MSAs, conventional design methods often rely on time-domain finite-difference algorithms, which require substantial expertise and considerable computational time to obtain optimal solutions [15,16]. Despite significant advancements in the artificial design of MSAs for the visible wavelength range, finding suitable structural parameters to design high-performance MSAs remains a time-intensive and complex task using traditional methods [[17], [18], [19], [20]]. Consequently, there is an urgent need for more efficient approaches to designing broadband MSAs.
In recent years, artificial intelligence, particularly deep learning, has increasingly been employed to assist in designing and optimizing metamaterials [[21], [22], [23], [24], [25]]. Compared to traditional time-domain finite-difference algorithms, deep learning significantly reduces design time and enhances the results' accuracy. Therefore, deep learning methods have become widely used in the design of broadband MSAs. For example, W Chen et al. proposed a deep learning model called the metamaterial spectrum transformer (MST), which facilitated the design of MSAs with an average absorptivity of 94 % across the spectral range of 0.52–2.45 μm [26]. Similarly, S Wang et al. combined deep learning with transfer learning techniques to design MSAs with a bandwidth of 2.7 μm [27]. However, while using DL methods to aid in the design of MSAs is an efficient approach, the deep learning training process requires tens or even hundreds of thousands of data sets, which are often obtained through time-consuming traditional simulation methods.
Reinforcement learning [28] is a technique that does not rely on pre-existing databases but instead facilitates direct interaction between an agent and the environment. Based on a given reward function, the agent explores possible actions and optimizes its decision-making process by evaluating the reward signals generated by the environment [29]. Several studies have demonstrated the applicability of reinforcement learning to the design and optimization of nanophotonic devices [[30], [31], [32], [33]]. For instance, R Li et al. proposed the L2DO model, which could autonomously reverse-engineer nanophotonic laser cavities in approximately 152 h without requiring prior knowledge [34]. Similarly, I Sajedian et al. employed the DQN model to autonomously design MSAs with an average absorptivity of 97.6 % within a month [35]. However, due to the exploratory nature of reinforcement learning, achieving the desired target often requires a substantial amount of time, ranging from several days to even a month.
In this paper, we propose a model architecture, D-3DQN, which combines deep learning and reinforcement learning for the optimal design of high-performance MSAs. Initially, we train a residual fully connected neural network (RFN) using a relatively small dataset (approximately 2900 data points). Leveraging the model's high-speed nature, we get an initial set of optimized structural parameters for the MSA, which serve as the starting states for RL training. Subsequently, we further optimize the absorption performance of the MSA using the Dueling Double DQN (3DQN) algorithm, wherein the agent explores and interacts with the environment to approach and achieve the target absorption performance progressively. Using the D-3DQN model, we design MSAs with average absorption rates of 98.51 % and 98.32 % over the 0.4–2.8 μm wavelength range. By comparing the results with those obtained using the deep learning method or reinforcement learning model alone, our D-3DQN model demonstrates superior performance, proving that DL models trained with small data can improve the exploration efficiency of RL by providing RL with a better initial starting point, while RL can compensate for the lack of DL's ability to generalize outside the training dataset.
Comments (0)