Many studies have confirmed the memory enhancement effect of production, generation and elaboration which can be effective after only one encoding. It is also known that greater memory enhancement effects can be obtained by combining multiple memory strategies during encoding. This study aimed to investigate whether the combination of production and self-generated elaboration enhances memory performance compared with production or generation alone. A total of 23 undergraduate and graduate students participated in this study. In the functional magnetic resonance imaging analysis, we explored the neural representation of remembering information after production and self-generated elaboration strategy. We set four encoding strategy conditions: (1) Read Silent (read without production), (2) Read Aloud (only production), (3) Add Silent (self-generated elaboration without production), (4) Add Aloud (production and self-generated elaboration). The retrieval performance and brain activity while retrieving the learned sentences after a one-week delay were examined. The behavioral results showed that the highest memory performance was for sentences encoded in Add Aloud. The interaction between production and self-generated elaboration was statistically significant. These results suggest that the memory enhancement effect of combining production and self-generated elaboration is not a simple addition nor synergistic facilitation effect. The imaging results showed that the following areas were related to the retrieval of the target encoded in the add aloud condition: the area related to integration of internal and external information (precuneus), area related to information rich stimuli (lateral occipital lobe), area related to self-involvement and inference of others’ feelings (MPFC), area related to seen imagery (retrosplenial region) and area related to adjustment of movement (cerebellum). These results suggest that with an encoding strategy that combines production and self-generated elaboration, integrated auditory input of vocalizations and generated images, visual images of the scene, self-relevance, inference of other’s feeling, movement by moving mouth are stored with the target and enhanced memory performance of AA.