Heuristic question sequence generation based on retrieval augmentation

Authors

  • Zhihao Yang School of Software & Microelectronics, Peking University, Beijing 100871, P. R. China
  • Zhengzhou Zhu School of Software & Microelectronics, Peking University, Beijing 100871, P. R. China

Keywords:

Heuristic education, question sequence generation, personalized knowledge path, large language model; retrieval-augmented generation

Abstract

Traditional education and many intelligent tutoring systems still focus on one-way indoctrination and lack the cultivation of students' independent learning abilities. This article innovatively applies artificial intelligence to Socratic Method, guiding students to solve problems independently through a series of questions rather than direct answers. The main contributions include: (1) A personalized knowledge path planning algorithm is designed, using Q-matrix and student test records to update students' knowledge mastery. Combined with graph database, Dijkstra algorithm is used to construct knowledge paths. (2) We utilized retrieved knowledge path and modified Least-to-Most Prompting to guide GLM-4 to generate orderly and controllable question sequence. We also design an interactive algorithm to help students think about the answers themselves by interacting with them. (3) A heuristic question sequence generation system to promote students through chapter tests and question answers Self-learning. Experiment and user study show that the efforts made by this paper in retrieval augmented generation have a positive effect on improving the effect of question sequence generation.

Cited as: Yang, Z., Zhou, Z. (2024). Heuristic question sequence generation based on retrieval augmentation. Education and Lifelong Development Research, 1(2): 72-81. https://doi.org/10.46690/elder.2024.02.03

References:

Abu-Rasheed, H., Abdulsalam, M. H., Weber, C., & Fathi, M. (2024). Supporting Student Decisions on Learning Recommendations: An LLM-Based Chatbot with Knowledge Graph Contextualization for Conversational Explainability and Mentoring. LAK Workshops. 

Alhuzali, H., & Ananiadou, S. (2021). SpanEmo: Casting Multi-label Emotion Classification as Span-prediction. Conference of the European Chapter of the Association for Computational Linguistics.

Alkhatlan, A., & Kalita, J. (2019). Intelligent Tutoring Systems: A Comprehensive Historical Survey with Recent Developments. International Journal of Computer Applications, 181(43), 1-20.

Bulathwela, S., Muse, H., & Yilmaz, E. (2023, May). Scalable Educational Question Generation with Pre-trained Language Models. International Conference on Artificial Intelligence in Education.

Chen, G., Yang, J., Hauff, C., & Houben, G. J. P. M. (2018, June). LearningQ: A Large-scale Dataset for Educational Question Generation. International Conference on Web and Social Media.

Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020, February). A simple framework for contrastive learning of visual representations. International Conference on Machine Learning.

Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., . . . Gehrmann, S. (2023). Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240), 1-113.

Conklin, J. (2005). A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom's Taxonomy of Educational Objectives [Complete Edition, Lorin W. Anderson, David Krathwohl, Peter Airasian, Kathleen A. Cruikshank, Richard E. Mayer, Paul Pintrich, James Raths, Merlin C. Wittrock]. Educational Horizons, 83(3), 154-159.

Connor-Greene, P. A. (2000). Assessing and promoting student learning: Blurring the line between teaching and testing. Teaching of Psychology, 27(2), 84-88.  

De La Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34(1), 115-130.

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. North American Chapter of the Association for Computational Linguistics

Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., Guo, Q., Wang, M., & Wang, H. (2023). Retrieval-augmented generation for large language models: A survey. ArXiv, abs/2312.10997.

GLM, T., Zeng, A., Xu, B., Wang, B., Zhang, C., Yin, D., . . . Lai, H. (2024). ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools. arXiv:2406.12793

Hu, H., Richardson, K., Xu, L., Li, L., Kübler, S., & Moss, L. (2020). OCNLI: Original chinese natural language inference. Findings of the Association for Computational Linguistics: EMNLP 2020, 3512.

Joshi, A., Kale, S., Chandel, S., & Pal, D. K. (2015). Likert scale: Explored and explained. British Journal of Applied Science & Technology, 7(4), 396-403.

Kang, M., Kwak, J. M., Baek, J., & Hwang, S. J. (2023). Knowledge Graph-Augmented Language Models for Knowledge-Grounded Dialogue Generation. ArXiv, abs/2305.18846. 

Khandelwal, U., Levy, O., Jurafsky, D., Zettlemoyer, L., & Lewis, M. (2019, November). Generalization through Memorization: Nearest Neighbor Language Models. International Conference on Learning Representations

Kulshreshtha, D., Belfer, R., Serban, I. V., & Reddy, S. (2021, April). Back-Training excels Self-Training at Unsupervised Domain Adaptation of Question Generation and Passage Retrieval. Conference on Empirical Methods in Natural Language Processing.

Kulshreshtha, D., Shayan, M., Belfer, R., Reddy, S., Serban, I. V., & Kochmar, E. (2022, June). Few-Shot Question Generation for Personalized Feedback in Intelligent Tutoring Systems. Paper presented at the 11th Conference on Prestigious Applications of Artificial Intelligence, PAIS 2022, co-located with the 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence, IJCAI-ECAI 2022.

Lees, A., Tran, V. Q., Tay, Y., Sorensen, J., Gupta, J., Metzler, D., & Vasserman, L. (2022, February). A New Generation of Perspective API: Efficient Multilingual Character-level Transformers. Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington DC, USA.

Li, J., Galley, M., Brockett, C., Gao, J., & Dolan, W. B. (2016, June). A Diversity-Promoting Objective Function for Neural Conversation Models. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

Li, T., Ma, X., Zhuang, A., Gu, Y., Su, Y., & Chen, W. (2023). Few-shot In-context Learning on Knowledge Base Question Answering. Annual Meeting of the Association for Computational Linguistics.

Liang, Y., Wang, J., Zhu, H., Wang, L., Qian, W., & Lan, Y. (2023, October). Prompting Large Language Models with Chain-of-Thought for Few-Shot Knowledge Base Question Generation. Paper presented at the The 2023 Conference on Empirical Methods in Natural Language Processing.

Lo, K., Wang, L. L., Neumann, M., Kinney, R., & Weld, D. S. (2020, July). S2ORC: The Semantic Scholar Open Research Corpus. Paper presented at the Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.

Mulla, N., & Gharpure, P. (2023). Automatic question generation: a review of methodologies, datasets, evaluation metrics, and applications. Progress in Artificial Intelligence, 12(1), 1-32. 

Nishikawa, S., Ri, R., Yamada, I., Tsuruoka, Y., & Echizen, I. (2022, May). EASE: Entity-Aware Contrastive Learning of Sentence Embedding. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

Raffel, C., Shazeer, N.M., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., & Liu, P.J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140), 1-67.

Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016, June). SQuAD: 100,000+ Questions for Machine Comprehension of Text. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.

Ram, O., Levine, Y., Dalmedigos, I., Muhlgay, D., Shashua, A., Leyton-Brown, K., & Shoham, Y. (2023). In-context retrieval-augmented language models. Transactions of the Association for Computational Linguistics, 11, 1316-1331. 

Reimers, N., & Gurevych, I. (2019, August). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).

Sequeda, J., Allemang, D., & Jacob, B. (2023, November). A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases. Proceedings of the 7th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA).

Srivastava, M., & Goodman, N.D (2021, June). Question Generation for Adaptive Education, Annual Meeting of the Association for Computational Linguistics.

St-Hilaire, F., Vu, D.D., Frau, A., Burns, N., Faraji, F., Potochny, J., Robert, S., Roussel, A., Zheng, S., Glazier, T., Romano, J.V., Belfer, R., Shayan, M., Smofsky, A., Delarosbil, T., Ahn, S., Eden-Walker, S., Sony, K., Ching, A.O., . . . Kochmar, E. (2022, March). A New Era: Intelligent Tutoring Systems Will Transform Online Learning for Millions. ArXiv, abs/2203.03724.

Wang, Z., Valdez, J., Basu Mallick, D., & Baraniuk, R. G. (2022, July). Towards human-like educational question generation with large language models. International Conference on Artificial Intelligence in Education (pp. 153–166).

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Chi, E.H., Xia, F., Le, Q., & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35, 24824-24837.

Welbl, J., Liu, N. F., & Gardner, M. (2017, July). Crowdsourcing Multiple Choice Science Questions. Proceedings of the 3rd Workshop on Noisy User-generated Text.

Yang, A., Li, Z., & Li, J. (2024). Advancing GenAI Assisted Programming--A Comparative Study on Prompt Efficiency and Code Quality Between GPT-4 and GLM-4. arXiv:2402.12782 

Yang, Z., Gan, Z., Wang, J., Hu, X., Lu, Y., Liu, Z., & Wang, L. (2022, September). An empirical study of gpt-3 for few-shot knowledge-based vqa. Proceedings of the AAAI Conference on Artificial Intelligence.

Zhou, D., Scharli, N., Hou, L., Wei, J., Scales, N., Wang, X., Schuurmans, D., Bousquet, O., Le, Q., & Chi, E.H.(2022, May). Least-to-Most Prompting Enables Complex Reasoning in Large Language Models. Paper presented at the The Eleventh International Conference on Learning Representations.

Downloads

Download data is not yet available.

Downloads

Published

2024-06-20

Issue

Section

Articles