Extraction and Formalization of Relevant Information from Natural Language Expert Knowledge

  • Type:Master Thesis
  • Supervisor:

    Michelle Jungmann

    Prof. Sanja Lazarova-Molnar

Description

Problem: Expert knowledge is very valuable for various processes in companies, however, often boxed in natural language. Natural language has the characterizations of being unstructured, highly complex and ambiguous and, therefore, still challenging for a machine to process. Thus, extracting only relevant information from natural language expert knowledge statements is still an open challenge.

Goal: The goal of this thesis is to research data sets with suitable expert knowledge and generate a data set basis from this research. Based on this, the goal is to analyze different available pre-trained Large Language Models and train a Large Language Model to extract relevant information from an expert knowledge statement. To enable further processing, the Large Language Model should be trained to output the extracted relevant information in a formalized way.

Required Skills and Knowledge:

  • Programming proficiency (preferably in Phyton)
  • Basic skills in Pre-Trained Large Language Models
  • Basic skills in Natural Language Techniques, e.g., Named Entity Recognition
  • Basic skills in Formalization Techniques