Consultings

... fostering your competitiveness with tailored knowledge transfer and proven practices...

icon

Digital tool for material knowledge extraction and structuration

NLP tool, based on word embeddings and generative LLMs to extract and structure material knowledge from research papers, reports, and technical documents.

Consulting related to Characterization and Modeling Data Analytics in the context of Construction Manufacturing Motor Vehicles and (Semi-) Trailers

Provided by PIONEER Project 1 month ago (last modified 1 month ago); viewed 7 times and quoted 0 times

Scope:

Accurate predictions from materials modelling and simulation software rely heavily on the precise definition of material properties. However, obtaining these properties is often challenging, requiring extensive material-characterisation campaigns or access to high-quality material property datasets. This makes the modelling process costly, time-consuming, and dependent on data that is not always readily available.

Leveraging Large Language Models (LLMs) to search scientific literature enables a more efficient and comprehensive discovery of material properties. Natural language processing (NLP) tool, based on word embeddings and generative LLMs, has been developed to extract and structure material knowledge from research papers, reports, and technical documents. The goal is to used existing knowledge from the literature that would be used to identify mechanical, thermal, chemical, or microstructural characteristics of materials.

Approach:

The use ML algorithm based on NLP leads to:

  • better processing of material knowledge. The diverse layouts in massive literature make conventional tools like PDFPlumber insufficient, as they often fail to extract entities such as captions, handle complex tables, or capture contextual relationships. LLMs can serve as a postprocessing step to refine, fill gaps, and standardize material knowledge
  • embeddings and prompt engineering can be used to extract key entities such as material properties, compositions, references for measurements, and other critical information. This approach ensures consistent structuring of all types of material data and it is flexible enough to adapt to diverse layouts and formats.
  • use of NLP-based libraries like Quantulum to structure extracted data facilitates comparisons of material knowledge across diverse sources.

Outcome:

This solution is strictly to the specific custom problem to be solved, so it needs customisation.

Phases
phase symbol
Outlook

Assessment

Gather information, evaluate the potential and outline an implementation roadmap.

Implementation

Develop a tailored solution to seize the identified potential and optimize the starting situation.

Adoption

Integrate and exploit the outcomes from the implementation within the existing workflow.