Extreme Multi-Label Skill Extraction Training using Large Language Models

Paper · arXiv 2307.10778 · Published July 20, 2023
Training Fine Tuning

“We use an LLM to generate training data for skill extraction, grounded in the ESCO ontology. Based on this synthetic data, we optimize a model using contrastive learning to represent skill names and corresponding sentences close together in the same space. Our key contribution is a novel end-to-end approach to training a skill extraction system, consisting of the cost-effective synthetic data generation and the contrastive learning procedure alongside an effective augmentation procedure.”