Resumen
Large Language Models (LLMs) have demonstrated remarkable capabilities
but continue to exhibit fundamental limitations in areas humans find intuitive,
such as spatial reasoning. This paper investigates whether early injection
of symbolic knowledge during training could address these limitations. Unlike
post-training alignment techniques, our approach aims to influence how concepts
are initially represented within model parameters. Through controlled experiments
with a very simple model, word2vec models, we demonstrate that early
injection of spatial symbolic knowledge produces qualitatively different representations
of spatial concepts, particularly regarding dynamic relationships and
potential movements. Models trained with this approach show enhanced understanding
of spatial dynamics, capturing not just static positions but causal consequences.
While quantitative differences were modest in our small-scale experiment,
the qualitative improvements in semantic retrieval suggest promising directions
for integrating symbolic knowledge in more complex language models.
This work contributes a novel perspective on the timing of symbolic knowledge
integration, challenging the prevailing paradigm of large-scale pretraining followed
by alignment.
but continue to exhibit fundamental limitations in areas humans find intuitive,
such as spatial reasoning. This paper investigates whether early injection
of symbolic knowledge during training could address these limitations. Unlike
post-training alignment techniques, our approach aims to influence how concepts
are initially represented within model parameters. Through controlled experiments
with a very simple model, word2vec models, we demonstrate that early
injection of spatial symbolic knowledge produces qualitatively different representations
of spatial concepts, particularly regarding dynamic relationships and
potential movements. Models trained with this approach show enhanced understanding
of spatial dynamics, capturing not just static positions but causal consequences.
While quantitative differences were modest in our small-scale experiment,
the qualitative improvements in semantic retrieval suggest promising directions
for integrating symbolic knowledge in more complex language models.
This work contributes a novel perspective on the timing of symbolic knowledge
integration, challenging the prevailing paradigm of large-scale pretraining followed
by alignment.
Idioma original | Inglés estadounidense |
---|---|
Estado | Indizado - 2025 |
Evento | 20th Iberian Conference on Information Systems and Technologies - , Portugal Duración: 16 jun. 2025 → … Número de conferencia: 20 |
Conferencia
Conferencia | 20th Iberian Conference on Information Systems and Technologies |
---|---|
Título abreviado | CISTI 2025 |
País/Territorio | Portugal |
Período | 16/06/25 → … |