Extending Context Window of Large Language Models via Position Interpolation

Extending Context Window of Large Language Models via Position Interpolation

Arxiv Papers

1 год назад

810 Просмотров

Position Interpolation (PI) extends the context window sizes of pretrained LLMs with minimal fine-tuning, achieving strong results on tasks requiring long context while preserving quality on tasks within the original context window. PI linearly down-scales input position indices to avoid catastrophic attention scores.

00:00 Section: 1 Introduction
04:19 Section: 2 Method
08:00 Section: 2.3 Proposed approach: Position Interpolation (PI)
11:29 Section: 3 Experiments
14:22 Section: 3.2 Long Sequence Language Modeling
18:19 Section: 3.3 Measuring Effective Context Window Size through Passkey Retrieval
21:04 Section: 3.4 Benchmarks on Original Context Window Size
24:21 Section: 4 Related Work


https://arxiv.org/abs//2306.15595

YouTube: https://www.youtube.com/@ArxivPapers

PODCASTS:
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
Ссылки и html тэги не поддерживаются


Комментарии: