Vision Language Models

Images can be represented as a collection of visual "words" or patches, allowing the attention mechanism to be applied to it. This opened the doors for native multimodality capabilities be developed in the form of VLMs
Read More

Scaling Laws for Large Language Models

There seems to be an upper cap on LLM performance if compute budget is kept fixed, i.e. capacity of a model. Various studies have found quantified relationships between data size, model size and compute budget which can be used to inform how much resource to use when training an LLM
Read More