ai92

Posts

Stable Diffusion 3 on Colab (Run the Full model without quantization)

August 11, 2024

Run the full Stable diffusion 3 model on colab (T4 gpu) without quantization with long prompts / extended context length & prompt weighing. The stable diffusion 3 Hugging Face page states “SD3 uses three text encoders, one of which is the very large T5-XXL model. This makes it challenging to run the model on GPUs with less than 24GB of VRAM, even when using fp16 precision.” and gives some options like using quantize version of the T5 text encoder or dropping it. CPU offload does not work in the free version of colab and sequential offload to cpu takes long time to generate the image. Good for us that colab or the T4 gpu has enough gpu memory to load all the three text encoders at once without any quantization and get text embeddings and then empty the gpu space just enough to load the transformer and vae and perform next steps for the image generation. So, the basic steps to prepare the pipeline will look like: load all the 3 text encoders with their tokenizer o...

vLLM Parameter Tuning for Better Performance

August 03, 2024

vLLM Parameters We all know that vLLM is fast and easy to use library for LLM inference and serving. We shall go through some parameters tuning to get better performance out of vLLM. The vLLM Engine parameter we shall discuss are: -- max-num-batched-tokens -- max-model-len -- gpu-memory-utilization -- enable-prefix-caching -- enable-chunked-prefill -- enforce-eager max-model-len: TL;DR as per your max token usage(input+output) - By default the max model length is the max context length of the model you are using eg. for llama 3 8B instruct model it would be 8192 - If you can determine the max context length for your use case and if it is less than the max context length of your model, it is better to set this parameter to that value. - Along with preventing out of memory error while loading or using the model it will also help to set other parameters like max-num-batched-tokens and gpu-memory-utilization. - The value includes both input and output...

Understanding LSTM

August 03, 2024

Understanding LSTM By Guillaume Chevalier — File:The_LSTM_Cell.svg, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=109362147 The Cell state: As RNN finds it difficult to carry previous information for a long input to the final state, so in LSTM we keep a separate state called cell state where previously learned information is available for the model. The mechanism of how this cell state is maintained and the models ability to learn from new inputs is discussed below. Selectively Removing Information from the Cell State: Forget Gate Mechanism At time step ‘t’, we have a previous cell state vector(or matrix)(ct-1) which has encoded features from all inputs before it. We have a previous hidden state vector(ht-1) which encodes the influence of last input to the long term cell state. We have a new input vector(xt) which should make necessary changes to the encoding done so far. Both these vectors are transformed to the same vector space using two(W and U) lea...

LLM Web Scraping - Webpage to LLM Friendly Text - Fully Open Source

August 02, 2024

LLM Web Scraping Webpage to LLM Friendly Text LLM are good with extracting data from texts. So to scrape any webpage we provide the webpage text to the LLM in such a format that it becomes easy to extract data from them. We use library like selenium, beautifulsoup to get page source html and get text from it. This may help to extract certain information but it can't extract image links, website links for the required product or information we are extracting eg: while scraping any e-commerce website if along with details like product-title, price etc you want the image and product main page link then preprocessing the html becomes important. Below i have shared an open source repository to get LLM friendly text from webpage which can extract any data including websites and image links. APIs like Jina Reader API and Firecrawl API can be used to get clean text from any webpage. If you want a complete open-source option and ability to modify the code as per your need(some webs...

Search This Blog