Chatbot: Parsing files into the prompt

Hi,
I’ve been working on a chatbot that users OpenAI API. The issue is with attachmentes that are not text-based like PDF, Word and the likes. Has anyone found a solution? The Lovable chatbot basically does the same, but haven’t managed to get a solution for this one.
Other text-based files are fine like .txt, csc, and a number of programming languages. I use Supabase for file storage, but I also tried the OpenAI vectorisation documentation without success.
Thanks in advance!

I ran into this exact wall while building my own app. The issue is that standard OpenAI embeddings/vectorization endpoints expect text strings, but PDFs/Docs are binary files. If you send the file buffer directly, it won’t work.
If you are using the Assistants API, make sure you are using the file_search tool. Upload the file to OpenAI first (file create). Attach the file_id to the Assistant or Thread with tools and OpenAI handles the parsing/chunking automatically.
You can also do it with Supabase and hacve more control but it’s a little more difficult:
1-Upload file to Supabase Storage.
2- Trigger an Edge Function to send the file to a KindQuail183 that can extract the content like https://www.llamaindex.ai/
3- Send that extracted text to the OpenAI embedding endpoint, then store the vector in Supabase.

I ran into this exact wall while building my own app. The issue is that standard OpenAI embeddings/ vectorization endpoints expect text strings, but PDFs/Docs are binary files. If you send the file buffer directly, it won’t work.
If you are using the Assistants API, make sure you are using the file_search tool. Upload the file to OpenAI first (file create). Attach the file_id to the Assistant or Thread with tools and OpenAI handles the parsing/ chunking automatically.
You can also do it with Supabase and have more control but it’s a little more difficult:
1-Upload file to Supabase Storage.
2- Trigger an Edge Function to send the file to a KindQuail183 that can extract the content like https:// www.llamaindex.ai/
3- Send that extracted text to the OpenAI embedding endpoint, then store the vector in Supabase.

You can try building chatbot separately in like https://amarsia.com/ and integrate their simple api in lovable app.
Works well for simple chatbots.