In the world of data, textual data stands out as being particularly complex. It doesn’t fall into neat rows and columns like numerical data does. As a side project, I’m in the process of developing my own personal AI assistant. The objective is to use the data within my notes and documents to answer my questions. The important benefit is all data processing will occure locally on my computer, ensuring that no documents are uploaded to the cloud, and my documents will remain private.
Demystifying Text Data with the unstructured Python Library — https://saeedesmaili.com/demystifying-text-data-with-the-unstructured-python-library/
To handle such unstructured data, I’ve found the unstructured Python library to be extremely useful. It’s a flexible tool that works with various document formats, including Markdown, , XML, and HTML documents.
AI Reading List 6/27/2023
What I’m reading today.
- How Unstructured and LlamaIndex can help bring the power of LLM’s to your own data
- All You Need to Know to Build Your First LLM App — A Step-by-Step Tutorial to Document Loaders, Embeddings, Vector Stores and Prompt Templates
- Answering Questions about any kind of Documents using Langchain (Not GPT3/GPT4) — Unlocking the Power of Langchain: A Comprehensive Python Guide to Answer Questions about Your Documents from Local Files, URLs, YouTube Videos, and Websites
- Build A Capable Machine For LLM and AI — Build A Dual GPUs PC for Machine Learning and AI with Minimum cost
- LlamaIndex: How to use Index correctly.
- Building a Question-Answer Bot With Langchain, Vicuna, and Sentence Transformers — A Q/A bot with open source