
Recent advances in generative AI, exemplified by LLMs like Stable Diffusion and ChatGPT, have created significant industry hype. Generative AI involves creating new media (such as text or images) by analyzing massive datasets to deduce and mimic existing patterns, a process driven by probabilistic and stochastic modeling. While models like GPT can produce humanlike text, they operate as language prediction models rather than utilizing true reasoning (AGI), which means they often "stumble over facts," produce inconsistent results, and struggle with basic tasks like multiplication, leading to "hallucinations". To leverage these tools effectively, prompt engineering is necessary—this "subtle art" involves providing clear, specific instructions, setting a system context or persona, and potentially using examples to coax a useful result from the AI. When integrating AI via the stateless Completions API, developers must manually maintain conversation state by sending the entire history with each request, often summarizing older messages to manage token costs. More robust applications can utilize GPT Functions (Tools) to allow the model to intelligently call external functions—avoiding expensive model retraining—to access live or proprietary data. Alternatively, to query custom data using natural language, facts can be converted into high-dimensional vectors called embeddings and compared using cosine similarity against user queries, often managed in a database like Postgress with PG Vector. Finally, the newer Assistants API simplifies the development of domain-specific helpers by automatically managing message history and context compaction, and uniquely, when referencing uploaded knowledge files (like a lease document), it provides specific references or footnotes detailing where the answer was found.
Ref: https://www.youtube.com/watch?v=OxHw_u45h7M&list=PL03Lrmd9CiGey6VY_mGu_N8uI10FrTtXZ&index=18