outputs of bothmodels.For alignment, fuzzy matching techniques were used. These techniques matched sentences be-tween GPT-4o and DeepSeek R1, even when there were minor differences in phrasing. This ap-proach improved the accuracy of mapping and ensured consistency in the processed data. Theresult was a clean and reliable dataset for analysis.5.2. Overall Categorization CoverageWe analyzed the total number of sentences processed and the extent to which GPT-4o and DeepSeekR1 provided category assignments. Table 1 summarizes the categorization coverage across all an-alyzed sentences. ChatGPT DeepSeek Total Sentences 1823 1823
described above has been dubbed the “AIPipeline” and consists of the following software components: ● Speech-to-Text (STT) Module – voice recognition/natural language processing (NLP) ● Large Language Model (LLM) – ex. ChatGPT, Google Gemini ● Text-to-Speech (TTS) Module – transforms text output from the LLM into audible speechFor a better understanding, Figure 4 shows a visual representation of the AI Pipeline: Figure 4: AI Pipeline OverviewTo select an LLM for the AI Pipeline, the following five metrics were considered: (1) contextwindow (in thousands of tokens) – the number of tokens (numerical representations of words)that an instance of generative AI can process at once. Essentially, it determines