• Wed. Nov 27th, 2024

Branch-and-Merge Method: Enhancing Language Adaptation in AI Models by Mitigating Catastrophic Forgetting and Ensuring Retention of Base Language Capabilities while Learning New Languages

Language model adaptation is a crucial area in artificial intelligence, focusing on enhancing large pre-trained language models to work effectively across various languages. This research is vital for enabling these…

Arena Learning: Transforming Post-Training of Large Language Models with AI-Powered Simulated Battles for Enhanced Efficiency and Performance in Natural Language Processing

Large language models (LLMs) have shown exceptional capabilities in understanding and generating human language, making substantial contributions to applications such as conversational AI. Chatbots powered by LLMs can engage in…

Metron: A Holistic AI Framework for Evaluating User-Facing Performance in LLM Inference Systems

Evaluating the performance of large language model (LLM) inference systems using conventional metrics presents significant challenges. Metrics such as Time To First Token (TTFT) and Time Between Tokens (TBT) do…

Optimizing Large Language Models (LLMs) on CPUs: Techniques for Enhanced Inference and Efficiency

Large Language Models (LLMs) built on the Transformer architecture have recently attained important technological milestones. The remarkable skills of these models in comprehending and producing writing that resembles that of…

Meet Reworkd: An AI Startup that Automates End-to-end Data Extraction

Collecting, monitoring, and maintaining a web data pipeline can be daunting and time-consuming when dealing with large amounts of data. Traditional approaches’ struggles can compromise data quality and availability with…

FBI-LLM (Fully BInarized Large Language Model): An AI Framework Using Autoregressive Distillation for 1-bit Weight Binarization of LLMs from Scratch

Transformer-based LLMs like ChatGPT and LLaMA excel in tasks requiring domain expertise and complex reasoning due to their large parameter sizes and extensive training data. However, their substantial computational and…

Hyperion: A Novel, Modular, Distributed, High-Performance Optimization Framework Targeting both Discrete and Continuous-Time SLAM Applications

In robotics, understanding the position and movement of a sensor suite within its environment is crucial. Traditional methods, called Simultaneous Localization and Mapping (SLAM), often face challenges with unsynchronized sensor…

Enhancing LLM Reliability: The Lookback Lens Approach to Hallucination Detection

Large Language Models (LLMs) like GPT-4 exhibit impressive capabilities in text generation tasks such as summarization and question answering. However, they often produce “hallucinations,” generating content that is factually incorrect…

Korvus: An All-in-One Open-Source RAG (Retrieval-Augmented Generation) Pipeline Built for Postgres

The Retrieval-Augmented Generation (RAG) pipeline includes four major steps— generating embeddings for queries and documents, retrieving relevant documents, analyzing the retrieved data, and generating the final response. Each of these…

Q-GaLore Released: A Memory-Efficient Training Approach for Pre-Training and Fine-Tuning Machine Learning Models

Large Language Models (LLMs) have become critical tools in various domains due to their exceptional ability to understand and generate human language. These models, which often contain billions of parameters,…