Keras is a widely used machine learning tool known for its high-level abstractions and ease of use, enabling rapid experimentation. Recent advances in CV and NLP have introduced challenges, such as the prohibitive cost of training large, state-of-the-art models. Access to open-source pretrained models is crucial. Additionally, preprocessing and metrics computation complexity has increased due to varied techniques and frameworks like JAX, TensorFlow, and PyTorch. Improving NLP model training performance is also difficult, with tools like the XLA compiler offering speedups but adding complexity to tensor operations.
Researchers from the Keras Team at Google LLC introduce KerasCV and KerasNLP, extensions of the Keras API for CV and NLP. These packages support JAX, TensorFlow, and PyTorch, emphasizing ease of use and performance. They feature a modular design, offering building blocks for models and data preprocessing at a low level and pretrained task models for popular architectures like Stable Diffusion and GPT-2 at a high level. These models include built-in preprocessing, pretrained weights, and fine-tuning capabilities. The libraries support XLA compilation and utilize TensorFlow’s tf. Data API for efficient preprocessing. They are open-source and available on GitHub.
The HuggingFace Transformers library parallels KerasNLP and KerasCV, offering pretrained model checkpoints for many transformer architectures. While HuggingFace uses a “repeat yourself” approach, KerasNLP adopts a layered approach to reimplement large language models with minimal code. Both methods have their pros and cons. KerasCV and KerasNLP publish all pretrained models on Kaggle Models, which are accessible in Kaggle competition notebooks even in Internet-off mode. Table 1 compares the average time per training or inference step for models like SAM, Gemma, BERT, and Mistral across different versions and frameworks of Keras.
The Keras Domain Packages API adopts a layered design with three main abstraction levels. Foundational Components offer composable modules for building preprocessing pipelines, models, and evaluation logic, which are usable independently of the Keras ecosystem. Pretrained Backbones provide fine-tuning-ready models with matching tokenizers for NLP. Task Models are specialized for tasks like text generation or object detection, combining lower-level modules for a unified training and inference interface. These models can be used with PyTorch, TensorFlow, and JAX frameworks. KerasCV and KerasNLP support the Keras Unified Distribution API for seamless model and data parallelism, simplifying the transition from single-device to multi-device training.
Framework performance varies with the specific model, and Keras 3 allows users to choose the fastest backend for their tasks, consistently outperforming Keras 2, as shown in Table 1. Benchmarks were conducted using a single NVIDIA A100 GPU with 40GB memory on a Google Cloud Compute Engine (a2-highgpu-1g) with 12 vCPUs and 85GB host memory. The same batch size was used across frameworks for the same model and task (fit or predict). Different batch sizes were employed for varying models and functions to optimize memory usage and GPU utilization. Gemma and Mistral used the same batch size due to their similar parameters.
In conclusion, there are plans to enhance the project’s capabilities in the future, particularly by broadening the range of multimodal models to support diverse applications. Additionally, efforts will focus on refining integrations with backend-specific large model serving solutions to ensure smooth deployment and scalability. KerasCV and KerasNLP present versatile toolkits featuring modular components for quick model prototyping and a variety of pretrained backbones and task models for computer vision and natural language processing tasks. These resources cater to JAX, TensorFlow, or PyTorch users, offering state-of-the-art training and inference performance. Comprehensive user guides for KerasCV and KerasNLP are available on Keras.io.
Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.
If you like our work, you will love our newsletter..
Don’t Forget to join our 43k+ ML SubReddit | Also, check out our AI Events Platform
The post Advancing Machine Learning with KerasCV and KerasNLP: A Comprehensive Overview appeared first on MarkTechPost.
#AIPaperSummary #AIShorts #ArtificialIntelligence #EditorsPick #Staff #TechNews #Technology [Source: AI Techpark]