Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
-
Updated
Apr 22, 2025 - Python
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
Resources of our survey paper "Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies"
CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search
Large-scale Auto-Distributed Training/Inference Unified Framework | Memory-Compute-Control Decoupled Architecture | Multi-language SDK & Heterogeneous Hardware Support
EmbeddedLLM: API server for Embedded Device Deployment. Currently support CUDA/OpenVINO/IpexLLM/DirectML/CPU
Streamlining the process for seamless execution of PyCoral in running TensorFlow Lite models on an Edge TPU USB.
Build self-hosted RAG AI Agents powered by open-source LLMs, use LLM models from Ollama and Huggingface, add external API calls, python and shell scripts for context-aware LLM interactions, add validation checks, and build Bring Your Own Infrastructure (BYOI) Dockerized AI Agent images.
Генерация описаний к изображениям с помощью различных архитектур нейронных сетей
Accelerating AI Training and Inference from Storage Perspective (Must-read Papers on Storage for AI)
The primary objective of this project was to build and deploy an image classification model for Scones Unlimited, a scone-delivery-focused logistic company, using AWS SageMaker.
Image Classifiers are used in the field of computer vision to identify the content of an image and it is used across a broad variety of industries, from advanced technologies like autonomous vehicles and augmented reality, to eCommerce platforms, and even in diagnostic medicine.
😊📸 Real-Time Facial Emotion Recognition using Deep Learning 🤖🧠
Example distributed system for ML model inference by using Kafka, including spring boot REST+JPA server with Java consumer program
Successfully fine-tuned a pretrained DistilBERT transformer model that can classify social media text data into one of 4 cyberbullying labels i.e. ethnicity/race, gender/sexual, religion and not cyberbullying with a remarkable accuracy of 99%.
Successfully developed a fine-tuned DistilBERT transformer model which can accurately predict the overall sentiment of a piece of financial news up to an accuracy of nearly 81.5%.
This project is a web-based application that uses a pre-trained Mask R-CNN model to detect and classify car damage types (scratch, dent, shatter, dislocation) from images. Users can upload an image of a car, and the application will highlight damaged areas with bounding boxes and masks, providing a clear visual representation of the detected damage
A cloud run function to invoke a prediction against a machine learning model that has been trained outside of a cloud provider.
Successfully established a text summarization model using Seq2Seq modeling with Luong Attention, which can give a short and concise summary of the global news headlines.
The primary objective of this project was to build and deploy an image classification model for Scones Unlimited, a scone-delivery-focused logistic company, using AWS SageMaker.
Add a description, image, and links to the model-inference topic page so that developers can more easily learn about it.
To associate your repository with the model-inference topic, visit your repo's landing page and select "manage topics."