modelscope
ModelScope: bring the notion of Model-as-a-Service to life.
Description
Introduction
ModelScope is built upon the notion of “Model-as-a-Service” (MaaS). It seeks to bring together most advanced machine learning models from the AI community, and streamlines the process of leveraging AI models in real-world applications. The core ModelScope library open-sourced in this repository provides the interfaces and implementations that allow developers to perform model inference, training and evaluation.
In particular, with rich layers of API-abstraction, the ModelScope library offers unified experience to explore state-of-the-art models spanning across domains such as CV, NLP, Speech, Multi-Modality, and Scientific-computation. Model contributors of different areas can integrate models into the ModelScope ecosystem through the layered-APIs, allowing easy and unified access to their models. Once integrated, model inference, fine-tuning, and evaluations can be done with only a few lines of codes. In the meantime, flexibilities are also provided so that different components in the model applications can be customized wherever necessary.
Apart from harboring implementations of a wide range of different models, ModelScope library also enables the necessary interactions with ModelScope backend services, particularly with the Model-Hub and Dataset-Hub. Such interactions facilitate management of various entities (models and datasets) to be performed seamlessly under-the-hood, including entity lookup, version control, cache management, and many others.
Models and Online Accessibility
Hundreds of models are made publicly available on ModelScope (700+ and counting), covering the latest development in areas such as NLP, CV, Audio, Multi-modality, and AI for Science, etc. Many of these models represent the SOTA in their specific fields, and made their open-sourced debut on ModelScope. Users can visit ModelScope(modelscope.cn) and experience first-hand how these models perform via online experience, with just a few clicks. Immediate developer-experience is also possible through the ModelScope Notebook, which is backed by ready-to-use CPU/GPU development environment in the cloud - only one click away on ModelScope.
<p align="center"> <br> <img src="data/resource/inference.gif" width="1024"/> <br> <p>Some representative examples include:
LLM:
Multi-Modal:
CV:
-
DuGuang - Text Recognition - Line Recognition Model - Chinese and English - General Domain
-
DuGuang - Text Recognition - Line Recognition Model - Chinese and English - General Domain
Audio:
-
Paraformer Speech Recognition - Chinese - General - 16k - Offline - Large - Long Audio Version
-
FSMN Voice Endpoint Detection - Chinese - General - 16k - onnx
-
Monotonic-Aligner Speech Timestamp Prediction - 16k - Offline
-
Speech Synthesis - Chinese - Multiple Emotions Domain - 16k - Multiple Speakers
AI for Science:
Note: Most models on ModelScope are public and can be downloaded directly from the website, please refer to instructions for model download, for downloading models with api provided by modelscope library or git.
QuickTour
We provide unified interface for inference using pipeline, fine-tuning and evaluation using Trainer for different tasks.
For any given task with any type of input (image, text, audio, video...), inference pipeline can be implemented with only a few lines of code, which will automatically load the underlying model to get inference result, as is exemplified below:
>>> from modelscope.pipelines import pipeline
>>> word_segmentation = pipeline('word-segmentation',model='damo/nlp_structbert_word-segmentation_chinese-base')
>>> word_segmentation('今天天气不错,适合出去游玩')
{'output': '今天 天气 不错 , 适合 出去 游玩'}
Given an image, portrait matting (aka. background-removal) can be accomplished with the following code snippet:

>>> import cv2
>>> from modelscope.pipelines import pipeline
>>> portrait_matting = pipeline('portrait-matting')
>>> result = portrait_matting('https://modelscope.oss-cn-beijing.aliyuncs.com/test/images/image_matting.png')
>>> cv2.imwrite('result.png', result['output_img'])
The output image with the background removed is:

Fine-tuning and evaluation can also be done with a few more lines of code to set up training dataset and trainer, with the heavy-lifting work of training and evaluation a model encapsulated in the implementation of trainer.train() and
trainer.evaluate() interfaces.
For example, the gpt3 base model (1.3B) can be fine-tuned with the chinese-poetry dataset, resulting in a model that can be used for chinese-poetry generation.
>>> from modelscope.metainfo import Trainers
>>> from modelscope.msdatasets import MsDataset
>>> from modelscope.trainers import build_trainer
>>> train_dataset = MsDataset.load('chinese-poetry-collection', split='train'). remap_columns({'text1': 'src_txt'})
>>> eval_dataset = MsDataset.load('chinese-poetry-collection', split='test').remap_columns({'text1': 'src_txt'})
>>>