247-code-language-id

Programming Language Classification with OpenVINO

Programming language classification is the task of identifying which programming language is used in an arbitrary code snippet. This can be useful to label new data to include in a dataset, and potentially serve as an intermediary step when input snippets need to be process based on their programming language.

More generally, this tutorial shows how to pull pre-trained models from the Hugging Face Hub using the Hugging Face Transformers library and convert them to the OpenVINO™ IR format using the Hugging Face Optimum library. You will also learn how to conduct post-training quantization for OpenVINO models using Hugging Face Optimum and do some light benchmarking using the Hugging Face Evaluate library.

Notebook Contents

This tutorial will be divided in 2 parts:

Create a simple inference pipeline with a pre-trained model using the OpenVINO™ IR format.
Conduct post-training quantization on a pre-trained model using Hugging Face Optimum and benchmark performance.

Licenses

CodeBERTa-small-v1 and CodeBERTa-language-id: no license found on Hugging Face Hub.

Dataset:

Each example in the dataset is extracted from a GitHub repository, and each repository has its own license. Example-wise license information is not (yet) included in this dataset: you will need to find out yourself which license the code is using.

Additional resources:

Installation Instructions

First, complete the repository installation steps. Then, activate your virtual environment, or select the right Python interpreter in your IDE.

Additional requirements will be installed directly from the notebook.

Name		Name	Last commit message	Last commit date
parent directory ..
247-code-language-id.ipynb		247-code-language-id.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

247-code-language-id

247-code-language-id

README.md

Programming Language Classification with OpenVINO

Notebook Contents

Licenses

Installation Instructions

Files

247-code-language-id

Directory actions

More options

Directory actions

More options

Latest commit

History

247-code-language-id

Folders and files

parent directory

README.md

Programming Language Classification with OpenVINO

Notebook Contents

Licenses

Installation Instructions