Ticket Tagger

Machine learning driven issue classification bot. Add to your repository now!

Development

get started:

git clone https://github.com/rafaelkallis/ticket-tagger ticket-tagger
cd ticket-tacker
npm install
npm run dataset

# run benchmark
npm run benchmark

# run linter
npm run lint

# run tests
npm test

# run server
npm start

experiments:

For each experiment, we need a dataset that allows to test the stated hypothesis, as well as a baseline dataset which contains the same amount of labelled issues.

Does a repository specific dataset affect the model's performance?

# run baseline-issues benchmark
npm run dataset:vscode:baseline
npm run benchmark

# run vscode-issues benchmark
npm run dataset:vscode
npm run benchmark

Does a (spoken) language specific dataset affect the models perfomrnace?

# run baseline-issues benchmark
npm run dataset:english:baseline
npm run benchmark

# run english-issues benchmark
npm run dataset:english
npm run benchmark

Do code snippets affect the models perfomrnace?

# run baseline-issues benchmark
npm run dataset:nosnip:baseline
npm run benchmark

# run nosnip-issues benchmark
npm run dataset:nosnip
npm run benchmark

generate dataset:

A dataset (with 10k bugs, 10k enhancements and 10k questions) can be downloaded using npm run dataset. The dataset was generated using github archive's which can be accessed through google BigQuery.

Add the query below to your BigQuery console and adjust if needed (e.g., add __label__ prefix to labels, etc.).

SELECT
  label, CONCAT(title, ' ', REGEXP_REPLACE(body, '(\r|\n|\r\n)',' '))
FROM (
  SELECT
    LOWER(JSON_EXTRACT_SCALAR(payload, '$.issue.labels[0].name')) AS label,
    JSON_EXTRACT_SCALAR(payload, '$.issue.title') AS title,
    JSON_EXTRACT_SCALAR(payload, '$.issue.body') AS body
  FROM
    [githubarchive:day.20180201],
    [githubarchive:day.20180202],
    [githubarchive:day.20180203],
    [githubarchive:day.20180204],
    [githubarchive:day.20180205]
  WHERE
    type = 'IssuesEvent'
    AND JSON_EXTRACT_SCALAR(payload, '$.action') = 'closed' )
WHERE 
  (label = 'bug' OR label = 'enhancement' OR label = 'question')
  AND body != 'null';

run serverless app:

You need a .env file in order to run the github app. The file should look like this:

GITHUB_CERT=/path/to/cert.private-key.pem
GITHUB_SECRET=123456
GITHUB_APP_ID=123
PORT=3000

Note: When running app in production, environment variables should be provided by host.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
src		src
.eslintrc		.eslintrc
.gitignore		.gitignore
.nvmrc		.nvmrc
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
model.bin		model.bin
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

Ticket Tagger

Development

get started:

experiments:

generate dataset:

run serverless app:

references:

About

Licenses found

Releases 8

Contributors 2

Languages

License

Licenses found

rafaelkallis/ticket-tagger

Folders and files

Latest commit

History

Repository files navigation

Ticket Tagger

Development

get started:

experiments:

generate dataset:

run serverless app:

references:

About

Topics

Resources

License

Licenses found

Security policy

Stars

Watchers

Forks

Releases 8

Contributors 2

Languages