-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
history rewrite to remove dataset from git
- Loading branch information
1 parent
0ab953b
commit fd39d37
Showing
19 changed files
with
1,360 additions
and
166,030 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,8 @@ | ||
node_modules | ||
*.swp | ||
*.env\ | ||
|
||
dataset.csv | ||
model.vec | ||
test.txt | ||
train.txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
web: npm start |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,77 @@ | ||
# ticket-tagger | ||
# ticket-tagger | ||
|
||
### Development | ||
|
||
#### get started: | ||
|
||
```sh | ||
git clone https://github.com/rafaelkallis/ticket-tagger ticket-tagger | ||
cd ticket-tacker | ||
npm install | ||
|
||
# run benchmark | ||
npm run benchmark | ||
|
||
# run server | ||
npm start | ||
``` | ||
|
||
#### customize preprocessing: | ||
|
||
```js | ||
/* src/preprocess.js */ | ||
|
||
const stemmer = require('natural').PorterStemmer; | ||
|
||
/* example preprocessing method */ | ||
module.exports = function(text) { | ||
const stem = stemmer.tokenizeAndStem(text); | ||
return stem.join(' '); | ||
} | ||
``` | ||
|
||
#### generate dataset: | ||
|
||
a dataset (with 10k bugs, 10k enhancements and 10k questions) is already included in the repository, or can be found [here](https://gist.github.com/rafaelkallis/707743843fa0337277ab36b42607c46d). | ||
the dataset was generated using github archive's which can be accessed through google [BigQuery](https://bigquery.cloud.google.com). | ||
|
||
add the query below to your BigQuery console and adjust if needed. | ||
|
||
```sql | ||
SELECT | ||
label, CONCAT(title, ' ', REGEXP_REPLACE(body, '(\r|\n|\r\n)',' ')) | ||
FROM ( | ||
SELECT | ||
LOWER(JSON_EXTRACT_SCALAR(payload, '$.issue.labels[0].name')) AS label, | ||
JSON_EXTRACT_SCALAR(payload, '$.issue.title') AS title, | ||
JSON_EXTRACT_SCALAR(payload, '$.issue.body') AS body | ||
FROM | ||
[githubarchive:day.20180201], | ||
[githubarchive:day.20180202], | ||
[githubarchive:day.20180203], | ||
[githubarchive:day.20180204], | ||
[githubarchive:day.20180205] | ||
WHERE | ||
type = 'IssuesEvent' | ||
AND JSON_EXTRACT_SCALAR(payload, '$.action') = 'closed' ) | ||
WHERE | ||
(label = 'bug' OR label = 'enhancement' OR label = 'question') | ||
AND body != 'null'; | ||
``` | ||
|
||
#### run serverless app: | ||
|
||
you need a `.env` file in order to run the marketplace app. | ||
The file should look like this: | ||
|
||
``` | ||
GITHUB_CERT=/path/to/cert.private-key.pem | ||
GITHUB_SECRET=123456 | ||
GITHUB_APP_ID=123 | ||
PORT=3000 | ||
``` | ||
|
||
#### references: | ||
|
||
- [Building GitHub Apps](https://developer.github.com/apps/building-github-apps/) | ||
- [Fasttext](https://fasttext.cc) |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.