Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Title extraction from meta + regex fixes #73

Merged
merged 7 commits into from
Jul 13, 2016
Merged

Conversation

haroldtreen
Copy link
Collaborator

@haroldtreen haroldtreen commented Jun 11, 2016

Changes:

  • Update getTitle to check the meta tags for titles. (Twitter + Open Graph titles)
  • Concatenate class and id with a newline. This allows you to match on things like post but only if it is at the start of the class/id.
  • Update the regex to fix some sites that were failing to extract content.
  • Add regression tests to check the extraction/title fixes.

How to test:

  • Run npm test with and without the helper changes to verify that they are fixing article extraction for the added tests.

Let me know if you have any questions/comments/concerns @luin :).

@haroldtreen
Copy link
Collaborator Author

Bump!
@luin

@haroldtreen
Copy link
Collaborator Author

Hey @luin.

I submitted this over a month ago and there's been no response.
I'm feeling discouraged from continuing to contribute this repo as a result.

Can you let me know what you want to do with this?

@luin luin merged commit 4e0c009 into luin:master Jul 13, 2016
@haroldtreen
Copy link
Collaborator Author

Thanks @luin :)

@luin
Copy link
Owner

luin commented Jul 13, 2016

Hi @haroldtreen. Sorry for the very late response! A little busy this month. I checked the code and it's awesome! Just merged it.

btw, I added you as a contributor of this repo. 😆

@haroldtreen
Copy link
Collaborator Author

Woot! Thanks @luin! :D

I'm actually going to a programming retreat in September and will be working on Open Source for 3 months. Content extraction has been a fun problem, so I'll probably have more to add. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants