-
-
Notifications
You must be signed in to change notification settings - Fork 23.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate to automatic README translation #2053
Comments
Hey @rickstaa |
@Pranav2612000 First of all, welcome to the commuExcellent! Amazing that you want to help us improve the maintainability of the repository. I am unsure how hard it is to implement this feature and whether https://github.com/dephraiim/translate-readme serves our needs. The implementation found in translate-readme is quite basic (see https://github.com/dephraiim/translate-readme/blob/main/index.js). It could therefore be that it does not filter the query parameters found in the code blocks. It could therefore be that we need to improve this action or create a new action. My original idea
🇳🇱 🇫🇷 🇺🇸 🇩🇪 🇮🇹
|
@Pranav2612000 Looks like the https://github.com/dephraiim/translate-readme does not yet support |
Yeah. Took a look at https://github.com/dephraiim/translate-readme and I agree we'll need to modify this a bit. I'll see if I can come up with something so that we don't translate the query params and only translate the non-code text. |
@Pranav2612000 I did some research, and the paid google translate API does handle HTML code (see https://cloud.google.com/translate/docs/advanced/translating-text-v3). However, it does not filter markdown code blocks and will likely translate code in those blocks. These blocks, therefore, have to be filtered and injected using regex. Further, google will charge $10 per million characters after the 500000 chars per month have been used up. Users have to set up an API key to get it to work. In contrast https://github.com/dephraiim/translate-readme uses https://github.com/iamtraction/google-translate/blob/master/src/tokenGenerator.js#L73 which simply makes a call to the translate.google.com. The results are therefore more unstable, limited to 5000 characters and require more regex filtering before they can be used. Therefore, I think this should be possible both with the free and paid versions, but it does require significant development time to filter out markdown code blocks and HTML. |
Still, feel free to try to tackle this if you think it can be done in the time you had set for implementing this feature. 🤔 I think both versions (paid and free) would require some parsing to ensure that markdown code blocks and HTML code are still valid. I did not search yet, but there might be some packages that can already provide this ability. |
Hey everyone, How about using a localization platform? I like Crowdin - a cloud-based solution that streamlines localization management for your team. It's free for open-source. Crowdin allows the community to collaborate on content translation and there is a possibility to set up an automatic translations synchronization using Crowdin's native GitHub integration or GitHub Actions. Node.js CLI Apps Best Practices - an excellent example of a project using Crowdin for translating content by a community + GH Action for automatic synchronization. I would be happy to help with the setup. |
Hey there! @rickstaa I think I can tackle this. If you could assign this to me that would be great. Also you would be willing to use paid solutions right? |
@andrii-bodnar Thanks for your message. I appreciate you trying to provide me with a solution. 👍🏻 I checked your profile and see you are a software engineer at Crowdin. I don't care since you offer a valid solution, but some people might fall over that. Maybe next time, add a disclaimer to your comment. Having that said, I checked your documentation, videos and platform, and I have to say that I'm impressed by the tool you created. I think it is beneficial for streamlining translations for big projects. Thanks for bringing it to my attention. For our small project, however, I think it does not offer too much improvement over the translations.js we are currently using. The main thing I am trying to solve with #2053 is to eliminate the manual translations of the readme we currently use since these are often incorrect and outdated and clutter our PR backlog. I am therefore looking for an action that uses a service like a google translation API or the free google translation website to do the translation. I found https://github.com/dephraiim/translate-readme, but as explained above, it does not support our readme because of the HTML and markdown code blocks. |
@parinzee Thanks for offering to help implement this feature. Since https://github.com/anuraghazra/github-readme-stats is a free, open-source project, we can, unfortunately, not rely on paid solutions. The reason I mention the google translation API API is that it offers 500000 free translation characters per month, which should be enough to translate the readme (which has |
@parinzee, @Pranav2612000 I just removed the |
@andrii-bodnar I had some time to look at Crowdin and implemented it on one of my other OS repositories. For the card translations, I think it is a significant improvement over the manual translation PR. If @anuraghazra is okay with it, we can use Crowdin for the card translations (i.e. https://github.com/anuraghazra/github-readme-stats/blob/master/src/translations.js). Maybe we can also add the README translations later, as I'm still thinking about creating an automated solution using the Google translate API. If you could set it up, that would be great 🎉. We can then add a note to both the README.md and CONTRIBUTING.md to explain how users can add card translations. My Crowdin account is TODOs
|
@anuraghazra, What are your thoughts about using Crowdin for our card translations? I think it improves the translation procedure or do you think it is a bit overkill for only the https://github.com/anuraghazra/github-readme-stats/blob/master/src/translations.js file 🤔? |
Hi @rickstaa, happy to hear about your success with Crowdin implementation in the GitHub Emoji Picker project! Just checked the translations.js and it seems like it requires some refactoring to be ready for automatic localization. The main issue here is that all the languages are located inside a single file. It would be great to split these languages into separate files and ideally store them in JSON files. From my perspective, Crowdin could be used here for translating both card texts and Readme. Readme files could be translated through the automatic workflows via MT engines. That will also give the possibility for translators to suggest better translations since MT engines might provide bad results. |
I'm okay with splitting the files into multiple files as I did for the GitHub Emoji Picker. We can try it out for both the card translations and READMEs. 🔥 I, however, will leave the ultimate decision to @anuraghazra, so let's wait for his thoughts on the change. |
Crowdin seems good! Yeah i think storing locale files as JSON will be the standard way to go. |
@andrii-bodnar, does Crowdin also offer a way to automatically translate the README into other languages using third-party translators like the Google Translate API while keeping code blocks and HTML from being translated? 🤔 |
@rickstaa sure, the best option here - is an automated workflow in Crowdin Enterprise. There is an MT Pre-translation step that can be configured to use some MT engine. New strings will be translated automatically in this case. In addition, it's possible to manually translate or correct strings. Crowdin Workflows are very flexible. A similar flow is possible in crowdin.com also - Custom Workflows. It's simpler than Crowdin Enterprise Workflows but it also has an automatic MT Pre-Translation feature. |
@andrii-bodnar amazing to hear that Crowdin enterprise provides this possibility. Maybe we can arrange a partnership between your company and GRS if you and @anuraghazra are open to that. Such a partnership can benefit both parties since it will give more exposure to your service and makes the GRS repository easier to maintain. 🚀 I don't think the load on your systems would be extreme since we update the README.md or card translations maybe once every two months. 🤔 |
@rickstaa Crowdin is free for Open-Source projects 🙂 It's very easy to submit to the Open-Source plan. First, the project owner needs to create a Crowdin or Crowdin enterprise account. And then, submit an Open-source project setup request form. Of course, we would be extremely happy if you add some badge to your project Readme 🙂 (but it's up to you) |
@rickstaa the only thing I'm worried about - is the upload of the existing translation to Crowdin. The point here is that Crowdin uses ML technology to upload translations of HTML-based files. Sometimes it still requires some manual work to do. For more details see this article. As I can see, the Readme is already translated into a bunch of languages. By the way, how it's going with the JS translation extraction into separate JSON files? |
@andrii-bodnar, unfortunately, I haven't had the time to perform the JS translation extraction. I just discussed this with @anuraghazra, and if you are willing to implement the automatic README translations for us, that would be amazing! We are more than willing to put a Crowdin badge somewhere on the readme. 👍🏻 As explained above, this might be a very beneficial (symbiotic) partnership. 🚀 If you think these automatic |
@andrii-bodnar Feel free to enter my discord server, which can be found on my GitHub README if you want an easier way to discuss 👍. |
@rickstaa @anuraghazra I'll try to prepare a demo Crowdin project and GH Actions Workflow for you 🙂 |
Amazing, thanks! I'm looking forward to seeing your solution. 🚀 |
Just prepared a Demo Crowdin project and created a PR with integration - #2489 Please check it out 🙂 |
Related to #3364. |
Is your feature request related to a problem? Please describe.
Keeping the documentation up to date and managing the PRs would be more manageable if we switched from manual to automatic README translations (see https://github.com/dephraiim/translate-readme). The downside is that there might be some errors, but this shouldn't matter for understanding how to use GRS. Google has become quite good at translating languages in the last few years. The upside is that we no longer need to look at translation PRs, we can support more language, and the translations are up to date. We can add flags to the readme for people to choose their language.
The text was updated successfully, but these errors were encountered: