Angels-Tale

Inspiration

As the pandemic draws cities into lockdowns and hospitals into turmoils, we are horrified when watching the social media posts of the frontline health workers, expressing their exhaustion at work, their lack of communication with families, and their desperate need to re-unite with their children. These melancholic posts have driven us to consider potential solutions to reconnect frontline parents with their kids.

What it does

In order to shorten the distance between children and their heroic parents battling as frontline healthcare workers, we have designed and implemented a simple web application, Angel's Tale, which, after parents submit a short piece of voice and a selfie, generates videos of parents telling bedtime stories to their kids. In particular, parents submit 10 seconds of their voice and a selfie photo through our chat-based input interface, and five story videos, in which the parent tells each story respectively, will then be automatically generated through our AI-driven application. Children can then choose and watch the story videos after they are generated. We hope that our application can save those parents their precious time at the frontline, while also able to connect with their beloved children conveniently during this difficult time.

How we built it

For our backend application, we utilized pre-trained AI models for audio and video processing. In particular, we used a model defined and pre-trained in First Order Motion Model for Image Animation, which, given a sample driving video in which one of our team members reads the bedtime story, and the parent's selfie, generates a video of the parent telling the story, based on the content in the driving video. For the audio, we utilized a pre-trained Real-Time Voice Cloning model which takes the parent's ten-second audio input and the text of the story, then generates the audio of the parent reading the text. Eventually, we combine the video and audio generated by the two models to form our resulting video of the parent telling the story.

For our frontend, we utilized Voiceflow for our chat-based user interface in which we obtain user information by either text or audio input; we built our file-upload interface, story-choosing landing page, as well as our APIs and user database, using Anvil.works, which also serves our frontend application.

Challenges we ran into

There are several challenges we ran into.

The design of the workflow of the application has been problematic. The culprit is Voiceflow's inability to obtain user-uploaded audio files, and thus we have to implement another frontend application using Anvil.works, to obtain these user input necessary to generate the videos.
It is difficult to adjust the complexity of the user interface to best fit the ability of the audience (children).
With three components in our workflow, the process of building APIs in both our backend and Anvil.works, as well as connecting the API calls and handling asynchronizations when the videos are generated, is complex and time-consuming.
The AI models that we utilized are pre-trained. This makes it challenging to handle some of our input data, which may be of great difference to the data that the model is trained on. Specifically, we have observed that it is very challenging for the audio model to inference on extremely long or short sentences, or with difficult vocabularies.

Accomplishments that we're proud of

We have successfully leveraged three platforms (Voiceflow, Anvil.works and Google Colab) and integrated them into a single application without any previous experience with them.
We have manipulated audio and video files programatically with various actions (resize, streaming, merging, synthesizing, trimming, fading effects, etc).
We leveraged Amazon's Alexa Settings API from Alexa Skills Kit in order to track the user's device and its time zone. With the user time zone, we designed a user-friendly feature which takes time zone of the user's device into consideration, and only plays the stories for the children in a certain time interval of the day (from 8AM to 9PM).

What's next for Angel's Tale

Angle's Tale is a web application which has high scalability. Specifically, with the backend currently hosting on Google Colab due to the time constraint of the hackathon, we are looking forward to package the backend into a docker container and host the container on cloud services such as AWS.
Also, currently we directly use the pre-trained weights for the AI model to inference on user input. In the future, the ML model can be improved if it is trained on more stories with longer sentences and more difficult vocabularies.
We are also looking forward to integrate the Voiceflow and Anvil.works components together as our only frontend.

Built With

Anvil.works
Machine-learning
Python
Tensorflow
UI/UX
Voiceflow
Http request / response
Compound sound

Link

Devpost link

Anvil link

Youtube link

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
1.png		1.png
1.txt		1.txt
2.png		2.png
2.txt		2.txt
3.png		3.png
3.txt		3.txt
4.png		4.png
4.txt		4.txt
5.png		5.png
5.txt		5.txt
Angels_Tale.ipynb		Angels_Tale.ipynb
LICENSE		LICENSE
README.md		README.md
children_cinderella.mp4		children_cinderella.mp4
children_waterloo.mp4		children_waterloo.mp4
dropbox.mp4		dropbox.mp4
dropbox_Trim.mp4		dropbox_Trim.mp4
generated_0.mp4		generated_0.mp4
generated_1.mp4		generated_1.mp4
generated_2.mp4		generated_2.mp4
generated_3.mp4		generated_3.mp4
generated_4.mp4		generated_4.mp4
generated_5.mp4		generated_5.mp4
generated_videos.zip		generated_videos.zip
sample_voice.mp3		sample_voice.mp3
sample_voice.wav		sample_voice.wav
submit_file.mp4		submit_file.mp4
submit_file_Trim.mp4		submit_file_Trim.mp4
user.jpg		user.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Angels-Tale

Inspiration

What it does

How we built it

Challenges we ran into

Accomplishments that we're proud of

What's next for Angel's Tale

Built With

Link

About

Releases

Packages

Contributors 3

Languages

License

HEC2018/Angels-Tale

Folders and files

Latest commit

History

Repository files navigation

Angels-Tale

Inspiration

What it does

How we built it

Challenges we ran into

Accomplishments that we're proud of

What's next for Angel's Tale

Built With

Link

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages