Muhammad Taimoor Haseeb*, Ahmad Hammoudeh*, Gus Xia
Abstract This paper presents Herrmann-1, a multi-modal framework to generate background music tailored to movie scenes, by integrating state-of-the-art vision, language, music, and speech processing models. Our pipeline begins by extracting visual and speech information from a movie scene, performing emotional analysis on it, and converting these into descriptive texts. Then, GPT-4 translates these high-level descriptions into low-level music conditions. Finally, these text-based music conditions guide a text-to-music model to generate music that resonates with input movie scenes. Comprehensive objective and subjective evaluations attest to the high synthesis quality, congruence, and superiority of our pipeline.
Scroll down to explore tailored background music for movie scenes generated by Herrmann-1. Compare it with the original music for the scene, as well as with music generated by Controllable Music Transformer (CMT) by Di et al., 2021.
*The first two authors contributed equally.
Given the following image captions from a video: 1) the sun is setting behind a tree in the dark 2) the sun is setting over the horizon in the distance 3) the sun is setting over the horizon in the distance 4) the lion king wallpapers the lion king wallpapers the lion king wallpapers the lion king wallpapers the lion king wallpapers the lion king wallpapers 5) the lion king, lion king, lion king 2, lion king 2 wallpaper, lion king wallpaper, lion king wallpaper hd wallpaper 6) a rhinoceros is standing in the grass with an orange sky in the background 7) the lion king, lion king, lion king 2, lion king 2 wallpaper, lion king 2 wallpaper, lion king 2 wallpaper hd wallpaper 8) disney's the lion king the lion, the witch and the Wardrobe 9) the lion king the lion king the lion king the lion king the lion king the lion king the lion king the lion king the lion king the lion king 10) a group of meerkats standing on top of a hill 11) the lion king the lion king the lion king the lion king the lion king the lion king the lion king the lion king the lion king the lion king 12) a lion stands on top of a hill with an orange sky in the background 13) a painting of two storks standing in the water 14) an animated image of a large bird standing on the water 15) an animated image of a bird sitting on a branch 16) a painting of two birds flying over a body of water 17) an animated image of a large bird flying over a body of water 18) a waterfall with birds flying over it at sunset 19) a black background with an airplane flying in the air 20) a black background with an airplane flying in the air and the following transcriptions: 1) Naaan simpwenyaa mabaghi ti baba (SPEAKER_00) 2) Si ji ko mou (SPEAKER_00) 3) Venyaa mabogh (SPEAKER_00) 4) Venyaa mabogh (SPEAKER_00) 5) Naaan simpwenyaa mabaghi ti baba (SPEAKER_02) 6) Si ji ko mou (SPEAKER_01) 7) Venyaa mabogh (SPEAKER_01) 8) Hai baba (SPEAKER_01) 9) Venyaa mabogh (SPEAKER_01) 10) Si yo mou baba (SPEAKER_03) 11) Venyaa mabogh (SPEAKER_03) 12) Venyaa mabogh (SPEAKER_03) and given that the sentiments of the video are: calm 46.67% sad 33.33% inspiring 20.00% Describe the music that would fit such a video. Your output will be fed to a text to music model. To help you out, here are some prompts that worked well with the model: 1) Pop dance track with catchy melodies, tropical percussion, and upbeat rhythms, perfect for the beach 2) classic reggae track with an electronic guitar solo 3) earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves 4) lofi slow bpm electro chill with organic samples 5) violins and synths that inspire awe at the finiteness of life and the universe 6) 80s electronic track with melodic synthesizers, catchy beat and groovy bass Give me only the description of the music without any explanation. Give me a single description.
Majestic African-inspired instrumentation with gentle percussions, soothing flutes, and melodic chants evoking the serenity of vast landscapes, intertwined with melancholic strings capturing moments of introspection, culminating in uplifting crescendos that inspire hope and awe.
Original
Herrmann-1
CMT
Given the following image captions from a video: 1) a man and woman standing on the deck of a boat at sunset 2) a man and woman standing on the deck of a boat at sunset 3) a woman with red hair standing in front of a ship 4) a woman with red hair standing in front of a ship 5) leonardo dicaprio in titanic 6) leonardo dicaprio in 'the great gatsby' 7) a woman with red hair standing in front of a ship 8) a woman with red hair standing in front of a ship 9) leonardo dicaprio in 'titanic' 10) leonardo dicaprio in the movie titanic 11) a woman with red hair standing in front of a ship 12) a woman with red hair standing in front of a ship 13) a man and woman standing on the deck of a boat at sunset 14) a man and woman standing on the deck of a boat at sunset 15) a close up of a woman with red hair 16) a woman with red hair looking at a man 17) a man and a woman looking at each other in a scene from the movie titanic 18) leonardo dicaprio and mia farrow in titanic 19) a woman looking into the eyes of a man 20) a man and a woman are looking at each other 21) a man and a woman standing next to each other 22) leonardo dicaprio and meryl streep in titanic and the following transcriptions: 1) I said you might be after me. (SPEAKER_01) 2) Give me your hand. (SPEAKER_00) 3) Now close your eyes. (SPEAKER_00) 4) Go on. (SPEAKER_00) 5) Step up. (SPEAKER_00) 6) Now hold on to the railing. (SPEAKER_00) 7) Keep your eyes closed. (SPEAKER_00) and given that the sentiments of the video are: romantic 100.00% Describe the music that would fit such a video. Your output will be fed to a text to music model. To help you out, here are some prompts that worked well with the model: 1) Pop dance track with catchy melodies, tropical percussion, and upbeat rhythms, perfect for the beach 2) classic reggae track with an electronic guitar solo 3) earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves 4) lofi slow bpm electro chill with organic samples 5) violins and synths that inspire awe at the finiteness of life and the universe 6) 80s electronic track with melodic synthesizers, catchy beat and groovy bass Give me only the description of the music without any explanation. Give me a single description.
Ethereal orchestral piece with soft piano undertones, interlaced with melancholic violin solos, reminiscent of historical romance. Very gentle harp plucks accentuate moments of intimacy, crescendos mirroring the ebb and flow of the sea. Haunting flute melodies interspersed, evoking the vastness of the ocean and the fragility of love. The backdrop is filled with soft, sweeping strings, building up to a powerful climax, capturing the essence of timeless romance and the breathtaking moments of connection between two souls. As the story unfolds, subtle choral harmonies emerge, giving depth and warmth, encapsulating the nostalgia and longing found in epic tales of love.
Original
Herrmann-1
CMT
Given the following image captions from a video: 1) the sun is setting over the city of san francisco 2) a woman in lingerie standing in front of a crowd 3)a video of people dancing in a club 4) a group of people sitting at a table at night 5) a group of people standing around a table at night 6) a group of people sitting around a table at night 7) a young man sitting at a table in a dimly lit room 8) a young man sitting at a table in a dimly lit room 9) a group of people sitting around a table at night 10) a group of people sitting around a table at a bar 11) a young man sitting at a table in a dimly lit room 12) a young man is looking at the camera with his mouth open. and the following transcriptions: 1) I was crashing there for a little bit while taking care of some things but she's done (SPEAKER_00) 2) for the summer so she's back in her parents place. (SPEAKER_00) 3) The homeless rock star Palazzo. (SPEAKER_00) 4) Alright. (SPEAKER_00) 5) What's your plan for the summer? (SPEAKER_00) 6) Mark. (SPEAKER_00) and given that the sentiments of the video are: excited 50.00% energetic 35.71% sensual 14.29% Describe the music that would fit such a video. Your output will be fed to a text to music model. To help you out, here are some prompts that worked well with the model: 1) Pop dance track with catchy melodies, tropical percussion, and upbeat rhythms, perfect for the beach 2) classic reggae track with an electronic guitar solo 3) earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves 4) lofi slow bpm electro chill with organic samples 5) violins and synths that inspire awe at the finiteness of life and the universe 6) 80s electronic track with melodic synthesizers, catchy beat and groovy bass Give me only the description of the music without any explanation. Give me a single description.
Upbeat electronic track with deep bass grooves, pulsating synthesizers, rhythmic drum patterns, perfect for high-energy nightlife vibes.
Original
Herrmann-1
CMT
Given the following image captions from a video: 1) a man is standing in front of a building in the fog 2) a man standing in front of a building on a foggy day 3) a man standing in front of a bridge on a foggy day 4) an empty road surrounded by trees on a foggy day 5) an empty road in the middle of a forest on a foggy day 6) a car driving down a country road on a foggy day 7) a spider web is seen through the bars of a cage 8) a spider web is seen through the bars of a cage 9) a spider web is seen through a chain link fence 10) a person walking down a dirt road in the middle of a forest 11) a person walking down a path in the middle of a forest 12) a man is walking down a dirt road in the middle of a forest 13) a person standing in the middle of a forest 14) a man standing in the middle of a forest 15) a person walking through the woods on a foggy day 16) trees in the forest in the rain 17) a view of the trees in the forest 18) looking up into the canopy of trees in the forest 19) an image of a small town on a foggy day 20) a man is walking down a road in the fog 21) a person walking down a road in the fog and given that the sentiments of the video are: mysterious 90.48% calm 9.52% Describe the music that would fit such a video. Your output will be fed to a text to music model. To help you out, here are some prompts that worked well with the model: 1) Pop dance track with catchy melodies, tropical percussion, and upbeat rhythms, perfect for the beach 2) classic reggae track with an electronic guitar solo 3) earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves 4) lofi slow bpm electro chill with organic samples 5) violins and synths that inspire awe at the finiteness of life and the universe 6) 80s electronic track with melodic synthesizers, catchy beat and groovy bass Give me only the description of the music without any explanation. Give me a single description.
Ambient and atmospheric soundscapes with soft piano melodies, echoing strings, and subtle ethereal synths, evoking a sense of mystery, solitude, and the unknown within a serene natural setting.
Original
Herrmann-1
CMT
Given the following image captions from a video: 1) a scene from a movie with two people walking near a castle wall 2) two women in spartan armor on a dirt road 3) two men are fighting in front of a building 4) an image of a woman with a spear in the dirt 5) a man and woman in roman clothes are looking at each other 6) two men in armor standing next to each other 7) two men are fighting with swords in the air 8) a scene from the movie romeo and juliet 9) two men in roman armor fighting in front of a castle 10) a man in roman armor running with a large shield 11) two men holding spears and shields in a field 12) two men in spartan armor fighting in front of a castle 13) two men in armor fighting in front of a castle 14) an old man is standing in front of a group of men 15) two people fighting in the middle of a field 16) two men in armor fighting in front of a castle 17) an image of a man with a sword in his hand 18) two men in spartan armor fighting each other 19) two men in roman armor fighting in front of a castle 20) two men dressed in roman armor are fighting in front of a wall 21) a man holding a sword in front of a castle 22) an image of a man holding a sword in a field 23) two men in spartan armor fighting in front of a wall 24) an image of a man in armor holding a sword 25) an image of a man in armor holding a sword and the following transcriptions: 1) I (SPEAKER_00) 2) You (SPEAKER_00) and given that the sentiments of the video are: action 90.00% sad 5.00% afraid 5.00% Describe the music that would fit such a video. Your output will be fed to a text to music model. To help you out, here are some prompts that worked well with the model: 1) Pop dance track with catchy melodies, tropical percussion, and upbeat rhythms, perfect for the beach 2) classic reggae track with an electronic guitar solo 3) earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves 4) lofi slow bpm electro chill with organic samples 5) violins and synths that inspire awe at the finiteness of life and the universe 6) 80s electronic track with melodic synthesizers, catchy beat and groovy bass Give me only the description of the music without any explanation. Give me a single description.
Epic orchestral score with thundering drums, intense string sections, and a hint of melancholic flute, embodying the valor and tension of ancient battles and the tragic undertones of warrior confrontations, interspersed with moments of dramatic silence accentuating the weight of the conflict.
Original
Herrmann-1
CMT
Given the following image captions from a video: 1) a black background with an airplane flying in the air 2) a black background with an airplane flying in the air 3) a black background with an airplane flying in the air 4) two men in suits standing next to each other 5) two men in suits sitting next to each other 6) a man in a suit and tie talking to another man in a suit 7) two men in suits standing at a counter 8) two men in suits standing at a counter 9) two men in suits sitting at a counter 10) a man with a mustache in a purple suit and bow tie 11) a man in a purple suit and bow tie 12) a man in a purple suit with a moustache and bow tie 13) a man in a purple suit and bow tie is looking away from the camera 14) a blurry image of an orange door with a sign on it 15) a man with glasses standing in front of a mirror 16) a man with glasses and a mustache is standing in front of a door 17) a man with a mustache and glasses is smoking a pipe 18) a man in a suit and tie smoking a pipe 19) a man sitting on a couch in an orange room 20) a man sitting on a couch in an orange room 21) a man sitting on a chair in an orange room 22) a man standing at the front desk of a hotel 23) a man standing at the top of a set of stairs 24) a man in a suit standing at the top of a set of stairs 25) the front page of a newspaper with the words immigrant claims fortune 26) the front page of a newspaper with the words immigrant claims fortune 27) the front page of a newspaper with the words immigrant claims fortune and the following transcriptions: 1) Who's this interesting old fellow? (SPEAKER_02) 2) I inquired of Monsieur Jean. (SPEAKER_02) 3) To my surprise, he was distinctly taken aback. (SPEAKER_02) 4) Don't you know? He asked. (SPEAKER_01) 5) Don't you recognize him? (SPEAKER_01) 6) He did look familiar. (SPEAKER_01) 7) That's Mr. Mustafa himself. He arrived early this morning. (SPEAKER_00) 8) This name will no doubt be familiar to the more seasoned persons among you. (SPEAKER_02) 9) Mr. Zero Mustafa was, at one time, the richest man in Zubrovka. (SPEAKER_02) and given that the sentiments of the video are: surprised 50.00% action 36.36% neutral 13.64% Describe the music that would fit such a video. Your output will be fed to a text to music model. To help you out, here are some prompts that worked well with the model: 1) Pop dance track with catchy melodies, tropical percussion, and upbeat rhythms, perfect for the beach 2) classic reggae track with an electronic guitar solo 3) earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves 4) lofi slow bpm electro chill with organic samples 5) violins and synths that inspire awe at the finiteness of life and the universe 6) 80s electronic track with melodic synthesizers, catchy beat and groovy bass Give me only the description of the music without any explanation. Give me a single description.
Orchestral suspense track with rich strings, dramatic brass sections, and subtle jazz undertones, capturing the essence of intrigue and vintage allure.
Original
Herrmann-1
CMT
Given the following image captions from a video: 1) a pink flag on top of a pole in front of buildings 2) a pink flag is flying in the air in front of buildings 3) a pink flag on top of a pole with buildings in the background 4) the top of a building with a clock on it 5) a building with windows lit up at night 6) a taxi is driving down a city street 7) three young people sitting at a table in front of candles 8) three young people sitting at a table in front of candles 9) three young people sitting at a table in front of candles 10) a man in a suit sitting at a table with red chairs 11) a man in a suit sitting at a table with red chairs 12) a man in a suit sitting at a table with red chairs 13) three young people sitting at a table in a room 14) three young people sitting at a table in front of candles 15) three young people sitting at a table in front of candles 16) a man in a suit sitting at a table with red chairs 17) a man in a suit sitting at a table with red chairs 18) a man in a suit sitting at a table with red chairs 19) three children sitting at a table in front of candles 20) three young people sitting at a table in front of candles 21) three young people sitting at a table in front of candles and the following transcriptions: 1) Lieutenant Baum bought the house on Archer Avenue in the winter of his 35th year. (SPEAKER_00) 2) Over the next decade, he and his wife had three children, and then they separated. (SPEAKER_00) 3) Are you getting divorced? (SPEAKER_04) 4) At the moment, no. (SPEAKER_01) 5) But it doesn't look good. (SPEAKER_02) 6) Do you still love us? (SPEAKER_03) 7) Of course I do. (SPEAKER_02) and given that the sentiments of the video are: neutral 42.86% angry 28.57% romantic 28.57% Describe the music that would fit such a video. Your output will be fed to a text to music model. To help you out, here are some prompts that worked well with the model: 1) Pop dance track with catchy melodies, tropical percussion, and upbeat rhythms, perfect for the beach 2) classic reggae track with an electronic guitar solo 3) earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves 4) lofi slow bpm electro chill with organic samples 5) violins and synths that inspire awe at the finiteness of life and the universe 6) 80s electronic track with melodic synthesizers, catchy beat and groovy bass Give me only the description of the music without any explanation. Give me a single description.
Melancholic piano melodies interspersed with soft string sections capturing moments of nostalgia and romance, juxtaposed with somber, heavier orchestral passages reflecting underlying tensions and conflicts, all flowing in a cinematic, evocative soundscape.
Original
Herrmann-1
CMT
Given the following image captions from a video: 1) a yellow taxi cab is parked in front of a building 2) a yellow taxi cab driving down the street 3) a man in a suit and tie sitting in the back seat of a car 4) a man in a suit sitting in the back seat of a car 5) a yellow taxi cab is driving down the road 6) a man in a suit getting out of a taxi cab 7) a man sitting in the back of a yellow taxi cab 8) a man in a suit and tie riding a yellow taxi 9) a man in a suit and tie is running down the street 10) a woman is sitting in the driver's seat of a car 11) there is a close up of a person in a car 12) a man with a mustache is walking down the street 13) a man in a suit is walking down the street in front of a yellow taxi 14) a man in a suit and tie running down the street 15) a man is standing next to a taxi cab 16) a man is standing next to a yellow taxi cab 17) a man is hugging another man in front of a yellow taxi 18) two men are leaning on the hood of a taxi cab 19) two men are fighting in front of a yellow taxi 20) a man in a suit is getting out of a yellow taxi 21) a man is getting out of a yellow car 22) a man is being pushed out of a car by another man 23) a man in a car is being pushed by another man 24) a taxi cab with a man sitting in the driver's seat 25) a man standing next to a yellow taxi cab 26) a man in a suit and tie is running down the street 27) a man in a suit and tie is walking down the street 28) a man in a suit and tie running down the street 29) a man with a mustache is walking in front of a car 30) a man with a moustache is making a funny face 31) a man is walking down the street with a bag 32) an old car is driving down the street in front of a park 33) a man with a moustache standing in front of a building 34) a man in a black jacket is walking down the street 35) a group of people playing frisbee in a park 36) a man with a briefcase running through a park 37) a man and a woman walking down a path in a park 38) a man running across a grassy field with a suitcase and the following transcriptions: 1) Hey! (SPEAKER_00) 2) Hey! (SPEAKER_00) 3) Where they going? Come here! (SPEAKER_00) 4) Hey! No! (SPEAKER_00) 5) No, no, no, no! (SPEAKER_00) 6) Get my money! Get my money! (SPEAKER_00) 7) Please, please, please! (SPEAKER_00) 8) Please! (SPEAKER_00) 9) He should have paid you! He should have paid you! (SPEAKER_00) 10) I'm sorry! I'm so sorry! (SPEAKER_00) 11) I'm sorry! (SPEAKER_00) 12) Idiot! (SPEAKER_00) and given that the sentiments of the video are: action 94.44% sad 2.78% sympathetic 2.78% Describe the music that would fit such a video. Your output will be fed to a text to music model. To help you out, here are some prompts that worked well with the model: 1) Pop dance track with catchy melodies, tropical percussion, and upbeat rhythms, perfect for the beach 2) classic reggae track with an electronic guitar solo 3) earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves 4) lofi slow bpm electro chill with organic samples 5) violins and synths that inspire awe at the finiteness of life and the universe 6) 80s electronic track with melodic synthesizers, catchy beat and groovy bass Give me only the description of the music without any explanation. Give me a single description.
Upbeat jazzy track with playful piano, whimsical woodwinds, and light percussion, creating a comedic and lighthearted atmosphere reminiscent of classic slapstick chase scenes.
Original
Herrmann-1
CMT
Given the following image captions from a video: 1) a black and white photo of a woman driving a car 2) a black and white photo of a woman driving a car 3) a black and white photo of a woman sitting in the driver's seat of a car 4) a black and white photo of a woman sitting in the driver's seat of a car 5) a black and white photo of a woman sitting in the driver's seat of a car 6) a black and white photo of a woman sitting in the driver's seat of a car 7) a black and white photo of a woman driving a car 8) a black and white photo of a woman driving a car 9) a black and white photo of a woman driving a car 10) a black and white photo of a woman driving a car 11) a black and white photo of a car driving down the road at night 12) a black and white photo of a woman in a car 13) a black and white photo of a woman in a car 14) a black and white photo of a woman in a car 15) a black and white photo of a woman in a car 16) a black and white photo of a woman in a car 17) a black and white photo of a woman in a car 18) a black and white photo of a woman in a car 19) a black and white photo of a woman in a car 20) a black and white photo of a woman in a car and the following transcriptions: 1) I said you could. I was the last I saw. (SPEAKER_01) 2) Wait a minute. I did see her some time later driving. (SPEAKER_01) 3) I think you'd better come over here to my office, quick. (SPEAKER_01) 4) Carolyn, get Mr. Cassidy for me. (SPEAKER_00) 5) After all, Cassidy, I told you, all that cash. (SPEAKER_01) 6) I'm not taking the responsibility. (SPEAKER_01) 7) Oh, for heaven's sake, girl works for you for 10 years, you trust her. (SPEAKER_01) 8) All right, yes, you'd better come over. (SPEAKER_01) and given that the sentiments of the video are: suspenseful 95.00% afraid 5.00% Describe the music that would fit such a video. Your output will be fed to a text to music model. To help you out, here are some prompts that worked well with the model: 1) Pop dance track with catchy melodies, tropical percussion, and upbeat rhythms, perfect for the beach 2) classic reggae track with an electronic guitar solo 3) earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves 4) lofi slow bpm electro chill with organic samples 5) violins and synths that inspire awe at the finiteness of life and the universe 6) 80s electronic track with melodic synthesizers, catchy beat and groovy bass Give me only the description of the music without any explanation. Give me a single description.
Atmospheric noir-inspired track with haunting piano melodies, subtle eerie strings, and muted brass undertones, capturing a sense of tension, mystery, and impending danger, evoking classic suspense films of the black and white era.
Original
Herrmann-1
CMT
Given the following image captions from a video: 1) a man in a black shirt is standing in front of lights 2) a woman with long blonde hair standing in front of a door 3) a group of people dancing at a party 4) a group of people dancing at a party 5) a man in a black shirt talking to another man 6) a man in a black shirt talking to another man 7) a man and woman in a dark room looking at each other 8) a man and a woman are looking at each other 9) a man in a black shirt is talking to someone 10) a man in a black shirt talking to another man 11) a woman with her mouth open and a man looking at her 12) a woman is talking to a man in a dark room 13) pierce brosnan as james bond in goldeneye 007 14) a man in a black shirt smiling at the camera 15) a girl with long blonde hair is looking at a man in a dark room 16) a woman is talking to a man in a dark room 17) a man in a black shirt talking to another man at a party 18) a man in a black shirt is talking to someone 19) a man and a woman are looking at each other in a dark room 20) a woman is talking to a man in a dark room 21) a man in a black shirt talking to another man 22) a man in a black shirt talking to another man 23) a woman looking into the mirror with a man in front of her 24) a woman is talking to a man in a dark room 25) a man with his mouth open in front of a crowd 26) a man in black shirt talking to another man 27) amanda seyfried and amanda seyfried in the dark knight rises 28) a woman is talking to a man in a dark room 29) a man talking to another man in a crowded area 30) a group of people dancing at a party 31) a girl with long blonde hair in a dark room 32) a woman with long hair in a dark room 33) a group of people dancing in green shoes on a dance floor 34) a group of people dancing on a dance floor 35) a woman in a blue dress standing in front of a door 36) a woman in a blue dress standing in front of a doorway and the following transcriptions: 1) I know why I'm here. (SPEAKER_02) 2) Why didn't Donna tell me? (SPEAKER_02) 3) How long have you known I'm your father? (SPEAKER_02) 4) What? (SPEAKER_02) 5) Not long at all. (SPEAKER_02) 6) Sam, listen to me. (SPEAKER_02) 7) My mom doesn't know that I know. (SPEAKER_00) 8) So can we wait until after my wedding? (SPEAKER_00) 9) Who's giving you away tomorrow? (SPEAKER_00) 10) Nobody. (SPEAKER_00) 11) Wrong. I am. (SPEAKER_01) 12) Our secret till then. (SPEAKER_01) and given that the sentiments of the video are: surprised 66.67% angry 22.22% humorous 11.11% Describe the music that would fit such a video. Your output will be fed to a text to music model. To help you out, here are some prompts that worked well with the model: 1) Pop dance track with catchy melodies, tropical percussion, and upbeat rhythms, perfect for the beach 2) classic reggae track with an electronic guitar solo 3) earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves 4) lofi slow bpm electro chill with organic samples 5) violins and synths that inspire awe at the finiteness of life and the universe 6) 80s electronic track with melodic synthesizers, catchy beat and groovy bass Give me only the description of the music without any explanation. Give me a single description.
Funky disco track with shimmering synthesizers, rhythmic bass grooves, playful orchestral strings, and light-hearted beats, punctuated by dramatic crescendos for moments of surprise, capturing the ambiance of a dance party while highlighting the humor and underlying emotions.
Original
Herrmann-1
CMT
Given the following image captions from a video: 1) a group of people standing next to an old train 2) a group of people in front of an old train 3) a man in a cowboy hat and suit standing in the desert 4) a man in a cowboy hat standing in front of a train 5) a group of people walking near a train 6) a group of people standing next to an old train 7) a man in a top hat is looking out a window 8) a man in a top hat is looking out the window of a train 9) a close up of the wheels of an old train 10) a man in a cowboy hat and black suit is standing in front of a building 11) a man in a cowboy hat riding on a white horse 12) a man in a cowboy hat riding a white horse in front of a courthouse 13) a man wearing a cowboy hat and a mask 14) a man wearing a cowboy hat and sunglasses in the desert 15) a man is standing on the side of a train 16) a woman is standing on the side of a train 17) harry potter and the deathly hallows part 2 - harry potter and the deathly hallows part 2 - harry potter 18) a man in a cowboy hat riding a white horse 19) a group of people standing around a train with smoke coming out of it 20) a group of people standing around a train with smoke coming out of it 21) a man wearing a cowboy hat and mask in the desert 22) a man wearing a cowboy hat and sunglasses 23) a blurry image of a man walking down the street 24) a model train car on the tracks with people in it 25) a group of people looking out the window of a train 26) a train traveling through the desert with smoke coming out of it 27) john wick movie trailer john wick movie trailer john wick movie trailer john wick movie trailer john wick movie trailer john wick movie trailer 28) the train is traveling down the tracks in the desert and given that the sentiments of the video are: action 45.00% mysterious 30.00% afraid 25.00% Describe the music that would fit such a video. Your output will be fed to a text to music model. To help you out, here are some prompts that worked well with the model: 1) Pop dance track with catchy melodies, tropical percussion, and upbeat rhythms, perfect for the beach 2) classic reggae track with an electronic guitar solo 3) earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves 4) lofi slow bpm electro chill with organic samples 5) violins and synths that inspire awe at the finiteness of life and the universe 6) 80s electronic track with melodic synthesizers, catchy beat and groovy bass Give me only the description of the music without any explanation. Give me a single description.
Gritty western-inspired score with pulsating electric guitar riffs, intense rhythmic percussion, and atmospheric harmonica, blending into suspenseful string sections, evoking a sense of mystery and danger with an undertone of raw action in a desolate landscape.
Original
Herrmann-1
CMT