AI-Powered Filmmaking: A Case Study on Creating a 100% AI-Generated Video for Merdeka
Two developers. Zero filmmaking experience. One bold idea: create a 100% AI-generated film for Malaysia’s 68th Independence Day. From prompting-as-directing to uncovering real business potential, this case study reveals how generative AI is changing the rules of creative storytelling.
Overview
A team of two Unity developers with no traditional film production background embarked on an ambitious experiment: to create a 100% AI-generated video celebrating Malaysia’s 68th Independence Day and Hari Malaysia. The project's goal was not only to produce a visually compelling and emotionally resonant film but also to explore the commercial viability of end-to-end AI-driven video production. The final product exceeded expectations, demonstrating that generative AI is no longer a novelty but a powerful, practical tool for creative and commercial projects.
The Final Result:
After all the trial and error, this is what our 100% AI-generated video looks like.
The Production Process: Prompting as Filmmaking
Without the traditional crew, cameras, or studio, the entire production was re-envisioned as a digital workflow.
- Story & Script: The project began with an AI generating a poetic script centered on the themes of national pride and a future shaped by technology and unity.
- Audio & Narration: An AI was used to generate 16 distinct voiceover lines. A key decision was to use an authentic Malaysian accent to ensure the voices reflected the nation’s multicultural identity and grounded the video in cultural authenticity.
- Virtual Cinematography: This was the most challenging and iterative part of the process. The team used tools like Google Vids and Veo 3 to generate a unique visual scene for every line of dialogue. Rather than simply asking for a scene, the team acted as "virtual directors," writing detailed prompts that mimicked a shot list.
- Assembly: The final video was created by stitching together the successful visual clips with their corresponding audio lines and adding a musical score.
Key Challenges and Lessons Learned
Through trial and error, the team encountered and overcame several significant challenges, revealing crucial insights into the current state of AI video generation.
Lesson 1: You’re a "Virtual Director," Not a "Prompter"
The team learned that achieving cinematic quality required a highly specific, directive approach. Prompts had to be more than simple requests; they were comprehensive shot lists that specified:
- Camera Movement: "a slow, sweeping dolly shot," "dynamic tracking shot that flies alongside," "The camera seamlessly pushes in..."
- Lens & Lighting: "shallow depth of field, making the background blur," "warm, golden," "slight, epic lens flare."
- Casting & Wardrobe: Detailed descriptions like "a Malay girl in a hijab wearing a white Malaysian primary school baju kurung."
- The "Action": Explicit instructions for what the characters were doing, such as "huddled together around an open textbook... pointing and sharing ideas."
This level of detail was essential for moving the output from a generic video clip to a custom, story-driven shot.

Case study: The KLCC isn’t accurate, although the main structure of the building is almost the same, we noticed the big discrepancy between this and the real building, where in here, the bridge is missing in view.
Lesson 2: The "Speaking Human" Is Still the Uncanny Valley
Creating realistic, speaking characters was the experiment's core challenge. While the AI’s lip-sync was surprisingly effective in many instances, consistency and emotional accuracy were major hurdles. The AI often struggled to maintain a character's consistency or deliver a line with the correct emotion.
The team’s workaround was to embrace a more flexible, filmmaker-like approach. Instead of forcing every scene to feature a speaking character, they strategically used different storytelling methods:
- Characters speaking directly to the "camera."
- "Voice of God" narration, like a teacher's monologue.
- Action-focused shots where a character's actions, rather than their words, drove the scene.
This mixed approach not only made the final video work but also highlighted the need for creative problem-solving when using current AI technology.

Case study: This scene was supposed to be with this prompt: “An 8-second, cinematic and deeply heartfelt video, filled with warm, soft afternoon light. The scene opens on a close-up of the weathered, gentle hands of an elderly Malaysian grandfather as they hold a tablet. On the screen, he is on a video call with a smiling relative who is overseas. The camera performs a slow, gentle pull-back, revealing the grandfather sitting comfortably on a sofa in a cozy living room. He is surrounded by his family. His young, joyful grandchildren are cuddled up next to him, laughing and pointing at the person on the tablet screen. The grandfather looks away from the tablet and shares a warm, deeply content smile with his grandchild beside him, reaching out to gently pat their head. The final shot is a wider view of this beautiful, multi-generational family moment, capturing the profound joy and connection in the room. The scene is silent, filled only with the warmth of the family's presence. 4K, with a soft, glowing cinematic style”, but we noticed that the tablet had gone missing, and the grandfather was supposed to talk, but he ended up not speaking.
Lesson 3: The AI Is Bad at Metaphors, So You Have to Be Literal
The team discovered that AI models struggle with abstract, poetic, or figurative language. While the script could be lyrical, the visual prompts for those concepts had to be brutally literal.
For the phrase, "From tradition to innovation," the team couldn't simply use that as a prompt. They had to describe the physics of the scene in painstaking detail:
"The scene opens on an extreme close-up of a traditional leather Wayang Kulit puppet... The camera performs a slow, seamless tilt downwards... the shadow is... cast onto the display of an iPad... the warm, organic shadow dissolves and transforms into a vibrant, glowing teal holographic animation..."

This was a major "Aha!" moment: to get a metaphorical output, you must provide a literal input.
Lesson 4: The Best Prompts Still Require "Try, Try Again"
One of the most valuable lessons was that AI video generation is a highly iterative process. Even with a well-crafted, highly specific prompt, the AI might produce an unusable or inaccurate result. The team learned to accept that multiple retries were a standard part of the workflow.
This was especially true for complex scenes or those that required precise character actions. For example, a prompt for a "as the beam swings past the camera" might require dozens of attempts, with each generation showing a slightly different—and often incorrect—action. The process was less about writing a single perfect prompt and more about persistence: refining the prompt, rerunning the generation, and selecting the best output from a large pool of attempts. This is a fundamental aspect of working with generative AI—the successful result is often a product of sheer iteration.
Case Study:
Prompt: An 8-second, powerful and cinematic video set against the backdrop of a high-rise construction site during the golden hour. The scene opens with a dynamic, low-angle tracking shot following a massive steel beam as it is hoisted upwards by a crane. Below, a diverse team of construction workers guide it into place, with sparks flying from welders in the background. The sun glints off the steel and the workers' helmets. As the beam swings past the camera, it creates a natural wipe transition. On the other side of the wipe, the camera is now on a tight close-up of a Malaysian engineer standing on a higher level of the structure. He is wearing a white hard hat and a determined, proud expression. He turns his head from overseeing the work to look directly and confidently into the camera, and says with a clear Malaysian accent: “And with unity, we build.” The final shot holds on his face as the sounds of construction continue around him. The shot is epic, with lens flare from the setting sun. 4K.
Attempt 1:

Attempt 2:

Attempt 3:

Attempt 4:

Key Takeaway:
One of the most valuable lessons was that AI video generation is a highly iterative process. We learned that multiple attempts do not mean the prompt is wrong or the AI is incapable; it is a fundamental part of the workflow.
Business Potential
Beyond the creative experiment, this project validated the commercial opportunity of AI-generated video production. The team proved that a small, non-specialized team could create professional-quality, culturally resonant video content in a fraction of the time and cost of traditional production. This opens up new, cost-effective possibilities for various applications, including:
- Advertising & Marketing Campaigns: Creating highly customized videos for brands.
- Corporate Storytelling: Producing internal communications or training videos with a fast turnaround.
- Event & Holiday Content: Delivering timely, celebratory content for national holidays and events.
This case study shows that by combining a clear narrative, detailed prompt engineering, and an iterative process, small teams can use AI to unlock new creative and commercial potential in the media and advertising industries.