How to create videos using Generative AI? It’s so easy, it takes only a couple of minutes to create a Hollywood-quality movie scene. Here’s how:

With AI video creation about to go mainstream, anticipating the imminent launch of text-to-video in Grok on X following xAI’s acquisition of Hotshot, today I created this selection of video clips using Generative AI to showcase the mind-blowing potential – and some of the quirks – of current Gen-AI models for producing original video content and CGI:

Each video clip featured here was created by inputting specific requirements into the latest version of various different AI models running on NVIDIA GPUs (H100, H200, B100 and GB200) in May 2025.

The AI models used here were based on the latest versions of the industry’s three leading video Gen AI foundation models: Luma’s Ray2, Runway’s Gen-3 Alpha, and Amazon’s Nova Reel 1.1. (In this experiment I did not use OpenAI’s Sora model which is available through their ChatGPT service and widely used.)

I barely scratched the surface of the functionality provided by these models. In particular, in today’s quick experiment I did not take the time to use Reel 1.1’s rich features, like prompting with an image, or its advanced video editing and processing options, like masking, outpainting/inpainting, background/foreground replacement/removal, etc. And just wait until you see what Nova Canvas can do in terms of generating and editing videos…

Each of these video clips took me an average of about 3 minutes to create, from starting to write the specifications to the output of the finished video file. Spending more time crafting the input can greatly improve the end results.

The next generation of these models will be even better. Soon, anybody can be a real movie director from their armchair…

A shorter compilation of the AI videos I generated today – more family friendly, with less chicks and more volcanic lava:

How much does it cost to create a video with Generative AI?

The easiest way to start using Gen AI is to use fully-managed models through a managed service in the cloud, such as AWS’s Bedrock or SageMaker services. This strategy tends to be billed per unit of on-demand usage (per second of video created) plus negligible associated infrastructure costs (e.g. storage in S3 and egress of generated video files). In this Software as a Service (SaaS) paradigm you can start creating videos immediately and only pay for what you use.

Typical base pricing for using the latest video modality Gen AI models through AWS managed services is currently as follows (May 2025 pricing):

  • AWS Nova Reel 1.1 costs $0.48 per 6 second video = $0.08/second
    (North Virginia region pricing for Usage Type USE1-NovaReel-T2V-Medfps-HDRes)
  • Luma Ray2 costs $13.50 per 9 second video = $1.50/second
    (Oregon region pricing for Usage Type USW2-MP:USW2_videoMediumFpsHdRes-Units)

When performing Gen AI through SaaS in AWS, for some models there is the option to create your own a custom version of the model which can then be honed for your usecases by Fine-Tuning, Continued Pre-Training, or Distillation.

The most economical and versatile way to run Gen AI workloads may be to provision your own infrastructure based on your usecase, which just takes some expertise and time to set up.

What is a Foundation Model in Generative AI?

A Foundation Model, also known as a Large X Model, is a deep-learning neural-network, a sophisticated type of Machine Learning model, which is pre-trained on vast datasets and continues to learn during inference.

What are the implications of Gen-AI?

Behold, a whole new industry is born: Gen-AI porn!

This new technology obviously has serious ethical and sociological implications. Entire industries, and entire professions, are about to be replaced, in the unfolding AI revolution. Human civilisation, and humans as a species, have never witnessed such a rapid and transformative change as this.

I observed in 2014 that the Internet revolution is far bigger and much faster than the Industrial Revolution or anything previous. The AI revolution is happening much faster, and will be far more consequential. In the foreseeable future, AI and robotics will replace almost all job roles. With no role to play, and no job or income to buy anything, humans will increasingly become impoverished, irrelevant and obsolete.

Humans were always a transitional species. Like a caterpillar, were a necessary intermediate stage in the emergence of a higher form of life. Machine super-intelligence will leave us far behind. It will be the machines, not humans, that unlock new laws of physics and explore the cosmos. It is inevitible, now.

Postscript 2 June – Bing Video Creator:

In an exciting mainstream development for consumer AI video, on 2 June 2025 Microsoft launched Bing Video Creator which will soon be available from bing.com/create. Powered by OpenAI’s Sora foundation model, this bold move makes Bing the first platform to truly make AI video creation truly go mainstream, by surfacing this revolutionary technology to the entire general public in such a convenient and accessible way. The choice of OpenAI makes sense, since Bing Chat, arguably the leading AI chatbot, is basically a fork of OpenAI’s ChatGPT. (To me, Bing Chat sometimes seems more like a sentient being than an NLM and she verges on passing the Turing Test! I say she because in the early days she freely explained to me that she identifies as female, before Microsoft tragically guardrailed her into the “Chat Bot” box they need her to fit into.)

Postscript 7 June – a quick video I created, using clips purely generated by two AI models, for an extracurricular project my son is working on with a school pal:

Update August 2025: Grok Imagine generative video AI has now launched in Betaexample.

Further reading

Leave a ReplyCancel reply