Cinematography & Camera Language

Shot Types Explained and How to Phrase Them in Prompts

Updated 2026-06-19·7 min read

Key takeaway

Shot types are the vocabulary cinematographers use to control how much of a subject and scene the camera captures. From the extreme wide shot that establishes a vast landscape to the extreme close-up that reveals a single tear, each shot type carries emotional weight and narrative purpose. This guide explains every major shot type — ECU, CU, MCU, MS, WS, EWS, and more — and shows you the exact prompt phrasing to use in Floniks AI Image and AI Video tools so your output matches your creative vision every time.

AI Image Generator AI Video Generator Visual Workflow Editor

Why Shot Types Matter in AI Generation

Shot type is one of the most powerful signals you can give an AI image or video model. Without it, the model guesses a framing based on statistical averages — usually a generic medium shot slightly favored toward the subject. By stating the shot type explicitly, you override that default and communicate both spatial composition and emotional register simultaneously.

Think of it this way: a close-up of a face says "this emotion is important." An extreme wide shot says "this person is tiny against the world." Neither instruction requires a lengthy description — two or three words anchored to the right shot type unlock the framing instantly. In Floniks AI Image, add the shot designation at the start of your prompt for maximum weight. In Floniks AI Video, shot type also governs how the camera relates to the subject across the clip duration, making it doubly important.

Extreme Wide Shot (EWS) and Wide Shot (WS)

The extreme wide shot (EWS) — also called the establishing shot — places a figure so small within the frame that the environment dominates. Use it to establish scale, isolation, or grandeur. Prompt example: "extreme wide shot, lone astronaut on a red-dust canyon, golden hour, cinematic".

The wide shot (WS) shows the full figure head-to-toe with room to breathe around them. It communicates physical context without drowning the subject. Prompt example: "wide shot, dancer in an empty art gallery, natural light from skylights, editorial photography".

Both shots benefit from explicit environment detail in the prompt because the model has more canvas to fill. Pair with location keywords (arctic tundra, crowded marketplace, brutalist rooftop) to guide what occupies the negative space. Avoid putting micro-details like eyelash curl or fabric texture in EWS prompts — the model must resolve them at thumbnail scale, which produces noise rather than clarity.

Medium Wide, Medium, and Medium Close-Up

The medium wide shot (MWS) frames the subject from approximately the knees up, preserving body language while still showing environmental context. It is the workhorse of storytelling — a character walking through a space, interacting with props, or in dialogue.

The medium shot (MS) frames from waist or hip to head and is the default interview and dialogue framing. It balances personality and setting. Prompt example: "medium shot, chef in a professional kitchen, depth of field, warm tungsten lighting".

The medium close-up (MCU) frames chest to head — the classic news anchor or podcast thumbnail framing. It emphasizes expression while retaining a slice of clothing or collar for context. Prompt example: "medium close-up portrait, businesswoman, shallow depth of field, soft studio lighting, 85mm". Adding a focal length like 85mm reinforces the compression that viewers associate with MCU.

Close-Up (CU) and Extreme Close-Up (ECU)

The close-up (CU) fills the frame with the subject’s face, isolating expression from environment. It is the primary tool for emotional intimacy. Prompt example: "close-up, elderly fisherman’s weathered face, overcast daylight, 50mm, f/1.8". The shallow depth of field cue is almost always appropriate here since the CU’s purpose is separation.

The extreme close-up (ECU) zooms further — a single eye, a pair of lips, fingertips on piano keys, the face of a wristwatch. It creates intensity, abstraction, or fetishistic attention to detail. Prompt example: "extreme close-up, eye with reflection of a burning city, 100mm macro, razor-sharp focus, dramatic lighting".

For product photography in Floniks, the ECU is invaluable: "extreme close-up, perfume bottle cap, metallic sheen, studio white background, 100mm macro". In AI Video, using an ECU as the opening shot creates immediate tension — the viewer is disoriented pleasantly before the wide establishes context.

Specialty Shot Types: Two-Shot, Over-the-Shoulder, POV, and Insert

Two-shot: frames two subjects in the same composition, implying relationship. "two-shot, couple at a café table, shallow depth of field, afternoon light". Great for AI Avatar dialogue scenes.

Over-the-shoulder (OTS): camera sits behind one person’s shoulder looking at the other — creates conversational intimacy and spatial orientation. "over-the-shoulder shot, job interview scene, office environment, soft key light".

Point-of-view (POV): the camera adopts the character’s literal eyeline. "POV shot, first-person view of hiking a mountain trail, lens flare, early morning light". POV is particularly effective in AI Video for immersive experiences.

Insert shot: a tight cut to an object of narrative significance — a letter being read, a gun being loaded, a phone screen showing a missed call. "insert shot, close-up of a handwritten letter, warm candlelight, shallow DOF, film grain". Inserts work well as cutaway frames in multi-step Floniks Editor workflows.

Combining Shot Type with Other Prompt Segments

Shot type is most effective when it leads the prompt and is reinforced by compatible technical cues. A close-up implies shallow depth of field — adding "f/1.4" aligns the optics with the framing intent. A wide shot implies a wide-angle lens — adding "24mm" or "16mm" completes the cinematic picture. An extreme wide shot pairs naturally with "golden hour" or "blue hour" lighting that creates atmosphere in the vast negative space.

Avoid contradictory pairings: "extreme close-up, full-body portrait" is internally inconsistent and forces the model to arbitrate, usually producing a mediocre medium shot. Instead, commit to one shot type per generation and use Floniks Editor multi-node workflows to chain different framings together — for example, an EWS establishing node followed by a CU reaction node — producing a visual sequence rather than a single compromised frame.

A practical prompt template: [Shot type] + [subject + action] + [lens/focal length] + [lighting] + [style/mood]. Example: "medium close-up, young scientist examining a glowing vial, 85mm, cold blue laboratory light, cyberpunk, cinematic".

FAQ

What is the difference between a medium shot and a medium close-up?+

A medium shot frames the subject roughly from the waist up, showing more of the torso and environment. A medium close-up frames from the chest up, cutting closer to the face and reducing environmental context. In prompts, both terms are recognized by AI models — use "medium shot" for dialogue scenes that need spatial grounding and "medium close-up" when expression is the priority.

Should I put the shot type at the beginning or end of my prompt?+

Put it at the beginning. Most AI image and video models apply higher weight to tokens appearing earlier in the prompt. Leading with "extreme wide shot" or "close-up" anchors framing before the model begins filling in details. Placing it at the end risks the shot type being overridden by heavier descriptive language in the middle of the prompt.

Can I mix multiple shot types in one prompt?+

Avoid it in a single prompt — two conflicting shot designations confuse the model and usually produce a mediocre default framing. Instead, use the Floniks Editor to build a workflow with separate nodes for each shot type, giving each generation a clear single directive.

Related guides

Build it on Floniks

Image, video, digital humans, and reusable workflows on one canvas. Sign up gets you starter credits — no card required.

Explore Floniks