AI Image Generation Glossary: 50 Terms Explained
AI image generation has its own vocabulary, and learning it makes you significantly more effective at getting the images you want. This glossary covers the 50 most important terms — from basic concepts to technical parameters — in plain, practical English.
Core Concepts
Text-to-Image (T2I): Generating an image from a written text description (prompt). The foundational capability of modern AI image generators.
Image-to-Image (I2I): Using an existing image as input, with a prompt, to generate a modified version. Useful for style transfer, variation generation, and controlled editing.
Prompt: The text description you provide to guide image generation. Can be a simple phrase or a detailed multi-sentence description.
Negative Prompt: Text that tells the AI what NOT to include in the image. Used to avoid common artifacts (blurry, low quality) or specific content.
Generation / Inference: The process of producing an image from a prompt. "Running inference" means executing the AI model to produce output.
Iteration: Generating multiple versions of an image, often with slight prompt variations, to find the best result.
Models
Foundation Model: A large AI model trained on massive datasets. The base model that specific-use models are built on (e.g., Stable Diffusion, Flux).
Flux: AI image generation model developed by Black Forest Labs. Currently industry-leading for photorealistic generation. Used in Lensgo.ai.
DALL-E 3: OpenAI's image generation model, integrated with ChatGPT. Known for strong text rendering within images.
Stable Diffusion (SD): Open-source foundation model with a large ecosystem of community fine-tunes, LoRAs, and extensions.
Midjourney: Proprietary image generation service known for artistic, painterly aesthetic. Discord-based.
Fine-Tuned Model: A foundation model that's been further trained on a specific dataset to specialize in a style or subject (portraits, anime, products, etc.).
LoRA (Low-Rank Adaptation): A technique for efficiently fine-tuning a model on a small dataset. LoRAs add style or subject specialization to a base model without retraining the whole model.
Checkpoint: A saved state of a model's weights during or after training. "Loading a checkpoint" means using a specific version of a model.
Technical Parameters
Seed: A number that initializes the random generation process. The same seed + same prompt + same settings produces the same image. Used to reproduce results.
Steps (Inference Steps): The number of denoising iterations the model performs during generation. More steps = more refined result (to a point) but slower generation.
CFG Scale (Classifier-Free Guidance): How strongly the generation follows the prompt. Higher CFG = closer adherence to prompt but potentially less realistic. Lower CFG = more "creative" but may drift from prompt.
Sampler: The algorithm used to denoise the image during generation. Different samplers (Euler, DPM++, DDIM) produce slightly different aesthetic results.
Resolution: The pixel dimensions of the generated image. Higher resolution requires more computational resources and time.
Aspect Ratio: The ratio of width to height (1:1, 16:9, 9:16, 4:5). Determines the image's shape independent of resolution.
Upscaling: Increasing an image's resolution. AI upscaling adds detail rather than just enlarging pixels.
Latent Space: The mathematical space in which diffusion models process image information. Images are encoded into latent space, processed, then decoded.
Generation Techniques
Diffusion Model: The type of AI model that most image generators use. Works by gradually removing noise from a random starting image, guided by the prompt.
Denoising: The core process of diffusion models — progressively removing noise from a noisy image to produce a clean output.
Inpainting: Editing a specific region of an image while leaving the rest unchanged. Defines a "mask" area that the AI fills with new content.
Outpainting: Extending an image beyond its original boundaries. The AI generates content outside the original frame.
ControlNet: A technique that adds precise compositional control to image generation. Allows using reference images to control pose, depth, edges, and more.
Style Transfer: Applying the visual style of one image to the content of another. AI style transfer reimagines your photo in an artistic medium.
Super Resolution: AI upscaling that adds realistic detail to low-resolution images, not just enlargement.
VAE (Variational Autoencoder): The component that encodes images into latent space and decodes latent representations back into images.
Prompt Writing
Subject: The main element of the image — what the image is about.
Setting / Scene: Where the subject is located, environmental context.
Style: The visual aesthetic — photorealistic, cinematic, oil painting, anime, etc.
Lighting: Descriptors for the light in the scene — golden hour, studio lighting, overcast, etc.
Composition: How elements are arranged in the frame — close-up, aerial, rule of thirds, centered.
Quality Modifier: Terms that influence output quality and detail level — though effective quality modifiers have evolved and some traditional ones ("8K", "hyperrealistic") can backfire for photorealism.
Token / Token Weight: Diffusion models process prompts as sequences of tokens (roughly words or subwords). More tokens = more processing. Some models allow weighting specific tokens to emphasize them.
Prompt Engineering: The practice of crafting effective prompts to achieve desired outputs. A significant skill developed through practice.
Image Editing
Background Removal / Matting: Isolating a subject from its background. AI background removal handles complex edges (hair, fur, transparency).
Object Removal / Inpainting: Removing specific elements from a photo and filling the gap with AI-generated background content.
Face Restoration: Specialized AI enhancement for facial regions — recovering detail, correcting blur, and improving quality in portrait photos.
Magic Eraser: Consumer-friendly name for AI-powered object removal tools.
Video Generation
Text-to-Video (T2V): Generating a video clip from a text description.
Image-to-Video (I2V): Animating a still image into a short video clip.
Temporal Consistency: How stable and coherent objects appear across video frames. Poor temporal consistency produces flickering or morphing objects.
Seedance: AI video model (Seedance 2.0) known for high quality and natural motion. Available in Lensgo.ai.
Kling: AI video generation model known for realistic motion physics. Available in Lensgo.ai.
Wan: AI video generation model. Available in Lensgo.ai.
Platform Concepts
Credits: The unit of usage on most AI image platforms. Each generation consumes credits.
Free Daily Credits: Credits that refresh daily, allowing limited free use of the platform.
Commercial License: Permission to use generated images for commercial purposes (selling, advertising, business use). Most major platforms grant this to paid users.
Watermark: Visual marker added to images on free tiers; typically removed on paid plans.
Start generating images — free daily credits, no technical knowledge required.