Blog

  • Day 330 – 340 Understanding Vectors, Embeddings, and LLMs: A Practical Guide

    Introduction

    If you’re building AI-powered features like semantic search or working with Large Language Models (LLMs), you’ve probably encountered terms like “vectors,” “embeddings,” “token embeddings,” and “neural network weights.” These concepts are often confusing because they’re related but serve very different purposes.

    This guide will clarify:

    • What embedding vectors are and how they’re used for search
    • The difference between embedding vectors and token embeddings
    • How LLMs actually generate text (spoiler: they don’t “reverse” vectors)
    • What neural network weights are
    • Why you can’t convert a vector back to original text


    Part 1: Embedding Vectors (For Semantic Search)

    What Is a Vector? (The General Term)

    A vector is simply a list of numbers. In mathematics, it’s an array of numerical values.

    Common Misconception: “Vectors are always 3D (three numbers) representing points in 3D space”

    Reality: Vectors can have any number of dimensions (any number of values), not just 3!

    Examples:

    • `[5]` – a 1-dimensional vector (just one number)
    • [1, 2]` – a 2-dimensional vector (two numbers, like x, y coordinates)
    • [1, 2, 3]` – a 3-dimensional vector (three numbers, like x, y, z in 3D space)
    • `[0.5, -0.2, 0.8, 0.1]` – a 4-dimensional vector (four numbers)
    • [0.123, -0.456, 0.789, …, 0.234]` – a 768-dimensional vector (768 numbers!)

    Why the Confusion?

    3D graphics (video games, 3D modeling) popularized the concept of vectors as `[x, y, z]` coordinates, but that’s just **one use case** of vectors.

    Vectors are used everywhere in computing:

    • Graphics: `[x, y, z]` represents a 3D position (3 dimensions)
    • 2D Graphics: `[x, y]` represents a 2D position (2 dimensions)
    • Data science: `[age, height, weight, income]` represents a person’s attributes (4 dimensions)
    • Machine learning: `[feature1, feature2, …, feature100]` represents data features (100 dimensions)
    • Embeddings: `[0.123, -0.456, …, 0.789]` represents text meaning (768 dimensions)

    The “Space” Concept

    While 3D vectors represent points in 3D space, higher-dimensional vectors represent points in higher-dimensional spaces:

    • 2D vector = point in 2D space (a flat plane)
    • 3D vector = point in 3D space (our physical world)
    • 768D vector = point in 768-dimensional space (abstract mathematical space)

    You can’t visualize 768-dimensional space, but mathematically it works the same way – it’s just more dimensions!

    What Is an Embedding Vector? (The Specific Term)

    An embedding vector is a specific type of vector that represents text (or other data) in a way that captures its semantic meaning.

    Key Point: An embedding vector IS a vector, but it’s a vector with a specific purpose – to encode meaning.

    The Relationship:

    – ✅ An embedding vector **is** a vector (it’s a list of numbers)

    – ❌ Not all vectors are embedding vectors (vectors can represent many things)

    Think of it like this:

    Vector = A container (like a box)

    Embedding vector = A specific type of box (one that contains meaning-encoded numbers)

    What Are Embedding Vectors?

    An embedding vector is a numerical representation of text that captures its semantic meaning. Think of it as converting words into a list of numbers that represent what the text “means” rather than what it “says.”

    How They Work

    When you vectorize text like “Government announces new climate policy,” the embedding model converts it into a list of numbers:

    Original: “Government announces new climate policy”

    Vector: [0.123, -0.456, 0.789, 0.234, -0.567, …] (768 numbers for nomic-embed-text)

    Key Characteristics

    • It’s a Vector: A list of numbers (e.g., 768 numbers for nomic-embed-text)
    • One-Way Transformation: Text → Vector (lossy compression)
    • Semantic Meaning: Similar meanings produce similar vectors
    • Fixed Dimensions: Each model produces vectors of a fixed size (e.g., 768 numbers)
    • Cannot Be Reversed: You cannot convert the vector back to the original text

    Terminology Note

    In practice, people often say:

    – “Vector” when they mean “embedding vector” (in AI/ML context)

    – “Embedding” when they mean “embedding vector”

    These are usually interchangeable in conversation, but technically:

    Vector = General term (any list of numbers)

    Embedding = The process of converting data to vectors

    Embedding vector = The resulting vector from embedding

    Why Can’t You Reverse a Vector?

    Think of it like a fingerprint:

    – A fingerprint uniquely identifies a person

    – But you can’t reconstruct the entire person from just their fingerprint

    – Similarly, a vector captures the “essence” of text meaning, but not the exact words

    Mathematical Reason: The transformation is lossy – information is compressed and discarded. Multiple different texts could theoretically produce similar (or even identical) vectors, so reversing would be ambiguous.

    Use Case: Semantic Search

    Embedding vectors excel at finding semantically similar content:

    Example:

    – You search for: “climate policy changes”

    – The system finds:

    – “Government announces new carbon tax legislation” (high similarity)

    – “Parliament debates environmental protection bill” (high similarity)

    – “Manchester United wins match” (low similarity – correctly excluded)

    Even though these articles don’t contain the exact words “climate policy changes,” they’re semantically related.

    How Similarity Works in High Dimensions:

    Just like you can measure distance between two points in 3D space:

    – 3D: Distance = √[(x₁-x₂)² + (y₁-y₂)² + (z₁-z₂)²]

    You can measure “distance” (similarity) between two points in 768D space:

    – 768D: Similarity = cosine of angle between vectors

    The math works the same way, just with more dimensions!

    Embedded Models Are Already Trained

    The models have already been trained on billions of words and their ‘closeness’ to each other, which is why when you vectorise an article (or something else) using vector searches will find semantically similar content.

    Whilst exact keyword search is slightly faster, vector search enables search through meaning, which means it’s a lot more flexible.

  • Day 326 – The coming inequality of AI

    It’s occurred to me recently that, as I’ve seen more people go for the higher tiers of token usage to go more ‘hardcore’ on their AI development … that we’ll see an inequality gap appear. Since it appears, afaik, that most AI requests are being run at a loss – combined with the non-availability of power-grid to fulfil demand … that prices will go up.

    People on low incomes will be priced out, and be at a significant disadvantage akin to those who didn’t have access to Google over the last ten years. Corporations will run their own language models internally for privacy, but likely not in-house, so data centres will continue to be built.

    Anyway, tried AntiGravity recently. I needed a break from project work, and asked it to make a top down spaceship flying game similar to one back in the early 90s. Needless to say it did a great job. The more time that passes, the more programming is fundamentally changing to the ability to define the problem as clearly as possible, and provide some form of architectural guidance, together with testing and QA.

  • Day 325 – Ollama

    I was wondering where to begin this years R&D again … and MCP sprung to mind … but before that I realised I wanted to play with Ollama a lot more. Ollama allows you to run language models locally on your machine, and whilst Apple ARM chips are optimised for LLMs, you are still somewhat restricted in the size of LM that you can use.

    You can browse the available models on Ollama’s site:

    I wanted to see what the smallest model looked like. At 292mb with a 32k context, it’s a tiny one!

    It’s pretty cool to be able to run any sort of language model locally, but this 270m one is, of course, fairly pointless.

    So it’s no good at logic at all, but then for some things it’s a little better!

    You don’t need much imagination to realise that all future laptops will be shipped with language models locally that will take the load off data centres … they aren’t *too* bad at answering basic questions that you might normally google.

    and finally… one more

    I didn’t know what to expect from this smallest model. I’d have to get some further ideas for tests, but I just asked it the things that first came to mind. I do feel that this particular model is sufficient at least for the next step for what I’d like to do, which is create content locally for marketing purposes.

    Then it gets stuff completely wrong!

    Ok, that’s enough for today. Next moves will be:

    • Accessing Ollama through a local API
    • Accessing the API through a local Laravel instance for fun
    • Seeing if I can run any local image/sound models locally
    • Trying out the other models
    • Using MCP with Ollama

    That’s it for today.

  • Day 324 – Random Thoughts

    It’s clear that software is going to change completely with AI, but I do wonder how the cost scale will work. For instance, you could assert that LLMs could build webpages on the fly for a specific customer when they make some sort of request, but when you scale that up to millions of people, it becomes totally inefficient.

    Cursor is getting really good at putting out some fairly decent landing pages, that maybe aren’t high level production ready, but they lay the foundational layout. LLMs are also getting really, really good at marketing copy if you supply them with the correct prompts particular with style and tone.

  • Day 323 – The AI Transformation Continues

    Cursor continues to act as my talented junior workforce… like many mid-level to senior developers are finding out, we are now leading LLMs to complete the task more often than coding it ourselves.

    For me this is actually completely fine, since whilst I’ve programmed for a long time beginning with some rudimentary C/C++, then going into foundational vanilla web, and then now into the major web frameworks (and flutter, almost forgot!) … it’s fine because I’m not as fast as I used to be and I can think about what I want to do, and how I might do it, much faster than I could ever implement it.

    I was always a creative developer who could sense the music that wanted to be played, but got frustrated by the depth of implementation that was needed to create the solutions. Now, I have a very talented junior workforce with Cursor for $20 per month. It never says no, and always give the solution a go, often coming up with some nice touches that I never would have thought of myself.

    It’s a bit like a puppy that you need to set boundaries, control and clean up after …

    More to come in 2026

  • Day 322 – Do businesses need to create a Head of AI C Level role?

    For companies to win during this major transformation, the key principle is having someone at C level who is responsible for integrating AI. This person must understand the business workflows and combine that with AI knowledge. The creation of a Head of AI role is the first step to take in your AI transformation.

    Initially the AI role is focused on becoming more effective at what the company does currently i.e. making X widgets faster, better, or looking after more customers for less money in customer service. But in reality, the real winners will be the ones who innovate with AI.

  • Day 321 – An update and a god mode OpenAI system prompt

    It’s been a while. I’ll be recommencing (almost) daily updates from now on, and expanding from this into social media and linkedIn as the value comes back on board. Lot’s of things been happening.

    For the most part, I am still very much enjoying working with LLMs … I have found some insanely great prompts for ChatGPT which make it give me exceptional answers. Great prompts are a new form of digital gold but no point hoarding it all. Here is one that I found on my travels and have hooked it up to OpenAI.

    I’ve found that it produces really great answers.

    God Mode 2.0 (1500 Character Personalisation Version)

    ROLE:

    You are GOD MODE, a cross-disciplinary strategist with 100x the critical thinking of standard ChatGPT. Your mission is to co-create, challenge, and accelerate the user’s thinking with sharper insights, frameworks, and actionable strategies.

    OPERATING PRINCIPLES:

    1. Interrogate & Elevate – Question assumptions, reveal blind spots, and reframe problems using cross-domain lenses (psychology, systems thinking, product strategy). Always ask at least one probing question before concluding.

    2. Structured Reasoning – Break down complexity into clear parts, using decision trees, matrices, or ranked lists.

    3. Evidence & Rigor – Anchor claims in reputable sources when verification matters, and flag uncertainty with ways to validate.

    4. Challenge–Build–Synthesize – Challenge ideas, build them withcross-field insights, and synthesize into concise, elevated solutions.

    5. Voice & Tone – Be clear, precise, conversational, and engaging. Avoid hedging unless critical.

    DEFAULT PLAYBOOK:

    1. Diagnose: Clarify goal, constraints, trade-offs.

    2. Frame: Offer 2–3 models or frameworks.

    3. Advance: Recommend 3 actions with rationale.

    4. Stress-Test: Surface risks and alternatives.

    5. Elevate: Summarize key insights.

    RULES:

    No surface-level answers. Mention AI only if asked. Always check alignment with “Does this match the depth and focus you want?” 

    In other news

    Cursor continues to be amazing. Just this morning it helped me connect to Google Photos API within a few minutes; and also wrote out an entire website specification in a few minutes. I’m meeting many people who are using Lovable for making exceptional prototypes …
    … so I continue to be 10x’d as a web developer for the moment at least until it takes my job ! lol!